Warning:
This wiki has been archived and is now read-only.

Simon Miles

From Provenance WG Wiki
Jump to: navigation, search
General
- The document is dated 28 October 2012.

-->Done

- I think the document would benefit from being clearer how you apply the information contained in it. That is, I'd argue it should be clearly
a methodological document from the start, explaining to the reader how to map their data. At the moment, the document discusses and presents
information, but without a context of how and when the reader should take it into account. If an overall methodology was articulated at the start
of the document, the reader could better understand the mapping and discussion in that context.
-->I have added a new subsection (Structure of the document), which aims to summarize each section and direct the users reading the document.
I have also added some paragraphs on how to use the mappings in their correspondent sections. Is that enough? I mean, the methodology is just apply
the patterns and mappings offered in the document, it is not rocket science. 
Abstract
- "prov" should be "PROV" in the final line.

-->Done

- "here here" in the final line.

-->Done

- I don't think readers will know what "direct mappings" or "prov refinements" mean by this point in the document.

-->Done: removed "direct" and explained what the "refinement" does.

Status of This Document
- Missing space before [DCTERMS] citation.

-->Must have been fixed in a previous review (no [DCTERMS] citation in Status)

1. Introduction

- Paragraph 1: "The Dublin Core Metadata Initiative provides a core metadata vocabulary..." - A vocabulary about what? The first two paragraphs provide
lots of minor details about DC, but don't give context explaining what it is for, which seems inconsistent.

-->Done: "The Dublin Core Metadata Initiative (DCMI) provides a core metadata vocabulary for simple and generic resource descriptions..."

- Paragraph 4: "A classification... is provided in Table 1. This classification is... conservative" - I think we should say what the classification
is before talking about its characteristics, otherwise the reader can get lost. What is the classification for?

--> Done. Paragraph changed to reflect the purpose of the classification.

- Paragraph 4: "can be considered as provenance related" - Delete "as".

--> Done

- Why do the categories in the text not follow the order of the categories in the table? This makes it harder to follow. Similarly, it would be good
to have a headed paragraph on the first category, Descriptive, so that it is clear you are explaining the table contents.

-->Done

- I don't think the purpose of the paragraphs on the categories ("Dates and Time terms" etc.) is clear. It seems to be a discussion about provenance,
but without much context beforehand. This should be clarified.

--> Done

- The paragraph about each category talks about just a few of the terms in the category, and it is not clear why. Could we be more rigorous and talk
about every term one by one?

--> We were asked by some reviewers to not include all the terms in each category. They can be found in the tables at the end of the mapping, and I don't see the necessity of repeating the definitions twice hear. The purpose of the categories is to explain why are they there and what kind of terms are being grouped, but not necessarily the definition of the terms of each category.

- "Dates and Time terms" - should be "Date and Time terms"

-->Done

- Why is "terms" in lowercase for "Dates and Time" but capitalised for the other categories, "Terms"?

-->Done

- "Dates typically belong to the provenance record of a resource." - As opposed to belonging to what? Not clear what you are trying to convey.

--> Done, removed that sentence.

- "the publication can be seen as an action" - Delete "the".

-->Done

- In Agency Terms, the sentence starting "All properties that have..." is not grammatically meaningful. Maybe you need to delete "that"?

-->Done

- In Derivation Terms, put "and" before the "dct:source" in the list of derivations.

-->Done

- "dct:references is a weaker relation" - Why? Surely it is a form of derivation like the others? If A references B, then part of A's content
(the reference) is specifically determined by B. I don't understand how it could be weaker than derivation.

-->This term created some discussion within the group. The problem is that adding a reference doesn't allways mean that part of the content (reference) is used in the current document. A possible example is if my text included the next sentence: "This work has nothing to do with this random reference [REF1]". The work has not been derived at all from it. Normally this is not the case, but as it can happen, we cannot assume the derivation. I have clarified this in the text

- Paragraph under Table 1: "Despite being relevant for provenance..." - This is confusing, as you are currently talking about provenance. I assume
you need to distinguish between dct:provenance and applicability for mapping to PROV.

--> Done, added a specific comments saying that it is left out of the mapping

- What is the conclusion of the discussive paragraph about dct:provenance?

--> Done

- Paragraph under Example 1: "ex:doc2 which had probably" - Should be "which probably had"

-->Done

1.1 Namespaces
- Shouldn't this table come before the introduction text, as that uses dct: plenty of times.

--> Moved and reestructured some oher parts of the document.

- Where is it explained what the ex: namespace is?

--> Added it to the table.

2.1 Basic considerations
- Point 1): "can be expressed in form" - Missing "the".

--> Fixed

2.2 What is ex:doc1? Entities in Dublin Core
- Paragraph 1: "As a dc metadata record describes the resulting document as a whole." - Resulting from what?

--> Fixed (removed "resulting")

- "According to the PROV ontology, the activity of issuing a document involves two different states of the document" - I don't think PROV-O says
this or requires it. It is more that, if you want to use PROV to model an acitivity affecting a document or any other thing, then this is most
naturally done using one entity for the state before and one entity for the state after being affected.

--> I disagree: An entity can't be used before being generated by an activity. Thus, for an "Issue" activity we need 2 states: the article before the issue and the issued article, which was generated in that activity. This is taken from the definition of generation: http://www.w3.org/TR/2012/CR-prov-dm-20121211/Overview.html#term-Generation

- "Each of these states corresponds to a different specialization of the document" - Be clear that this is "specialization" in the PROV sense. Maybe
there should be a font style used to show when PROV vocabulary is being used in the text?

--> Done.

- Paragraph 2: "somehow find some activity and agent" - Delete (or explain) "somehow". What activity? I don't think it is necessarily clear to the reader.

--> Done

- I think Figure 1 needs more explanation. What does it show and why?

--> Added.

- I think you need to explain what the bold arrow towards the top of the figures mean.
- It seems confusing to use the same notation in the figures for both RDF resources and PROV entities. The top of Figure 1 seems to show
ex:publisher to be an entity, but that isn't shown in the mapped PROV (and probably shouldn't be).


> TO DO. Yes, we haven't used PROV notation to refer to dct resources.
2.3 Direct mappings
- Paragraph 1: "The direct mappings provide basic interoperability" - Interoperability of what? Basic in what way?

--> Rephrased.

- "By means of OWL 2 RL reasoning, any PROV application can at least make some sense from Dublin Core data." - This seems too vague to mean much. 
What are you specifically trying to convey?

--> That by using the mapping, now PROV applications can interoperate with Dublin Core data. Rephrased.

- Paragraph under Table 3: "a metadata record such as example 1 will infer that" - A record cannot infer.

--> Fixed.

- I still find the prov:wasRevisionOf-dct:isVersionOf mapping strange, even though it may well be correct. This is not necessarily for the mapping
document itself, but I'd be interested to see an example of an assertion that was expressible in dct:isVersionOf but not in prov:wasRevisionOf.

--> This could be complicated, as PROV and DC don't specify what are the attributes of a revision. Since prov only included revisions and DC included revisions, editions and adaptations, this approach seems more sensible. As an example, I would say that an adaptation to english of a spanish book is not a prov:revision of that book. However, it is a dct:version. ---> The definitions may have changed since we last brought this to the DC community. We could try to raise the debate once again and discuss this particular mapping

- The column headings in Table 5 are wrong (PROV Term and DC Term need to be switched).

--> Fixed.

2.4 PROV refinements
- Why is "Activity" or "Role" part of the names? Why not just "prov:Creation" or "prov:Modification"? This does not seem consistent with the other documents.

--> Fixed

- Is it correct that the namespace of these new terms is prov: and not a separate namespace for the mapping? I assume it is correct, but just checking.

--> We originally had a different one. But the group voted on having all PROV family of specifications under the same namespace. So yes, it is correct :)

- The final paragraph uses "should" a couple of times. It is unclear who this obligates, and might be read as an instruction to the reader/implementer,
as this is how "should" is used in the other PROV documents.

--> Fixed

2.5 Complex Mappings
- Paragraph 1: "consist on a set" - Should be "consist of a set"

--> Fixed.

- Figure 4: We need to say, in the figure caption and possibly the accompanying text, which activity is before which other activity (as this is the key point
of the conflation).

--> Added

B.2 Informative references
- The dates for PROV-CONSTRAINTS, PROV-DM and PROV-O are not correct for the forthcoming release.

--> Done, taken from prob_bib.js