ISSUE-403: Feedback on the mapping from Tim Lebo

Feedback_TL

Feedback on the mapping from Tim Lebo

State:
CLOSED
Product:
Mapping PROV-O to Dublin Core
Raised by:
Daniel Garijo
Opened on:
2012-06-09
Description:
Regarding https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer#wiki-References

1)
"To be more precise, we define provenance metadata as metadata providing provenance information according to the definition of the W3C Provenance Incubator Group"

Why are you still using the XG's definition? Does PROV-WG still not provide one that you like? Should PROV-WG be explicit about their definition of provenance (since its materials will become Recommendation and XG's will not)?


2)

"For the complex mappings, we take the following approach: "

is confusing. Is one of the "three parts" enumerated above "complex". Ah, yes. The third.

Suggest to draw that connection more clearly.

3)

The points in the second half of the paragraph:

". A rationale for these two steps is that the mappings in stage 1 are context free and do not depend on the existence of any other statements. On the other hand, by employing the patterns developed for stage 2, any kind of generated PROV data could be cleaned up at a later point, for instance after the integration with provenance information from a different source, which could be advantageous. "

really should be promoted to the first half of the paragraph. It takes too long to determine what the distinction is between the two phases.

4)

The use of blank nodes is disturbing (http://linkeddatabook.com/editions/1.0/#htoc16). Please make it clear that the bnodes only exist during the processing that you suggest, and that bnodes are not produced in resulting PROV or DC records.

5)

Direct mappings:

-1 dct:references rdfs:subPropertyOf prov:wasDerivedFrom .
+1 dct:creator rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
-1 (casting a broad to a specific) dct:date rdfs:subPropertyOf prov:generatedAtTime .
+1 dct:Agent owl:equivalentClass prov:Agent .
-1 (reverse these) prov:hadOriginalSource rdfs:subPropertyOf dct:source .
+1 prov:wasRevisionOf rdfs:subPropertyOf dct:isVersionOf .

Voting for all of them (in https://github.com/dcmi/DC-PROV-Mapping/wiki/Direct-Mappings):

+1 dct:Agent owl:equivalentClass prov:Agent.
-1 dct:references rdfs:subPropertyOf prov:wasDerivedFrom .

+1 dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:creator rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:publisher rdfs:subPropertyOf prov:wasAttributedTo .
+1 dct:contributor rdfs:subPropertyOf prov:wasAttributedTo .

+1 dct:isVersionOf rdfs:subPropertyOf prov:wasDerivedFrom .
+1 dct:isFormatOf rdfs:subPropertyOf prov:alternateOf .
+1 dct:replaces rdfs:subPropertyOf prov:tracedTo .
+1 dct:source rdfs:subPropertyOf prov:wasDerivedFrom .

-1 dct:date rdfs:subPropertyOf prov:generatedAtTime .

I would support reversing the above. As it is, you are casting a general "any date you wish" into a very specific meaning.

At first glance, the following are concerning. If the same instance has all of these properties, then it was generated at many distinct times. Perhaps your complex mappings tease this out.

-1 dct:issued rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:dateAccepted rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:dateCopyRighted rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:dateSubmitted rdfs:subPropertyOf prov:generatedAtTime .
-1 dct:modified rdfs:subPropertyOf prov:generatedAtTime .

The following casts a range into an instant of time.

-1 dct:valid rdfs:subPropertyOf prov:generatedAtTime .

-1 prov:hadOriginalSource rdfs:subPropertyOf dct:source .

I would support reversing the above. PROV is pointing to a subset of the sources that dct:source intends to cite. dct:source is the union of hadOriginalSource and any of its derivations (and more, perhaps).

+1 prov:wasRevisionOf rdfs:subPropertyOf dct:isVersionOf .


6)

In https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer

For readability, I'd reverse the order of these:

dcprov:CreationActivity rdfs:subClassOf
prov:Activity, dcprov:ContributionActivity .
dcprov:ContributionActivity rdfs:subClassOf
prov:Activity .

7)

In https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer

For readability, I'd reverse the order of these:

dcprov:CreatorRole rdfs:subClassOf
prov:Role, dcprov:ContributorRole .
dcprov:ContributorRole rdfs:subClassOf
prov:Role .

8)

If we reapply the SPARQL queries from the complex mappings twice, do we get two un-identified blank nodes that should be identified?
If so, this leads to proliferation of bnodes that should be avoided. If the queries are only to be informative, and those bnodes to be appropriately named to avoid duplication, then I suggest this be clearly stated.

9)

In https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1 section "List of dc terms excluded from the mapping",
I suggest to organize by descriptive vs. provenance metadata. That way I can review your categorization more easily, AND focus on only the provenance metadata (which is the point of the mapping).

10)

In https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer

No bibliography for (DCMI Usage Board, 2010b) or (DCMI Usage Board, 2010a)

You don't reference the URL http://dublincore.org/documents/dcmi-terms/ ?

11)

It seems like you could include the content of https://github.com/dcmi/DC-PROV-Mapping/wiki/Direct-Mappings and https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations directly in the "primer" - the redundancy is dissonant.

Why three complex mappings in the primer? Why now fewer?

The organization across 4 pages makes it difficult to determine "what is where". I think the content as it is could stand on its own as one document.

12)

Where is stage 2 of the complex mappings?


13) Are there implementations of your complex mapping?



14)

https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations

The following order makes more sense to me

dcprov:PublicationActivity rdfs:subClassOf prov:Activity .
dcprov:ContributionActivity rdfs:subClassOf prov:Activity .
dcprov:CreationActivity rdfs:subClassOf prov:Activity, dcprov:ContributionActivity .
dcprov:ContributorRole rdfs:subClassOf prov:Role .
dcprov:PublisherRole rdfs:subClassOf prov:Role .
dcprov:CreatorRole rdfs:subClassOf prov:Role, dcprov:ContributorRole .



15)

https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations

Are the following used in the complex rules? It would be very nice to show which rules each specialization is used in. Similarly, it would be nice to group rules by their use of PROV terms, and by "in the where" versus "in the construct". A navigation like this would really bring the material together nicely.

dcprov:PublicationActivity rdfs:subClassOf prov:Activity .
dcprov:ContributionActivity rdfs:subClassOf prov:Activity .
dcprov:CreationActivity rdfs:subClassOf prov:Activity, dcprov:ContributionActivity .
dcprov:ContributorRole rdfs:subClassOf prov:Role .
dcprov:PublisherRole rdfs:subClassOf prov:Role .
dcprov:CreatorRole rdfs:subClassOf prov:Role, dcprov:ContributorRole .


16)

Is the following a copy paste error (publisher is never mentioned):

https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1

Section: dct:publisher

CONSTRUCT {
?doc a prov:Entity .
prov:wasAttributedTo ?ag .
_:out a prov:Entity .
prov:specializationOf ?doc .
?ag a prov:Agent .
_:act a prov:Activity, dcprov:PublicationActivity ;
prov:wasAssociatedWith ?ag ;
prov:qualifiedAssociation _:assoc .
_:assoc a prov:Association ;
prov:agent ?ag ;
prov:hadRole dcprov:PublisherRole .
_:out prov:wasGeneratedBy _:act ;
prov:wasAttributedTo ?ag .
} WHERE {
?doc dct:creator ?ag .
}



17)

https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1

spacing is off in:


dct:rightsHolder

The rightsHolder is different, here we propose to omit the activity and just add the rights holder to the entity by means of
prov:wasAttributedTo. This mapping could actually be omitted as the statements can be inferred from the direct mapping.

CONSTRUCT {
?doc a prov:Entity .
?ag a prov:Agent .
?doc prov:wasAttributedTo ?ag .
} WHERE {
?doc dct:rightsHolder?ag .
}


18)

https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1

Recommend expanding variable names to be more readable (e.g., ?ag to ?agent)

19)

https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1

Is there a reason why you use "_:iss_entity" instead of just the "[]" syntax? smearing a node across the CONSTRUCT makes it more difficult to read. You used the "[]" in :


dct:modified

[ a prov:Generation ;
prov:atTime ?date ;
prov:activity _:act . ] .
Related Actions Items:
No related actions
Related emails:
  1. Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from dgarijo@delicias.dia.fi.upm.es on 2012-07-05)
  2. Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from lebot@rpi.edu on 2012-07-04)
  3. Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from dgarijo@delicias.dia.fi.upm.es on 2012-07-04)
  4. PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core] (from sysbot+tracker@w3.org on 2012-06-09)

Related notes:

No additional notes.

Display change log ATOM feed


Chair, Staff Contact
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 403.html,v 1.1 2013-06-20 07:37:42 vivien Exp $