Complex Mappings S1

From Provenance WG Wiki
Jump to: navigation, search

Author: Kai Eckert, Daniel Garijo, Simon Miles, Michael Panzer

This document is part of the Dublin Core PROV mapping, see the ProvDCMapping.

PREFIX prov: <http://www.w3.org/ns/prov#>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX dcprov: ???
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>


dct:creator

CONSTRUCT {
   ?doc a prov:Entity .
      prov:wasAttributedTo ?ag .
   _:out a prov:Entity .
      prov:specializationOf ?doc .
   ?ag a prov:Agent .
   _:act a prov:Activity, dcprov:CreationActivity ;
      prov:wasAssociatedWith ?ag ;
      prov:qualifiedAssociation _:assoc .
   _:assoc a prov:Association ;
      prov:agent ?ag ;
      prov:hadRole dcprov:CreatorRole .
   _:out prov:wasGeneratedBy _:act ;
      prov:wasAttributedTo ?ag .
} WHERE {
   ?doc dct:creator ?ag .
}

In the same way, publisher and contributor can be mapped, only the roles and activities change:


dct:contributor

CONSTRUCT {
   ?doc a prov:Entity .
      prov:wasAttributedTo ?ag .
   _:out a prov:Entity .
      prov:specializationOf ?doc .
   ?ag a prov:Agent .
   _:act a prov:Activity, dcprov:ContributionActivity ;
      prov:wasAssociatedWith ?ag ;
      prov:qualifiedAssociation _:assoc .
   _:assoc a prov:Association ;
      prov:agent ?ag ;
      prov:hadRole dcprov:ContributorRole .
   _:out prov:wasGeneratedBy _:act ;
      prov:wasAttributedTo ?ag .
} WHERE {
   ?doc dct:creator ?ag .
}


dct:publisher

CONSTRUCT {
   ?doc a prov:Entity .
      prov:wasAttributedTo ?ag .
   _:out a prov:Entity .
      prov:specializationOf ?doc .
   ?ag a prov:Agent .
   _:act a prov:Activity, dcprov:PublicationActivity ;
      prov:wasAssociatedWith ?ag ;
      prov:qualifiedAssociation _:assoc .
   _:assoc a prov:Association ;
      prov:agent ?ag ;
      prov:hadRole dcprov:PublisherRole .
   _:out prov:wasGeneratedBy _:act ;
      prov:wasAttributedTo ?ag .
} WHERE {
   ?doc dct:creator ?ag .
}


dct:rightsHolder

The rightsHolder is different, here we propose to omit the activity and just add the rights holder to the entity by means of prov:wasAttributedTo. This mapping could actually be omitted as the statements can be inferred from the direct mapping.

CONSTRUCT {
 ?doc     a                         prov:Entity .
 ?ag       a                         prov:Agent .
 ?doc     prov:wasAttributedTo      ?ag .
} WHERE { 
 ?doc dct:rightsHolder?ag .
}


dct:issued

When using Dublin Core terms, it is usual to see that a resource is annotated with several dc assertions like creator, publisher, issued, date, etc. Therefore if we assume that each date corresponds to the generation date by an activity (creationActivity, publishingActivity, etc.) then we can't say that all those activities generated :doc1. Instead, in order to generate "proper" provenance records, we say that all those activities generated an entity which for which :doc1 is a specialization.

CONSTRUCT{
?doc               a                         prov:Entity .
# DATE can have different formats, hopefully xsd:date...
_:act               a                         prov:Activity, dcprov:PublicationActivity .
_:act               prov:used                 _:used_entity .
# The “output”
_:iss_entity        a                         prov:Entity .
_:iss_entity        prov:specializationOf     ?doc .
_:iss_entity        prov:wasGeneratedBy       _:act .
_:iss_entity        prov:qualifiedGeneration  [ a prov:Generation ;
                                                prov:atTime ?date  ;
                                                prov:activity _:act . ] .
# The “input”
_:used_entity       a                         prov:Entity .
_:used_entity       prov:specializationOf     ?doc .
_:iss_entity        prov:wasDerivedFrom       _:used_entity .
} WHERE { 
 ?doc dct:issued ?date.
}

dct:modified

As seen with the following terms, most entity/date properties will have a similar structure.

CONSTRUCT{
?doc               a                         prov:Entity .
_:act              a                         prov:Activity, dcprov:ModificationActivity .
_:act              prov:used                 _:used_entity .
# The “output”
_:iss_entity        a                         prov:Entity .
_:iss_entity        prov:specializationOf     ?doc .
_:iss_entity        prov:wasGeneratedBy       _:act .
_:iss_entity        prov:qualifiedGeneration  [ a prov:Generation ;
                                                prov:atTime ?date  ;
                                                prov:activity _:act . ] .
# The “input”
_:used_entity       a                         prov:Entity .
_:used_entity       prov:specializationOf     ?doc .
_:iss_entity        prov:wasDerivedFrom       _:used_entity .
} WHERE { 
 ?doc dct:issued ?date.
}

dct:dateAccepted

CONSTRUCT{
?doc               a                         prov:Entity .
_:act              a                         prov:Activity, dcprov:AcceptanceActivity .
_:act              prov:used                 _:used_entity .
# The “output”
_:iss_entity        a                         prov:Entity .
_:iss_entity        prov:specializationOf     ?doc .
_:iss_entity        prov:wasGeneratedBy       _:act .
_:iss_entity        prov:qualifiedGeneration  [ a prov:Generation ;
                                                prov:atTime ?date  ;
                                                prov:activity _:act . ] .
# The “input”
_:used_entity       a                         prov:Entity .
_:used_entity       prov:specializationOf     ?doc .
_:iss_entity        prov:wasDerivedFrom       _:used_entity .
} WHERE { 
 ?doc dct:issued ?date.
}

dct:dateCopyrighted

CONSTRUCT{
?doc               a                         prov:Entity .
_:act              a                         prov:Activity, dcprov:CopyrightingActivity .
_:act              prov:used                 _:used_entity .
# The “output”
_:iss_entity        a                         prov:Entity .
_:iss_entity        prov:specializationOf     ?doc .
_:iss_entity        prov:wasGeneratedBy       _:act .
_:iss_entity        prov:qualifiedGeneration  [ a prov:Generation ;
                                                prov:atTime ?date  ;
                                                prov:activity _:act . ] .
# The “input”
_:used_entity       a                         prov:Entity .
_:used_entity       prov:specializationOf     ?doc .
_:iss_entity        prov:wasDerivedFrom       _:used_entity .
} WHERE { 
 ?doc dct:issued ?date.
}

dct:dateSubmitted

CONSTRUCT{
?doc               a                         prov:Entity .
_:act              a                         prov:Activity, dcprov:SubmissionActivity .
_:act              prov:used                 _:used_entity .
# The “output”
_:iss_entity        a                         prov:Entity .
_:iss_entity        prov:specializationOf     ?doc .
_:iss_entity        prov:wasGeneratedBy       _:act .
_:iss_entity        prov:qualifiedGeneration  [ a prov:Generation ;
                                                prov:atTime ?date  ;
                                                prov:activity _:act . ] .
# The “input”
_:used_entity       a                         prov:Entity .
_:used_entity       prov:specializationOf     ?doc .
_:iss_entity        prov:wasDerivedFrom       _:used_entity .
} WHERE { 
 ?doc dct:issued ?date.
}

dct:created

CONSTRUCT{
?doc               a                         prov:Entity .
_:act              a                         prov:Activity, dcprov:CreationActivity .
_:act              prov:used                 _:used_entity .
# The “output”
_:iss_entity        a                         prov:Entity .
_:iss_entity        prov:specializationOf     ?doc .
_:iss_entity        prov:wasGeneratedBy       _:act .
_:iss_entity        prov:qualifiedGeneration  [ a prov:Generation ;
                                                prov:atTime ?date  ;
                                                prov:activity _:act . ] .
# The “input”
_:used_entity       a                         prov:Entity .
_:used_entity       prov:specializationOf     ?doc .
_:iss_entity        prov:wasDerivedFrom       _:used_entity .
} WHERE { 
 ?doc dct:issued ?date.
}


dct:isVersionOf / dct:hasVersion

Here, a specialty of SPARQL CONSTRUCT queries can be used to deal with the inverse properties in Dublin Core.

IsVersionOf: In dcterms, isVersionOf is defined as “A related resource of which the described resource is a version, edition, or adaptation”. In prov, this is much more general than “wasRevisionOf”, so we map it to “wasDerivedFrom”. I would say that prov:wasDerivedFrom>dcterms:isVersionOf>prov:wasRevisionOf. Thus:

CONSTRUCT {
   ?doc1 a prov:Entity ;
      prov:wasDerivedFrom ?doc2.
   ?doc2 a prov:Entity .
} WHERE {
   OPTIONAL { ?doc1 dct:isVersionOf ?doc2 . }
   OPTIONAL { ?doc2 dct:hasVersion ?doc1 .}
}


dct:isFormatOf / dct:hasFormat

isFormatOf: “A related resource that is substantially the same as the described resource, but in another format”. This would map to prov:alternateOf. We don’t know which entities are both of them specializing, but we know that one is an alternate of the other.

CONSTRUCT {
   ?doc1 a prov:Entity ;
      prov:alternateOf ?doc2.
   ?doc2 a prov:Entity .
} WHERE {
   OPTIONAL { ?doc1 dct:isFormatof ?doc2 . }
   OPTIONAL { ?doc2 dct:hasFormat ?doc1 .}
}


dct:isReferencedBy / dct:references

IsReferencedBy: A related resource that references, cites, or otherwise points to the described resource.

 CONSTRUCT {
   ?doc1 a prov:Entity ;
   ?doc2 a prov:Entity .
      prov:wasDerivedFrom ?doc2.
} WHERE {
   OPTIONAL { ?doc1 dct:isReferencedBy ?doc2 . }
   OPTIONAL { ?doc2 dct:references ?doc1 .}
}

dct:replaces / dct:isReplacedBy

CONSTRUCT {
   ?doc1 a prov:Entity ;
      prov:tracedTo ?doc2.
   ?doc2 a prov:Entity .
} WHERE {
   OPTIONAL { ?doc1 dct:replaces ?doc2 . }
   OPTIONAL { ?doc2 dct:isReplacedBy ?doc1 .}
}


dct:source

dct:source is defined as "A related resource from which the described resource is derived". Thus, we have mapped it as a kind of derivation. prov:hadOriginalSource is more specific.

CONSTRUCT{
  ?doc1         a                      prov:Entity ;
                 prov:wasDerivedFrom    :subj2 .
  ?doc2         a                      prov:Entity .
 } WHERE { 
  ?doc1 dct:source ?doc2.
 }

List of dc terms excluded from the mapping

  • dcterms:abstract: Summary of the resource. Thus, not part of its provenance.
  • dcterms:accessRights:Who can access the resource (security status). Since the privileges of the resource are part of the description of the resource, it’s not included in the list.
  • dcterms:accrualPeriodicity:Frequency of the items added to a collection. Descriptive metadata.
  • dcterms:accrualPolicy:policy associated with the insertion of items to a collection. We could use it to enrich the qualified involvement, but there is no direct mapping of this relationship.
  • dcterms:alternative: Refers to an alternative name of the resource. Descriptive metadata
  • dcterms:audience: The audience for whom the resource is useful. Not related to provenance.
  • dcterms:conformsTo: Indicates if the resource conforms to a standard. Descriptive metadata.
  • dcterms:coverage:Topic of the resource. Descriptive metadata
  • dcterms:description: An account of the resource. Descriptive metadata.
  • dcterms:educationLevel: The educational level of the audience for which the resource is intended too.
  • dcterms:extent: Size or duration of the resource. Descriptive metadata
  • dcterms:format: Format of the resource. Descriptive metadata.
  • dcterms:hasPart?: A resource that is included in the current resource. Entity composition is out of the scope of DM, so we leave it out of the list as well.
  • dcterms:identifier: An unambiguous reference on a given context. It’s descriptive metadata.
  • dcterms:instructionalMethod: Method used to create the knowledge that the resource is supposed to support. It is not related to the provenance of the resource.
  • dcterms:isPartOf: inverse of hasPart.
  • dcterms:isRequiredBy: The current resource is required for supporting the function of another resource. This is not related the provenance, since it refers to something that may not have happened yet (e.g., a library dependency, but the program that needs it hasn’t been executed yet).
  • dcterms:language: Language of the resource. Descriptive metadata.
  • dcterms:license: License of the resource.
  • dcterms:mediator: Entity that mediates access to the resource. Not related to its provenance.
  • dcterms:medium: Material of the resource. Descriptive metadata.
  • dcterms:relation: A related resource. This relationship isvery broad and could relate either provenance resources or not. It could be seen as a superproperty of wasDerivedFrom, tracedTo, alternateOf, specializationOf, etc. Thus there is no direct mapping.
  • dcterms:requires: Inverse property of isRequiredBy (see isRequiredBy).
  • dcterms:rights: Metadata about the rights of the resource. Descriptive metadata.
  • dcterms:spatial: Spatial characteristics of the resource. Descriptive metadata. It descrives the spatial characteristics of the content of the resource, that is why we can’t map it to prov:hadLocation.
  • dcterms:subject: Topic of the resource. Descriptive metadata.
  • dcterms:tableOfContents: list of subunits of the resource. Descriptive metadata.
  • dcterms:temporal: temporal characteristics of the resource. Descriptive metadata.
  • dcterms:title: Title of the resource. Descriptive metadata.
  • dcterms:type: rdf:type of the resource.
  • dcterms:bibliographicCitation: It relates the Literal representing the bibliographic citation of the resource to the actual resource, so it is descriptive metadata.