ProvRDF
From Provenance WG Wiki
Introduction
This document gives a draft (detailed) translation from (some of) PROV-DM to PROV-O, and sketches how to go in the reverse direction (i.e. how to extract PROV-DM from a RDF graph that includes PROV-O data as well as possibly other RDF).
Note that I (jcheney) am not being careful about using a standardized RDF syntax, as I don't know any of them. I am just giving the flavor of what I have in mind.
Coverage: This is NOT complete, only for illustration so far. It covers many of the basic element and relation records. It does not cover: derivation, acts on hbehalf of, accounts, ....
Guideline: Include all RDF assertions associated with a DM assertion, even if some of them wind up being redundant/inferrable.
From PROV-DM to PROV-O
We define a translation from PROV-DM formulas to RDF conforming to PROV-O as follows.
There are some places where it's non-obvious (to jcheney) what to do, marked with "???".
Mapping coverage
The undersigned have reviewed DM WD3 and agree that all ASN signatures in WD3 appear as left hand sides of the rules shown on this page. Further, the rules here are in the same order as DM WD3 and no rules appear here without appearing in DM WD3.
- Daniel Garijo (10-Feb-2012)
- Add your name here (date)
- and here (date)
The formulas are listed in an order that corresponds to the order given in PROV-DM WD3.
Translating element formulas
Entity
Uses before defined: prov:type
Activity
Uses before defined: prov:type
Issues (LHS):
- How is startedAt and endedAt distinct from other attribute-value associated with Activity? (Satya)
Agent
prov:Person rdfs:subClassOf prov:Agent . prov:Organization rdfs:subClassOf prov:Agent . prov:SoftwareAgent rdfs:subClassOf prov:Agent . [] a owl:AllDisjointClasses; owl:members ( prov:Person prov:Organization prov:SoftwareAgent ).
Mentions but does not define:
- prov:Person,
- prov:Organization,
- prov:SoftwareAgent .
Uses before defined:
- prov:type
- wasStartedBy
- wasAssociatedWith
- prov:role
Issues:
- How to type an agent to Person, Organization, SoftwareAgent? with a prov:type attribute? (the example shows it, but not stated explicitly)
Note
Concerns:
- Use of notes is reasonable for things like "GUI Color",
- but NOT for the much heavier-duty use that DM offers (meta-provenance).
Uses before defined: hasAnnotation
Translating relation formulas
Generation
Issues:
- An entity can only be generated once (Tim's claim, does DM say anything about it?):
prov:Entity owl:subClassOf [ owl:onProperty prov:wasGeneratedAt; owl:maxCardinality 1 ] .
- For PROV-O, the activity id cannot be optional. (Satya)
- Having activity id as optional violates the DM requirement that all relations "have two primary elements" (Section 5.3, DM TPWD). (Satya)
- Why is time [t] listed as a distinct attribute from other attribute-value pairs? Isn't time of generation yet another attribute? (Satya)
Usage
Issues:
- If activity id is optional for generation record, why is it not so for usage record? These two points need to be reconciled either way. (Satya)
- Similar to generation, time can be "folded" into the "attribute" list. (Satya)
Agent Association
Uses before defined: prov:type, prov:role
Issues:
- All the descriptions associated with Plan in DM TPWD is in context of Activity, then why should be associated with Activity and Agent (also raised as Issue-203 by Stephan). (Satya)
Starting
PROV-DM (eg)
[PROV-O] (eg)
We seemed to agree at F2F2 that 1) who started and 2) when it was started would be separated
Note: we agreed to left this relationship out of the first alignement
Issues:
- The majority of uses for the qualified start should actually be Activities (Tim)
Ending
PROV-DM
[PROV-O]
We seemed to agree at F2F2 that 1) who started and 2) when it was started would be separated
Note: we agreed to left this relationship out of the first alignement
Responsibility
Used prov:type and prov:role before defined.
Issues:
- might be nice to rename "hadQualifiedEntity/Activity" to "involvedEntity/Activity"
Derivation
precise-1
imprecise-1
imprecise-n
not in DM: consolidated derivation signature
Note: Daniel used the [a,[g2], [u1]] notation to indicate that a, g2, and u1 are optional, but if g2 or u1 are present, then a is required as well. (Imprecise and precise derivations are mixed, but we could separate them)
AlternateOf
Note: we agreed to left this relationship out of the first alignement
SpecializationOf
Note: we agreed to left this relationship out of the first alignement
Annotation
PROV-DM (eg)
[PROV-O] (eg)
Note: we'll be using rdfs:comment and label to handle the annotations
Account
PROV-DM (eg)
[PROV-O] (eg)
Note: we agreed to left this relationship out of the first alignement
Record Container
PROV-DM (eg)
[PROV-O] (eg)
Note: we agreed to left this relationship out of the first alignement
Time
Asserter
PROV-DM (eg)
[PROV-O] (eg)
Location
Traceability
PROV-DM (eg)
[PROV-O] (eg)
Activity Ordering
Revision
PROV-DM (eg)
[PROV-O] (eg)
Attribution
Quotation
Summary
Original Source
old monolithic list
Questions/problems
- The element formula for activities is the only one that mentions additional things besides attributes. This seems odd.
- It isn't obvious whether we should emit a triple saying that the plan element of an activity is a
. I guess this can be inferred if we omit it?
- In the rule for note, there is no class we can assign to the id. (The obvious idea of using rdfs:comment doesn't work because there's no separate class for the comments, and the range of rdfs:comment is Literal.) Is this a problem? Proposed solution: add class prov:Note.
- wasGeneratedBy has a time which can be linked to the generated entity by
, but I think the time should be linked directly to the id. Proposed solution: introduce
, define
as the composition of
and
.
- used has a time and it's not obvious what this should be linked to in RDF and how. There is no relation for linking the used id to the time. Proposed solution: introduce
.
- wasStartedBy and wasEndedBy are treated as events (and they have id's and attributes), but there is no class for them. Proposed solution: introduce
and
as subclasses of QualifiedInvolvement.
- wasStartedBy and wasEndedBy rules have no obvious way to link the start and end time.
- In hasAnnotation, should the attributes be connected to r or to n? Given that the note n can have arbitrary attributes, why does hasAnnotation have additional attributes?
From PROV-O to PROV-DM
Given an instance of PROV-O, we want to compute an instance of PROV-DM that has the "same meaning".
The basic idea is:
- For each node in the RDF graph, check whether the node is an instance of one of the PROV-O classes Entity, Agent, or Activity.
- For each such node, look for the appropriate edges in the prov: namespace needed to fill in the fields of the corresponding PROV-DM record.
- Any additional fields in other namespaces are added as attributes.
- For each of the edges / graph patterns corresponding to PRO-DM relations, look for the corresponding data and generate the appropriate relation.
[TODO: Flesh this out!]
