PIL OWL Ontology Meeting 2011-11-07

From Provenance WG Wiki
Revision as of 18:51, 7 November 2011 by Tlebo (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Meeting Information

prov-wg - Modeling Task Force - OWL group telecon


Agenda

Satya's

  • 1. Review the OPMO-based solution for modeling role information in PROV-O OWL and the instantiation as RDF using James's "division example" (Tim, Daniel, Stephan)
  • 2. Review the modified html document - (a) examples for wasRevisionOf, Recipe etc. (b) classes in "holding section", (c) properties in "holding section"
  • 3. Review new section in html document - Mapping PROV-DM to PROV-O
  • 4. Discuss proposal for simplifying the PROV-O for readers/users by re-structuring some of the properties in a "core" and an "extended" model
  • 5. Discuss addition of diagrams for some of the object properties

Luc's comments

A few questions comments: 1.  There is an outstanding issue (raised by Paul) that we should be able    to have time associated with derivation. If adopted, this may require    a derivation qualifier. Would your approach still work? In this case, which    is the prov:entity?

2. You have introduced a qualifier in participation, there is none in prov-dm.    Why is it required here, since it seems to just link entity and pe. (no time here, for instance)    Should it be introduced in prov-dm? What else do we want to have?

3. It's the same for revision, there is no qualifier in prov-dm.       But here, the Revision qualifier you introduce is of different nature, it is there     to capture a ternary relation between entities (while before, it was binary relation between     pe/entity, with a hook for "extra stuff").

4. I understand why Generation "points to the future", but it makes it the odd one.     It also seems that you can't write provenance "linearly" from future to past.     Are you satisfied with this?

5. prov-dm introduces a relation "precedes" between events to give some interpretation to the data model.     Potentially, it becomes possible to express it in the ontology. I am not suggesting that it should be encoded in the core     ontology!  

6. It would be good to write that unqualified involvement is "unprecise".     When we assert used(pe,e),     it could be because of QualifiedUse(pe,e,t1,role=r1)  and QualifiedUse(pe,e,t2,role=r2).     So, used(pe,e) gives a *lower bound* on the number of actual uses.


7. Your picture (which BTW, I like very much, and we could adopt in prov-dm!) has a Use and a Usage.     BTW, what is it you crossed out? can you explain?

8. I like the fact you use nouns for properties of qualified involvement.     There seems to be an exception, which is hadQuotee/hadQuoter (which is in the picture but not described below).

9. On the choice of term: "Involvement". Is this appropriate to use this term in the case of revision and quotation     (BTW, there was a suggestion that complementOf could indicate time intervals, so the same technique coudl be used here),      where there doesn't seem to be a      process execution at all. You seem to have introduced "Qualified Relations" really.     

Attendees

  • Tim
  • Satya
  • Khalid
  • Stephan
  • James
  • Stian 
  • Jim Mc.

Discussions

Division example

Example 2: algebra

"P  is a ProcessExecution that "uses" 40 and 5, and divides them to get 8.  40 is a numerator and 5 is a denominator. The same entity/thing could  have different roles with respect to different processes. For e.g.  (a) 40 / 5 = 8 by PE p1 

(b) 8 / 2 = 4 by PE p2, 

where 8 plays the roles of "result" and "numerator" respectively.


  1. This is Daniel's preference for sneaking
        1.  :p
        2.     a prov:ProcessExecution;
        3.      prov:used        "40"^^xsd:integer;          # Could be inferred, but we materialize it directly to avoid need for inference.
        4.      prov:used         "5"^^xsd:integer;            #
        5.      prov:generated "8"^^xsd:integer;            #
        6.      
        7.      algebra:usedAsNumerator    "40"^^xsd:integer;
        8.      algebra:usedAsDenominator "5"^^xsd:integer;
        9.      algebra:generatedAsResult   "8"^^xsd:integer;
        10. .
        11. algebra:usedAsNumerator     rdfs:subPropertyOf prov:used .
        12. algebra:usedAsDenominator rdfs:subPropertyOf prov:used .
        13. algebra:generatedAs              rdfs:subPropertyOf prov:generated .

Many potential issues with the above: 

a) Can't extract any additional qualifiers or roles (unless you say role=algebra:usedAsNumerator) to  map with PROV-DM.

b) I don't think we can use a Literal as a prov:Entity in PROV-O.

c) "8"^^xsd:integer is here generated by :p - but the number 40 can be "generated" by many different things - at different times.  It is the *entity* ("the output of :p") that *characterizes the integer 40* that was generated by :p. The characterisation could just be 

40 a prov:Entity;

   rdf:value  "40"^^xsd:integer .


p

    a prov:ProcessExecution;     prov:used "40"^^xsd:integer; # OWL FULL? I know....     prov:qualifedUse [         prov:entity 40     ];     prov:used "5"^^xsd:integer;     prov:qualifedUse [              ];     tim:generated "8"^^xsd:integer;     prov:qualifedGeneration [         prov:entity "8"^^xsd:integer;     ]; . tim:generated owl:inverseOf prov:wasGeneratedBy .

Review OPMO-style solution for modelling roles

Tim talked through http://www.w3.org/2011/prov/wiki/Qualifed_Involvements_in_PROV-O

Satya: "QualifiedInvolvements are subordinate to instances of the primary ProcessExecution or Entity classes" - what does subordinate mean?

Tim: a QualifiedInvolvement can't be used on its own, it's a link from an already asserted Entity and ProcessExecution.

Satya: Modelled in RDF any resource stands on its own. 

Tim: Not trying to encode this subordinate relationship in OWL, but if that is possible. A QI must be pointed with by a prov:qualifiedWith property.

Satya: Yes, we could restrict that in OWL.

Tim: OPM-O permits each of these relationships to stay on its own - but we want in PROV-O to move the model to just Process Executions and Entities - so a QualifiedInvolvement must be referencing a PE and Entity.

Khalid: ??

Tim: The QualifiedInvolvement must refer to an Entity with prov:entity. 

Khalid: Just on the class Usage for instance, say that property prov:entity must have a value and prov:processExecution

Tim: .. except we are removing prov:processExecution, keeping only the inverse property like prov:qualifiedUsage

Khalid: ...? Don't think there is much difference.

Tim: Yes, but the relation in inverse. QualifiedInvolvement is the inverse of opmo:Cause etc. 

Satya: Inverse of a class??

Stian: A QualifiedInvolvement is basically forming a chain from Process Execution to Entity.

Tim: I mean conceptually inverse. I don't care about the actual inverse properties or model them - just to compare with OPM-O.

Satya: In OPM-O a (?) is standing on its own, but here you require properties linking the Usage etc. to a Process Execution and Entity? A Usage can't exist unless it is linked to an instance of a PE and an Entity?

Tim: Qualified Usage section - prov:qualifiedUsage. A prov:Usage must be linked to the PE using prov:qualifiedUsage and must point to an prov:Entity with prov:entity.

Stian: Good Satya: We're all on the same page.


Tim: This pattern applied again for Participation, Generation etc. prov:hadParticipant and qualification with prov:qualifiedParticipation.

Satya: Diagram does not show the Role bit - there's a link from prov:Participation with a predicate.. prov:role?

Tim: yes, big blue Participation has 3 properties, rdf:type, prov:role and prov:entity

Stephan: Not to focus too much on predicate names

Had participant: Qualified. qualified control, say: prov:hadParticpant :bystander prov:qualifiedParticipation [        a prov:Participation;        prov:entity     :bystander;        prov:role   part:practical-joke-target;     ];

    1. Fixme: Should also say bystander as role

Tim: If an entity wishes to qualify how it was generated, it needs to state the prov:wasGeneratedBy to the ProcessExecution (as it is the PE that qualifies) - which has a prov:qualifiedGeneration to specify the qualified generation. 

In the Overview section, shows Unqualified Predicate, Qualified Predicate and QualifiedInvolvement - the rdfs:range of the rdfs:Property before.

Satya: is reification of triples required for annotation? 

Tim: Only the rdf object in reality is stated again.


=====Comments on the qualified involvement approach

Tim is NOT proposing to actually use reification to model it, but conceptually that is what we are inherently doing.

st1 rdf:type Usage (rdf:Statement) st1 hasSubject "a PE"  # rdf:subject ? st1 hasPredicate "used" st1 hasObject "a Entity" st1 hasTime t1 st1 hasRole r1

but.. there is no reason why some crazy people could not model prov:entity as subproperty of hasObject etc.

Satya: How to model: 4 (when playing role of result) wasDerivedFrom 8 (when playing role of numerator)      Stian's approach from reading Qualified Involvements:    

  1. wasDerivedFrom implies a PE

_pe a prov:ProcessExecution ;    prov:used :8 ;    prov:qualifiedUsage [       a prov:Usage ;     prov:entity :8 ;     prov:role :numerator     ]        prov:qualifiedGeneration [       a prov:Generation ;       prov:entity :4 ;       prov:role :result     ]

4 prov:wasDerivedFrom :8 ;

    prov:wasGeneratedBy _pe .

In PROV-ASN: wasDerivedFrom(4, 8, qualifier(role="result"), qualified(role="numerator"))

which decomposes to: entity(8) entity(4) wasGeneratedBy(4, _pe, qualifier(role="result")) used(8, _pe, qualifier(role="numerator")) processExecution(_pe)

Tim: http://dvcs.w3.org/hg/prov/file/tip/ontology/components/wasDerivedFrom/prov-dm-e5-wasDerivedFrom-e3.ttl

with search/replace of :channel "in" with prov:role :numerator and :channel "out" with prov:role :result

We'll discuss this more tomorrow at the same time.


Stephen: We're not just trying to do roles, but any qualifier

Satya: In OWL when you qualify something, you are trying to distinguish a thing from something else - generally by subclass or subproperties. Role is trickier. 

Stephen: I mean the terminology of qualifier in PROV-DM.

Satya: Struggled with PROV-DM being non-RDF in its approach.

Stephen: PROV-DM specify its qualifiers and roles - which we are trying to map here.

Satya: Generally the qualifiers can be subclasses and properties. Role stands out. 

Stephen: Well, we've not tried much. Say time. 

Satya: We have TemporalEntity and hasTemporalValue. 

Stephen: Don't think role is a special thing, just one kind of (PROV-DM) qualifier. 

Satya: What PROV-DM says is that the qualifier is a qualifier on the relationship. On a predicate in RDF we can't have qualifiers. That's the fundamental issue here. We'll talk about this tomorrow.


Review the modified html document

Satya: Just don't introduce new terms etc. now. 

Satya: Refer to current version of PROV-O http://dvcs.w3.org/hg/prov/raw-file/3dc836813ce6/ontology/ProvenanceFormalModel.html

Satya: Went through Class and Property section, and put in a holding section:

http://dvcs.w3.org/hg/prov/raw-file/3dc836813ce6/ontology/ProvenanceFormalModel.html#holding-section-for-classes

Satya: Do we need Time class?   <prov:startedAt>             <rdf:Description rdf:about="http://www.example.com/crimeFile#t1">               <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Time"/>               <time:inXSDDateTime>2011-10-20T16:26:45Z</time:inXSDDateTime>             </rdf:Description>           </prov:startedAt>

Stian: time:inXSDDateTime  domain: time:Instant and range xsd:dateTime


   

    <owl:DatatypeProperty rdf:about="&time;inXSDDateTime">         <rdfs:range rdf:resource="&xsd;dateTime"/>         <rdfs:domain rdf:resource="&time;Instant"/>     </owl:DatatypeProperty>          yes - this is imported from time.owl           Satya: Holding section for properties http://dvcs.w3.org/hg/prov/raw-file/3dc836813ce6/ontology/ProvenanceFormalModel.html#holding-section-for-properties

followed as a placeholder for preceded by. wasScheduledAfter as a name is not good, as it implies the plan/recipe - not what happened. Propose to replace wasScheduledAfter with followed.

Khalid: yes - but wasSchedAfter had different. We added 'followed' to be more general than wasScheduledAfter - saying that it started after the end of the other pe.

If a PE generated an entity and another PE used it - that should be sufficient. 

Stian: Basically is overlap allowed or not.

Khalid: One is stronger than the other PROV-DM wasScheduledAfter: https://dvcs.w3.org/hg/prov/raw-file/f805364c879d/model/ProvenanceModel.html#expression-OrderingOfProcessExecutions

http://dvcs.w3.org/hg/prov/raw-file/3dc836813ce6/ontology/components/Time/example-3-extension.ttl shows how to do this:

pe2 a prov:ProcessExecution ;

    # starts immediately after :pe1     time:intervalMetBy :pe1;     # lasts until :pe3 finishes     time:intervalFinishes :pe3 .

this maks :pe1 also a time:Interval (which can have prov:startedAt as subproperty of time:hasBeginning)

Section:  Mapping PROV-DM to PROV-O

https://dvcs.w3.org/hg/prov/raw-file/tip/ontology/ProvenanceFormalModel.html#mapping-the-prov-dm-terms-to-prov-ontology

Discuss tomorrow.

Diagrams =

Khalid: Diagram for "hadTemporalValue" 3.2.13

https://dvcs.w3.org/hg/prov/raw-file/tip/ontology/ProvenanceFormalModel.html#hadtemporalvalue

W3C Working Draft 07 November 2011

This version: http://www.w3.org/TR/2011/WD-prov-o-20111107/ Latest published version: http://www.w3.org/TR/prov-o/ Latest editor's draft: http://www.w3.org/TR/2011/WD-prov-o-20111013/ Editors: Satya Sahoo, Case Western Reserve University, USA Deborah McGuinness, Rensselaer Polytechnic Institute, USA Authors: (In alphabetical order) Khalid Belhajjame, University of Manchester, UK James Cheney, University of Edinburgh, UK Daniel Garijo, Universidad Politécnica de Madrid, Spain Timothy Lebo, Rensselaer Polytechnic Institute, USA Stian Soiland-Reyes, University of Manchester, UK Copyright © 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.

Abstract

The PROV Ontology (also PROV-O) encodes the PROV Data Model [PROV-DM] in the OWL2 Web Ontology Language (OWL2). The PROV ontology consists of a set of classes, properties, and restrictions that can be used to represent provenance information. The PROV ontology is specialized to create domain-specific provenance ontologies that model the provenance information specific to different applications. The PROV ontology supports a set of entailments based on OWL2 formal semantics and provenance specific inference rules. The PROV ontology is available for download as a separate OWL2 document.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was published by the Provenance Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-prov-wg@w3.org (subscribe, archives). All feedback is welcome.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction 1.1 Guide to this Document 2. PROV Ontology 2.1 Mapping the PROV-DM terms to PROV Ontology 2.2 OWL2 Syntax Used in this Document 2.3 Namespace and OWL2 version 3. PROV Ontology: Classes and Properties 3.1 Classes 3.1.1 Entity 3.1.2 ProcessExecution 3.1.3 Agent 3.1.4 TemporalEntity 3.1.5 ProvenanceContainer 3.1.5.1 Modeling ProvenanceContainer and Account as RDF Graph 3.1.6 Location 3.1.7 EntityInRole 3.1.8 Recipe 3.1.9 Holding Section for Classes 3.1.9.1 Time 3.1.9.2 EntityInRole 3.1.9.3 Collections 3.1.9.3.1 Expansion 3.1.9.3.2 Reduction 3.1.9.3.3 EmptyCollection 3.1.9.3.4 Collection content 3.2 Object Properties 3.2.1 wasGeneratedBy 3.2.2 wasRevisionOf 3.2.3 wasDerivedFrom 3.2.4 wasEventuallyDerivedFrom 3.2.5 dependedOn 3.2.6 used 3.2.7 hadParticipant 3.2.8 wasComplementOf 3.2.9 wasControlledBy 3.2.10 hadRecipe 3.2.11 wasInformedBy 3.2.12 wasScheduledAfter 3.2.13 hadTemporalValue 3.2.14 wasAssumedBy 3.2.15 wasAttributedTo 3.2.16 wasQuoteOf 3.2.17 wasSummaryOf 3.2.18 hadOriginalSource 3.2.19 Holding Section for Properties 3.2.19.1 followed 3.2.19.2 hadTemporalValue 3.2.19.3 wasAssumedBy 3.2.19.4 startedAt 3.2.19.5 endedAt 3.2.19.6 wasGeneratedAt 3.2.19.7 assumedRole 3.2.19.8 assumedRoleAt 3.3 Characteristics of Object Properties 3.4 Annotation Properties 3.5 Overview of the ontology 4. Specializing Provenance Ontology for Domain-specific Provenance Applications 4.1 Modeling the Crime File Scenario 4.1.1 Specialization of PROV Ontology Classes 4.1.1.1 cf:Journalist 4.1.1.2 cf:CrimeFile 4.1.1.3 cf:FileCreation, cf:FileEditing, cf:FileAppending, cf:Emailing, cf:SpellChecking 4.1.2 Specialization of PROV Ontology Properties 4.1.2.1 cf:hadFilePath 4.2 Modeling an Example Scientific Workflow Scenario 4.2.1 Workflow extensions to PROV classes 4.2.2 Workflow extensions to PROV properties 4.2.3 Workflow structure 4.2.4 Example workflow 4.2.5 Example workflow run 5. Formal Semantics of the PROV Ontology 5.1 RDF Semantics for PROV Ontology 5.2 OWL2 Semantics for PROV Ontology 5.3 Provenance-specific Entailments Supported by PROV Ontology 5.3.1 Provenance constraint on ProcessExecution 5.3.2 Provenance constraint on wasGeneratedBy (generation-affects-attributes) 5.3.3 Provenance constraint on wasGeneratedBy (generation-pe-ordering) 5.3.4 Provenance constraint on wasGeneratedBy (generation-unicity) 5.3.5 Provenance constraint on Used (use-attributes) 5.3.6 Provenance constraint on Used (use-pe-ordering) 5.3.7 Provenance constraint on wasDerivedFrom (derivation-attributes) 5.3.8 Provenance constraint on wasDerivedFrom (derivation-use-generation-ordering) 5.3.9 Provenance constraint on wasDerivedFrom (derivation-events) 5.3.10 Provenance constraint on wasDerivedFrom (derivation-events) 5.3.11 Provenance constraint on wasDerivedFrom (derivation-use) 5.3.12 Provenance constraint on wasEventuallyDerivedFrom (derivation-generation-generation-ordering) 5.3.13 Provenance constraint on wasEventuallyDerivedFrom (derivation-linked-independent) 5.3.14 Provenance constraint on wasComplementOf (wasComplementOf-necessary-cond) 5.3.15 Provenance constraint on hadParticipant (participant) A. Acknowledgements B. References B.1 Normative references B.2 Informative references 1. Introduction

PROV Ontology (also PROV-O) defines the normative modeling of the PROV Data Model [PROV-DM] using the W3C OWL2 Web Ontology Language. This document specification describes the set of classes, properties, and restrictions that constitute the PROV ontology, which have been introduced in the PROV Data Model [PROV-DM]. This ontology specification provides the foundation for implementation of provenance applications in different domains using the PROV ontology for representing, exchanging, and integrating provenance information. Together with the PROV Access and Query [PROV-PAQ] and PROV Data Model [PROV-DM], this document forms a framework for provenance information interchange and management in domain-specific Web-based applications.

The PROV ontology classes and properties are defined such that they can not only be used directly to represent provenance information, but also can be specialized for modeling application-specific provenance details in a variety of domains. Thus, the PROV ontology is expected to be both directly usable in applications as well as serve as a reference model for creation of domain-specific provenance ontology and thereby facilitate interoperable provenance modeling. This document uses an example provenance scenario introduced in the PROV Data Model [PROV-DM] to demonstrate the use PROV-O classes and properties to model provenance information.

Finally, this document describes the formal semantics of the PROV ontology using the OWL2 semantics, [OWL2-DIRECT-SEMANTICS], [OWL2-RDF-BASED-SEMANTICS], and a set of provenance-specific inference rules. This is expected to support provenance implementations to automatically check for consistency of provenance information represented using PROV ontology and explicitly assert implicit provenance knowledge.

The key words "must", "must not", "required", "shall", "shall not", "should", "should not", "recommended", "may", and "optional" in this document are to be interpreted as described in [RFC2119].

1.1 Guide to this Document

This document is intended for provide an understanding of the PROV ontology and how it can be used by different applications to represent their provenance information. The intended audience of this document include users who are new to provenance modeling as well as experienced users who would like their provenance model compatible with the PROV ontology to facilitate standardization. This document assumes a basic understanding of the W3C RDF(S) and OWL2 specification. Readers are referred to the OWL2 and RDF(S) documentations, starting with the [OWL2-PRIMER] and [RDF-PRIMER], for further details about the OWL2 and RDF(S) specifications respectively.

Section 2 describes the mapping of the PROV Data Model [PROV-DM] to the PROV ontology. Section 3 introduces the classes and properties of the PROV ontology. Section 4 describes the approach used to specialize the PROV ontology create a domain specific ontology for an example provenance scenario introduced in the PROV Data Model [PROV-DM]. The PROV ontology supports a set of provenance entailments and these are described in Section 5.

2. PROV Ontology

The PROV Data Model [PROV-DM] introduces a minimal set of concepts to represent provenance information in a variety of application domains. This document maps the PROV Data Model to PROV Ontology using the OWL2 ontology language, which facilitates a fixed interpretation and use of the PROV Data Model concepts based on the formal semantics of OWL2 [OWL2-DIRECT-SEMANTICS] [OWL2-RDF-BASED-SEMANTICS].

The PROV Ontology can be used directly in a domain application, though many domain applications may require specialization of PROV-O Classes and Properties for representing domain-specific provenance information. We briefly introduce some of the OWL2 modeling terms that will be used to describe the PROV ontology. An OWL2 instance is an individual object in a domain of discourse, for example a person named Alice or a car, and a set of individuals sharing a set of common characteristics is called a class. Person and Car are examples of classes representing the set of individual persons and cars respectively. The OWL2 object properties are used to link individuals, classes, or create a property hierarchy. For example, the object property "hasOwner" can be used to link car with person. The OWL2 datatype properties are used to link individuals or classes to data values, including XML Schema datatypes [XMLSCHEMA-2].

The PROV Data Model document [PROV-DM] introduces an example provenance scenario describing the creation of crime statistics file stored on a shared file system and edited by journalists Alice, Bob, Charles, David, and Edith. This scenario is used as a running example in this document to describe the PROV ontology classes and properties, the specialization mechanism, and the entailments supported by the PROV ontology.

2.1 Mapping the PROV-DM terms to PROV Ontology

The PROV Data Model [PROV-DM] uses an Abstract Syntax Notation (ASN) to describe the set of provenance terms that are used to construct the PROV ontology. There are a number of differences between the PROV-DM ASN and the Semantic Web RDF, RDFS and OWL2 technologies; hence the approach used to model the provenance terms in PROV ontology differ, partially or significantly, from the PROV-DM approach.

For example, the notion of "expressions" used in the PROV-DM map to RDF triple assertions in PROV-O. Similarly, the PROV-DM discusses the use of "Qualifier" to assert additional information about provenance terms. Following the general knowledge representation practices and OWL2 ontologies specifically, the PROV ontology specializes a given provenance term to create either a sub class or sub property to represent "additionally" qualified terms. Throughout this document, we explicitly state the difference, if any, between the PROV-DM term and PROV ontology term.

In addition, RDF is strictly monotonic and "...it cannot express closed-world assumptions, local default preferences, and several other commonly used non-monotonic constructs."[RDF-MT], but the PROV-DM seems to introduce the notion of non-monotonic assertions through "Account" construct [PROV-DM]. For example, Account description in PROV-DM states that it "It provides a scoping mechanism for expression identifiers and for some contraints (such as generation-unicity and derivation-use)." 2.2 OWL2 Syntax Used in this Document

This document uses the RDF/XML syntax, which is the mandatory syntax supported by all OWL2 syntax [OWL2-PRIMER] to represent the PROV ontology. Provenance assertions using PROV-O can use any of the RDF syntax defined in the RDF specification [RDF-PRIMER].

2.3 Namespace and OWL2 version

The corresponding OWL2 version of this PROV Ontology is available at [PROV-Ontology-Namespace] and as ProvenanceOntology.owl. The namespace for the PROV ontology and all terms defined in this document is http://www.w3.org/ns/prov-o/ [PROV-Ontology-Namespace] and is in this document denoted by the prefix prov.

It has been suggested that [PROV-DM] and PROV-O should instead use the namespace http://www.w3.org/ns/prov/ for terms that are common in both models. This is ISSUE-90 3. PROV Ontology: Classes and Properties

We now introduce the classes and properties that constitute the PROV ontology. We first give a textual description of each ontology term, followed by OWL2 syntax representing the ontology term and an example use of the class in the provenance scenario.

3.1 Classes

The PROV ontology consists of classes that can be organized into a hierarchical structure using the rdfs:subClassOf property.   Note: CamelBack notation is used for class names

3.1.1 Entity

Class Description Entity is defined to be "An Entity represents an identifiable characterized thing." [PROV-DM]

OWL syntax  prov:Entity rdfs:subClassOf owl:Thing.                     Example Example of instances of class Entity from the provenance scenario are files with identifiers e1 and e2. The RDF/XML syntax for asserting that e1 is an instance of Entity is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#e1">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/> </rdf:Description> Additional assertions can be made about the Entity instances that describe additional attributes of the entities. Following common knowledge representation approach, the Entity class can be specialized to create multiple sub classes, using the rdfs:subClassOf property, representing distinct categories of entities using additional characterizing attributes (as defined in the [PROV-DM]). The additional attributes should use an appropriate namespace, and the new sub classes may be introduced by application-specific provenance ontologies.

Example <rdf:Description rdf:about="http://www.example.com/crimeFile#e2">                     <rdf:type rdf:resource="http://www.example.com/crime#CrimeFile">            </rdf:Description> <rdf:Description rdf:about="http://www.example.com/crime#CrimeFile">   <rdfs:subClassOf rdf:resource="http://www.w3.org/ns/prov-o/Entity"/> </rdf:Description> 3.1.2 ProcessExecution

Class Description ProcessExecution is defined to be "an identifiable activity, which performs a piece of work." [PROV-DM]

OWL syntax prov:ProcessExecution rdfs:subClassOf owl:Thing. Example Example instances of the class ProcessExecution (from the provenance scenario ) are "file creation" (pe0) and "file editing" (pe2) . The RDF/XML syntax for asserting that pe2 is an instance of ProcessExecution is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/> </rdf:Description> pe2 is an instance of class :Emailing, which is defined to be sub-class of class prov:ProcessExecution in the CrimeFile ontology. Hence, using standard RDFS entailment allows us to infer that pe2 is also an instance of prov:ProcessExecution. 3.1.3 Agent

Class Description Agent is defined to be a "characterized entity capable of activity" [PROV-DM]

OWL syntax prov:Agent rdfs:subClassOf prov:Entity. Example Example of instances of class Agent from the provenance scenario are Alice and Edith. The RDF/XML syntax for asserting that Alice is an instance of Agent is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#Alice">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Agent"/> </rdf:Description> Similar to example for Entity, both Alice and Edith are instances of class Journalist, which is defined to be "sub-class" of class Agent in the CrimeFile ontology. Hence, using standard RDFS entailment allows us to infer that both Alice and Edith are also instances of Agent. 3.1.4 TemporalEntity

Class Description TemporalEntity represents temporal information about entities in the Provenance model. This class has been re-used from the OWL Time ontology [OWL-TIME]. The PROV ontology also models the two sub classes of TemporalEntity, namely Instant and Interval.

The Instant class represents "point-line" temporal information that have "no interior points" [OWL-TIME]. The Interval class represents temporal information that have a non-zero duration [OWL-TIME]

OWL syntax time:TemporalEntity rdfs:subClassOf owl:Thing. Example Example of instances of class TemporalEntity from the provenance scenario are t and t+1. t+1 is associated with the instance of ProcessExecution pe2. The instances of TemporalEntity are linked to instances of Entity or ProcessExecution classes by the hadTemporalValue property that is described later in this document.

The RDF/XML syntax for this asserting that t+1 is an instance of class TemporalEntity and t+1 is associated with pe2 is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">   <prov:hadTemporalValue rdf:about="http://www.example.com/crimeFile#t+1">     <rdf:type rdf:resource="http://www.w3.org/2006/time#TemporalEntity"/>   </prov:hadTemporalValue> </rdf:Description> 3.1.5 ProvenanceContainer

Class Description ProvenanceContainer is defined to be an aggregation of provenance assertions. A provenance container should have an URI associated with it. The ProvenanceContainer class can also be used to model the PROV-DM concept of Account.

OWL syntax prov:ProvenanceContainer rdfs:subClassOf owl:Thing. Examples of instance of class ProvenanceContainer includes a RDF graph containing set of assertions describing the provenance of a car, such as its manufacturer, date of manufacture, and place of manufacture.

<rdf:Description rdf:about="http://www.example.com/crimeFile#ProvenanceContainer1">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/#ProvenanceContainer"/>     <cf:contains rdf:resource="http://www.example.com/crimeFile#Statement1"/>     <cf:contains rdf:resource="http://www.example.com/crimeFile#Statement2"/>     <cf:assertedBy rdf:resource="http://www.example.com/crimeFile#Alice"/> </rdf:Description>                                 According to the definitions of ProvenanceContainer and Account, both contain a set of provenance assertions and have an identifier. Hence, ProvenanceContainer class can also be used to create instances of accounts. Scope and Identifiers. This is ISSUE-81. 3.1.5.1 Modeling ProvenanceContainer and Account as RDF Graph

If a RDF graph contains a set of RDF assertions then, (a) if an explicit asserter is associated with the RDF graph it corresponds to the term "Account" in PROV-DM, and (b) if an asserted is not associated with the RDF graph it corresponds to the term "ProvenanceContainer" in PROV-DM.

3.1.6 Location

Class Description Location is defined to be "is an identifiable geographic place (ISO 19112)." [PROV-DM]

OWL syntax prov:Location rdfs:subClassOf owl:Thing. Example of instances of class Location from the provenance scenario is the location of the crime file in the shared directory /share with file path /shared/crime.txt. The RDF/XML syntax for asserting that the location of the crime file is the shared directory.

<cf:hasLocation>     <rdf:Description rdf:about="http://www.example.com/crimeFile#sharedDirectoryLocation1">         <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Location"/>         <cf:hasFilePath rdf:datatype="http://www.w3.org/2001/XMLSchema#string">/share/crime.txt</cf:hasFilePath>     </rdf:Description> </cf:hasLocation> Need to clarify whether "geographic" includes "geospatial"? 3.1.7 EntityInRole

Class Description EntityInRole is defined to be a role "assumed by a Entity or an agent." [PROV-DM]

OWL syntax prov:EntityInRole rdfs:subClassOf prov:Entity. Example Example of instances of class EntityInRole from the provenance scenario are author role assumed by Bob and file creator role assumed by Alice. The RDF/XML syntax for asserting that Bob assumes the role of an author is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#Bob_as_Author">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/EntityInRole"/>     <prov:wasAssumedBy rdf:resource="http://www.example.com/crimeFile#Bob"/>                                     </rdf:Description> 3.1.8 Recipe

Class Description Recipe represents the specification of a ProcessExecution. PROV ontology does not define the different types of recipes that can be created by provenance applications in different domains.

OWL syntax prov:Recipe rdfs:subClassOf owl:Thing. Example An example of recipe from the provenance scenario may be the editing protocol followed by the journalists to edit a news report.

<rdf:Description rdf:about="http://www.example.com/crimeFile#news_editing">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExection"/>     <prov:hadRecipe rdf:resource="http://www.example.com/crimeFile#NewsReportEditingProtocol"/>                                     </rdf:Description> 3.1.9 Holding Section for Classes

Temporary section for terms not part of "core" ontology.

3.1.9.1 Time

Class Description Time is subclass of time:Instant from [OWL-TIME] which requires that the time is defined using the time:inXSDDateTime property. This class used with startedAt and other subproperties of hasTemporalValue ensures compatibility with xsd:dateTime literals expressions in [PROV-DM] ASN and other serialisations. c

3.1.9.2 EntityInRole

An EntityInRole is used together with the properties used, wasGeneratedBy and wasControlledBy to specify that the wasAssumedBy entity participated in the relation in a given role. The role is specified using assumedRole, referring to a Role.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">   <prov:used rdf:parseType="Resource">       <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/EntityInRole"/>       <prov:wasAssumedBy rdf:resource="http://www.example.com/crimeFile#Bob"/>       <prov:assumedRole rdf:resource="http://www.example.com/crime#author"/>       <crime:parameter>p1</crime:parameter>   </prov:used> </rdf:Description>         The example above corresponds to in [PROV-DM] ASN used(pe1, Bob, qualifier(role="author", parameter="p1").

3.1.9.3 Collections

A Collection is a type of Entity which have been composed of other entities. A PROV-O Collection can represent any kind of collection, such as a ordered list, array, associative list, dictionary, hashtable, map. It is out of scope for PROV to further define the exact nature of the collection, but PROV-O defines shortcuts for defining that a entity have been added or removed to a collection. These operations are modelled as subproperties of wasDerivedFrom between two static collections, corresponding to collection assertions [PROV-DM].

3.1.9.3.1 Expansion

An Entity can be added to a Collection, producing a new (derived) Collection which contains the new item in addition to the items of the old collection. The item can be added at a certain key (represented as another Entity), which could be a position (for ordered lists), a hash key for a dictionary, or the value itself (for sets). In PROV-O the addition is specified using the functional properties wasExpandedFrom, wasExpandedBy and wasExpandedAt. These correspond to the PROV-ASN collection assertions wasAddedTo_Coll (the expanded collection), wasAddedTo_Entity (the expansion) and wasAddedTo_Key (the key it was expanded at). The properties are functional so that only one expansion is asserted at a time, relating the three properties without requiring an explicit "Expansion" class, and also asserting that no other entities have been added or removed to the two collections related using prov:wasExpandedBy.

[PROV-DM] does not make the guarantee that other entities have not been added. Is it fair to make such an assumption here?

col1 a prov:Collection ;

    prov:wasExpandedFrom :col0 ;     prov:wasExpandedBy :e1 ;     prov:wasExpandedAt :key1 .

col2 a prov:Collection ;

    prov:wasExpandedFrom :col1 ;     prov:wasExpandedBy :e2 ;     prov:wasExpandedAt :key2 .     TODO: Write Collection examples as RDF/XML The above example describes collections :col0, :col1 and :col2. We know that :col2: has the entries: (:key2, :e2) and (:key1, :e2). As we don't have the provenance of :col0 it might or might not contain other keys and entities.

If a Collection has one of the functional prov:wasExpandedFrom, prov:wasExpandedBy or prov:wasExpandedAt properties asserted, then it is an ExpandedCollection and the existence of the remaining wasExpanded* properties are implied. TODO: Express the constraint expanded-collection in the OWL ontology Does prov:Collection allow replacement or multiple additions on the same key? If we do a second expansion using the :key1, will :e1 still be in the collection? We recommend that for map functionality replacement should always be represented by first an explicit removal (wasReducedBy) followed by insertion (wasExpandedBy). 3.1.9.3.2 Reduction

Removing from a collection is modelled in a similar way as expansion, by deriving a new reduced collection which does not have the removed item or key. This is done using the properties prov:wasReducedFrom, prov:wasReducedBy and prov:wasReducedAt, which correspond go [PROV-DM] properties wasRemovedFrom_Coll and wasRemovedFrom_Key.

col3 a prov:Collection ;

    prov:wasReducedFrom :col2 ;     prov:wasReducedAt :key1 .

col4 a prov:Collection ;

    prov:wasReducedFrom :col3 ;     prov:wasReducedBy :e2 ;     prov:wasReducedAt :key2 .     The example above says that in :col3 does not contain what :col2 had at :key1, e.g. (:key1, :e1). :col4 does not contain (:key2, :e2).

If a Collection has one of the functional prov:wasReducedFrom, prov:wasReducedBy or prov:wasReducedAt properties asserted, then it is a ReducedCollection and the existence of the remaining wasReduced* properties are implied. A ReducedCollection is disjoint from a ExpandedCollection, so it is not possible to combine any wasReduced* property with any wasExpanded* property. TODO: Express the constraint reduced-collection in the OWL ontology Does removal at :key1 mean it is no longer present in the collection? What if the collection is a linked list, where :key1 is a position? (:e2 would now be at :key1). Does removal assert that the key existed in the collection, or simply that it no longer is in the collection? If it is possible to insert several values at the same key, is it possible to remove only one of these at a given key? Asserting prov:wasReducedBy is optional, as prov:wasReducedAt will remove any value at that key. (PROV-DM does not describe wasRemovedFrom_Entity).

3.1.9.3.3 EmptyCollection

PROV-O defines a subclass of Collection called EmptyCollection. Asserting that a collection is empty means that it does not contain any key/value pairs. Combined with expansion and reduction statements this allows the assertion of the complete content of a collection.

col0 a prov:EmptyCollection .
col4 a prov:EmptyCollection .

[PROV-DM] does not describe the concept of an empty collection With the additional information given above, one can conclude that :col1 (which prov:wasExpandedFrom :col0) only contains the expanded entity :e1, and that :col2 only contains the keys :key1 and :key2.

An EmptyCollection is disjoint from an ExpandedCollection. It is not valid for an asserted EmptyCollection to be in the domain of prov:wasExpandedFrom or in the range of prov:wasReducedFrom. TODO: Include the constraints empty-collection-disjoint and empty-collection-range in the OWL ontology 3.1.9.3.4 Collection content

To describe the complete content of a Collection (its keys and values), an asserter can form a chain of wasExpandedFrom assertions starting from an EmptyCollection. Note that although this does enforce an ordering of the addition of the elements to the final collection, it does not neccessarily assert that this happened sequentially, as the corresponding implied ProcessExecutions could have had zero duration. To assert that the intermediate expansions occurred "instantly" and not expose any temporal ordering of the insertions, you may state that the generation time of the initial and final collection is the same:

col0 a prov:EmptyCollection ;

    prov:wasGeneratedAt :t0 .

col1 a prov:Collection ;

    prov:wasExpandedFrom :col0 ;     prov:wasExpandedBy :e1 .

col2 a prov:Collection ;

    prov:wasExpandedFrom :col1 ;     prov:wasExpandedBy :e2 ;     prov:wasGeneratedAt :t0 . The collection :col2 described above was created with the entities :e1 and :e2. Both items were inserted at the same time :t0. (The wasGeneratedAt :t0 for :col1 is implied above due to the derivation-use-generation-ordering constraint.)

FIXME: What if the asserter knows and want to assert the content, and she knows it was inserted in a temporal order - but don't know that order? (for instance "members of the Royal Society"). Should there be a prov:hadContent property? Is it possible to use rdf collections for such a shorthand? Is it possible to express set operations (union, difference, intersection, negation) between two collections without having to express all the individual members? 3.2 Object Properties

The PROV ontology has the following object properties.

Note: Names of properties starts with a verb in lower case followed by verb(s) starting with upper case

3.2.1 wasGeneratedBy

The wasGeneratedBy property links the Entity class with the ProcessExecution class.

Note: No arity constraints are assumed between Entity and ProcessExecution


Example Example of wasGeneratedBy property from the provenance scenario is e1 wasGeneratedBy pe0. The RDF/XML syntax for asserting this information is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#e1">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>     <prov:wasGeneratedBy>         <rdf:Description rdf:about="http://www.example.com/crimeFile#pe0">             <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/ProcessExecution"/>         </rdf:Description>     <prov:wasGeneratedBy> </rdf:Description>     3.2.2 wasRevisionOf

The wasRevisionOf property links two instances of Entity class, where one instance is a revision of another instance, and there is explicit role of an Agent in asserting this information.

Example Example of wasRevisionOf property from the provenance scenario is e3 wasRevisionOf e2. The RDF/XML syntax for asserting this information is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#e3">     <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>     <prov:wasRevisionOf>         <rdf:Description rdf:about="http://www.example.com/crimeFile#e2">             <rdf:type rdf:resource="http://www.w3.org/ns/prov-o/Entity"/>         </rdf:Description>     <prov:wasRevisionOf> </rdf:Description>     Can instance of Agents be reasoning agents that infer the information that one Entity instance is a revision of another Entity instance and then asserts the information? In other words, is assertion after inference supported by this property? 3.2.3 wasDerivedFrom

The wasDerivedFrom property links two instances of Entity class, where "some characterized entity is transformed from, created from, or affected by another characterized entity." [PROV-DM]


Example Example of wasDerivedFrom property from the provenance scenario is e3 wasDerivedFrom e2. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#e3">     <prov:wasDerivedFrom rdf:resource="http://www.example.com/crimeFile#e2"/> </rdf:Description>         Should derivation have a time? Which time? This is ISSUE-43. Should we specifically mention derivation of agents? This is ISSUE-42. 3.2.4 wasEventuallyDerivedFrom

This object property is used to link two instances of Entity class that "...are not directly used and generated respectively" by a single instance of ProcessExecution class [PROV-DM].


Example Example of wasEventuallyDerivedFrom property from the provenance scenario is e5 wasEventuallyDerivedFrom e2. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#e5">     <prov:wasEventuallyDerivedFrom rdf:resource="http://www.example.com/crimeFile#e2"/> </rdf:Description>         Is the current definition of wasEventuallyDerivedFrom inconsistent with definition of wasDerivedFrom? This is ISSUE-122 and ISSUE-126 3.2.5 dependedOn

The dependedOn property links two instances of Entity class to model the derivation of one instance from another instance. This is a transitive property, in other words if an Entity instance a1 dependedOn a2 and a2 dependedOn a3, then a1 dependedOn a3 is also true.


Example Example of dependedOn property from the provenance scenario is e5 dependedOn e2. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#e5">     <prov:dependedOn rdf:resource="http://www.example.com/crimeFile#e2"/> </rdf:Description>         Is dependedOn a parent property of wasDerivedFrom? This is ISSUE-125 3.2.6 used

The used property links the ProcessExecution class to the Entity class, where the Entity instance is "consumed" by a ProcessExecution instance.

Note: No arity constraints are assumed between Entity and ProcessExecution


Example Example of used property from the provenance scenario is pe2 used e2. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">           <prov:used rdf:resource="http://www.example.com/crimeFile#e2"/>   </rdf:Description>         3.2.7 hadParticipant

The hadPariticipant property links Entity class to ProcessExecution class, where Entity used or wasGeneratedBy ProcessExecution.

Note: No arity constraints are assumed between Entity and ProcessExecution


Example Example of hadParticipant property from the provenance scenario is pe2 hadParticipant e2. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe2">           <prov:hadParticipant rdf:resource="http://www.example.com/crimeFile#e2"/> </rdf:Description>         Suggested definition for participation. This is ISSUE-49. The current definition of hasParticipant does not account for involvement of an Entity in ProcessExecution where it was neither "used" or "generated". For example, a witness in a criminal activity. 3.2.8 wasComplementOf

The wasComplementOf property links two instances of set of assertions about Entity instances, where "it is relationship between two characterized entities asserted to have compatible characterization over some continuous time interval." [PROV-DM]


Should the wasComplementOf property link two instances of ProvenanceContainer (or Account) classes since they are two classes modeling a set of (one or more) provenance assertions? 3.2.9 wasControlledBy

The wasControlledBy property links ProcessExecution class to Agent class, where control represents the involvement of the Agent in modifying the characteristics of the instance of the ProcessExecution class"[PROV-DM].


Example Example of wasControlledBy property from the provenance scenario is FileAppending (ProcessExecution) wasControlledBy Bob. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">   <prov:wasControlledBy>     <rdf:Description rdf:about="http://www.example.com/crimeFile#Bob">       <rdf:type rdf:resource="http://www.example.com/crime#Journalist"/>     </rdf:Description>   </prov:wasControlledBy> </rdf:Description>         3.2.10 hadRecipe

This property links the ProcessExecution class to the Recipe class, which describes the execution characteristics of the instance of the ProcessExecution class. The recipe might or might not have been followed exactly by the ProcessExecution.


Example Example of hadRecipe property in the (extended) provenance scenario is that pe1 (instance of ProcessExecution class) followed some file appending instructions (instructions1). The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe1">           <prov:hadRecipe rdf:resource="http://www.example.com/crimeFile#instructions1"/> </rdf:Description>         3.2.11 wasInformedBy

This object property links two instances of the ProcessExecution classes. It is used to express the information that a given process execution used an entity that was generated by another process execution.


Example Example of wasInformedBy property from the provenance scenario is pe4 wasInformedBy pe3. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe4">           <prov:wasInformedBy rdf:resource="http://www.example.com/crimeFile#pe3"/> </rdf:Description>         3.2.12 wasScheduledAfter

This property links two instances of ProcessExecution class to specify the order of their executions. Specifically, it is used to specify that a given process execution starts after the end of another process execution.


Example Example of wasScheduledAfter property from the provenance scenario is pe4 wasScheduledAfter pe3. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe4">           <prov:wasScheduledAfter rdf:resource="http://www.example.com/crimeFile#pe3"/> </rdf:Description>         3.2.13 hadTemporalValue

This object property links an instance of ProcessExecution or Entity with an time:TemporalEntity from [OWL-TIME], thereby allowing association of time value with instances of the two classes and their subclasses.

Example Example of hadTemporalValue property from the provenance scenario is t+3 time value is associated with the pe3 ProcessExecution instanc. The RDF/XML syntax for asserting this is given below.

<rdf:Description rdf:about="http://www.example.com/crimeFile#pe3">           <prov:hasTemporalValue rdf:resource="http://www.example.com/crimeFile#t+3"/> </rdf:Description>        


Stian: Diagrams that are 'crisp' are exported as PNGs in 200 dpi from omni graffle, and included in the HTML with CSS that scales it down. See 

 <img src="diagram-history/khalidDiagrams/wasScheduledAfter.png"                style="height: 3em" alt="wasScheduledAfter links ProcessExecution to ProcessExecution" />

(https://dvcs.w3.org/hg/prov/raw-file/tip/ontology/diagram-history/khalidDiagrams/wasScheduledAfter.png is much bigger than it appears in the HTML)

As a side-effect this allows HTML scaling in browsers and nicer printouts.