PIL OWL Ontology Meeting 2011-10-31

From Provenance WG Wiki
Revision as of 19:01, 4 November 2011 by Tlebo (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Meeting Information

prov-wg - Modeling Task Force - OWL group telecon

Meeting Information

prov-wg - Modeling Task Force - OWL group telecon


  • Satya
  • Tim
  • Daniel
  • Stephan
  • James
  • Paolo
  • Luc
  • Khalid


  • try to reach a consensus on the modeling of roles using either
    • entity properties modelled as classes
    • entityInRole class


1) Stephan points out that they are qualifiers 2) to what extent is open world? e.g. prov:used, if second person wants to role, then they make a second (redundant?) assertion. example from email: division with "8" ("8" as numerator, "8" as result) replace the nonqualified with a qualified? No. concern is that 2 assertions are needed.

Example Encoding 1

  • process executions: p1, p2, p3
  • p1: algebraic process (used input 8)
  • p2: 40/5 = 8
  • p3: 8/2 = 4
:8 rdf:type Entity
:8_as_numerator rdf:type Numerator (which is subclass of EntityInRole)
:8_as_result rdf:type Result (which is subclass of EntityInRole)
:p1 used :8
:p3 used :8_as_numerator
:p3 used :8
:8_as_numerator wasAssumedBy :8
:8_as_result wasGeneratedBy p2
:8_as_result wasAssumedBy :8

"multiplicand" times "multiplier" equals "result" ?

"dividend" divided by "divisor" equals "result" ?

Example encoding 2

opmo: p3 used 8 If we want to add more info: usage1 rdf:type Usage usage1 cause 8 usage1 effect p3 usage1 hasRole numerator (or 8 asNumerator) Satya, does this fall into your "two methods"? "8 / 1 = 8"

p5 a alg:AlgebraicProcessExecution .
    prov:used :what_p5_used_8, 
     prov:generated :what_p5_generated_8;
    alg:hadNumerator :what_p5_used_8; # subproperty of prov:used
    alg:hadDenominator :what_p5_used_1;# subproperty of prov:used
    alg:hadResult :what_p5_generated_8;# subproperty of prov:generated


     rdf:value 8 .     
     rdf:value 1 .    
     rdf:value 8 .
what_p5_used_8 a alg:Numerator . # Subsequently-created qualification.
what_p5_used_1 a alg:Denominator . # Subsequently-created qualification.
:what_p5_generated_8 a math:Result .

From the primer worked example ex1:derek a prov:Agent . ex1:analyst a prov:Role . ex1:aggregationByRegion

       prov:wasControlledBy [
               a prov:EntityInRole ;
               prov:assumedBy ex:derek ;
               prov:assumedRole ex1:analyst .
       ] ;

From Stian:

e1 a prov:Entity .
e1XGeneration a prov:EntityInRole ;
 prov:assumedBy :e1 ;
 prov:assumedRole :X ;
 prov:assumedAt :t1 ;
 prov:wasGeneratedBy :pe1 .
e1XUse a prov:EntityInRole ;
 prov:assumedBy :e1 ;
 prov:assumedRole :X ;
 prov:assumedAt :t2 .
pe2 a prov:ProcessExecution ;
 prov:used :e1X2 .
 this is the gist of Luc's objection: (Oct 26)
 My problems are the following:

- Encoding2 is not an extension of encoding1: it does not just add new edges,

  it removes some.
  But according to the data model, we just have added extra information.

===  Selecting Roled and Unroled Entities ===

Stephan asked: How many things were used?
rephrased: how many distinct usages did the process execution have?
follow-up question: are distinct usages important in provenance?

Query Pattern 1
           # Select the Entities that ARE roled.
           GRAPH :myAccount {
            ?pe prov:used [ a prov:Entity; prov:wasAssumedBy ?usedThing ] . # NOTE, that wasAssumedBy is a subproperty of a yet-unidentified property in PROV
                                                                                                                      # that references the more abstract notion that the Entity represents.
    } UNION {       
           # Select the Entities that are NOT roled.
           GRAPH :myAccount {
                    ?pe prov:used ?usedThing .
                      OPTIONAL { ?usedThing prov:wasAssumedBy ?none }  # This second pattern illustrates BBBBAAAADDDDDD PROV modeling 
                                                                             filter(!bound(?none))  # (one never uses the actual thing (e.g. Khalid), 
                                                                                                                # but a _view_ of it (e.g. the Person named Khalid ordering the Latte)

I think it may be more simple. This query would give us the real "usages" of a pe (binary relationships) SELECT DISTINCT ?usedThing

            ?pe prov:used ?usedThing.
            ?usedThing a <prov:Entity>.#Assuming that we don't apply inference in EntityInRole...

Number of Usages

Stephan asked: How many distinct usages were used?

Tim: a MUCH better question Luc: had to change the kinds of entities you're changing in the world. Luc wants to just point to the abstract entity, not the specific usage. making new entities for an derivation chain - James' example from http://lists.w3.org/Archives/Public/public-prov-wg/2011Oct/0326.html

Luc's request for examples

1. Show derivations: wdf(2,8) and wdf(8,40) and wdf(8,5) 2. In the opmo like example, show opmv properties (eg wasGeneratedBy inferred from Generation class) 3. Show the same examples without roles, and display the difference between an encoding with role and an encoding without role 4. Show time 5 Show qualifiers as in prov-dm

see derivations b/c they have transitivity. Entity in Roles has troubles with transitivity/derivation: does 4 derives from 8(Entity) or 8asNumerator (EntityInRole)? or both? Tim: we're fighting abstraction confusion here, again. Luc wants to refer directly to the abstract notion with prov:used. Tim: We can get "transitivity" via property chains. where do we impose the contextualization? On Daniel to Tim: opmo has defined the binary props AND the Usage, Control and Generation classes. So the transitivity can asserted as usual with the binary relaitonships. Luc: does not like the forced instantiation of Entities if they need to be described further. In non-described situations it is direct and clean.

   ?pe :used 8^^xsd:integer . # This refers directly to the "abstract" 8.
   ?pe :used [ prov:abc 8^^xsd:integer ] .

Jeni's Blog: http://www.jenitennison.com/blog/node/142

The OWL ontology for OPM for OPM is a very literal mapping of OPM into RDF. Each of the types of nodes is a separate class, and each of the types of edges is a separate class. Thus, it introduces a lot of n-ary relationships. Take a really simple example of an XML file being transformed into HTML using XSLT. Some Observations There are two things that I want to pull out about the RDF mapping described above.

    it’s incredibly literal; every entity type within the model is  mapped onto an RDF class, including the edges, 
the roles and the  accounts (which I didn’t show above)

    it doesn’t reuse any existing vocabularies, even when they might  help (such as for the ‘value’ of a role, which is really a label)

It reminds me of the mapping of object-oriented or relational data  models into each other or into XML, which often result in a god awful  
mess and people swearing that technology X is goddamned ugly.
The fact is that elegant uses of each modelling paradigm — ones that  are easy to understand and efficient to query — always take 
advantage of  the unique features of that paradigm. For example, good XML  vocabularies take advantage of the distinctions between
 attributes and  elements, of nesting and hierarchies, and of the ability to hold mixed  content.

It’s the same with RDF. There are four features of RDF that I think good vocabularies will take suitable advantage of:

    existing vocabularies


    shortcuts and reasoning

    named graphs



pe_1 a :Telecon;
  prov:used [ prov:abc <http://tw.rpi.edu/instances/TimLebo>; a feel:Tired ];
  prov:generated [ prov:abc <http://tw.rpi.edu/instances/TimLebo>; a feel:Tired, feel:Frustrated ]

What about this?<-- Isn't this opmo approach? yep :D Isn't prov:role a specialization of prov:qualifier? (I think it should replace it according to what Luc said)

Stephan's example

# predicate names in Usage, Control, etc are preliminary and subject to change
prov:role a owl:ObjectProperty ;
    rdfs:subPropertyOf prov:qualifier .

        a prov:ProcessExecution;
        prov:hadUsage [
                a prov:Usage ;
                prov:time ""^^xsd:dateTime ;
                prov:entity ex1:dataSet1 ;
                prov:role ex1:dataSetToAggregate ; # Does this make it awkward to SPARQL (sketching below). NOPE!
        ] ;
        prov:used ex1:dataset1 ;
        prov:hadUsage [
                a prov:Usage ;
                prov:time ""^^xsd:dateTime ;
                prov:entity ex1:regionList ;
                prov:qualifier ex1:regionsToAggregateBy ;
        prov:used ex1:regionList ;
        prov:hadControl [
                a prov:Control ;
                prov:time ""^^xsd:dateTime ;
                prov:entity ex1:derek ;
                prov:qualifier ex1:analyst ;
        ] ;
        prov:wasControlledBy ex1:derek .
  1. This is awesome and eloquent (Tim):
SELECT ?used ?role
     ex1:aggregationByRegionAlt prov:used                              ?used; 
                                                    prov:hadUsage [ prov:entity ?used; prov:role ?role ] .

Daniel to Tim: I think you should ask just one of both: SELECT ?used WHERE {

    ex1:aggregationByRegionAlt prov:used                              ?used .

} If you want more info, you refer to the Usage class.Tim likes this. SELECT ?used ?role WHERE {

    ex1:aggregationByRegionAlt  prov:hadUsage [ prov:entity ?used; prov:role ?role ] .

} BAD (do not use): SELECT ?used ?role WHERE {

    ex1:aggregationByRegionAlt prov:used ?used; 
                                                                   ?used a ?role . # This breaks when an entity is used in multiple roles.
   #                                                                                                                                                                             ^ nope; used in ANY PE (not just this one); that is how it breaks.
   # if ?used is prov:used at any other time by any other PE, then the roles in which the others use ?used are returned in the query about JUST  ex1:aggregationByRegionAlt 
   # ah, I see it now.
   # I've just abandoned my preference for typing the entity to the role. Blah. But progress!
   # I think roles only make sense in the context of both the entity and the process execution. 8_as_numerator in isolation does not provide much benefit.  Which is why I like it on Usage, Control, etc.