Re: PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process [Conceptual Model]

Hi Luc,
My comments are inline:
>First, you will note that wasInformedBy is *not* a temporal relation
between process executions.

The PROV-DM currently defines the following constraint for wasInformedBy:
Given two process execution expressions denoted by pe1 and pe2, the
expression wasInformedBy(pe2,pe1) holds, if and only if there is an entity
expression denoted by e and qualifiers q1 and q2, such that
wasGeneratedBy(e,pe1,q1) and used(pe2,e,q2) hold.

If we consider the two expressions wasGeneratedBy(e, pe1, q1) and used(pe2,
e, q2) - these two expressions together enforce that pe2 cannot have start
time that is "before" start time of pe1. This is temporal relation/ordering
between pe1 and pe2. Hence, if both these expressions have to "hold" for
wasInformedBy(pe2, pe1) to "hold" I am not sure how it is not a temporal
ordering?


>Second, it would be nice for PROV to have a temporal ordering relation.
However, we have to be
>careful. The relations used/generatedBy/derivedFrom/dependedOn/... all have
a notion of >causality/influence: the source of the edge being influenced by
the edge destination.
>We know that causal order implies temporal order, but not the converse.  I
am therefore reluctant
>to introduce a relation that arbitrarily capture  temporal order.  What
would it give us? After all,
>we can associate time with PEs, and given such time information, we can
already decide if pe1 >start precedes pe2 start, or if pe1 end precedes pe2
start. What would a temporal relation give us >over time?
There are many non-causal properties that are part of provenance assertions.

For example, to reconstruct the history of activities of an accused person X
on Oct 2 before the X reached the crime scene, the police make the following
assertions:
1. X bought a car at 2:00pm US ET - buying the car is PE pe1
2. X bought flowers at 4:00pm US ET- buying flowers is PE pe2
3. X hailed a taxi and travelled to crime scene at 6:00pm US ET - travelling
in taxi is PE pe3

In the above scenario, the police need to have temporal ordering of PEs to
establish that person X was in the city on the day of the crime but there is
no causal relation between pe1, pe2, and pe3.

As you stated, temporal ordering may or may not represent causal relation
between PEs and since non-causal ordering of PEs occur in many provenance
applications we need to define a property for temporal ordering of PEs and
causality-based temporal ordering is a specialization of that property.


>The relation wasScheduleAfter attempts to capture some temporal ordering,
with underpinning
>causal influence.  You are incorrect to state that to assert
wasScheduledAfter you need to know >of an agent. It's exactly the contrary.
By asserting wasScheduledAfter, you also assert the >existence of such
an agent, but don't have to specify which it is.

The PROV-DM currently defines the following constraint
for wasScheduledAfter:
Given two process execution expressions denoted by pe1 and pe2, the
expression wasScheduledAfter(pe2,pe1) holds, if and only if there are two
entity expressions denoted by e1 and e2, such that
wasControlledBy(pe1,e1,qualifier(role="end")) and
wasControlledBy(pe2,e2,qualifier(role="start")) and wasDerivedFrom(e2,e1).
and
This definition assumes that the activities represented by process execution
expressions identified by pe1 and pe2 are controlled by some agents,
represented by expressions identified by e1 and e2, where the first agent
terminates (control qualifier qualifier(role="end")) the first activity, and
the second initiates (control qualifier qualifier(role="start")) the second.
The second agent being "derived" from the first enforces temporal ordering.

If we don't know which are the Agents associated with pe1 and pe2 then how
can we state that they are entities with identifiers e1 and e2?

In other words, if there are two PEs (from Taverna workflows) -
retrieveGeneSequence and runBLASTService and John (the research robot) ended
retrieveGeneSequence and Tom (the research robot - derived from John)
started runBLASTService - then we can assert that runBLASTService
wasScheduledAfter retrieveGeneSequence.

But, if don't know which Agents are associated with retrieveGeneSequence and
runBLASTService PEs then how can we assert wasScheduledAfter property
between the two PEs?

There maybe a third robot Albert and it is not related to either Tom or John
by wasDerivedFrom property. But, a provenance application has to know which
of three robots (agents) are associated with the two PEs (and then verify
that there is a wasDerivedFrom property linking the two robots).

The constraint defined for wasScheduledAfter is a rule and for the rule to
"fire" its conditions have to evaluate to "true".

Just knowing that there exist some Agent associated
with retrieveGeneSequence and runBLASTService PEs will not make the
constraint evaluate to "true" - the provenance application has to specify
which Agents (John and Tom) were associated with the two PEs.

Hence, according to the current PROV-DM text, my understanding is that a
provenance application will need to know about the specific agents
associated with PEs before they can use the wasScheduledAfter property. This
information may or may not be available to a provenance application.

Therefore I am raising the need for a generic ordering property for PEs that
can be simply asserted by provenance applications. Similar to other
provenance assertions the ordering of PEs can be verified later using either
timestamps or causal relations constraints.

>Final point, your reference [1] had not been agreed, it is the proposal you
made back then.
Hence, I had raised this issue (Issue-50) to discuss the property. To
clarify, has there been discussions or agreement on the two properties
isInformedBy and wasScheduledAfter (I may have missed the particular mails
in the mailing list)?

Thanks.

Best,
Satya

On Sun, Oct 2, 2011 at 9:58 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote:

> **
> Hi Satya,
>
> First, you will note that wasInformedBy is *not* a temporal relation
> between process executions.
> It is *not* transitive.  It requires information to flow between two PEs.
> For wasInformedBy(pe1,pe2),
> a minimum constraint is that the end of pe2 does *not* precede the start of
> pe1.
> The data journalism example had an illustration of such relation. It has
> been established to be useful
> both theoretically and practically.
>
> Second, it would be nice for PROV to have a temporal ordering relation.
> However, we have to be
> careful. The relations used/generatedBy/derivedFrom/dependedOn/... all have
> a notion of causality/influence:
> the source of the edge being influenced by the edge destination.
>
> We know that causal order implies temporal order, but not the converse.  I
> am therefore reluctant
> to introduce a relation that arbitrarily capture  temporal order.  What
> would it give us? After all,
> we can associate time with
> PEs, and given such time information, we can already decide if pe1 start
> precedes pe2 start, or if pe1 end
> precedes pe2 start. What would a temporal relation give us over time?
>
> The relation wasScheduleAfter attempts to capture some temporal ordering,
> with underpinning
> causal influence.  You are incorrect to state that to assert
> wasScheduledAfter you need to know of an agent.
> It's exactly the contrary. By asserting wasScheduledAfter, you also assert
> the existence of such an
> agent, but don't have to specify which it is.
>
> Final point, your reference [1] had not been agreed, it is the proposal you
> made back then.
>
> So, in conclusion:
> 1. I would argue that wasInformedBy is useful, and should be kept as such,
> ... and definitely cannot
>    be subsumed by some temporal ordering.
>
> 2. Temporal ordering *with* some form of underpinning causal influence, is
> also useful. I agree that
>    wasScheduledAfter is a first attempt. Maybe somebody can put forward
> alternative definitions.
>
> Cheers,
> Luc
>
>
> On 02/10/11 02:03, Satya Sahoo wrote:
>
> Hi Luc,
> I would like to re-raise this issue since the two properties defined in
> PROV-DM, "wasInformedBy" and "wasScheduledAfter" do not represent the
> original property for ordering process executions that was agreed to by the
> provenance incubator group and also during the first F2F [1].
>
>  I believe there are primarily two dimensions/constraints for ordering
> process executions:
> a) Two PEs are scheduled (by agent/user) to execute in particular order at
> specific time instants, which we can represent as *time-based ordering of
> PEs*. Of course, additional information about which agent/user started or
> stopped the PEs can be specified, but the time value primarily define the
> ordering of the PEs.
>
>  b) A PE pe1 is designed to initiate/start a second PE pe2 (due to some
> condition being satisfied for example a specific state was reached or some
> entity became available), which we can represent as a *control-based
> ordering of PEs*. This ordering of process cannot be effectively captured
> by time-based ordering, since pe1 may still be executing while pe2 starts.
>
>  Both these cases are captured by the property "wasPrecededBy" (the
> corresponding property in opposite direction can be "wasSucceededBy") where
> the PEs were ordered according to their time of start/stop or explicit
> start/stop by another PE.
>
>  Some specific comments on the current PROV-DM document Section 5.3.6
> Ordering of Process Executions
> =====
> 1. An information flow ordering expression is a representation that a
> characterized thing was generated by an activity, represented by a process
> execution expresion, before it was used by another activity, also
> represented by a process execution expression.
>
>  Issue: This is a particular case of "time-based ordering", there can
> multiple others. For example,
>
>  a) We can have the provenance assertions about two PEs Pe1 and Pe2: Pe1
> was stopped at time instant t1 and Pe2 started at time instant t2 and t2 >
> t1. Hence Pe2 wasPrecededBy Pe1
>
>  b) Similarly, we have provenance assertions about two PEs Pe1, Pe2 and an
> Entity e1: Pe1 used e1 at time t1 and PE2 used e1 at time t2 and t2 > t1,
> hence (start of) Pe2 wasPrecededBy (start of) Pe1.
>
>  My suggestion to just create a single generic property for ordering of
> PEs (Khalid had suggested using PEs instead of Process) and allow specific
> provenance application to create more specialized PE ordering properties
> according to their requirements.
>
>  2. According to the current definition of "wasScheduledAfter" we cannot
> assert that one PE was scheduled after another PE if we don't have
> information about the agent associated with the PEs. Further, the name of
> the property seems to refer to the intended ordering of PEs rather than
> actual execution of PEs - a workflow specification may have "scheduled" Pe1
> to execute "after" Pe2, but during the workflow run, Pe2 may have executed
> before Pe1?
>
>  Overall, I am not sure why we need two very special cases of PE ordering
> property instead of using a generic "wasPrecededBy" (or "wasSucceededBy")
> property that can be specialized as needed by different provenance
> applications.
>
>  Thanks.
>
>  Best,
> Satya
>
>  [1]
> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution
>
> On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk>wrote:
>
>>
>> Hi Satya,
>>
>> Issue has been closed pending review, with the latest document version.
>> Feel free to reopen if not appropriate.
>>
>> Luc
>>
>>
>> On 27/07/2011 02:51, Provenance Working Group Issue Tracker wrote:
>>
>>> PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process
>>> [Conceptual Model]
>>>
>>> http://www.w3.org/2011/prov/track/issues/50
>>>
>>> Raised by: Satya Sahoo
>>> On product: Conceptual Model
>>>
>>> I am not sure where did we get the currently listed definition of
>>> "Ordering of Process" - it is neither listed in the original provenance
>>> concept page [1] nor in the consolidated concepts page [2].
>>>
>>> I had proposed the following definition:
>>> "Ordering of processes execution (in provenance) needs to be modeled as a
>>> property linking process entities in specific order along a particular
>>> dimension (temporal or control flow)"
>>>
>>> [1]http://www.w3.org/2011/prov/wiki/ConceptOrderingOfProcesses
>>> [2]
>>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution
>>>
>>>
>>>
>>>
>>>
>>
>>
>

Received on Monday, 3 October 2011 00:55:04 UTC