Re: PROV-ISSUE-67 (single-execution): What is a PE? (was) Why is there a difference in what is represented by one vs multiple executions? [Conceptual Model]

Breaking the discussion of the nature of a PE out...

 

I think your case can be handled via multiple accounts already, so I'm
taking your comment to really be asking whether we need to use accounts
for this, which is a good question. 

 

Accounts with the singlePE rule allow you to document when there's more
than one way to think about what happened - different levels of
granularity in the processes described or, more analogous to your
example, focusing on different dimensions of what happened. Your example
would fit if you are really thinking of cyclone formation as something
that can be viewed from a temperature, or pressure, or moisture
perspective but really involves all three. I.e. there's a difference
between a claim that Luc flew all the way to Boston *and* walked all the
way to Boston  (what the single PE rule is intended to prohibit) and a
claim that Luc flew to Boston *and* bought his way to Boston (purchases
a ticket) - these are two views of one overall financio-physical
process, not multiple histories.

 

Right now, the latter case would be handled via accounts - you could
state both and put them in two accounts. In OPM discussions we then saw
value in allowing one to assert that accounts were 'consistent' (true in
the financio-physical case, not true in if there were fly and walk
accounts of Luc's travel).

 

It seems like a reasonable question to ask whether one needs accounts
for the consistent case. If the model assumes that if a Bob is generated
by multiple PEs that those PEs are all different views of some
underlying/overall/higher dimensional process, I'm not sure what we
might lose by putting all views in one account. I.e. we still have only
one history for a Bob, and if there are multiple PEs listed, the
interpretation would be that these are all aspects of that process. The
only challenge here might be that it could be difficult to disentangle
things if those multiple aspects involve the same Bobs (not sure I can
make a clear example -  suppose a warm, moist airmass moves in and the
temperature change and moisture change PEs both use it, but it got
heated by the sun and picked up water from the ocean - do we care that
it looks like the energy from the sun caused a moisture increase and the
ocean caused the air to warm (both perhaps true a bit, but if I just
reported that temperatureChange caused the storm, I might not include
the ocean as a heat source and vice versa- the sun didn't add moisture,
but I can't convey this if these are all one big account. Again - not
sure the example is clear, but if we think we need to differentiate as
to which Bobs and earlier processes contribute in which view of storm
formation, making separate accounts would help.)

 

My guess is that this concern, as well as just general clarity (avoiding
a multiple histories interpretation) mean that staying with a
multi-account model to address your example is workable and probably
desirable.

 

Jim

 

 

 

You raise an interesting example, but I don't see that this is an
effective argument against requiring that a single PE generates a Bob. I
also think that this requirement is not the same as requiring the PE to
be atomic, so we should separate those.

 

Looking at your example, I think it is clear that all of the things you
call 'processes' (I have a minor issue with a state change being called
a process - see below) are required for the cyclone to form, so
minimally you're not claiming that the Bob was created multiple ways -
just one complex/multifaceted way. (You don't have a case analogous to
saying Luc arrived in Boston by flying the whole way and walking the
whole way. It is more like wanting t say that Luc got to Boston by
flying and by paying for a ticket.) In that sense, while cyclone
formation is complex and potentially composite, it is still one process
- things only happened one way. 

 

I would encode this more like the following: a low pressure system A
went through a cyclone formation PE subject to the laws of nature (the
recipe) to create cyclone B with air, water, heat, solar energy, etc.
participating.  If you wanted to report different accounts that said the
cyclone was generated by an airpressurelowering PE, and one that claimed
it was a coriolisForcePE, and one that claimed a temperatureChangePE did
it, I think that's fine - those are all partial views of reality that
should be recordable in PIL as separate accounts. That's not much
different that the discussions about recording more or less granularity
in different accounts - both are valid partial views and both are
consistent - but saying that a Bob was simultaneously created by
multiple processes seems odd versus saying that these are different
descriptions of THE PE that created it. In your example, I think it is
pretty clear that there is only one process - temperature and pressure
are related by natural laws and both change together. Same for my
example - Luc getting to Boston could be viewed as a financial or
physical process but in reality, he only got there once through a
financio-physical process (there are mechanisms that link paying and
flying that are analogous to the natural laws that couple your cyclone
processes.) 

 

There could still be granularity - heating from the sun might have only
occurred during the day while the storm formed over multiple days and
PIL (with the single PE rule in place) could be used to describe either
one multiday process, or to separate out the heating processes and
multiple stages of the storm development, or both (as two accounts). 

 

So -

 

 

Going to back to some of Luc's earlier email regarding the assumption
that isGeneratedBy links Bob to a single PE - I am not sure I understand
why or how can we make this assumption? Similar to our discussion on
isDerivedFrom, we should not make any assumption that PE in
isGeneratedBy represents an atomic process. There can be multiple PEs
involved in generation of a single Bob (a tropicalCyclone is generatedBy
the processes of changeInAirPressure and changeInTemperature and
changeInMoisture and rotationofEarth).

 

A specific application may specialize isGeneratedBy property to define
isGeneratedBySingleProcessExecution or
isGeneratedByMultipleProcessExecution. Our provenance model should cater
for both these cases.

 

Received on Saturday, 6 August 2011 18:47:59 UTC