Re: PROV-ISSUE-105: 5.3.1 Generation (current version of the conceptual model document) [Conceptual Model] from Satya Sahoo on 2011-12-08 (public-prov-wg@w3.org from December 2011)

From: Satya Sahoo <satya.sahoo@case.edu>
Date: Wed, 7 Dec 2011 20:59:28 -0500
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Cc: public-prov-wg@w3.org
Message-ID: <CAOMwk6xW6Mn7GbXAoT6p=vs86x8+he5jf89n+g265HmwpSJ+fg@mail.gmail.com>
Hi Luc,
Again apologies about the delayed reply. My responses are interleaved:


>> 5.3.1 Generation
>> =====
>> 1. In PROV-DM, a generation expression is a representation of a world
>> event, the creation of a new characterized thing by an activity. This
>> characterized thing did not exist before creation.
>>
>> Issue: The "characterized thing" in the above statements is Entity or
>> some other resource?
>>
>>
>
> Now,  we have defined entity as an identifiable characterized thing. So,
> the statements has become:
>
> In PROV-DM, a generation record is a representation of a world event, the
> creation of a new entity by an activity. This entity did not exist before
> creation. The representation of this event encompasses a description of the
> modalities of generation of this entity by this activity.
>
> Ok. I have raised the issue of activity vs. event separately (generation
record as representation of a world event).



>
>  2. contains a generationQualifier q that describes the modalities of
>> generation of this thing by this activity
>>
>> Issue: How is this qualifier distinct from specialization of the
>> generation property?
>>
>>
>
> I think the work on prov-o now answers this question.


Ok, I believe we have covered this through introduction of
qualifiedInvolvement in prov-o, which is different from specialization.

>
>
>  3. The first one is available as the first value on port p1, whereas the
>> other is the second value on port p1.
>>
>> Issue: As we discussed during the telcon on  Sept 15 [1] and in email
>> thread (Subject: Roles, initiated by Paolo on Sept 15), the "qualifier" if
>> any are on the entity and PE and not on the relation. In the above
>> statement, port p1 is qualifier for either the entities e1, e2 (they were
>> generated on that particular port) or the PE pe1 (it was using that port
>> for listening/responding). Hence, the qualifiers are on the "class" and not
>> the "relation".
>>
>> [1] http://www.w3.org/2011/prov/**meeting/2011-09-15<http://www.w3.org/2011/prov/meeting/2011-09-15>
>>
>>
>
> I think the work on prov-o also answers this comment.
>
> Ok, as above the qualifiedInvolvement work in prov-o covers this now.


>
>  4. If two process executions sequentially set different values to some
>> attribute by means of two different generate events, then they generate
>> distinct entities.
>>
>> Issue: This is an incorrect statement. Setting values of an entity at
>> different points of time cannot be equated to generating new entities. For
>> example, we don't generate a new human being everytime a PE changes the
>> value of their age. pe1 sets Person X age = 5 years in 2005 and pe2 sets
>> Person X age = 10 years in 2010 then they are not generating new person
>> (within an account or across accounts).
>>
>>
>
> Remember that an entity is a perspective on a thing.
> So, here, we can have multiple perspectives:
>
> e1 Luc
> e2 Luc at age=5
> e3 Luc at age=10
>
> e3 and e2 have a same attribute name age, but different values. So they
> must be different entities,i.e. perspectives, over human being e1.
>
>
As I have discussed in other mails, interpreting entity to be perspective
on a thing does not work in an information system where everything is a
representation of a thing in the world and interpreted as things in the
information system. A thing never enters any information system. Hence, the
distinction between a representation of a thing and the thing cannot be
maintained in any information system.

In addition, e1 is Luc not human being. Since there are 7 billion human
beings and when we make assertions about a person we use an identifier to
refer to a specific human being. So, the assertions of age=5 and age=10 are
being made about Luc and not human being pe se.


>
>  5. Alternatively, for two process executions to generate an entity
>> simultaneously, they would require some synchronization by which they agree
>> the entity is released for use; the end of this synchronization would
>> constitute the actual generation of the entity, but is performed by a
>> single process execution.Given an entity expression denoted by e, two
>> process execution expressions denoted by pe1 and pe2, and two qualifiers q1
>> and q2, if the expressions wasGeneratedBy(e,pe1,q1) and
>> wasGeneratedBy(e,pe2,q2) exist in the scope of a given account, then
>> pe1=pe2 and q1=q2.
>>
>> Issue: If two sculptors collaborate on creating a human figurine statue
>> entity e1: sculptor A by PE pe1 creates the arms and legs of e1 and
>> sculptor B by PE pe2 creates the head and upper-body part of e1 then both
>> pe1 and pe2 create e1. They may or may not be synchronized. How can we
>> infer that pe1 = pe2 (whether in one account or across accounts)?
>>
>>
>
> I think you've articulated well the case that A and B create different
> parts.  If they do this at different times, you will have
> statue without head, statue with head without leg, statue with head with
> leg.
>
> The constrained with accounts on generation-unicity is enforcing some
> structure in the provenance records, so that if really pe1<>pe2, then
> they should generate the statue in different records.
>

But, both they together generated the statue, which has leg + head. So,
just because two or more distinct activities led to the creation of a
single entity does not mean that for the sake of the above constraint the
entity has to be "broken down" and referred to by distinct identifiers. I
am afraid any user or provenance application will find the constraint
unnecessary as it does not reflect scores of real world scenarios.


>
> I am proposing, in the end, to follow Simon's proposal, and move this in
> an entirely different section.
>
>  6. Given an identifier pe for a process execution expression, an
>> identifier e for an entity expression, qualifier q, and optional time t, if
>> the assertion wasGeneratedBy(e,pe,r) or wasGeneratedBy(e,pe,r,t) holds,
>> then the values of some of e's attributes are determined by the activity
>> represented by process execution expression identified by pe and the entity
>> expressions used by pe. Only some (possibly none) of the attributes values
>> may be determined since, in an open world, not all used entity expressions
>> may have been asserted. [PROV:0002]
>>
>> Issue: This constraint is confusing (maybe even contradictory) - some or
>> none attributes may be determined? Further, there is no specification or
>> mechanism defined to identify which attributes were determined by the PE?
>> the constraint does not provide any new information (even as a constraint)
>> regarding generation.
>>
>>
>
> We have decided to drop this constraint at the last teleconference.


Ok.


>
>  7. If an assertion wasGeneratedBy(x,pe,r) or wasGeneratedBy(x,pe,r,t),
>> then generation of the thing denoted by x precedes the end of pe and
>> follows the beginning of pe.
>>
>> Issue: Suggest rewording this: given the assertion that "an Entity e1 was
>> generated by a PE pe1" then "the Entity e1 did not exist before start of PE
>> pe1".
>>
>>
>>
>>
>>
> This would be an entirely different meaning that is not the same as the
> one intended.
>

Exactly. I am not sure why is it necessary for generation of x to precede
end of pe since they can share the same event or time value? For example,
it is fairly common to state "the car production ended with the production
of car c1 at 10:00am on Dec 7."

Thanks.

Best,
Satya


>
> Cheers,
> Luc
>
> --
> Professor Luc Moreau
> Electronics and Computer Science   tel:   +44 23 8059 4487
> University of Southampton          fax:   +44 23 8059 2865
> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
> United Kingdom                     http://www.ecs.soton.ac.uk/~**lavm<http://www.ecs.soton.ac.uk/~lavm>
>
>
>
Received on Thursday, 8 December 2011 02:00:00 UTC