Re: PROV-ISSUE-465 (avoid-infinite-inferences): Avoid infinite loops in inferences [prov-dm-constraints]

Hi,

Dropping 1 is not an option for me, and I explain why here:

If we have the following PROV instance:

entity(e1)
entity(e2)
entity(e3)
wasDerivedFrom(e3,e2)
wasDerivedFrom(e2,e1)
wasDerivedFrom(e1,e3)

If we can't infer generation events, then we can't order them, then we 
can't detect violation.
Obviously, this is a simple example to illustrate the problem.

Luc


On 08/08/12 10:47, James Cheney wrote:
> Hi,
>
> I understand your concern.  At a purely technical level, we avoid nontermination, but only by drawing a fine distinction between having an entity(e) statement in the instance, and "knowing e is an entity" (represented by entity' \in typeOf(e)).
>
> We (me, Luc, Paolo, Tom) discussed three ways of avoiding this problem, before finalizing the review copy:
>
> 1.  Drop the entity-generation-invalidation and activity-start-end inferences altogether.
> 2.  Add some limitation to inference (such as your suggestion of not triggering inferences on generated existential variables, or applying the ) that recovers finiteness.
> 3.  [what we have done] demote the type inferences to only infer constraints like 'entity' in typeof(id) , not add new statements like entity(e) to the PROV instance.
>
> Of the three, the one with strongest consensus was (3).  Some of us strongly felt that the e-g-i and a-s-e inferences are needed.  Others, including me, strongly felt that (2) would be a bad idea, as it breaks the connection to logic (i.e., the e-g-i and a-s-e  and may have more radical unforeseen consequences.
>
> We believed that (3) was an acceptable compromise (if a bit hacky), but, I'm not sure how the group as a whole would feel.  That is why I'm laying out the options we considered.
>
> So, my proposal would be to make this distinction clearer (and explain why) so that it does not surprise or bite people...
>
> If you are not happy with that, then I suggest we revert to (1) by dropping the inferences, but then you will have to convince other people that it is controversial.
>
> Further comments below.
>
> --James
>
> On Aug 8, 2012, at 10:10 AM, Stian Soiland-Reyes wrote:
>
>> On Mon, Aug 6, 2012 at 4:47 PM, James Cheney <jcheney@inf.ed.ac.uk> wrote:
>>> This is not true, and the solution proposed isn't necessary (nor would it be sufficient to restore termination, if termination were a problem).
>>>
>>> We had a long internal discussion before releasing the review draft about how to avoid this.  My preference would be to get rid of inferences 7 and 8.  Instead, we agreed to weaken the type inferences so that they don't infer new entity() or activity() facts.
>> I'm sorry, that reasoning is not good enough for me alone, as we don't
>> have "an entity statement" (yes, that old discussion again..) in DM,
>> simply "an entity":
>>
>>> An entity ◊, written entity(id, [attr1=val1, ...]) in PROV-N, has:
>> wasStartedBy has:
>>> trigger: an optional identifier (e) for the entity triggering the activity;
>> Thus, if the trigger identifier is there, it's an entity.
>>
> This is fine; it means that 'entity' \in typeOf(e) holds, but this is not the same as entity(e) being added to the instance.
>
>> Inference 8 says:
>>
>> IF activity(a,t1,t2,_attrs) THEN there exist _id1, _e1, _id2, and _e2
>> such that wasStartedBy(_id1;a,_e1,_a1,t1,[]) and
>> wasEndedBy(_id2;a,_e2,_a2,t2,[]).
>>
>> so _e1 exists, it even says so! And by what we see above it's an
>> entity, nothing else is allowed (unless we add the previously
>> discussed - for blank!).
>>
>> Hence, at least in my mind, this implies:
>>
>> entity(_e1)
>>
>> and inference 7 triggers.
> This does not follow from PROV-CONSTRAINTS as written.  There is no inference that allows us to infer entity(_e1) from  wasStartedBy(_id1;a,_e1,_a1,t1,[]).  All we know is that there is *something* called _e1 (and by type constraints, it will have to satisfy 'entity' in typeOf(_e1)).
>
> The type constraints *would*
>
> There is naturally no guarantee that every entity that exists is described by a matching entity(e) statement in a given PROV instance.  Moreover, we can know that something exists but not be sure if it is an entity.  For example we could know
>
> agent(e)
>
>
>
>>
>> I see what you argue, because
>> http://dvcs.w3.org/hg/prov/raw-file/tip/model/releases/ED-prov-constraints-20120723/prov-constraints.html#type-constraints
>> does not specify the inference entity(e), just some typeOf(e)
>> relation, and there are no inference rules that imply an entity()
>> *statement*.
>>
>> This implies that one way to avoid an entity to be  generated and
>> invalidated is simply to skip the entity() statements (and equivalent
>> skip activity to avoid wasStartedBy and wasEndedBy). I think I need to
>> read the whole specification again if this is the kind of
>> interpretation you really mean. This should then also be stated very
>> early on, as it is very unexpected to me.
> Yes, that is exactly the distinction we are relying on to avoid infinite loops.  The fact that it is not well explained is probably an artifact of the fact that this change was the thing holding up the release...
>
>> It also means there could be implied activities without wasStartedBy
>> and wasEndedBy, and I am worried that those might then be using and
>> generating things in impossible orders, because the time constraints
>> rely on wasStartedBy/wasEndedBy/wasGeneratedBy/wasInvalidatedBy.
>>
>
> OK, but that is fine: we are not claiming to prevent all impossible things with validity, only that the things we do rule out are definitely nonsense.
>
> (By the same argument, though, if these inferences are both controversial and pose technically complicated, then I think it's best to drop them.  I don't know where the consensus of the group is, though.)
>
>>
>>> It's true that you can keep applying any existential rule forever as its hypotheses stay true forever, but this is not a problem since we only apply such a rule (generating a new existential variable) if the RHS is not already satisfiable.  If it were a problem, it would be a problem for all of our (existential) inferences.
>> Yes, but that would be the case for inference rule 7 and 8, because of
>> 'there exists'.
>>
> As I've tried to explain, the key distinction is that "there exists x such that ..." does not tell us anything about statements that hold of x, other than what we say inthe conclusion.  Since we don't say
>
> Perhaps another way of thinking of it is that when we say entity(e), we are (by inf 7 & 8) also asserting that e is an entity with some existing generation and invalidation events, whereas when we say 'entity' in typeOf(e), we are just saying that we know it's an entity, but we don't claim its generation and invaliation events.
>
>
> --James

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm

Received on Wednesday, 8 August 2012 10:08:07 UTC