PROV-ISSUE-138 (collection-collision): Collection does not describe multiple additions/replacements [Data Model]

PROV-ISSUE-138 (collection-collision): Collection does not describe multiple additions/replacements [Data Model]

http://www.w3.org/2011/prov/track/issues/138

Raised by: Stian Soiland-Reyes
On product: Data Model


http://www.w3.org/TR/prov-dm/#expression-Collection introduces relations for expressing collection modifications:


Expression: wasAddedTo_Key(c,k) (resp. wasRemovedFrom_Key(c,k)) denotes that collection c had a new value with key k added to (resp. removed from) it.


It is not clear what would happen if a second addition added a value with the same key.

Imagine:

  wasAddedTo_Coll(c2,c1)
  wasAddedTo_Key(c2,k1)
  wasAddedTo_Entity(c2,e1)

  wasAddedTo_Coll(c3,c2)
  wasAddedTo_Key(c3,k1)
  wasAddedTo_Entity(c3,e2)

It is clear that c3 contains (k1, e2). Does it also contain (k1, e1)?

I understand this is meant as a general collection, and the interpretation of this might as well depend on the specific type of collection. However that means that without knowing the type of collection you can't tell if e1 is contained in c3 or not.


I believe we should not allow automatic replacement, and neither allow multiple values for the same key.  So if asserting:

  wasAddedTo_Coll(c2,c1)
  wasAddedTo_Key(c2,k1)
  wasAddedTo_Entity(c2,e1)


c1 does *not* contain k1, but *might* have e1 under a different key.


Suggested:
wasAddedTo_Coll(c2,c1); wasAddedTo_Key(c2,k1) asserts that c2 now contains the key k1 - but the collection c2 is derived from, c1, did not contain the key k1. 

A second addition of the same key without an intermediate removal is not valid in PROV. If the specific type of collection is performing replacement by key, such as a dictionary/map/hashtable, then the asserter will need to model an intermediate wasRemovedFrom_Coll(intermediate, original) and  wasAddedTo_Coll(new, intermediate). If the specific type of collection contains multiple values per key, then the value should instead be asserted as a new nested collection of those values.

Received on Saturday, 29 October 2011 23:46:10 UTC