SpecializationAlternateDefinitions

From Provenance WG Wiki
Jump to: navigation, search

Attempted Definitions of entity/specialization/alternate of

The aim is to come up with:

  • 3 short informal definitions for entity, specialization, alternate
  • compatible with prov-dm

This means that we don't want radical changes to the definition of entity and don't introduce a concept thing.

Definitions 1

To kick off, current definitions in DM:

  • An entity is a physical, digital, conceptual, or other kind of thing; entities may be real or imaginary.
  • An entity is a specialization of another if they refer to some common thing but the former is a more constrained entity than the latter. The common thing does not need to be identified.
  • An entity is alternate of another if they are both a specialization of some common entity. The common entity does not need to be identified.


Rejected because an entity does not refer to a thing.


Definitions 2

From Jim and Tim:

  • "An entity is a specialization of another if they describe some common thing but the former is a more constrained entity than the latter. The common thing does not need to be identified."


Feels like an entity is a description for a thing, while the prov-dm definition of entity says it's a thing.


Definitions 3

From GK:

  • "specializationOf_1: To express when the description of one entity is a more specific version of the same thing as the description of another entity."


Using the term 'version' is confusing in the context of provenance

Definitions 4

From James

  • "alternateOf: To express when one Entity is an aspect of the same Thing as another Entity.
  • "specializationOf: To express when one of two alternate Entities is more a specific aspect of the Thing they are both based on


How do we reconcile that an entity is a thing and an aspect of a thing?

James: I don't think this is a problem [1] See Definition 8

Definitions 5

From Tom:

  • An entity is a thing one wants to provide provenance for and whose situation in the world is described by some attribute-value pairs. For the purpose of this specification, things can be physical, digital, conceptual, or otherwise. An entity's attribute-value pairs are specified when the entity description is created and remain unchanged.

This further simplifies the definition of alternateOf and specializationOf. Then I would propose something like this:

  • alternateOf: Two entities are alternates if they are the same thing, but their situation in the world is described by different attribute-value pairs.
  • specializationOf: An entity is a specialization of another entity if they are the same thing, and the description of the situation in the world of the former includes all of the attribute-value pairs of the latter, and at least one more.

Pros:

  • almost completely independent of semantics
  • fixes a lot of issues concerning reflexivity/transitivity

Cons:

  • danger of coupling entities too strongly to their attribute-value pairs. This would leave less room to simply assert entities and alternates/specializations without them. (which is probably why they are in the constraints now)
  • "if they are the same thing" is still vague


Definitions 6

(Tom) An attempt at compromise:

DM:

  • An entity represents a thing one wants to provide provenance for, in a certain situation in the world. For the purpose of this specification, things can be physical, digital, conceptual, or otherwise.
  • An entity is a specialization of another entity if they represent the same thing, and the former entity has a more specific situation in the world than the latter.
  • Two entities are alternates if they represent the same thing, with a different situation in the world.

CONSTRAINTS:

  • The situation in the world of an entity is described by a set of attribute-value pairs. An entity's attribute-value pairs are specified when the entity description is created and remain unchanged.
  • When an entity is a specialization of another entity, the more specific situation in the world of the former SHOULD be described by including all attribute-value pairs of the latter, and at least one more.
  • When two entities are alternates of each other, each entity's situation in the world SHOULD be described by at least one attribute-value pair that is not used to describe the other entity's situation in the world.

Pros:

  • general, informal definitions in the DM, leaving room for interpretation of the implementers
  • the constraints make discussion about reflexivity/transitivity impossible
  • requires minimal changes to both documents, and even supports the former definition of alternate, being "two entities are alternates if they specify some common thing"

Cons:

  • "represents" makes it sound like entities are records rather than parts of things. --James
  • In the definition of alternate, I see no reason why alternate should require that the situations are different (or that their recorded attributes differ). --James
  • Reference to attribute pairs in the definitions makes it sound like we are defining specialization/alternate in terms of the records. I agree that some of these constraints should be in PROV-DM-CONSTRAINTS somewhere (they aren't currently) - maybe as inferences rather than constraints? --James

Definitions 7

(Paul) A revision of Tom's Definition 6

DM:

  • An entity is a physical, digital, conceptual, or other kind of thing situated in the world. Entities may be real or imaginary.
  • An entity is a specialization of another entity if they are the same thing, and the former entity has a more specific situation in the world than the latter.
  • Two entities are alternates if they are the same thing, with a different situation in the world.

CONSTRAINTS:

  • The situation in the world of an entity is described by a set of attribute-value pairs. An entity's attribute-value pairs are specified when the entity description is created and remain unchanged.
  • When an entity is a specialization of another entity, the more specific situation in the world of the former SHOULD be described by including all attribute-value pairs of the latter, and at least one more.
  • When two entities are alternates of each other, each entity's situation in the world SHOULD be described by at least one attribute-value pair that is not used to describe the other entity's situation in the world.

Pros:

  • same pros as definition 6 without introducing the notion of "representation" in the definition of entity.

Cons:

  • Saying "are the same thing" confuses whole/part. How can "James on April 15th, 2001" be the same thing as "James between 1990 and 2010"? --James
  • As with def. 6, I don't see a reason to define the relations in terms of attribute-value pairs. --James


Definitions 8

Revised from def #4 & #6/7. --James

  • An entity is a fixed aspect of a physical, digital, conceptual, or other kind of thing situated in the world. Entities may be real or imaginary.
  • An entity is a specialization of another entity if they are aspects of the same thing, and the former entity is a more specific aspect of the thing than the latter.
  • Two entities are alternates if they are aspects of the same thing, with a possibly different situation in the world.

CONSTRAINTS:

  • The situation in the world of an entity is described by a set of attribute-value pairs. An entity's attribute-value pairs are established when the entity description is created and remain unchanged.
  • When an entity is a specialization of another entity, the interval of the more specific entity SHOULD be contained in the interval of the less specific one, and all of the attributes of the less specific entity SHOULD agree with those of the more specific one.
  • When two entities are alternates of each other and their intervals overlap, any shared attributes of the two entities SHOULD be equal.

Questions:

  • Q1. Is the SHOULD a constraint, or inference? i.e. if specializationOf(James-in-2001,James) holds and I say that James has haircolor = blue and I don't say what James-in-2001's haircolor is then do we get to infer that James-in-2001's haircolor is blue, or do we reject the provenance as invalid because the attributes don't match?
  • Q2. We can easily make specialization strict (never reflexive) by saying the two related entities have to be different, without saying anything about attributes in a particular description.
  • Q3. Likewise, we can easily make alternate irreflexive by saying that the two related entities have to be different, without saying anything about the attributes in a particular description.

Answers:

  • Q1.: I would say you can infer it. You make a valid point that it would be very unhandy to expect that all asserters always have to specify all attributes of an entity they want to assert a specialization of. The constraint lies in the fact that they shouldn't assert a conflicting attribute-value pair, for example that James-in-2012 has green hair. (which is invalid, if it is a specialization of James and James has blue hair) --Tom
  • Q2.: Doesn't this make thing a lot more vague and open for discussion again? I agree that we could leave attributes out of the definition, but we should at least mention something about them in the constraints to anticipate questions of future implementers. Perhaps something less constrained than what I proposed? --Tom

Pros:

  • I like the introduction of "possibly" into the definition of alternate. Makes the more intuitive case of reflexivity still possible. --Tom

Cons:

  • Same concern Jun raised in the mailing list. For non-native speakers, statements like "is an aspect of" and "should agree with" are a bit confusing. I do believe, however, that we're on the right track with these definitions. Perhaps just phrased differently? Then again, an example at the definition of entity might clarify this as well, and then we can use this terminology throughout the entire document. --Tom


Definitions 9

Another proposal, revised from cons #6 + def #8 --Tom

I tried coming up with alternatives, but I have to agree that "aspect of" is the best possible way of saying "is a thing" or "is part of a thing" or "is a feature(s) of a thing" all in one term. So these definitions by James get a +1 from me.

Actually, the only changes I propose are to the spec/alt constraints, with the remarks of def #8 incorporated:

  • When an entity is asserted as a specialization of another entity, the only constraint is that the entities SHOULD be different from one another. However, the interval of the more specific entity SHOULD be contained in the interval of the less specific one, and none of the attributes-value pairs of the former SHOULD conflict with those of the latter.
  • Alternate entities can be different or the same. However, when their intervals overlap, there SHOULD be no conflicting attribute-value pairs.

Remarks?

Attempt to state constraints formally

Specialization is a strict partial order:

  • specializationOf(x,y) and specializationOf(y,x) is impossible (antisymmetry)
  • specializationOf(x,y) and specializationOf(y,z) implies specializationOf(x,z) (transitivity)

Alternate is an equivalence relation:

  • alternateOf(x,x) holds for any entity x (reflexivity)
  • alternateOf(x,y) implies alternateOf(y,x) (symmetry)
  • alternateOf(x,y) and alternateOf(y,z) implies alternateOf(x,z) (transitivity)

Specialization implies alternate:

  • specializationOf(x,y) implies alternateOf(x,y)

Event ordering:

  • If specializationOf(x,y) and wasGeneratedBy(g1,x,a1) and wasGeneratedBy(g2,y,a2) then gx precedes gx
  • If specializationOf(x,y) and wasInvalidatedBy(ix,x,a1) and wasGeneratedBy(iy,y,a2) then ix precedes iy

Shared attributes of overlapping alternates are equal:

  • If alternateOf(x,y) and entity(x,attrs1) and entity(y,attrs2) and wasGeneratedBy(gx,x,a1) and wasInvalidatedBy(iy,y,a2) and gx precedes iy, then if attr=val1 is in attrs1 and attr=val2 is in attrs2 then val1=val2.

Attributes of less specific aspects are shared by their specializations:

  • If specializationOf(x,y) and entity(x,attrs1) and entity(y,attrs2) and attr=val is in attrs2 then attr=val is also in attrs1.


--James


Definition 10 (TOM summing up)

  • An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects. Entities may be real or imaginary.
  • An entity is a specialization of another entity if the former shares all aspects of the latter, and additionally, the former entity provides more specific aspects of the same thing as the latter.
  • Two entities are alternates if they specialize a common entity.

Or a bit broader:

  • Two entities are alternates if they share some aspects of the same thing.

(I have no particular preference about which is better)

CONSTRAINTS

  • The situation in the world of an entity is described by a set of attribute-value pairs. An entity's attribute-value pairs are established when the entity description is created and remain unchanged.
  • When an entity is a specialization of another entity, the interval of the more specific entity SHOULD be contained in the interval of the less specific one, and none of the attributes of the more specific entity SHOULD conflict with those of the less specific one.
  • When two entities are alternates of each other and their intervals overlap, any shared attributes of the two entities SHOULD be equal.

Definition 11

(#10 modified to address James' concerns)

  • An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects. Entities may be real or imaginary.
  • Two alternate entities have equal or different aspects of the same thing
  • An entity that is a specialization of another entity shares all aspects of the latter, and additionally provides more specific aspects of the same thing as the latter

Definitions 12

  • An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects. Entities may be real or imaginary.
  • Two alternate entities have equal or different aspects of the same thing.
  • An entity that is a specialization of another entity shares all aspects of the latter, and additionally has more specific aspects of the same thing as the latter.

Definitions 13 (refinement of 12)

The following are (I think) equivalent to the above but hopefully a little clearer, particularly how the lifetime "aspects" are related.

  • An entity is a physical, digital, conceptual, or other kind of thing with some fixed aspects. Entities may be real or imaginary. (same as above)
  • Two alternate entities present aspects of the same thing. These aspects may be the same or different, and the alternate entities may or may not overlap in time. (changed has to presents; rearranged & added time)
  • An entity that is a specialization of another entity shares all aspects of the latter, and additionally presents more specific aspects of the same thing as the latter. In particular, the lifetime of the specialized entity contains that of any specialization. (changed has to presents; rearranged & added time)