This wiki has been archived and is now read-only.


From Provenance WG Wiki
Jump to: navigation, search



ISSUE-532 (Role)

ISSUE-525 (Specialization/Alternate)

ISSUE-507 (Inverse Relations)

ISSUE-504 (collection/bundle)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0094.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/504
  • Group Response
    • It is correct that: A bundle is a named set of provenance descriptions (2.2.2). It is also correct that section 2.2.3 indicates that many types of collections exist, including sets. However, section 2.2.3 states: A collection is an entity that provides a structure to some constituents, which are themselves entities.
    • In PROV, provenance descriptions are not given identifiers and are not regarded as entities. Identifiers occurring in provenance descriptions denote "things in the world" (resources).
    • To be able to talk about the provenance of PROV descriptions, the bundle construct allows a set of descriptions to be named, and become a "thing in the world". Such a bundle is an entity whose provenance can then be described.
    • In conclusion, a PROV bundle is not a PROV collection.
    • In response to the follow-on message from the reviewer, there is no support in the Working Group, for adding identifiers to individual PROV statements, since they would result in a proliferation of identifiers. In practice, a given asserter would typically assert multiple statements: subject, relation, object. It feels more appropriate to give all these statements an id (by means of a bundle), and express their provenance.

ISSUE-503 (adopt plan)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0093.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/503
  • Group Response
    • The expanded relationship wasAssociatedWith allows for plan to be specified (plan attribute http://www.w3.org/TR/prov-dm/#association.plan).
    • It is not entirely clear what the semantics of the suggested wasAdoptedBy would be:
      • If it is a form of influence by which an agent was influenced by a plan, this can be expressed by a subtype of derivation wasDerivedFrom(ag,pl)
      • Alternatively, if it is an influence of the plan by the agent, this can be expressed by subtype of attribution wasAttributedTo(pl,ag)
      • If it is not an influence, a given application could define, in OWL terminology, a property chain wasAdoptedBy=agent o inverse(hadPlan)
    • The above discussion shows that PROV provides core building blocks that allow a relation such as wasAdoptedBy to be defined.
    • Hence, there is no need for a separate wasAdoptedBy relation.
    • Following the reviewer's follow-on message, the group has been very careful about introducing relationships that are not influence. Specialization/Alternate/Membership are special cases given their prevalence in provenance.
    • Other relations along the lines of "adopting a plan" could potentially be considered, such as "rejecting a plan" or "abandoning a plan". It feels that such a relation is not primitive but could be expressed in terms of an activity (to adopt, to reject, to abandon) and a used plan.
  • Suggested change: Replace To illustrate expanded relations, we consider the concept of association, described in Section 2.1.3. by To illustrate expanded relations, we revisit the concept of association, introduced in Section 2.1.3 (full definition of the expanded association can be found in section 5.3.3).
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0027.html

ISSUE-447 (subactivity)

ISSUE-492 (typo in example)

ISSUE-500 (activity hierarchy)

ISSUE-505 (prov-n notation)

ISSUE-508 (Table 5)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0098.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/508
  • Group Response
    • The text indeed required clarification: "core structures have their names and parameters highlighted in bold in the second column (prov-n representation); expanded structures are not represented with a bold font."
    • Indentation of subconcepts had been considered by the editors. While it appears beneficial to see Revision, Quotation, and Primary Source indented below Derivation, this would lead to confusion elsewhere in the table:
      • Plans (in component 3) are subtype of Entity, but entities belong to component 1. Indenting Plan under another concept would therefore be misleading.
      • Person/Organization/SoftwareAgent could be indented below agents. However, our preference is to list core structures first, before expanded structures.
      • Finally, Influence could be see as super-relation of many relations, but, again, they are spread across components, and Influence is regarded as an expanded structures.
    • Overall, there are multiple, conflicting ways of organizing table 5. We feel that this order of structures allows components to be exposed and core structures to be presented first, without attempting to expose a hierarchy of types, which would require an entirely different layout.
    • PROV-DM follows the syntax specified by PROV-N. Regarding the style of encoding of attributes, this issue is already raised against the PROV-N document (issue-533).
  • References:
  • Implemented changes: http://dvcs.w3.org/hg/prov/diff/47d79e48cb4c/model/prov-dm.html
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0006.html

ISSUE-531 (Multiple location)

ISSUE-528 (MentionOf)

ISSUE-517 (Revision/Quotation)

ISSUE-501 (DrivingACarToBoston)

ISSUE-516 (DerivationAsBundle)

ISSUE-514 (Starter/EnderActivity)

ISSUE-513 (StartSubActivity)

ISSUE-511 (UsageSubActivity)

ISSUE-510 (GenerationSubActivity)

ISSUE-512 (FinePayingExample)

ISSUE-497 (Figure 1)

ISSUE-515 (Invalidation)

entity(e1, [ex:available="yes"])
entity(e2,  [ex:available="yes"])
    • The above example shows that e has some aspects that remain constant during its lifetime (e.g. its identity), but is also allowed to have other aspects that change over time. These changing aspects cannot be expressed as attributes.
    • There is no requirement for asserters to assert invalidation of entities
    • Given this, the Working Group feels that the concern raised by the author is not applicable. Entities may have long lifespan, provided that they have some aspects, represented as attributes, that do not change over that lifespan. Other aspects are allowed to change. As a minimum, an entity must have a fixed identity during its lifetime.
    • As far as a new section on state is concerned, the Working Group has made a decision to leave this kind of material outside the prov-dm document. Some of this is actually covered in prov-constraints.
  • In the follow-on message, the reviewer discusses the traffic light example. As the light changes from red to green, the green traffic light is invalidated and the red traffic light is generated. Both are specializations of the traffic-light, which continues its existence across this change state, since color is not one of its attributes.
entity(ex:green-traffic-light, [ex:color="green"])
entity(ex:red-traffic-light, [ex:color="red"])
specializationOf(ex:green-traffic-light, ex:traffic-light)
specializationOf(ex:red-traffic-light, ex:traffic-light)

ISSUE-530 (attributes)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0120.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/530
  • Group Response
    • The group has given careful considerations to attributes in prov-dm, specifically time, location and role.
    • The group could not reach consensus to allow these attributes to apply to more concepts of the data model. The challenge is not to add the attribute to a concept, but to find an interpretation of that attribute, which fits the rest of the model.
    • Role:
      • We have already elaborated on roles in our response to ISSUE-532.
    • Location:
      • While a notion of location is fairly intuitive for an activity or entity, it is less intuitive for associations for instance. In an association, the activity may have a location, and the agent may have a location. It is however unclear what the location of the association itself may be.
    • Time:
      • The same comments apply for time. However, in this case, the constraints document explains what kind of ordering constraints exist, between an agent and activity, for instance.
      • Furthermore, as expanded in details in prov-constraints, time information is connected to a unique event. The Working Group has not defined, for instance, an event for the start of an association, and an event for its end. It is not clear why such event types would be required, when activity start and end could be used to that end, and the association be represented by an activity, holding for some time interval.
    • So overall, the group could not find consensus to broaden these attributes to other relations in a meaningful manner. Particular implementations, using the PROV extension mechanism, are however able to add similar attributes for their specific needs.
    • In response to the follow-on message, the Working Group, as it wraps up its activities, will consider follow-on activities, and mechanisms for community to share information. The Semantic Web wiki is a starting point.
  • References:
  • Proposed changes:none
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0006.html

ISSUE-520 (Person/Organization/SoftwareAgent)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0110.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/520
  • Group Response:
    • The reason why the WG introduced agents in the PROV model is to be able to assign responsibility for an activity taking place, for the existence of an entity, or for another agent's activity.
    • For inter-operability reason, the WG also believed it is useful to define commonly encountered types of agents: Person, SoftwareAgent, and Organization. Agents of type prov:Person are people responsible for something; agents of type prov:SoftwareAgent are running software responsible for something; etc
    • The reason why an instance of prov:Agent is allowed to be also a prov:Entity is because we may want to talk about its provenance, how it was generated or derived, etc.
    • Given this:
      • it is not appropriate to make Person/SoftwareAgent/Organization subtypes of Entity in PROV, since entities by default do not bear responsibility in the PROV model. It is the notion of prov:Agent that carries responsibility, in PROV
      • it is possible to define an instance as both a prov:Person and a prov:Entity, when we want to express it is responsible for something, and we want to express its provenance.
    • If one wishes to introduce a type of person, as an entity, without associating any responsibility, then there are ontologies, outside PROV, which allow for that. FOAF concepts such as foaf:Person, foaf:Organization may be relevant. With these, one can write entity(e, [prov:type='foaf:Person'])
  • References:
  • Proposed changes:
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0006.html

ISSUE-522 (Activity Delegation)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0112.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/522
  • Group Response
    • Our response to ISSUE-521 partly addresses this issue.
    • PROV delegations are not temporal relations. Instead, prov-constraints define ordering constraints that are implied by delegations: the responsible agent has to precede or has some overlap with the subordinate agent.
    • If in an application, it is necessary to express that a delegation takes place over an interval(evt1-evt2) and followed by a delegation during interval (evt2-evt3), a possible way to model in PROV is as follows:
      • One may model this scenario with two activities, one for the first interval, or one for the second interval, and two relations actedOnBehalfOf, one for each activity.
    • It is true that, in a delegation, activity is optional. The reviewer suggests "Therefore, it is possible to state that one agent is the delegate of another, irrespective of any activity. This delegation likely is not indefinite, however, and is bounded by some context (e.g., time, role within an organization, etc). It should be possible to describe the bounds of the delegation.". But it is not the intended semantics:
      • PROV constraints defines the semantics of optional arguments, and specifically, in Table 3, explains that activity in delegation is expandable.
      • It means that an absent activity can be replaced by an existential variable. Hence,
      • actedOnBehalfOf(ag2,ag1) really means that agent ag2 acted on behalf of agent ag1 in the context of some unspecified activity. Some activity, not all activity.
      • This (unspecified) activity defines the bounds of the delegation. If these bounds need to be made explicit, than an activity also needs to be made explicit.
  • References:
  • Changes to the document: none, but issue raised against prov-constraints
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0027.html

ISSUE-509 (AttributesInUML)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0099.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/509
  • Group Response
    • First, let us note the non normative nature of the UML diagrams. They are here to inform readers, and convey the intuition of the data model
    • The UML actually represent all the information present in relations such as WasStartedBy.
      • PROV Id and PROV attributes are explicitly listed as UML attributes in the association class
      • The started activity and the trigger entity are source and destination of the association edge
      • The starter activity is present with the starter edge
      • Time is also present though the time edge
    • With UML diagrams, we can take a full object oriented view or a more relational view of the data model. The former lists all attributes, whereas the latter highlights the relations. We opted for the latter approach.
    • Hence, what the UML diagram does not explicit represent is the actual names of all attributes of a relation. That is covered by the normative text.
    • It is correct that Time is a primitive datatype, and marked as such. Given the important of time and events in the model, it is considered pedagogical to keep it in Figure 5. We note that Figure 1, the much simplified version, doesn't show it.
    • Finally, it's correct that we use names such as Start, but the UML diagram contains relation label WasStartedBy. This has now been fixed for all introductory paragraphs.
  • References:
  • Implemented changes: http://dvcs.w3.org/hg/prov/diff/817b3b917afe/model/prov-dm.html
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0006.html

ISSUE-526 (Alternate)

ISSUE-502 (Derivation)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0092.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/502
  • Group Response:
    • The focus of derivation is on connecting a generated entity to a used entity. Hence, transformation of an entity into an other, or updating of an entity from another are appropriate focus for this definition.
    • One should note that the focus is not of creation of the entity since we already have the notion of generation for that.
    • Given an entity that was generated, the concept of derivation allows us to express dependencies on entities that have influenced that entity. As the author suggests, it could be argued that most entities can be said to be derived from other entities.
    • In PROV, the creation of an entity, referred to as generation, is the point after which it becomes available for usage. Before generation, the entity cannot be used.
    • The document gives the example of a car, moved from Boston to Cambridge (see example 5, in editor's draft). For this car, we identify multiple entities exposing various facets of the thing: Joe's car, Joe's car in Boston, and Joe's car in Cambridge.
entity(joe-car-boston, [prov:location="boston"])
entity(joe-car-cambridge, [prov:location="cambridge"])
specializationOf(joe-car-cambridge, joe-car)
specialization(joe-car-boston, joe-car)

ISSUE-524 (Bundle/Collection)

ISSUE-519 and ISSUE-523 (Influence Inheritance)

IF wasGeneratedBy(id; e,a,_t,attrs) THEN wasInfluencedBy(id; e, a, attrs).
    • Whatever appears as id/attributes in wasGeneratedBy becomes also id/attributes in wasInfluencedBy
    • Whatever appears as entity (e) in wasGeneratedBy becomes influencee in wasInfluencedBy
    • Whatever appears as activity (a) in wasGeneratedBy becomes influencer in wasInfluencedBy
    • Given this, prov-dm should define the minimalist characteristics for wasInfluencedBy in a technology agnostic way.
    • Inheritance is a way of implementing Inference 15 of prov-constraints (and this approach was successfully followed by prov-o), but it does not have to be implemented that way. For instance, a rule based system could simply implement Inference 15 without requiring inheritance. The current prov-xml schema does not define WasGeneratedBy as an extension if Influence. A record based system may not rely on inheritance.
    • As the author suggests, inheritance would imply that attributes are inherited by the children relation. It is not the case that wasGeneratedBy has influencer/influencee attributes, but instead, we want to show that they correspond to activity/entity in that case.
    • Given this, the document should be changed as follows:
      • The UML diagram in Figure 8 should not show a Generalization association between WasGeneratedBy (and others) and WasInfluencedBy.
      • A table should be introduced showing which attributes in Generation/Usage/etc are influencer or influencee.
    • With these changes, the issue raised by the author is no longer applicable: it is no longer the case that wasGeneratedBy etc can be used anywhere between agent/activity/entity.
    • For the comment "The notion of influence is useful for the PROV model, but it is unclear whether this is intended to represent an extension point for adopters of the spec. How should it be implemented?", we have shown with prov-o, prov-n, and prov-xml various ways of implementing Influence. According to Section 6, Influence is not seen as an extensibility point of the model, instead, it is seen as a means to express influence in PROV without being specific about its nature. We note the following, quoted from the specification:
      • It is recommended to adopt these more specific relations when writing provenance descriptions. It is anticipated that the Influence relation may be useful to express queries over provenance information.
  • References:
  • Implemented changes:
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0028.html

ISSUE-521 (Responsibility Activity)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0111.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/521
  • Group Response
    • PROV agents bear responsibility for activities taking place, entities being generated, and other agents.
    • PROV agents MAY be entities or activities
    • Given this, it is legal to write the following, in which a2 acted on behalf of a1, where a2 and a1 are activities, but the type of a2 and a1 can also be inferred to be agent. Hence, the response to the author's question "Can activities be responsible for other activities" is yes, as illustrated by the example.

ISSUE-450 (Incomplete Table)

ISSUE-482 (Bundles and IDs)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Aug/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/482
  • Group Response:
    • PROV specifications define a notion of bundle, but do not define operations on bundles such as merge. The definition of such operations is left to implementations.
    • The prov-constraints document defines a notion of validity in the presence of bundles. Validity is determined by checking validity of bundles, individually, irrespective of other existing bundles. For instance, the following document, containing two bundles is valid.
 prefix ex <http://example.org/>
 bundle ex:b1

 bundle ex:b2
    • Other specifications may provide some guidance regarding this issue. For instance, the Architecture of the World Wide Web, Volume One, provides principles, constraints, and good practice notes about the use of IRIs.
    • Given the above, PROV by itself does not require IDs to be unique in a bundle, but one may have to ensure this in order to perform certain operations on the PROV data or to meet other best practice.

ISSUE-518 (PrimarySource)

ISSUE-499 (Generation vs Activity)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0089.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/499
  • Group Response
      • The author states It is not clear why it is necessary to define terms for discrete points in time within the PROV model. If activities already have start and end times, isn't that sufficient?.
      • As indicated in prov-constraints, PROV is implicitly based on a notion of instantaneous events. Five of them are identified, start/end/generation/usage/invalidation.
      • These events are of interest because they mark a "change of state" in the world: an activity is started/end, an entity is generated/used/invalidated. These types of events matter because they enable or disable the occurrence of further events. For instance, before generation, an entity cannot be used, but it can after its generation, ... until its invalidation.
      • Those events always involve an activity and an entity:
        • start and end of an activity with respect to a trigger
        • generation/usage/invalidation of an entity by an activity.
    • Each type of event enables or disables the occurrence of specific types of events:
      • Start of a:
        • No event with a can precede start of a, event with a can follow start of a
      • End of a:
        • Event with a can precede end of a, event with a cannot follow end of a
      • Generation of e:
        • Event with e cannot precede generation of e, event with e can follow generation of e
      • Invalidation of e:
        • Event with e can precede invalidation of, event with e cannot following invalidation of e
      • Usage of e by a:
        • "influence" of e can "show" after usage by a, but cannot "show" before usage
    • Given the different types of events, it is not sufficient to have just start and end events, as suggested by the author.
    • In PROV activities "occur". They do "stuff". They act upon and with entities. The activities are involved in the generation and usage of activities: as indicated above, an event always occurs in the context on an activity.
    • If, for some application, it is useful to see the creation of entities as having a duration, this indeed can be modelled by an activity with a duration. But what we care about, from a provenance viewpoint, is when the entity is actually created, which we then refer as generation. This cannot be modelled by an activity. The generation (event) is in the model the relation between an activity and an entity.
    • To avoid potential confusion between activity and start/end/generation/usage/invalidation, we now make explicit that start/end/generation/usage/invalidation are instantaneous.

ISSUE-529 (Empty Collection)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0119.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/529
  • Group Response:
    • In an open world context, absence of the relation hadMember(c,e) does not imply that a collection c is empty. Hence, the group introduced a class EmptyCollection to indicate when a collection is empty.
    • Figure 11, like all UML diagrams, is informative. It shows that Collection and EmptyCollection are linked with Entity, by means of a Generalization association. Therefore, a Collection and EmptyCollection are also entities with an id and attributes.
    • Concretely, prov-dm (prov-n) sees all the sub-types (e.g. prov:type='prov:Collection' ) as type information that is expressed by the prov:type attribute.
    • The handling of these subtypes is consistent with other subtypes in the model, e.g. revision, softwareAgent, etc
    • Prov-dm, as a conceptual model, leaves the implementation of these inherited types to concrete serializations.
    • As to the question of why doesn't PROV-DM have a list of members as an attribute of Collections, the design of prov-dm makes all associations between PROV entities relations. In effect, this allows us to understand the structure of a provenance graph, just by looking at the relations, without having to process attributes of entities. A given implementation may also to decide to represent collection members as attributes if it finds it convenient.
  • References:
  • Implemented changes:
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0028.html

ISSUE-449 (prov:value)

ISSUE-462 (Definition of Entity)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Jul/0009.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/462
  • Group Response:
    • The term 'entity' is intentionally defined in a liberal manner to avoid restricting users expressivity. Obviously, it shouldn't be too liberal, otherwise it would be all encompassing, without clear semantics.
    • The term 'entity' (and associated notions such as 'alternate', 'specialization') have been the subject of intense debate by the Working Group, and the definition reflects the compromise reached by the Working Group.
    • The term 'aspect' is not used here with a technical meaning and should be understood with its dictionary meaning 'A particular part or feature of something'.
    • PROV-Constraints, in its rationale section, expands on the notion of entity.
    • While an object/thing may change over time, an entity fixes some aspects of that thing for a period of time (in between its generation and invalidation). As opposed to other models of provenance (such as OPM), an entity is not an absolute state snapshot. Instead, it is a kind of partial state, just fixing some aspects. The rationale for this design decision is that it is quite challenging to find absolute state snapshots that do not change: the location of a file on a cloud changes, the footer of this Web page changes (as more people access it), etc. Hence, by allowing some aspects (as opposed to all) to be fixed, the PROV concept of 'entity' is easy to use.
    • We distinguish an 'aspect' from an 'attribute'. An attribute-value pair represents additional information about an entity (or activity, agent, usage, etc). In the case of an entity, attribute-value pairs provide descriptions of fixed aspects. So, the term 'aspect' refers to properties of the thing, whereas the term 'attribute' refers to its description in PROV.
    • PROV does *NOT* assume that all fixed aspects are described by attribute-value pairs. So, there may be some fixed aspects that have not been described. Obviously, without description, it's difficult to query or search over them.
    • According to PROV Constraint key-object (constraint 23), an entity has a set of attributes given by taking the union of all the attributes found in all descriptions of that entity. In other words, PROV does not allow for different attribute-value pairs to hold in different intervals for a given entity.
    • The attribute-value pairs of an entity provide information for some of the fixed aspects of an entity.
      • This point may not have been clear, and requires text modification. (see below)
    • A specific attribute of an entity is its identity. It is also assumed that it holds for the duration of the entity lifetime.
      • This point may not have been clear, and requires text modification. (see below)

ISSUE-498 (Relation terminology)

ISSUE-569 (Mutable resources)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0001.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/569
  • Group Response:
    • PROV supports the case you describe using the prov:specializationOf relation to connect a mutable resource URI to entities representing each revision over time. The latter don't have to exist already in Callimachus, but may be created with unique IDs specifically to model the provenance.
    • If a change in a resource's state is something to be documented in the provenance, then that requires multiple entities. PROV entities are allowed to be mutable, but the purpose of this is to hide information that is unimportant, i.e. that you do not want to model in the provenance. As soon as the timeline of the resource is divided into relevantly different periods (e.g. before and after each contributor edited), then the mechanism to document this in PROV is to use multiple entities. If you have a single identifier (entity) for the mutable resource as it exists over time, through multiple revisions, this can be connected to the set of revision entities using the prov:specializationOf relation.
  • The flour and baking example is similar. If a change is to be documented in PROV, then multiple entities are used, e.g. the flour before and after baking. If it is not documented, then only one entity is required. There is no notion of a change which is "documented but not significant", because it is unclear what significance would be in general except for the decision to model/document it. As before, a general, mutable "flour" entity can exist that is connected to the flour before and after baking using prov:specializationOf. For example:
 ex:baked prov:used ex:flour1
 ex:flour2 prov:wasGeneratedBy ex:baked
 ex:flour2 prov:wasDerivedFrom ex:flour1
 ex:flour1 prov:specializationOf ex:flour
 ex:flour2 prov:specializationOf ex:flour

ISSUE-463 (Overall Feedback)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Jul/0010.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/463
  • Group Response:
    • The feedback was broken down in individual issues that were addressed separately on this page. The group thanks for the reviewer for the extensive comments!
    • The group made changes based on the reviewers feedback, please see each issue for the relevant change.
    • The UML diagrams in PROV-DM are informative. They are intended to illustrate concepts as best as possible. The normative material is found in the text. There may be alternative UML modelling of the same normative definitions.
    • Alternative UML diagrams were proposed by the author of this feedback. Individual issues have addressed these points, but below we provide specific feedback on some UML diagrams.
    • Some comments on the UML diagrams provided by the reviewer:
      • Entities.png: Organization, Person, Software are not entities (ISSUE-520), Bundles are not Collections (ISSUE-524), and membership is expressed as a relation and not an attribute (ISSUE-529)
      • Interactions.png: we did not find it suitable to introduce a role (generatedEntity/UsedEntity) since then we would have to introduce a different identifier for the entity in that role. This would result in very convoluted graphs, with lots of 'acts as' relations. There is no startTime/endTime for Invalidation, Usage, Generation, but simply a time. A strong desire has been to facilitate the assertions of provenance: ex:a2 prov:used <uri> and <uri> wasGeneratedBy ex:a1
      • Relations.png: Alternate/Specialization/Membership do not have id and attributes. Adoption is not a PROV relation. PROV does not define activity composition.
  • References:
  • Changes to the document:
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0028.html

ISSUE-475 (Mention)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Aug/0001.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/475
  • Group Response:
    • The reviewer suggests that the work to describe contextualized provenance should be deferred so that it can be aligned with ongoing W3C work on RDF datasets and their semantics. Since ISSUE-475 was submitted, the RDF working group has decided that it will not provide a formal semantics for RDF Datasets. This RDF resolution ensures that any semantics for bundle and/or mention is guaranteed not to be in conflict with the RDF semantics.
    • As PROV-Constraints section 6.2 clearly indicates, PROV-bundles validity is determined by examining bundles in isolation of each other. Our response to issue-482 also indicates that PROV itself does not set any constraints on how a given ID is being used across multiple bundles. Given this, mentionOf is a general relation which allows an entity to be linked to another entity described in another bundle.
    • The reviewer suggests that
   mentionOf(infra, supra, b)

could simply be expressed as

  specializationOf(infra, supra)
  entity(infra, [mentionedIn=b])
    • This design was considered and rejected by the Working Group:
      • By design, relations between PROV objects are expressed by PROV relations (usage, generation, etc, mention), and are not expressed as PROV attributes. The suggested additional attribute mentionedIn would relate the entity infra with bundle b, and would go against this prov-dm design.
      • The interpretation of the attribute-value pair mentionedIn=b is somewhat difficult, because infra is not itself described in bundle b: supra is the entity described in bundle b. So, syntactically, mentionedIn=b may look like an attribute-value pair, but in reality, it can only be understood in the presence of specializationOf(infra, supra). Hence, the reason for introducing the ternary relation mentionOf.
    • The Working Group left it unspecified which new attributes could be inferred for infra, and in general what constraints apply to mentionOf. The reviewer is critical of this decision, arguing that nothing new can be inferred from mentionOf, and therefore mentionOf can be replaced by specializationOf. 'Under-specification' is a feature of PROV: what can be inferred from relations such as usage, derivation, alternate? The group recently acknowledged this for alternateOf and added a clarifiying note in the text. This observation is applicable to further PROV concepts, such as Quotation, PrimarySource, SoftwareAgent, etc. which do not allow us to infer more than their parent concept would (Derivation, Agent). We are in a same situation with mentionOf. Further inferences are left to be specified by applications.
    • The reviewer's suggestion to address the use of Example 45 is to copy part of the referred bundle. By copying statements from the original context to the new context, we have lost the original context in which they occur (... their provenance!), and we have no way of expressing that wasAssociatedWith(ex:a1, ...) in the new context is a "kind of specialization" of wasAssociatedWith(ex:a1,...) in the original context, ... which is why mentionOf was introduced in the first place.
    • The reviewer also comments on the lack of information about 'Fixed aspects'. We refer to our response to ISSUE-462, and recent associated changes to the document.
  • At the fourth face to face meeting, the issue of Mention was discussed (see references below). There was consensus that the linking across bundle seemed to be a useful concept, but consensus could not be reached on the exact wording to describe the construct. It was recognized that there was no prior art on this notion, and therefore, the concept was more a research idea, than a standardization outcome. It was decided that the material at risk related to mention would be removed from the specifications, and would be incorporated in a separate note to be written.
  • References:
  • Changes to the document:
    • Feature at risk was removed from the specifications
  • Original author's acknowledgement:


ISSUE-541 (Optional Generation Time)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0135.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/541
  • Group Response:
    • It is correct that the PROV-N uses the production timeOrMarker to encode time information in PROV-N and does not specify what generation time means.
    • It is the purpose of the prov-dm document to define what is meant by generation time.
    • We refrained from repeating such definition in the prov-n document, to avoid potential inconsistencies between normative documents.
    • Instead the production generationExpression is followed by a table, mapping prov-n arguments to the corresponding prov-dm attributes, as defined in the prov-dm document.
    • For instance, timeOrMarker is mapped to time, defined in prov-dm as time: an optional "generation time" (t), the time at which the entity was completely created;

ISSUE-542 (Optional Usage Time)

ISSUE-543 (Key-Value)

ISSUE-545 (Structure of document)

ISSUE-537 (Syntax of identifiers)

ISSUE-535 (Grammar notation)

ISSUE-534 (Example)

ISSUE-536 (Syntax Ambiguity and - marker)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0130.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/536
  • Group Response:
    • The comment refers to the previous working draft. In particular, the syntax ambiguity issue is superseded by the syntax published in the Last Call Working Draft.
    • Our response to ISSUE-537 explains how optional identifiers should be expressed in PROV-N.
    • The proper syntax for the suggested examples would be as follows:
      • wasDerivedFrom(d; e2, e1) // semi-colon separates derivation identifier from other arguments
      • wasDerivedFrom(e2, e1, a, -, -) // where absent usage and generation are marked with -
    • The author also queries the choice of the special marker '-'. We had to use a symbol that was not a qualified name: - is not allowed as a local name.
    • The author suggests using NULL, but this is a valid local name (and hence, a valid qualified name).
    • Our response to ISSUE-533 addresses the named attributes option.
  • References:
  • Changes to the document: none
  • Original author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0029.html

ISSUE-538 (Rephrasing)

ISSUE-533 (Named Attributes)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-wg/2012Sep/0127.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/533
  • Group Response:
    • There is no right or wrong approach, there are essentially two different philosophies. Either we adopt a named attribute approach as suggested in the feedback, or we go for a positional attribute solution.
    • As suggested, by the author, it become a choice between:
      • wasDerivedFrom( derivation = $d, drv_entity = $e2, src_entity = $e1, activity = $a, generation = $g2, usage = $u1, [ optional_attributes] )
      • wasDerivedFrom(d; e2, e1, a, g2, u1, attrs)
    • The Working Group opted for the positional argument approach for the following reasons:
      • It is commonly used in programming languages and logic; it is also the approach used in OWL functional syntax
      • It is more concise as the above example illustrates
      • This latter point is particularly important when we write inferences (see prov-constraints). For example, the following inference is much more readable using positional notation.
        • IF wasGeneratedBy(id; e,a,_t,attrs) THEN wasInfluencedBy(id; e, a, attrs).
      • Other serializations produced by the Working Group and elsewhere adopt a named attribute approach (e.g. PROV-XML and PROV-JSON).
    • As far as the optional attributes were concerned, the requirements were different:
      • They are optional;
      • A given attribute may occur multiple times with different values;
      • They can be application specific.
    • For these, the positional solution was not suitable, but the named attribute solution was good.

ISSUE-546 (Encoding)

ISSUE-540 (Production Documentation)

ISSUE-539 (Production Documentation)

ISSUE-544 (Change section title)


ISSUE-446 (prov:involvee not documented in PROV-O)

ISSUE-476 (hadOriginalSource)

ISSUE-552 (Influence subclasses)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Sep/0000.html
  • Tracker: https://www.w3.org/2011/prov/track/issues/552
  • Group Response:
    • On "subclassing Influence":
      • The WG agrees with the suggestion that the phrase "a particular case of derivation" should be expressed using rdfs:subClassOf.
      • Since the prov-dm's definitions for revision, quotation, and primary source mention that they are "particular case[s] of derivation", then it follows that each should be subclasses in the PROV-O encoding. We changed PROV-O to include these three classes as a subclass of Derivation.
      • The WG agrees with the reviewer that "a kind of" is a more natural phrasing than "a particular case", and so we have adopted it as suggested.
    • On the phrasing of definitions:
      • It was pointed out that the definitions for "{Entity,Agent,Activity}Influence" are inconsistent with that of their parent class "Influence".
      • The source of this inconsistency is that {Entity,Agent,Activity}Influence are not defined by prov-dm, but by prov-o as artifacts of encoding prov-dm's model into the paradigm of OWL (i.e., the use of the qualification pattern to describe binary relations).
      • The inconsistent definitions were "demoted" to rdfs:comments because they focus too heavily on the RDF and OWL paradigm and not enough on how they are expressing the abstract model of prov-dm.
      • New definitions were created to align with their parent class, with a focus on how the classes are expressing the abstract model of prov-dm.
    • On the inconsistency of subclasses according to "general understanding of the english terms":
      • The reviewer points out that the definitions of Influence, EntityInfluence, and Start illustrate an inconsistency: "influence is a capacity, an entity influence is a provider (of descriptions) and a start is a "when" (a time)".
      • The WG acknowledges that the definitions as shown support this concern.
      • The inconsistency between Influence and its immediate subclasses {Entity,Agent,Activity}Influence is addressed by the response to the earlier comment ("phrasing of definitions").
      • To address the inconsistency between {Influence, {Entity,Agent,Activity}Influence} and {Start,End}, PROV-DM updated the definitions for Start and End:
        • Start is when an activity is deemed to have been started by an an entity, known as trigger . The activity did not exist before its start. Any usage, generation, or invalidation involving an activity follows the activity's start. A start may refer to a trigger entity that set off the activity, or to an activity, known as starter , that generated the trigger. ref
        • End is when an activity is deemed to have been ended by an entity, known as trigger . The activity no longer exists after its end. Any usage, generation, or invalidation involving an activity precedes the activity's end. An end may refer to a trigger entity that terminated the activity, or to an activity, known as ender that generated the trigger. ref
  • References:
  • Changes to the document:
    • prov-dm updated the definitions for revision, quotation, and primary source to reinforce that each is a relation.
    • prov-o changed to add axioms:
      • prov:Revision rdfs:subClassOf prov:Derivation .
      • prov:PrimarySource rdfs:subClassOf prov:Derivation .
      • prov:Quotation rdfs:subClassOf prov:Derivation .
    • prov-o "demoted" the original definitions of {Entity,Agent,Activity}Influence to rdfs:comments.
    • prov-o created new definitions for {Entity,Agent,Activity}Influence to align with their parent class definition.
    • prov-o removed existing comments on {Entity,Agent,Activity}Influence that were very similar to the new "prov-dm centric" definitions. The removed comments had more of an OWL flavor to them instead of an abstract flavor. For example, the following comment was removed:
      • "ActivityInfluence is intended to be a general subclass of Influence of an Activity. It is a superclass for more specific kinds of Influences (e.g. Generation, Communication, and Invalidation)." in favor of the definition "ActivitiyInfluence is the capacity an activity to have an effect on the character, development, or behavior of another by means of generation, invalidation, communication, or other."
    • The latest draft of the PROV-O html document reflects the definitions changed in the PROV-O OWL file:
    • PROV-DM's new definition for Start -> PROV-O's new definition for Start
    • PROV-DM's new definition for End -> PROV-O's new definition for End
  • Request for author's acknowledgement: http://lists.w3.org/Archives/Public/public-prov-comments/2012Nov/0023.html
  • Original author's acknowledgement: NONE

ISSUE-491 (prov:agent)

ISSUE-479 (citing Trig)

ISSUE-592 (wasInformedBy confusing with wasInfluencedBy)


ISSUE-556 (time-qualification)

ISSUE-576 (logical definition and comments on prov-constraints)

ISSUE-582 (document-instance)

ISSUE-586 (toplevel-bundle-description)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/586
  • Summary: The description of 'toplevel instance' as 'set of statements not appearing in a bundle' is unclear
  • Group response:
    • This is not a formal constraint; this description is potentially misleading, since it is allowed for multiple copies of the same statement to appear in toplevel instance and bundles.
  • References:
  • Changes to the document:
    • Clarify description of "toplevel instance" to just say that there is a toplevel instance and possibly some named instances, called bundles, and they are all treated independently for the purpose of validity checking (so presence or absence of statements in one instance never affects the validity of another).
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-587 (rdf-analogies)

ISSUE-588 (strictly-precedes-irreflexive)

ISSUE-584 (merging)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/584
  • Summary: The nonstandard/procedurally defined term 'merging'
  • Group response:
    • For terms, "merging" is exactly unification in the usual first-order logic / logic programming sense, as we state in a remark. For predicates that carry attribute lists, things are more complicated since key constraints require the attribute lists be combined, not unified in the usual sense.
  • References:
  • Changes to the document:
    • Use "unification" for "merging" at the level of terms
    • Declaratively describe unification as producing "either failure or a substitution that makes both sides equal", as well as giving the (standard) algorithm
    • Retain "merging" for the nonstandard operation on predicates that unifies the term arguments and concatenates the lists of attributes.
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-579 (declarative-fol-specification)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/579
  • Summary: Suggestion to replace procedural specification with (equivalent, but shorter and less prescriptive) declarative theory in First-Order Logic
  • Group response:
    • PROV-CONSTRAINTS intentionally reuses as much of standard techniques from logic and particularly database theory as possible. However, our audience (as reflected by the composition of the WG) is not expected to be familiar already with first-order logic, so we felt it was important to elaborate upon these concepts sufficiently that someone without background in these areas can implement it.
    • Moreover, writing an arbitrary FOL axiomatization has its own problems: since there is currently no standard way to do this we would have to restate a lot of the standard definitions in order to make the specification self-contained (as we have already done). In addition, an arbitrary FOL theory is not guaranteed to be decidable, even over finite models. We resolved that the constraints document had to demonstrate decidability/computability, as a basic prerequisite for implementability. Simply giving a set of FOL axioms on its own would not be enough to do this, and would leave (the vast majority of) implementors not familiar with FOL theorem proving/databases/constraint solving at sea with respect to implementation.
    • Thus, this issue is deferred to the planned PROV-SEM note.
  • References:
  • Changes to the document:
    • PROV-CONSTRAINTS updated to clarify that a declarative alternative is deferred to PROV-SEM
    • Add non-normative material PROV-SEM giving a FOL axiomatization, proof of soundness/completeness with respect to the algorithm in the spec and soundness with respect to the draft model-theory in the current draft of PROV-SEM.
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-585 (applying-satisfying-constraints)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/585
  • Summary: Suggestion to avoid discussing how to 'apply' definitions, inferences and constraints; the term 'satisfies' is not adequately defined in the context of PROV-CONSTRAINTS
  • Group response:
    • As noted in the response to ISSUE-579, we disagree that rewriting everything in terms of pure first-order logic would lead to a satisfactory specification (as opposed to a satisfactory research paper, say). The goal of the non-normative section here is essentially to link the (implicit) declarative semantics of the first-order theory, which we described informally earlier, with the procedural way in which normalization handles this behavior. This is exactly analogous with an operational, or proof-theoretic approach to the semantics of logic programming, which is equally correct compared with a declarative, denotational semantics; we simply chose to present the approach that lends itself more immediately to efficient implementation.
    • We inadvertently used "satisfies" as as shorthand for "passes all constraint checks without generating INVALID". This will be clarified.
  • References:
  • Changes to the document:
    • Clarify that "applying" is one of many ways of "checking" constraints
    • Clarify meaning of "satisfies" in definition of validity
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-583 (equivalent-instances-in-bundles)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/583
  • Summary: Questions concerning what it means for applications to treat equivalent instances 'in the same way', particularly in bundles.
  • Group response:
    • Since validity and equivalence are optional, this is not a formal requirement, but a guideline; what it means for an application to treat equivalent instances/documents "in the same way" is application specific, and there are natural settings where it makes sense for an application (evenone that cares about validity) to have different behavior for equivalent documents. We give one example of formatting/pretty printing. You give some additional examples; digital signing is a third. Because we have no way of circumscribing what applications might do or what it means for an application to treat documents "in the same way", we just leave this as a guideline.
  • References:
  • Changes to the document:
    • Clarify that the suggestion that applications SHOULD treat equivalent instances 'in the same way' is a guideline, and depends on what 'in the same way' means for a given application.
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-580 (drop-syntactic-sugar-definitions)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/580
  • Summary: Suggestion to drop definitions in section 4.1 since they are not needed if the semantics is defined more abstractly
  • Group response:
    • This is actually an orthogonal issue to the style of semantics; PROV-DM and PROV-N nowhere specify how missing arguments are to be expanded to the "PROV-DM abstract syntax" (which itself is not explicitly specified in PROV-DM). You're correct that Definition 1 (which expands short forms) is in a sense implicit in PROV-DM, which only discusses the long forms and their optional arguments, but it isn't said explicitly in either PROV-DM or PROV-N how the PROV-N short forms are to be expanded to PROV-DM. Furthermore, Def. 2-4 deal with special cases concerning optional/implicit parameters which are not explained anywhere else. We recognize that there is a certain amount of PROV-N centrism in these definitions, but since PROV-N is formally specified and the abstract syntax is not, we feel it's important to make fully clear how arbitrary PROV-N can be translated to the subset of PROV-N that corresponds to the abstract syntax of PROV-DM. This is to ensure that there is no room for misinterpretation among multiple readers, who may expect different conventions for expansion/implicit parameters (even if the rules we specified seem "obvious").
  • References:
  • Changes to the document:
    • Add a note clarifying the relationship between PROV-DM "abstract syntax" and PROV-N, and why the definitions are needed for this mapping.
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-577 (valid-vs-consistent)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/577
  • Summary: 'Valid' is used differently from its usual meaning in logic; 'consistent' would be a better term
  • Group response:
    • We would like to clarify that we are not attempting to define a semantics (in the sense of model theory or programming language semantics) for PROV in PROV-CONSTRAINTS. We may do this in a future version of PROV-SEM, by giving a first-order axiomatization that is sound with respect to the model theory that is in the current draft of PROV-SEM.
    • PROV-CONSTRAINTS defines a subset of PROV documents, currently called "valid", by analogy with the notion of validity in other Web standards such as XML, CSS, and so on. While concepts from logic are used, it is not intended as a logic or semantics.
    • We agree that it would be preferable to avoid redefining standard terminology from logic in nonstandard ways, and you are correct that "valid" means something different in logic than the sense in which it is usually used in Web standards. However, since we expect our audience to consist of implementors and not logicians, on reflection we prefer the terminology "valid"/"validation" over "consistent"/"consistency checking".
  • References:
  • Changes to the document:
    • Clarify (sec. 1.2) that our notion of "valid" is named by analogy to other W3C standards, such as CSS and XML, and that in logical terms it is "consistency"
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-578 (equivalence)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/578
  • Summary: Use of "equivalent"; incompatibility with common uses of the term in logic/mathematics
  • Group response:
    • This issue was discussed within the group already, and we could not come to an agreement on how equivalence should behave on invalid instances. Therefore, we decided not to define equivalence on invalid instances.
    • From a mathematical point of view, we only define equivalence as a relation over valid documents/instances, not all instances. This avoids the problem of deciding what to do with equivalence for invalid instances.
    • By analogy consider a typed programming language. An expression 2 + "foo" is not well-typed; technically one could consider a notion of equivalence on such expressions, so that for example, 2 + "foo" would be equivalent to (1 + 1) + "foo". But these ill-typed expressions are (by the definition of the language) not allowed. Similarly, for applications that care about validity, invalid PROV documents can be ignored, so (to us) there seems to be no negative consequence to defining equivalence to hold only on this subset of documents, or to defining all invalid documents to be equivalent (as would follow from the logical definition of equivalence).
    • However, for other applications, such as information retrieval, it is not safe to assume that an invalid instance is equivalent to "false"; we can imagine scenarios where an application wants to search for documents similar to an existing (possibly invalid) document. If the definition of equivalence considers all invalid documents equivalent, then there will be a huge number of matches that have no (intuitive) similarity to the query document.
    • We also plan to augment PROV-SEM with a logical formalization that will be related to both the model theory proposed there and the procedural specification in PROV-CONSTRAINTS. For this formalization, logical equivalence will be the same as PROV-equivalence on valid instances. (For invalid instances, logical equivalence requires making all invalid instances equivalent, which we prefer not to require.)
  • References:
  • Changes to the document:
    • explicitly defined isomorphism in normative section 6.1
    • specify that equivalence is an equivalence relation on *all* documents
    • specify that no invalid document is equivalent to a valid one
    • specify equivalence between valid documents as already done
    • leave it up to implementations how (if at all) to test equivalence on different invalid documents.
    • relating PROV-equivalence with logical equivalence is deferred to PROV-SEM
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

ISSUE-581 (avoid-specifying-algorithm)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Oct/0004.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/581
  • Summary: Suggestion to avoid wording that 'almost requires' using normalization to implement constraints
  • Group response:
    • Just saying that we *define* validity and equivalence in terms of a normalization procedure that *can* be used to check these properties does not require that all implementations explicitly perform normalization. We discussed this issue extensively, and one consequence of this is that the implementation criteria for the constraints document will only test the extensional behavior of validity/equivalence checks; implementations only need to classify documents as valid/invalid/equivalent etc. in the same way as the reference implementation, they do not have to "be" the reference implementation.
    • However, this issue arose relatively late in the process and we did miss some places where the document gives a misleading impression that normalization is required to implement the spec.
    • Nevertheless, as written, it is difficult to see how else one could implement the specification. In fact, you are correct that there is a simple, declarative specification via a FOL theory of what the normalization algorithm does, which could be used as a starting point for people with a formal background or those who wish to implement the specification in some other way. However, we disagree that it would improve the specification to adopt the declarative view as normative.
    • Making the document smaller and simpler in this way would detract from its usefulness to implementors that are not already experts in computational logic. In other words, we recognize that some implementors may want to check the constraints in other ways, but we believe that the algorithm we used to specify validity and equivalence is a particular, good way by default, because it sits within a well-understood formalism known from database constraints and finite first-order model theory.
    • The normal forms are essentially "universal instances" in the sense of Fagin et al. 2005, and the algorithm we outline is easily seen to be in polynomial time; in contrast, simply giving a FOL theory on its own gives no guarantee of efficiency or even decidability.
    • We intend to incorporate this theory and formalize the link between the procedural and declarative specifications in PROV-SEM. Although PROV-SEM will not be normative, any implementation that correctly implements the declarative specification given there will be correct.
    • We will also take greater care to explain that the procedural approach to specification is just one of many possible ways to implement constraint checking (though the group as a whole feels that it is a good default approach for implementors seeking a shortest path to compliance).
  • References:
    • Revise all parts of the document that may currently convey the impression that the normalization algorithm is a REQUIRED implementation strategy, to ensure that it is clear that this is one approach (among possibly many) that implementations MAY employ. PROV-SEM will present a declarative specification that may serve as a better starting point for alternative implementations.
    • Added a paragraph to the beginning of section 2 that specifically addresses this
  • Changes to the document:
  • Original author's acknowledgement: http://www.w3.org/mid/5092BBE0.1080403%2540emse.fr

PROV Primer

ISSUE-561 (Primer Section 2 figure)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Jul/0010.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/561
  • Group Response
    • Since (and partly prompted by) the reviewer comment, the Working Group has discussed the best form for the primer overview diagram.
    • It was decided to change so that the overview image used by primer is no longer to be a copy of the one from the PROV-DM. This is because the intention is different: the primer aims to give just a very few concepts and relations to give an intuition ahead of the rest of the introduction.
    • The figure has been changed to be a reduced version of the one used in the PROV-O specification, and no link between the diagrams in specs is now claimed.
  • References:
  • Changes to the document:
    • Removed the claim in the primer text that the image is the same as the one in PROV-DM.
    • Changed the primer key concepts (overview) image to be one with a reduced set of concepts and relations giving an introductory intuition.
  • Original author's acknowledgement:

ISSUE-562, ISSUE-563, ISSUE-564 (Specialization and alternates)

  • Original email: http://lists.w3.org/Archives/Public/public-prov-comments/2012Jul/0010.html
  • Tracker: http://www.w3.org/2011/prov/track/issues/561, http://www.w3.org/2011/prov/track/issues/562, http://www.w3.org/2011/prov/track/issues/563
  • Group Response
    • In ISSUE-562 and ISSUE-563, the comment is that the primer text implies particular things which the reviewer believes to be untrue, but are actually correct implications.
    • First, it is correct that specialization implies that the child entity inherits all of the attributes of the parent entity. It is the reviewer's counter-example that is an incorrect use of PROV: the "parent" entity of one version of a document is not the prior version of the document, but the document in general, i.e. independent of version. All versions of a document share the attributes of the document in general.
    • Second, the fact that two specializations of a single general entity are alternates of each other is a common case that fits the PROV definition of "alternate", and the implication is again correct.
    • The fact that the reviewer believed the implications to be incorrect suggests that the primer did not adequately explain the concepts.
    • ISSUE-564 relates to the reviewer finding the listed possible uses of the alternate relation confusingly distinct. Again, this is probably due to an inadequate explanation of the alternate and specialization relations.
    • The conclusion of the group is that the previous explanation of the concepts was not adequately clear.
  • References:
  • Changes to the document:
    • The intuitive introduction to specialization and alternate relations, Section 2.9, has been completely rewritten based around a few use cases each with more detail than present before. Specialization is introduced before alternate, as it more clearly gives the overall motivation for the relations. We believe this gives a clearer indication of what the relations mean, and in what cases they should be used.
    • See http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html#alternate-entities-and-specialization
  • Original author's acknowledgement: