The PROV provenance model – Last Call

The Provenance Working Group has released a last call working draft of its data model (PROV-DM). It is the fifth draft of this document: I summarize here the changes that occurred since its previous version.

From a presentation viewpoint, the Working Group has worked hard to identify a set of Core constructs. The core of PROV focuses on essential provenance structures commonly found in provenance descriptions. The core is centered around the concepts of entity, activity, and agents, and seven binary associations between these (generation, usage, communication, derivation, attribution, association, and delegation). Beyond PROV core, extended structures are designed to support more advanced uses of provenance.

Furthermore, the group has confirmed the structure of PROV in terms of components dealing with various facets of provenance. Core concepts are defined in the first three components.

  • Component 1 is concerned with entities, activities, and time.
  • Component 2 is about derivations and derivation subtypes.
  • Component 3 deals with agents, responsibility, and influence.

Extended concepts can be further found in the following components.

  • Component 4 is about bundles, a mechanism to support provenance of provenance.
  • Component 5 consists of relations linking entities referring to the same thing.
  • Component 6 is concerned about collections.

A number of technical changes occurred in this version of the document.

1. The Working Group decided to limit the scope of Component 6 to abstract collections and a simple membership relation. All the concepts related to dictionaries and operations over dictionaries are being moved to a separate note.

2. The Working Group has finalized a mechanism by which provenance of provenance can be expressed. In the PROV data model, individual provenance statements are not identifiable. However, an asserter can bundle up a set of statements, give it a name, and express its provenance, using PROV, since this bundle is itself an entity. The construct introduce to support this is a bundle.

3. The Working Group has finalized definitions of two relations linking entities. An entity is a specialization of another if it shares all aspects of the latter, and additionally presents more specific aspects of the same thing as the latter. In contrast, an entity is alternate of another if it presents aspect of the same thing.

For instance, a BBC News page for desktop http://www.bbc.co.uk/news/science-environment-17526723 is an alternate of a page for mobile http://m.bbc.co.uk/news/science-environment-17526723 devices. The BBC news home page on a given day (say 2012-03-23) is a specialization of the BBC news page in general.

These relations are important for provenance, since they relate entities about a same thing, though they may be described by different provenance statements.

The PROV-WG is releasing this fifth version of the PROV data model as a LAST CALL working draft. This means that the design is not expected to change significantly, going forward, and now is the key time for external review. If you wish to make comments regarding this document, please send them to public-prov-comments@w3.org. The Last Call period ends 18 September 2012. All feedback is welcome.

One thought on “The PROV provenance model – Last Call

Comments are closed.