Feedback Welcome: An Overview of the Provenance (PROV) family of specs

Part of Data

Author(s) and publish date

Skip to 2 comments

Knowing how, where, when and why content was produced is an important part of making a trustworthy web. However, it is often difficult to interchange this provenance information between systems. For example, it's often difficult to locate or find provenance information for a web page. Even if the provenance information is located, it is often only available as text or if it is available in a structured way it does not use a common terminology -- making it difficult to create software that can leverage this information.

The Provenance Working Group was charted to help address these limitations. The group has been working diligently to create a family of specifications (called PROV) that allow for the interchange of provenance. The group is looking for your feedback. This post provides an overview of the various working drafts that have been published and should help you find your way around.

The set of specs at this point addresses two aspects of provenance interoperability introduced above:

  • provenance access
  • provenance representation

PROV-AQ: Provenance Access and Query addresses how to both make available and retrieve provenance information for Web resources. The document specifies how to use existing Web technologies such as HTTP, link headers, and SPARQL to accomplish this. Where possible the specification attempts to be agnostic the format of the provenance being accessed.

Once some provenance is obtained, it is important for the information to be understandable in a machine interpretable fashion. The Working Group has defined a data model (PROV-DM) that provides facilities for representing the entities, people and activities involved in producing a piece of data or thing in the world. The data model is domain-agnostic and has well defined extensibility points. Importantly, the data model has a corresponding OWL ontology (PROV-O) that encodes the PROV-DM. PROV-O is envisioned to specify the serialization for exchanging provenance information.

To help orient users of PROV-O and PROV-DM, the working group has developed a primer (PROV-Primer) that introduces the core constructs of the data model and provides examples using PROV-O. It is recommended that users and reviewers of the specification begin with the primer before moving to the ontology or data model.

The group is looking for feedback of all types: Would you expose provenance using PROV-AQ? Can you represent your provenance information using the PROV-O data model? Does PROV-O integrate well with your Linked Data or other Semantic Web infrastructure?

Let us know what you think.

The PROV family of specifications:

Paul Groth and Luc Moreau on behalf of the PROV-WG

Related RSS feed

Comments (2)

Comments for this post are closed.