PROV – A Framework for Provenance Interchange

Last week, the W3C Provenance Working group released 13 documents simultaneously that together define a framework for interchanging provenance on the Web. We are really excited about this release as it a complete, full and stable definition of PROV and includes 4 Proposed Recommendations.

While 13 documents is a lot, this is because we have broken down PROV into chunks designed for particular communities and usages. As users of PROV you won’t have to focus on the entire framework just the parts that you need. For an overview of this family of documents and the intended audience check out the PROV Overview.

Here, I wanted to provide you a bit of a guide to the PROV framework and the role of the various documents.

The Core: A Data Model

At the center of PROV is a data model, PROV-DM, that defines a vocabulary for describing provenance. These terms allow for the description of provenance from data, process and agent perspectives. PROV-DM is can be written down in multiple serialization technologies. PROV defines 3 serializations.

  1. PROV-O is a lightweight OWL2 ontology designed for Linked Data and Semantic Web applications.
  2. PROV-N is a compact syntax aimed at human consumption.
  3. PROV-XML is a native xml schema specifically designed for the XML community.

Using these serializations, applications can expose and interchange provenance. PROV-DM and its serializations have specifically been designed with extensibility in mind. We already have several extensions of PROV-O designed for specific communities.

Supporting Validation

PROV-DM and the associated serializations were purposely designed to allow for flexibility in writing provenance. We wanted to make it as easy to get started as possible and to allow for adaptability as PROV is increasingly used. However, we also realized that some users want a guide as to ensure that their provenance is consistent. Just like there are HTML validators we wanted to provide PROV validators. PROV Constraints defines a set of constraints that can be used to implement validators. PROV-Constraints is backed by a formal semantics defined in PROV Sem.

Data Model extensions

Two use cases for modeling provenance are seen in multiple applications, one is the case of aggregating information into collection/dictionary type structures (e.g. a folder with files) and the other is connecting multiple provenance traces together. PROV-Dictionary and PROV-Links provide define constructs to help model these constructs.

Accessing Provenance

Finally, once you’ve modeled your provenance, you want to be able to easily expose it. PROV-AQ defines how to use already existing web mechanisms, like link headers, to make provenance available. A key part of the design of PROV-AQ was to make it independent of any serialization format, so you can use whatever best fits your needs.

Dublin Core

Dublin Core is one of the most widely published vocabularies and many of its terms are associated with provenance. Working with the DC community, we’ve defined a mapping between Dublin Core and PROV-O ( PROV-DC ). This means that applications who support PROV can easily consume provenance already exposed as Dublin Core

Summary

PROV provides a framework for writing down, validating and exchanging provenance information in an interoperable way. Already over 60 implementations support PROV and we expect more in the future. If you have an implementation, there’s still time to register yours using one of our surveys. See the Call for Implementations page for more information. PROV contains both recommendations and notes. The classification was primarily based on the amount of prior work and implementation experience the specification has.

What you can do

We are still looking for feedback on the documents: PROV-Primer, PROV-XML, PROV-DC, PROV-Dictionary, PROV-Links, PROV-Sem. You can also report your implementation. If you have questions or comments, please contact public-prov-comments@w3.org

Finally, if your a W3C member and think that PROV should be a final recommendation of the W3C encourage your AC Representative to vote for the specification.

One Response to PROV – A Framework for Provenance Interchange

  1. Pingback: Dublin Core to PROV mapping | Semantic Web Activity News