Team Comment on the “The PROV-JSON Serialization” Submission


The W3C is pleased to receive the “PROV-JSON Serialization” submission from the University of Southampton, UK, edited by Trung Dong Huynh, Michael O. Jewell, Amir Sezavar Kshavarz, Danius T. Michaelides, Huanjia Yang, and Luc Moreau.

The W3C Provenance Working Group published the PROV “family” of document on the 30th of April, 2013, as W3C Recommendations as well as a number of associated Working Group Notes. The core document of the family is the “PROV-DM: The PROV Data Model” Recommendation, which defines an abstract view of provenance information that can be exchanged among applications. The “PROV-N: The Provenance Notation” Recommendation defines a way to serialize this abstract information for human consumption;  the “PROV-O: The PROV Ontology” Recommendation maps the general model on OWL2, making it suitable to use the PROV vocabulary on the Semantic Web, and which can be serialized using the various RDF serialization formats, i.e., RDF/XML, Turtle, RDFa, or JSON-LD.

The generality of the PROV Data Model makes it possible to use the model and the vocabulary in different applications that are not necessarily using the Semantic Web. However, when the provenance information is used in a purely XML environment, for example, it is more succinct and more manageable to serialize the same information directly into XML. To achieve this, the Working Group has also published the “PROV-XML: The PROV XML Schema” document, albeit only as a Working Group Note, which defines this direct mapping. Similarly, there are applications on the Web that rely essentially on JSON; for those a JSON serialization of the provenance information would be useful. Unfortunately, that work item has not been added to the Working Group’s original charter, and the group is soon to close. Consequently, this particular serialization could not be defined by the Working Group. This is exactly what the submission covers: it defines a way to encode the provenance information in JSON. By doing so, it makes the generic PROV model usable for another large family of applications on the Web.

One may wonder whether it is worth defining such a serialization. After all, technically, it would be possible to go through the mapping of the provenance data into RDF using PROV-O, and serializing the resulting RDF graph using JSON-LD. However, the outcome would be more complicated than necessary. Indeed, both of these steps use structures required by the general case they aim for (and are necessary to use provenance information with Linked Data), but the price is a bigger complexity of the result.  Just as there is a need for PROV-XML (although the same argument could be made using PROV-O and RDF/XML instead), there is also a need for a direct JSON encoding, and this justifies the separate serialization definition.

Next Steps

As of today, the Provenance Working Group is closing down, having successfully completed its chartered work. However, if, in future, a new Provenance Working Group is chartered, it would be of interest to take this Submission and promote it into a Working Group Note or a Recommendation. Also, if other Working Groups are created that concentrate on the more general issue of Data on the Web, this serialization may well be a starting point for further work if provenance issues are also addressed.

We encourage people interested by this work to discuss on the semantic-web@w3.org [public archive] or the public-provenance-comments@w3.org [public archive] Mailing Lists.

Ivan Herman, Semantic Web Activity Lead <ivan@w3.org>,
