PROV examples

From Provenance WG Wiki
Revision as of 16:01, 28 February 2012 by Tlebo (Talk | contribs)

Jump to: navigation, search

This page is an initial proposal for how to organize concrete PROV examples. It is intended that these examples can guide discussions of mappings between PROV serializations, be available for automated verification, and provide test cases for PROV tools.


PROV examples have been kept in the following places:

Example root

The collection is maintained in the prov-wg mercurial repository at

Facets of organization

Some materials that we might want to organize around the example.

  • Example name ("example 1", "khalid at restaurant", etc.)
  • Original Format (ASN, XML, RDF, JSON)
  • Derived Formats (ASN, XML, RDF, JSON)
  • Testing materials (??, XQUERY, SPARQL, ??)
  • Test targets (i.e., the expected outputs to compare)


The first level of organization is the original format:

Within each of these directories can be a stub template. For example, the rdf stub defines some common prefixes.

These "original format" directories contain a directory for each example. The example directory is named according to the pattern:


The title is optional. Use dashes to separate words, avoid underscores and spaces. The title need not be unique, since the number will be.

For example the following two directories contain the materials for the first two examples. The first one is not titled, while the second one is.

<number> is determined by counting the number of examples in the current directory and adding one.

The numbering is determined within the current directory, not across all directories. So, the following two examples have nothing to do with each other:
 |----- nothing to do with each other.

Name the file containing the example exactly the same as the directory. So,

If the example was converted, place the result in a directory named after the resulting format. So, if these rdf examples were converted to ASN, the result would be stored at:

The structure of the testing materials is still undecided. It will likely change with each test format. Whatever structure is used in one format directory (e.g. examples/asn/) should be used in all format directories (e.g. examples/rdf/eg-1/convert/asn/)

If queries can be applied to the example file (e.g. eg-1.ttl), then put the queries in a directory query. Follow the same pattern for naming queries as we used for naming examples:


For example, the first query was not titled, but the second one was:

The query should be applied to the union of the example files (e.g. eg-1/eg-1.ttl is the only file).

Expected output for a query should match the name of the query, but be placed in the compare directory:


  • remove first level format layer, bringing example to top
  • add place to link to homepage for example. (The examples I have in mind 1) identify the problem 2) the use

of provenance 3) the PROV example 4) how PROV is accessed and queried? I'm thinking 1-1 1/2 pages per each example.)

  • add template on wiki to document example.