PROV examples - directory conventions

From Provenance WG Wiki
Jump to: navigation, search

Author: Tim Lebo, with many good suggestions from prov-wg

This page is a proposal for how to organize concrete PROV examples. It is intended that these examples can guide discussions of mappings between PROV serializations, be available for automated verification, and provide test cases for PROV tools.

Background

PROV examples were previously kept in the following places:

This design is a fresh start.

Example root

The collection is maintained in the prov-wg mercurial repository at

http://dvcs.w3.org/hg/prov/file/tip/examples/

see MercurialRepository and Mercurial_repository for instructions on how to clone, add to, and commit to the prov-wg repository.

Facets of organization

Some materials that we might want to organize around the example.

  • Example name ("painting flying to boston", "khalid at restaurant", etc.)
  • Application domain (e.g. Journalism, Life Sciences, Financial and Legal Audit) (TODO)
  • Narrative and discussion (e.g. 1-2 pages in wiki format)
  • Original encoding Format (ASN, XML, RDF, JSON)
  • Derived encoding Formats (ASN, XML, RDF, JSON)
  • Testing materials (??, XQUERY, SPARQL, ??)
  • Test targets (i.e., the expected outputs to compare)

Directory organization and naming conventions

Directory for each example

A directory is created for each example. The example's directory is named according to the pattern:

'eg-'<number>['-'title]

The title is optional and need not be unique. Use dashes to separate words, avoid underscores and spaces. Once the number is assigned, it may not be changed.

For example, the following two directories contain the materials for the first two examples. The first one is not titled, while the second one is.

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait

<number> is determined by counting the number of examples directories and adding one.

Files

Place the file containing the example in a directory that corresponds to its format. Name the file to match the name of the example directory. Use a file extension to indicate which format the example uses (e.g. ttl, rdf, nt for PROV-O, asn for PROV-DM, xml, etc) For example, the following two examples encode RDF using Turtle:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/eg-1.ttl
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait/rdf/eg-2-self-portrait.ttl

The following format directory names may be used:

  • rdf for PROV-O. This is for the rdf/xml, turtle, ntriples, rdfa, trig, trix, etc. formats.
  • asn for PROV-DM.
  • xml for PROV-XML.
  • json

Illustrating Example files

illustrate
./eg-11-w3c-publication/asn/convert/rdf/illustrate/eg-11-w3c-publication.n3.rdf.graffle.png

Automated creation

NOTE: prov-wg has not reviewed this section 

If the example can be created automatically, place the code that can do such at:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-9-provrdf-owl-coverage/rdf/create/

the results should be stored at:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-9-provrdf-owl-coverage/rdf/create/rdf/eg-9-provrdf-owl-coverage.ttl

Transcriptions

If the original example was transcribed (either manually or automatically), place the result in a directory named after the resulting format. So, if these rdf examples were converted to ASN, the result would be stored at:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/convert/asn/eg-1.asn
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait/rdf/convert/asn/eg-2-self-portrait.asn

Note that these locations are different than the following (rdf/convert/ is missing). In the cases above, the ASN was converted from rdf. In the cases below, the ASN was the original format.

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/asn/eg-1.asn
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-asn/eg-2-self-portrait.asn

To take it a step further, we can see round tripping as:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/convert/asn/convert/rdf/eg-1.ttl
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait/rdf/convert/asn/convert/rdf/eg-2-self-portrait.ttl

Testing

The structure of the testing materials is still undecided. It will likely change with each test format. Whatever structure is used in one format directory (e.g. examples/$eg/asn/) should be used in all format directories (e.g. examples/$eg/$format/convert/asn/)

Queries

If queries can be applied to the example file (e.g. eg-1.ttl), then put the queries in a directory named query. Follow the same pattern for naming queries as we used for naming examples:

'query-'<number>['-'title]

For example, the first query is not titled, but the second one was:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/query/query-1.rq
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/query/query-2-who-to-blame.rq

The query should be applied to the union of the example files in the current directory (in this example, examples/eg-1/rdf/eg-1.ttl is the only file).

Expected output for a query should match the name of the query, but be placed in the compare directory:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/query/compare/query-1.out
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/query/compare/query-2-who-to-blame.out

Queries on Round-trips

Queries that applied to earlier encodings of the same format can be applied to the round-tripped results, without duplicating the queries. Queries that should be applied ONLY to the round-tripped results may be placed in the appropriate location (by following the overlay of the two design patterns):

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/convert/asn/convert/rdf/query/query-1.rq
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait/rdf/convert/asn/convert/rdf/query/query-1.rq

Note that the following two queries are distinct. The former applies to both RDF encodings, but the latter only applies to the older RDF encodings.

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/query/query-1.rq
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/convert/asn/convert/rdf/query/query-1.rq

Inference

This section is a sketch and needs to be annealed with an example.

If the example uses inference, the ontology or rules can be stored in the infer directory. The inferences should be applied to the union of the files in the example.

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/infer/my-property-chain.owl

The expected output is stored in the compare directory.

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/infer/compare/my-property-chain.ttl

If the inference should be queried, the query pattern in the previous section can be applied:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/infer/compare/query/query-1.rq
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/infer/compare/query/compare/query-1.out

Documentation

Use a directory document at any level within the example directory structure to document the corresponding level. For example, the following text files describe the purpose of the overall example.

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/document/readme.txt
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait/readme.txt

Within a document directory, a file homepage may contain one or more URLs that describe this example:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/document/homepage
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-2-self-portrait/homepage

The PROV-O encoding can be documented here:

http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/document/

The ASN conversion processactivity can be documented here:

 http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/convert/document/

The ASN output can be documented here:

 http://dvcs.w3.org/hg/prov/file/tip/examples/eg-1/rdf/convert/asn/document/

(and so on)

Reserved namespace for wiki documentation

The namespace http://www.w3.org/2011/prov/wiki/eg- is reserved for wiki page documentation for all examples in http://dvcs.w3.org/hg/prov/file/tip/examples/. The following mapping must be followed:

http://dvcs.w3.org/hg/prov/file/tip/examples/<eg-name> <=> http://www.w3.org/2011/prov/wiki/<eg-name>

See How to document a PROV example

Additional Links

TODO

  • We may want to put in some examples for the top 10 domains that at least the folks participating in the WG believe are applicable domains. In other words, I believe some organization on the page (via tagging or structure, whatever) that is by domain would benefit us. In order to foster the adoption of the standard, we need to show folks that are looking for solutions to specific problems, more so than tool builders, how they can solve their specific problems.
  • DONE remove first level format layer, bringing example to top
  • DONE add place to link to homepage for example. (document/homepage described on this page)
  • DONE (The examples I have in mind 1) identify the problem 2) the use of provenance 3) the PROV example 4) how PROV is accessed and queried? I'm thinking 1-1 1/2 pages per each example.) How to document a PROV example
  • DONE add template on wiki to document example.
  • DONE Where to put a stub template? For example, the rdf stub defines some common prefixes. http://dvcs.w3.org/hg/prov/file/tip/examples/rdf/prefixes.ttl (first for examples are stubs)