F2F1 Access and Query Proposal

From Provenance WG Wiki
Revision as of 12:27, 13 June 2011 by Smiles (Talk | contribs)

Jump to: navigation, search

Access and Query Proposal for the F2F1

This page concerns provenance access and query in preparation for the first F2F meeting, focusing on a limited range of aims, proposing solutions and presenting open issues regarding those aims. The wider objectives of the access and query task force will be clarified in discussion after the meeting.


  1. By 9/Jun: Agree on scope for F2F1 (mailing list)
  2. By 23/Jun: Populate proposals for solutions to the scope questions (wiki)
  3. By 30/Jun: Identify issues with the solutions proposed (wiki, mailing list)
  4. By 30/Jun: Prepare draft for discussion at F2F1 (wiki)

2 and 3 may be interleaved. 4 may just be the outcome of the process, with some editorial cleaning up.


Consider the following scenario. A user gains access to an online resource through browsing the web and downloading it, by receiving by email, transferring it via FTP, or by some other protocol. The client software (browser, email client etc.) offers an "Oh yeah?" button, by which the provenance of the resource will be retrieved and displayed. What does the client do on the button being clicked, what information does it need in order to perform the retrieval, and where does that information come from?

To meet time constraints before the F2F meeting, this document will concern two questions only:

  1. Given the identity, I, of a resource state representation and a location, L, from which to retrieve provenance, how do we obtain the provenance of the representation from the location?
  2. How can a browser find I and L (as above) for an HTML document that was downloaded, so that its provenance may be retrieved?

Retrieving the provenance of a document from its identifier

Proposals to meet the first aim

Proposal: Use HTTP Link to find identifier

(proposal by GK - 2011-06-09)

For a document accessible using HTTP, POWDER described a mechanism (http://www.w3.org/TR/2009/REC-powder-dr-20090901/#httplink) for associating a POWDER description with the resource, adding an HTTP Link header field to the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here). Since the POWDER specification was published, the HTTP linking draft has been approved by the IETF as RFC 5988 (http://tools.ietf.org/html/rfc5988).

Link: provenance-URI; rel="describedby"

Open considerations:

  1. use the POWDER describedBy link relation type, or register a new one? I tend to favour the latter as, while POWDER and provenance information are both descriptions of a resource, the intent and coverage is different. Thus, rel="describedBy" in the above example might become rel="provenance", subject to registration of the new link relation type (cf. http://tools.ietf.org/html/rfc5988#section-6.2.1).
  2. I do not include the type= option in this example as shown in the POWDER example. This is in keeping with the notion that the mechanism for accessing provenance should be independent of its format.

Proposal: Use HTTP to retrieve provenance

(proposal by GK - 2011-06-09)

A general presumption is that provenance information is accessed in the same way as any web resource. Typically, this will be via HTTP. Thus, any given provenance information will be associated with a URI, and may be accessed by dereferencing that URI using nomal Web mechanisms.

The problem of accessing required provenance information then reduces to the problem of finding its URI.

Embedding information necessary to retrieve provenance in an HTML document

Proposals to meet the second aim

The identity of the HTML document and the location from which to download its provenance can be embedded in the HTML itself.

Proposal: Use HTML Link

(proposal by GK - 2011-06-09)

For a document presented as HTML or XHTML without regard for how it has been obtained, POWDER describes a mechanism (http://www.w3.org/TR/2009/REC-powder-dr-20090901/#assoc-markup) for associating a POWDER description with the document, adding <Link> element to the HTML <head> section.

I propose an adaptation of this mechanism, probably using a different link rel= value. The POWDER specification makes no explicit reference to the RFC 5988 registry of link relations, but I expect they could apply also to HTML header <Link> elements.

Example (adapted from POWDER):

 <html xmlns="http://www.w3.org/1999/xhtml">
    <head profile="http://www.w3.org/2007/11/powder-profile">
       <meta name="wdr.issuedby" content="http://authority.example.org/company.rdf#me"/>
       <link rel="describedby" href="''provenanceURI''">
       <title>Welcome to example.com </title>

The POWDER specification also adds:

  • Documents MAY also include any of the attribution data from the POWDER document in meta tags. In particular, the issuedby field is likely to be useful to user agents deciding whether or not to fetch the full POWDER document. Any attribution data encoded in meta tags within an HTML document should be the same a s that in the POWDER document. In case of discrepancy, the POWDER document should be taken as more authoritative.

Issues regarding the above

Open discussions, informative notes, criticisms, etc. regarding the proposals above. These would ideally be articulated as short specific questions which can be discussed individually at the F2F meeting.

Issue: Access mechanism should be independent of provenance format

(by GK - 2011-06-09)

The mechanisms for accessing provenance are independent of the format of the provenance information itself. While there is an expectation that provenance information will be returned as RDF or RDFa, other formats are possible and the access mechanisms should as far as possible avoid any dependence on the format of information received.

Issues beyond scope

Issues about provenance access and query that go beyond the two specific aims above. These will either be discussed at the F2F meeting if there is time, or else postponed until after.