F2F1 Access and Query Proposal

From Provenance WG Wiki
Revision as of 15:03, 29 June 2011 by Smiles (Talk | contribs)

Jump to: navigation, search

Access and Query Proposal for the F2F1

This page concerns provenance access and query in preparation for the first F2F meeting, focusing on a limited range of aims, proposing solutions and presenting open issues regarding those aims. The wider objectives of the Access and Query Task Force will be clarified in discussion after the meeting.

Plan

  1. By 9/Jun: Agree on scope for F2F1 (mailing list)
  2. By 23/Jun: Populate proposals for solutions to the scope questions (wiki)
  3. By 30/Jun: Identify issues with the solutions proposed (wiki, mailing list)
  4. By 30/Jun: Prepare draft for discussion at F2F1 (wiki)

2 and 3 may be interleaved. 4 may just be the outcome of the process, with some editorial cleaning up.

Scope

Consider the following scenario. A user gains access to an online resource through browsing the web and downloading it, by receiving by email, transferring it via FTP, or by some other protocol. The client software (browser, email client etc.) offers an "Oh yeah?" button, by which the provenance of the resource will be retrieved and displayed. What does the client do on the button being clicked, what information does it need in order to perform the retrieval, and where does that information come from?

To meet time constraints before the F2F meeting, this document will concern two questions only:

  1. Given the identity, I, of a resource state representation and a location, L, from which to retrieve provenance, how do we obtain the provenance of the representation from the location?
  2. How can a browser find I and L (as above) for an HTML document that was downloaded, so that its provenance may be retrieved?

Initial Points for Decision

Below are some initial, limited points on which we may be able to decide at the F2F1 before opening up discussion about the proposals.

P1. There may be data regarding the provenance of a thing accessible from multiple sources.

P2. The information required to obtain access to some provenance of a thing may be supplied in many different ways, and we do not aim to enumerate them all.

P3. The WG effort will concern how the provider of a thing can supply information required to obtain access to some provenance of that thing (which may, as a side effect, include recommendations on how others can do the same).

P4. Regarding some provenance data obtained from dereferencing a provenance URI, calling a provenance service, or some other means. Which of the following is true?

  • (a) It is apparent from the data itself what single thing it describes the provenance of.
  • (b) Provenance data documents a set of things and how they are related by past occurrences, so to extract the provenance of any one thing requires knowing how it is identified in the data.
  • (c) Something else.

P5. Regarding some provenance data obtained from dereferencing a provenance URI, calling a provenance service, or some other means. Which of the following is true?

  • (a) To meet the standard, it should be immutable.
  • (b) It can change over time without restriction.
  • (c) To meet the standard, there are particular ways in which it should not change, e.g. any one account should remain as it was.

Retrieving the provenance of a document from its identifier

Proposals to meet the first aim

Proposal: Use HTTP Link to find identifier

(proposal by GK - 2011-06-09)

For a document accessible using HTTP, POWDER described a mechanism (http://www.w3.org/TR/2009/REC-powder-dr-20090901/#httplink) for associating a POWDER description with the resource, adding an HTTP Link header field to the HTTP response to a GET or HEAD operation (other HTTP operations are not excluded, but are not considered here). Since the POWDER specification was published, the HTTP linking draft has been approved by the IETF as RFC 5988 (http://tools.ietf.org/html/rfc5988).

Link: provenance-URI; rel="describedby"

Open considerations:

  1. use the POWDER describedBy link relation type, or register a new one? I tend to favour the latter as, while POWDER and provenance information are both descriptions of a resource, the intent and coverage is different. Thus, rel="describedBy" in the above example might become rel="provenance", subject to registration of the new link relation type (cf. http://tools.ietf.org/html/rfc5988#section-6.2.1).
  2. I do not include the type= option in this example as shown in the POWDER example. This is in keeping with the notion that the mechanism for accessing provenance should be independent of its format.


Comments

  • as indicated below for HTML Link, I don't think it's necessary to refer to POWDER, but we can take the same approach as POWDER. Indeed, i don't think that the relationship "describedby" is appropriate to link to provenance. Also, we need to have a reference to provenance service (see below).
    • I agree we can use POWDER approach without referencing POWDER. Not sure why we need a reference to a provenance service. -- GK
  • this approach is good if we can't embed provenance information in the document itself
    • embedded provenance is not excluded - your phrasing implies it would be preferred, but I think the non-embedded case is more fundamental. --GK
      • phrasing was not intended to express a preference. It is an observation. --Luc Moreau 23:23, 17 June 2011 (UTC)
  • could also apply to RSS feeds
    • trivially true to the extent it could apply to any HTTP retrieval. I wouldn't make explicit mention of RSS.
      • apologies, I meant ATOM feed (as done in POWDER) --Luc Moreau 23:23, 17 June 2011 (UTC)
  • so this is an alternative to the HTML link solution
    • this and HTML link are indeed alternatives, not all-encompassing and not mutually exclusive (though I would tend to discourage using both together) -- GK

--Luc Moreau 07:48, 17 June 2011 (UTC)

  • What is provenance-uri? is it unique? is provenance found at one and only one place? is there one and only one authority to provide provenance information about something? -- Luc
    • Provenance can be found many places - but the provenance linked from the resource itself can often be seen as more authorative. I think I would combine this approach with the Provenance Service - you should be able to find and provide provenance without knowing of or having to set up any provenance service. -- Stian
  • Can more than one "provenance" Link be provided? Imagine the resource knows of some of the (possibly conflicting) accounts of the provenance, distributed across different web resources. -- or would the link go to a meta-provenance that describes the other links? --Stian

Proposal: Use HTTP to retrieve provenance

(proposal by GK - 2011-06-09)

A general presumption is that provenance information is accessed in the same way as any web resource. Typically, this will be via HTTP. Thus, any given provenance information will be associated with a URI, and may be accessed by dereferencing that URI using nomal Web mechanisms.

The problem of accessing required provenance information then reduces to the problem of finding its URI.

Comments

This assumes that the URI is dereferenceable, which may not be always the case. Also, one needs to distinguish the URL where a resource is located, from the identity (probably expressed as URI) of the state representation.

  • True, this assumes the provenance URI is dereferencable. I think this should be considered the default case. I'd want to see clear use-cases before considering alternatives. -- GK
    • Your notion of provenance-uri is not clearly defined. Is it a URI indicating where provenance can be found, or is it a URI assigned to the entity for the purpose of tracking its provenance. If the latter, your own email http://lists.w3.org/Archives/Public/public-prov-wg/2011May/0131.html makes it clear that it could be a non-dereferenceable URI (e.g. a UUID URN) --Luc Moreau 23:57, 17 June 2011 (UTC)
  • I strongly disagree about any need distinguish between URL and URI. There be dragons. Just say nothing about this. -- GK

--Luc Moreau 07:45, 17 June 2011 (UTC)

I think that there is a fundamental difference between our approaches, which I summarise as follows.

  • Your approach seems to associate a provenance-uri with every representation returned (it can be passed in the message header or in the representation itself). I cannot ascertain whether there is one provenance-uri returned, or whether there can be multiple.
  • My approach is to:
    • uniquely name a 'thing' (I) (which may be different from the resource URL in the case of a stateful resource)
    • provide a location (L) where provenance can be retrieved
    • so, I think L combined with I would give a provenance-uri
    • this approach is more flexible, since it allows us to ask other services, do you know about thing I?
  • Could you not just say that the provenance URI identifies 'a' provenance resource (dereferencable or not) which *should* provide a minimal provenance of the Thing - but might also give provenance of many other things and links to deeper provenance about the Thing? --Stian
  • To locate statements about the Thing in the provenance, and avoiding separate provenance identifiers for each Thing, you need to carry over the identifier - are you suggesting that as the Thing is not necessarily the current Resource State then you also need an "provenance-about" identifier in the link? --Stian

--Luc Moreau 23:38, 17 June 2011 (UTC)

Proposal: HTTP Protocol to retrieve provenance from a provenance service

There are two constraints to consider:

  1. The incubator group identified the importance of being able to retrieve provenance from third party "provenance services".
  2. Furthermore, a document or resource state representation may be identified by a URI that is dereferenceable (e.g. http URL) or not (UUID URI).

So, one could adopt a protocol similar to the SPARQL 1.1 Graph Store HTTP Protocol, and adapted for the purpose of provenance (http://www.w3.org/TR/2011/WD-sparql11-http-rdf-update-20110512/#http-get)

The following operation allows us to retrieve the provenance of something, identified by URI, from provenanceservice.com. The URI is not required to be dereferenceable.

GET /provenance/service?entity=.. uri ..  HTTP/1.1
HOST: provenanceservice.com
Accept: application/rdf+xml

This allows us to use http content negotiation, and return serializations of the provenance in multiple format (here rdf+xml).

When the URI is dereferenceable, there is also the possibility that the provenance service is the same host as the host that delivered the document/resource state representation. In that case, the uri can be given a default, which is the one for the target:

GET uri_path?entity  HTTP/1.1
HOST: uri_host
Accept: application/rdf+xml

Note that the original proposal used ?provenance, but then, it wasn't clear it was the same mechanism, where simply the parameter entity was given a default value.

Comments

-- I don't think we should go here. It added complication for which there is no clear need. And I think your proposed use of HTTP host may be wrong in any case (but I haven't checked it carefully). There may be a variety of ways to pull information from a third party service. I don't think we should begin to define a mechanism until we have a clear requirement. If here is a mechanism to specify, the charter suggests a SPARQL query, IIRC -- GK, 2011-06-17

--Luc Moreau 23:45, 17 June 2011 (UTC)

  • It's because it leads to introduction of "requirements" like this that I was concerned about the XG pronouncements on access and query. The user requirement a stated is fair enough, but the assumption of a new specification to achieve it is not. I think this is introducing a range of complexities for which there is no clear need of a standard. I strongly hold the view that provenance should be treated as a resource like any other web resource, and the introduction of special mechanisms for provenance should be kept to an absolute minimum. You make reference to "protocol similar to the SPARQL" -- I say that "similar to" is not good enough: it should *be* SPARQL. I see no compelling case that SPARQL cannot do any reasonably required job. -- GK.

--Luc Moreau 13:37, 13 June 2011 (UTC)

-- Question/Comment by Daniel G (14:34, 16 June 2011): Is this supposed to support queries for the provenance of the provenance of the resource (like uri_path?provenance?provenance)

Provenance of a resource is representation of a resource, which itself has an identity, and can have a URI. So same approach follows, with this URI.

--Luc Moreau 14:25, 16 June 2011 (UTC)

--Comment by Daniel G (18:17, 16 June 2011). Of course, but my concern was that if this design will involve some automatic recursive redirection form the server instead of doing it manually. For example, if I type uri_path?provenance?provenance?provenance in the browser will it take me directly to the provenance information of the provenance information of the provenance info of the uri_path? Or do I have to access those uris myself step by step?

Your browser should do this for you.


This said, it does not make much sense:

  • the result of a GET for uri_path?provenance may vary over time, since more provenance information (from multiple observers) can become available
  • hence, each result of a GET for uri_path?provenance needs to have its own identity (which could be embedded in the result, or returned in the http header, as described above)
  • it is this second identity that needs to be used to obtain the provenance of the provenance
  • uri_path?provenance?provenance would fail to identify which provenance information we want to obtain provenance for.

--Luc Moreau 07:41, 17 June 2011 (UTC)

You are right. I had not considered to add multiple provenance providers for the same resource. --DanielG 2011-06-23


-- I don't particularly like this approach - I would very much like to see a way to have a standalone Provenance Service similar to how SparQL endpoints and search engines exists today - but I don't like the ?provenance or fake-host approach. A RESTful approach would be to have a single resource which describes how to search for the provenance from the service, a "fill in the blanks" approach. In (X)HTML this would be a <form> with the field "URI" - but multiple other parameters might be desired, such as author, time, source. (I'll put this in as a separate proposal) Is it within the scope of this WG to define such a service, or can we just define a minimal provenance service which could be extended depending on capabilities? --Stian, 2011-06-20

--Luc Moreau 22:54, 20 June 2011 (UTC)

Proposal: RESTful provenance service, by Stian

A RESTful approach where a resource will present a kind of form for the client to fill in to perform a search for provenance information. Different representational formats might provide different fields and mechanisms for how to perform the search. It is out of the scope of this WG to define all such formats and possible extensions, but we probably want to suggest a minimal standard representation.

The main idea is to follow the "hypermedia as the engine of the resource state" - and so the representation should say how to perform the search, instead of the WG trying to formalize URI patterns or HTTP mechanisms.

XHTML representation

An example of how an XHTML representation with an XML microformat can provide a form which programmatically can be recognized as a way to access a Provenance Service.

  GET http://example.com/some/provenance/endpoint HTTP/1.1
  Accept: application/xhtml+xml

  200 OK
  Content-Type: application/xhtml+xml
  Content-Length: 1337

  <?xml version="1.0" encoding="UTF-8"?>
  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
  <html version="-//W3C//DTD XHTML 1.1//EN"
      xmlns="http://www.w3.org/1999/xhtml" 
      xmlns:prov="http://www.w3.org/2011/prov/serviceExample"
      xml:lang="en">    
    <body>
      <form action="/some/search" method="POST" prov:searchType="form">
         <input name="uri" prov:thingURI="uri" />
         <input type="submit" />
      </form>
    </body>
  </html>
    

The XHTML might contain other formatting and other input fields which the provenance service client might or might not choose to interpret.

The POST request uses regular x-www-form-urlencoding as:

     
  POST http://example.com/some/search HTTP/1.1
  Content-Type: application/x-www-form-urlencoded
  Content-Length: 67
  Accept: application/xhtml+xml
  
  uri=http%3A%2F%2Fother%2Eexample%2Eorg%2Fsome%2Dresource
    

To enable linking to the results, the service returns by redirecting using 303 See Other - but could also have responded directly with a 200 OK.

     
  303 See Other
  Location: http://example.com/some/search/results/7227
 


Retrieving the results gives the lists of known provenance resources matching the search:

  GET http://example.com/some/search/results/7227 HTTP/1.1
  Accept: application/xhtml+xml
    
   
  200 OK
  Content-Type: application/xhtml+xml
  Content-Length: 1337

  <?xml version="1.0" encoding="UTF-8"?>
  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
    "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
  <html version="-//W3C//DTD XHTML 1.1//EN"
      xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"
      xmlns:prov="http://www.w3.org/2011/prov/serviceExample">    
    <body>
      <ul prov:results="3" prov:thingURI="http://other.example.org/some/resource" >
        <li><a href="/provenance/34" prov:source="internal" /></li>
        <li><a href="/uploaded/provenance/12336" prov:source="external" /></li>
        <li><a href="http://example.net/stuff-provenance/47372" prov:source="external" /></li>
      </ul>
    </body>
  </html>

  

In this example the form has been annotated with microdata using the prov namespace which the WG can define for the minimal properties required. In this case:

  • prov:searchType="form" means this <form> describes how to access a Provenance Service.
  • prov:thingURI="uri" means the given <input> field can be filled in to search for the provenance of the Thing identified with the given URI. (The field-name can be chosen by the service)
  • Other (extensible) fields might be present - but not required if prov:thingURI is filled in.
  • The search-form is just a regular application/x-www-form-urlencoded (can be overriden by <form enctype>) with the given field 'uri' set to the escaped version of the URI. If the form is using GET instead of POST, or returns with a redirect to a Search Result resource (as above), then you can also search for the provenance of a search result.
  • prov:results="3" means the service knows of 3 provenance resources matching the search parameters. These might or might not be dereferenceable or still exist.
  • Search parameters used can optionally be repeated, here prov:thingURI is shown.
  • prov:source="internal" means that the <a href> is a link to a provenance resource, and the provenance is considered "internal" to this service
  • prov:source="external" means that the <a href> is a link to a provenance resource, and the provenance is considered "external" to this service - so the service takes not responsibility for the correctness of said provenance.

JSON representation

An example of how a JSON representation can provide a URI template which programmatically can be recognized as a way to access a Provenance Service.

  GET http://example.com/some/provenance/endpoint HTTP/1.1
  Accept: application/json

  200 OK
  Content-Type: application/json
  Content-Length: 1337

  {
     "org.w3.www.2011.prov.service": {
         "search-templates": ["http://example.com/some/search?uri={thing-uri}"]
     }
  } 
   

The JSON might contain other keys beyond "org.w3.www.2011.prov.service" which the client might or might not understand.

     
  GET http://example.com/some/search?uri=http%3A%2F%2Fother%2Eexample%2Eorg%2Fsome%2Dresource HTTP/1.1
  Accept: application/json
  
  200 OK
  Content-Type: application/json
  Content-Length: 1337

  {
     "org.w3.www.2011.prov.service": {
         "search": {"thing-uri": "http://other.example.org/some/resource"},
         "result-count": 3,
         "results": [
           {"provenance-uri": "/provenance/34", "source": "internal"},
           {"provenance-uri": "/uploaded/provenance/12336", "source": "external"},
           {"provenance-uri": "http://example.net/stuff-provenance/47372", "source": "external"}
         ]
     }
  } 

The returned JSON might contain other keys than "org.w3.www.2011.prov.service" which the client might or might not understand.

In this example the initial 'form' JSON contains an URI template where the client can fill in the URI for the thing it wants the provenance of, and then perform a GET. Other URI templates might be available under other keys, and might have additional parameters.

  • "org.w3.www.2011.prov.service" means that keys under here are following the WG service specifications. JSON does not have namespaces or schemas, so this is just an attempt to allow mixing of standardised data and third-party data.
  • "search-templates" contains a list of URI templates for performing provenance search. The client can choose any template where he understands all the {parameters}. There should be at least (and at most) one search template which requires only the parameter {thing-uri}.
  • In the result, "search" repeats the parameters of the search.
  • "result-count" shows the (possibly approximate) total number of results found.
  • "results" lists the found provenance resources. Each is described with a relative or absolute "provenance-uri" reference (similar to <a href>) and optionally the "source" to say if this is an "internal" or "external" provenance.

Comments

  • I quite like the way your provenance service allows for multiple provenances (or more precisely provenance-uris) to be returned. The provenance service I described above could only return one provenance representation.
  • Looking at a REST view of a provenance service, isn't a POST supposed to create a new resource? It seems that this interaction is not intended to create anything: instead can we use a GET?
    • Indeed it does create a new resource, http://example.com/some/search/results/7227 - but the XHTML form could also have method="GET". I mainly wanted to show how the format defines the interaction, and HTML is fairly well understood, although we should recommend a subset to avoid clients being complete browsers! --Stian
      • OK, is the resource created by the POST, or was it created when a new provenance-uri was registered in the provenance service for this thing-uri? all things being equal, would a new resource be created for another POST (with same parameters?) --Luc
        • Good question, but REST does not guarantee you that the resource did not exist before the POST. (Use POST for creation != POST means creation). This is just the regular Post/Redirect/Get pattern - I can change it to do a GET submission instead, but I wanted to avoid that, because it will look too much like you can only do URI patterns. --Stian

--Luc Moreau 09:07, 21 June 2011 (UTC)

We seem to be using uris for many different purposes, and I am not sure we are using terminology consistently. Maybe, some definitions:

  • thing-uri: the uri identifying the thing we are interested in
    • Agreed - fixed examples above. --Stian
  • provenance-uri: the uri from which the representation of some (thing's) provenance can be retrieved
  • provenance-service-uri: the uri of a service from which provenance-uris can be retrieved for a given thing-uri

--Luc Moreau 09:13, 21 June 2011 (UTC)

  • Note: I only added "prov:source='internal' to tag the <a> link (instead of prov:hit="hit") - I don't think the internal/external distinction is particularly important" -- Stian
  • I like this approach, it seems to cover all the problems when accessing the provenance of a resource. But what if we have several restful services providing different provenance themselves? (I don't believe that we assume to have every provenance source in one centralized service, am I right?). Would we have to include these services in the html document instead of the uri's directly? (for example using another of the approaches proposed in this page) -- Daniel G 06/23/2011, 13:58.

Embedding information necessary to retrieve provenance in an HTML document

Proposals to meet the second aim

The identity of the HTML document and the location from which to download its provenance can be embedded in the HTML itself.

Proposal: Use HTML Link

(proposal by GK - 2011-06-09)

For a document presented as HTML or XHTML without regard for how it has been obtained, POWDER describes a mechanism (http://www.w3.org/TR/2009/REC-powder-dr-20090901/#assoc-markup) for associating a POWDER description with the document, adding <Link> element to the HTML <head> section.

I propose an adaptation of this mechanism, probably using a different link rel= value. The POWDER specification makes no explicit reference to the RFC 5988 registry of link relations, but I expect they could apply also to HTML header <Link> elements.

Example (adapted from POWDER):

<html xmlns="http://www.w3.org/1999/xhtml">
   <head profile="http://www.w3.org/2007/11/powder-profile">
      <meta name="wdr.issuedby" content="http://authority.example.org/company.rdf#me"/>
      <link rel="describedby" href="provenanceURI">
      <title>Welcome to example.com </title>
   </head>
   <body>
      ...
   </body>
</html>

The POWDER specification also adds:

  • Documents MAY also include any of the attribution data from the POWDER document in meta tags. In particular, the issuedby field is likely to be useful to user agents deciding whether or not to fetch the full POWDER document. Any attribution data encoded in meta tags within an HTML document should be the same a s that in the POWDER document. In case of discrepancy, the POWDER document should be taken as more authoritative.

Proposal: Use HTML Link (without POWDER)

Same as above example, but using a provenance profile since POWDER does not include any provenance-related metadata, per se.

<html xmlns="http://www.w3.org/1999/xhtml">
   <head profile="http://www.w3.org/.../provenance">
      <link rel="identity" href="thing-uri: a URI for this resource state representation">
      <link rel="service" href="provenance-service-uri: a URI for a provenance service">
      <title>Welcome to example.com </title>
   </head>
   <body>
      ...
   </body>
</html>

--Luc Moreau 13:59, 13 June 2011 (UTC)

Comment

Why? I don't see the added value in the profile -- GK, 2011-06-17

Proposal: Use POWDER to provide a default provenance service

POWDER allows for some metadata to be associated with all resources within a host (or sub-path within). The following example illustrates a powder.xml file that associates provenanceservice.com as the provenance service for all resources in example.com.

<powder xmlns:pil="http://www.w3.org/ ... provenance"
        xmlns="http://www.w3.org/2007/05/powder#">


  <dr>
    <iriset>
      <includehosts>example.com</includehosts>
    </iriset>

    <descriptorset>
      <pil:service>provenanceservice.com</pil:service>
    </descriptorset>
  </dr>

</powder>

We then use http://www.w3.org/TR/powder-dr/#assoc-linking to associate the current resource representation with a power.xml document as above.

--Luc Moreau 13:52, 13 June 2011 (UTC)

Comments

I don't see the point in this. This looks to me like added complexity without any clear use-case. I would suggest focusing first on the simple cases and then figuring our where the important gaps are. If you want to have the group complete on an aggressive schedule, introducing complex mechanisms without a clear need isn't going to help. GK, 2011-06-17

Proposal: Use of the <Meta> HTML tag or microdata format

(By Khalid Belhajjame)

As a proposal to the second question, the identity "I" and the location of the provenance "L" can be embeded in the HTML file. This can be achieved using the construct "Meta", e.g.:

<html xmlns="http://www.w3.org/1999/xhtml">
 <head>
  <META name="IVPT Identifier" content="I"/> 
  <META name="Provenance location" content="L"/> 
  ...
 </head>
 ...
</html>

Or by using microdata. For example, the following provides an example of a HTML document in which the properties "identifiedBy" and "provenance" are used to identify the identity of the HTML document and the location where its provenance can be found.


<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
  ...
  </head>
  <div itemprop="identifiedBy">I</div> 
  <div itemprop="provenance">L</div>
  ...
 </html>


Comments

Who can explain when we should use the meta tag and when we should use the link tag. We seem to have encoded the same information with two different tags. Which one is the most appropriate?

--Luc Moreau 07:53, 17 June 2011 (UTC)

I agree with the thrust of Luc's comment. I see no need for link and meta as alternatives - pick one. Personally, I think the link tag approach is cleaner and more appropriate, and more widely applicable, and can be introduced more easily using existing mechanisms (via the link tag registry). -- GK, 2011-06-17

At this time, I think Microdata is controversial. Cf. http://lists.w3.org/Archives/Member/tag/2011Jun/0021.html under consideration by W3C TAG (member-only-link at this time). I don't discount its consideration at a later date, but I wouldn't expend effort on it right now. -- GK, 2011-06-17

Issues regarding the above

Open discussions, informative notes, criticisms, etc. regarding the proposals above. These would ideally be articulated as short specific questions which can be discussed individually at the F2F meeting.

Issue: Access mechanism should be independent of provenance format

(by GK - 2011-06-09)

The mechanisms for accessing provenance are independent of the format of the provenance information itself. While there is an expectation that provenance information will be returned as RDF or RDFa, other formats are possible and the access mechanisms should as far as possible avoid any dependence on the format of information received.


That's why we can use content negotiation when retrieving provenance: for instance, using the HTTP ACCEPT field to indicate whether we want json, rdf, or other representation.

--Luc Moreau 07:55, 17 June 2011 (UTC)

Issues beyond scope

Issues about provenance access and query that go beyond the two specific aims above. These will either be discussed at the F2F meeting if there is time, or else postponed until after.

  • Comment by Daniel G (16-06-2011): Guidelines for provenance publishers, based on the consensus reached in the F2F meeting. (How can I publish my content in order to make it accesible to anyone else?)


  • How do we address situations where: the location L does not have the requested provenance information for I? The provenance has "moved" to a different location L2?
  • How do we specify multiple locations for the provenance location, L1 and L2, in the downloaded document?

(By Yogesh S. 2011-06-16)

  • Some scientists that I have talked to are worried about the concept of having provenance separate from their data. They believe it is a recipe for disaster. They want to make sure the data and any provenance info are within the same file, so that it is always traveling together as it moves from hand to hand. (by Yolanda G., 2011-06-23)
    • I agree this is a concern, but one that I think is part of a larger problem. There are any number of composite package formats that can achieve this. In the Wf4Ever project, we are exploring the notion of research object to compose data, provenance and much much more in the context of workflow preservation. -- GK.