Re: PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology]

Luc,

Considering your example of l-value and r-value, I think it's the implication of 
dereferencability and updatability that comes with a container that I feel is 
over constraining.  I don't want to prohibit containers or modifiable entities 
as resources (with provenance), I just don't think that all resources with 
provenance are necessarily containers in this sense.

#g
--


Luc Moreau wrote:
> Nothing in the example is restricted to rdf or triple stores.
> It also applies to a table in a relational database (and its xml 
> serialization),
> or an excel spreadsheet (and a csv representation).
> 
> The relational database/table and the spreadsheet can be seen as 
> containers, since
> they can be updated.
> 
> The reason why it is important is that we need to consider stateful 
> resources (well,
> I think so, don't you?).
> 
> An alternative way of looking at it, adopting some old programming language
> terminology, is this:
> 
> a resource is like a l-value
> a snapshot is like a r-value
> a r-text is like a representation of a r-value
> 
> Luc
> 
> On 05/25/2011 09:53 AM, Graham Klyne wrote:
>> I have a problem with resource-as-container.  I think it's too 
>> constraining.  My zebra example wouldn't comply.
>>
>> As for the distinction between f1 and r1 per your example, I think 
>> this is rather broadening the discussion - which I'm not sure is 
>> necessary or helpful.
>>
>> I would say that in this case, r1 is a service resource.  And as such, 
>> I don't think it makes sense to download a service.  E.g. what to you 
>> receive if you do s simple HTTP GET in a SPARQL endpoint URI?  I think 
>> it's typically some kind of intro page that explains how to use the 
>> service (e.g. http://data.clarosnet.org/sparql/).  The URIs that may 
>> be used to download *content* from the triple store are different 
>> (e.g. URI-encoded SPARQL queries, or constructed LDAPI URIs).
>>
>> So, for the purposes of this example, we need to be clearer about what 
>> we mean when saying "analyst (alice) downloads a turtle serialization 
>> (lcp1) of the resource (r1) from government portal" - in this context, 
>> I don't think it makes sense as it stands.
>>
>> I also note that once you introduce a triple store into the mix, while 
>> we can expect it to contain information that has been loaded into it, 
>> when retrieving information, we have no a priori way to claim that the 
>> information subsequently retrieved has to do with the original 
>> resource.  The best we can say is that if the entire *content* of the 
>> resource "r1" is downloaded, then that content should contain as a 
>> subset the RDF that was loaded.  But even this isn't clear-cut - if 
>> the triple store supports named graphs (which most do), then there's 
>> no way to represent its entire content in a single Turtle download.
>>
>> In  summary, I think the introduction of containers and triple stores 
>> is mixing mechanism with essential provenance concepts here, and I 
>> think we need to get the former straight before we can explain what 
>> happens when more complex mechanisms are introduced.  The scenario as 
>> described could playperfectly well without mention of a triple store.
>>
>> #g
>> -- 
>>
>>
>> Luc Moreau wrote:
>>> Hi Paul,
>>> Yesterday, I also began drafting some definition. We need 
>>> representations in here too. I am not sure about
>>> your illustrations.  Here is my take on it:
>>>
>>>
>>>
>>>
>>>  From a provenance viewpoint, we seem to discuss several concepts
>>> related to resources.  Some terminology is required to disambiguate
>>> concepts.  It is inspired by terminology developed by the rdf working
>>> group (thanks to Sandro for drafting the original email!)
>>>
>>>
>>> 1. A "resource" is a container, whose contents may vary over time.
>>>    Its content may be structured in many different ways (hierarchical
>>>    XML tree, RDF arcs, etc).
>>>
>>> 2. A "r-snapshot" is a state of a resource, or a snapshot of that
>>>    resource at a specific instant.  A r-snapshot is immutable. From a
>>>    resource that changes over time, one can obtain multiple
>>>    r-snapshots.
>>>
>>> 3. A "r-text" is a particular sequence of characters or bytes which
>>>    conveys a particular r-snapshot in some language.  If you can parse
>>>    a r-text, you know what is in the r-snapshot it conveys.  You can
>>>    tell someone exactly what is in a particular resource at some
>>>    instant by sending them a r-text.  (You send them the r-text which
>>>    conveys the r-snapshot which is the current state of that resource.)
>>>
>>>
>>>
>>> In some cases, some resources do not vary over time, which means that
>>> there is a single r-snapshot for them, and some may even have a 
>>> single r-text
>>> (no content negotiation).  In such a specific case (static resources 
>>> on the web),
>>> the three concepts conflate into  a single one.
>>>
>>> The challenge is to deal with dynamic contents.
>>>
>>>
>>>
>>> Illustration inspired by the example.
>>>
>>> - government (gov) converts data (d1) to RDF file (f1) at time (t1) 
>>> using xlst transform
>>> - government (gov) uploads RDF data (f1) into a triple store, exposed 
>>> as  Web resource (r1)
>>> - analyst (alice) downloads a turtle serialization (lcp1) of the 
>>> resource (r1) from government portal
>>>
>>> Illustrations:
>>> - r1: is a resource: it's the triple store, its a container, its 
>>> content can vary over time
>>> - lcp1: is a r-text (turtle serialization) of a given snapshot 
>>> (created by, or available at the time of, download)
>>> - f1 is a local file: it can be seen as a stateless anonymous 
>>> resource, with a single r-text.
>>>
>>> If in addition:
>>> - analyst (alice) downloads a rdf/xml serialization (lcp2) of the 
>>> resource (r1)
>>>
>>> If the content of r1 has not changed, then lcp2 and lcp1 are both 
>>> r-texts of a same r-snapshot.
>>>
>>> Note that this is not limited to RDF (as Graham mentioned)
>>>
>>> - newspaper (news), uses a CMS to publish the incidence map (map1), 
>>> chart (c1) and
>>>   the image (img1) within a document (art1) written by (joe) using
>>>   license (li2)
>>> - newspaper (news), updates art1, adding a correction following a 
>>> complaint from a reader
>>>
>>> Illustrations:
>>> - art1 is a also resource, with two r-snapshots (before and after 
>>> correction)
>>> - with language negotiation, an http client can download  html and 
>>> xhtml representations (i.e., r-texts) of the article
>>>
>>>
>>>
>>> What do you think?
>>> Cheers,
>>> Luc
>>>
>>>
>>> On 05/25/2011 06:49 AM, Paul Groth wrote:
>>>> Hi,
>>>>
>>>> To throw out some, perhaps simpler, definitions into the mix that I 
>>>> think follow along the lines of what's being discussed.
>>>>
>>>> Resource - something that can be identified
>>>>
>>>> Snapshot - the state of a resource at particular point in time
>>>>
>>>> In the Data Journalism Scenario: a 'resource' would be the web page. 
>>>> a 'snapshot' would be the web page before publication.
>>>>
>>>> cheers,
>>>> Paul
>>>>
>>>> Note: Similar concepts are found within many provenance models that 
>>>> I know of....if it's helpful I can list those out
>>>>
>>>
>>
> 

Received on Wednesday, 25 May 2011 12:40:33 UTC