RE: updates to PAQ doc for discussion

Graham, I'd like to have something like you describe, but I don't think you can really make the inferences you want to for the general case, and don't necessarily see why version helps versus view. I'm not opposed to having a subtype of view for version, but I'm not sure how to make it rigorous...

Taking those parts in order:

1) The problems I have with inferring authorship/editorship have to do with the fact that not all edits are equal. Someone who just fixes grammar might not be an author/editor in the final doc. Changes that are put in in one version and removed in another may or may not indicate that an intellectual contribution has been made (my text doesn't make the final version but other text is added in some other part of the doc because of what I contributed...am I an author/editor or not?). To me these types of issues really indicate that a document is not just a more flexible version of a file/file-like version, that edit operations aren't really occuring on the same type of thing as editorial/intellectual contributions are made on, etc. So we really have the IVPof/view case although we try to pretend that is really hierarchical and just a matter of more/less constrained versions of the same thing.

2) Regarding the question of why version does something better that view - if X is a view of A and Y is another view of A, why wouldn't I think inferring creatorship/editorship is OK? (I'm claiming above that inferring is probably not valid in some cases - here I'm asking whether version does a better job of cutting down on those cases versus view.) I.e. if you wrote the bits to a section of a disk that is an anIVPof/view of a file, why wouldn't it be just as valid or invalid as trying to make that inference between a doc and a version of it? How does a hierarchical meaning help? (I guess I'm assuming that IVPof is one-way like version and for my disk versus file use case here I would actually assert IVPof in both directions so I could infer the file creator also wrote the bits to the disk and vice versa whereas with your doc/version case the IVPof relationship would go one way. So, rephrasing the question here - I'd agree that inference should only go in the direction of the relationship, but if there are relationships in both directions, wouldn't inferrencing be just as valid for that case?)

3) I would tie the use cases together and rather than looking to infer authorship/editorship from view or version relationships, I would see any differences in who's listed for the doc and the aggregate list from each version as an indication that there's been an error, a lie, or the provenance is just not complete (intellectual contributions haven't been separated from text/file-level edits, one version isn't really 'derivedfrom' another when I look at more granularity in the files or processes, etc.)

A version relationship may still be a useful, particularly if we agree that it allows inferencing as you want (i.e. you only use version instead of view when you want people to infer authorship/editorship/(what else can I infer?) -view shouldn't work that way, version could though there would be cases where the English language meaning and this technical definition would be at odds (the examples I've given). 

If we do that, I think version would have tol only be valid within an account - i.e. the notion of version is an indication that, for the set of processes being reported, the asserter believes one can consider the view relationships hierarchical/transitive/version-like and inferrencing is OK. If I take two accounts that use version and merge them, I may find that the set of processes they describe will break versioning - versions might have to be interpreted as views because of the additional info (Perhaps this example works: if you use version to indicate text changes in a doc and I use version to describe multiple copies (file versions) of one logical file (one of your versions of a doc), I think both might be internally consistent, but together they'd imply that every person who copied a version of your doc was an author/editor which is not what you intended). Perhaps version being account-limited is still OK - PIL is an assertion language and so an asserter may be wrong and it may be possible that they are wrong about a version relationship while still being right about their being a view relationship...

--  Jim


________________________________________
From: Graham Klyne [Graham.Klyne@zoo.ox.ac.uk]
Sent: Saturday, August 20, 2011 4:49 AM
To: Myers, Jim
Cc: Khalid Belhajjame; Paul Groth; public-prov-wg@w3.org
Subject: Re: updates to PAQ doc for discussion

Jim,

Thanks for the example - it clarifies the kind of use-case you are considering.
  If I may paraphrase, I think you are interested here in reconciling different
accounts to help understand any unexpected differences that may show up - and I
can see that a fairly "loose" notion of "view" may be needed here.

The use cases I'm (implicitly) considering are more closely related to
generation and initial interpretation.  A premise I have here is that provenance
information MUST be easy to generate at the point that provenance is instigated;
e.g. workflow instrumentation for provenance capture must be lightweight; this
implies (for me) that provenance information will be commonly expressed using
local contextual information - in particular, using a local context-URI.  As a
result, we could end up with provenance about some resource expressed with
respect to very localized "views", some of which may be more globally
applicable, and should be interpretable as such.

Specifically, returning to my example of a W3C specification production process,
the editor for each revision is captured in the version management system.  What
relationship between each revision and the overall specification allows me to
infer that the individual revision editors are also editors of the overall
document?  Does the more general notion of "view" allow this?  I don't see how,
but I'm not certain.

Consider also a slightly more complex example, where the specification is
branched in the repository to allow different editors to work on different
aspects, then subsequently merged.  Each of the branches has its own revision
history with editors for each revision, and is also a view of the final
document.  Can I infer the branch editors are all editors of the final document
without knowing there is a transitive notion of views in play?

#g
--


Myers, Jim wrote:
> I agree that versioning/hierarchical constraints are a major use case.  But:
>
> There are use cases beyond that: a location on disk and a file coincide for a while, but both have an independent provenance - neither is a more constrained view of the other.
>
> If I can generalize that - I think many of the interesting questions provenance will be able to help with arise when a) someone has made an assumption that nothing happened to their favorite entity except via process executions they know about, b) there's another witness who sees the world differently (different entities, different processes of interest), and c) I find an IVP-of style link.  If I've been watching a disk and see lots of misreads in one area, my provenance about that becomes very interesting to you when we realize that your file was recorded in that area at one point.  In that sense, I think the non-hierarchical relationships may be more important in trying to interpret multiple accounts (or for someone trying to assert something synthesized from multiple observers/sensors). If we control the processing, we like to make things nice and hierarchical, if we don't control it, we get non-hierarchical relationships and we need to use them for debugging/resolving w
hat appear to be paradoxes.
>
> I could see calling out 'versionOf' as a subtype of 'IVP/ViewOf' as a convenience. (Not sure without thinking further how much that helps me as an interpreter of provenance, but I don't know that is causes any trouble either and may make the 80% case clearer...).
>
>  Jim
>
>> -----Original Message-----
>> From: Graham Klyne [mailto:Graham.Klyne@zoo.ox.ac.uk]
>> Sent: Friday, August 19, 2011 4:08 PM
>> To: Myers, Jim
>> Cc: Khalid Belhajjame; Paul Groth; public-prov-wg@w3.org
>> Subject: Re: updates to PAQ doc for discussion
>>
>> Myers, Jim wrote:
>>>> Jim,
>>>>
>>>> In:
>>>> http://dvcs.w3.org/hg/prov/raw-
>>>> file/default/model/ProvenanceModel.html#concept-IVP-of
>>>> (I note the section anchor still retains the old name :)
>>>>
>>>> I see:
>>>> "we say that A is-complement-of B, and B is-complement-of A, in a
>>>> symmetrical fashion"
>>> But the next section says asymmetric is OK too...
>>>
>>>> By my understanding the original IVPof was not symmetrical.
>>> I would use the open world to claim it was - the existence of some
>> properties where A is more immutable than B doesn't stop the opposite
>> from being true as well...
>>
>> I think that depends on how they are derived.  My working assumption has
>> been that one has a dynamic resource, but to meaningfully express
>> provenance about that resource one has to adopt a constrained view.
>>
>> For example, a W3C specification undergoes a number of revisions, but they
>> are all identified with the same latest-version link. This suggests the
>> specification though its lifetime as a dynamic resource, and particular
>> revisions as constrained views of that resource.  Provenance might be
>> applied to either; e.g. if the creator of the overall resource is Tom, then Tom
>> is also the creator of the various revisions, but the most-recent editor is static
>> only for given revisions.
>>
>> This kind of constraining relation seems useful and natural to me - even in an
>> open world - and reasonably easy to reason about to boot.  But I can't see
>> what useful actionable purpose is served by a relationship like
>> complementOf.  For me, the complexity and impenetrability of the
>> description is inicative of a problem here.
>>
>>>> In the example given there, I think it is claiming that, say, views
>>>> M3 and L2 are complementOf, but I'd say they are not in any IVPof
>> relation.
>>> I would have said IVPof fits this too, but I'm not sure that opinion
>>> was shared. I think the broader, potentially symmetric definition is
>>> what we need,
>> regardless of what the answer was.
>>
>> Interesting - almost the opposite of what I just wrote :)
>>
>> What (practical) *use* do you see in the complementOf relation?  Why might
>> a developer of provenance-handling software care about it?
>>
>> Cheers,
>>
>> #g
>> --
>>
>>
>>>> Myers, Jim wrote:
>>>>> I hadn't interpreted the name change and lifting of the property
>>>> restrictions as changing the definition as you do it below. Is that
>>>> what is being proposed? To limit complementOf to ~peer relations
>>>> versus simply being a drop-in replacement for IVPof with less
>>>> definition of how properties might relate?
>>>>>  Jim
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Graham Klyne [mailto:Graham.Klyne@zoo.ox.ac.uk]
>>>>>> Sent: Friday, August 19, 2011 7:22 AM
>>>>>> To: Myers, Jim
>>>>>> Cc: Khalid Belhajjame; Paul Groth; public-prov-wg@w3.org
>>>>>> Subject: Re: updates to PAQ doc for discussion
>>>>>>
>>>>>> I too find the name unhelpful.  But I'm also concerned about the
>>>>>> form of the definition.  I'm not sure "generality" is the right
>>>>>> aspect, though, as in some ways I see IVPof (to use the old name)
>>>>>> as being more general than complementOf.
>>>>>>
>>>>>> Why:
>>>>>>
>>>>>> Roughly, using SPARQL, I can use IVPof to locate instances of
>>>> complementOf.
>>>>>> But I can't see how to do the other way.
>>>>>>
>>>>>> e.g.
>>>>>>
>>>>>> [[
>>>>>> CONSTRUCT
>>>>>>     { ?v1 complementOf ?v2 }
>>>>>> WHERE
>>>>>>     { ?v1 IVPof ?r ; ?v2 IVPof ?r } ]]
>>>>>>
>>>>>> So from this operational perspective, IVPof is more generally
>> applicable.
>>>>>> (But from another perspective, this is possible because IVPof is
>>>>>> more constraining - less general - that complementOf.  Hence my
>>>>>> comment about generality not necessarily being a helpful
>>>>>> criterion.)
>>>>>>
>>>>>> I find that when I think about provenance being related to an
>>>>>> invariant or less variant view of a resource (e.g. see the
>>>>>> discussion at
>>>>>> http://dvcs.w3.org/hg/prov/raw-file/tip/paq/provenance-
>>>>>> access.html#provenance--context-and-resources),
>>>>>> the notion of IVP is useful.  I have not yet found a case where
>>>>>> talking/thinking about complementOf is useful to me.  Fior this
>>>>>> reason, I prefer having IVPof (or viewOf, or some other name) to
>>>> complementOf.
>>>>>> #g
>>>>>> --
>>>>>>
>>>>>>
>>>>>> Myers, Jim wrote:
>>>>>>> I'm complaining about the name 'complement' not the generality of
>>>>>>> the definition. Complementary angles are not different
>>>>>>> characterizations of the same angle, they are different angles
>>>>>>> that create a whole. A wine complements food. Some other term
>> with
>>>>>>> the broader definition would be fine. (BTW: I am beginning to
>>>>>>> think that being able to associate a time interval with the
>>>>>>> relationship would be useful...)
>>>>>>>
>>>>>>>  Jim
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Khalid Belhajjame [mailto:Khalid.Belhajjame@cs.man.ac.uk]
>>>>>>>> Sent: Wednesday, August 17, 2011 1:31 PM
>>>>>>>> To: Myers, Jim
>>>>>>>> Cc: Paul Groth; Graham Klyne; public-prov-wg@w3.org
>>>>>>>> Subject: Re: updates to PAQ doc for discussion
>>>>>>>>
>>>>>>>> Hi Jim
>>>>>>>>
>>>>>>>> On 16/08/2011 13:45, Myers, Jim wrote:
>>>>>>>>> As for complementOf - since complement means 'counterpart' and
>>>> has
>>>>>>>>> the
>>>>>>>> notion of not being the same thing - being separate and adding to
>>>>>>>> the thing, I don't think it works as a replacement for IVPof -
>>>>>>>> viewOf doesn't capture everything but would be better than
>>>>>>>> complement in that its English meaning does not conflict ...
>>>>>>>>
>>>>>>>> I am not sure I understand what you mean. Could you please
>>>> elaborate?
>>>>>>>> The way is complement of is defined seems to me more general that
>>>>>>>> IVP of and also more natural. While IVPof requires that all the
>>>>>>>> immutable attributes of one characterization are subset of the
>>>>>>>> immutable attributes of the other characterization,
>>>>>>>> isComplementOf does not pose this constraint, which is
>>>>>>>> plausible: in practice, when we have two characterizations of an
>>>>>>>> entity, these characterizations are likely to use different set
>>>>>>>> of attributes depending on the observer, and the likelihood that
>>>>>>>> the immutable attributes of one are subset of the immutable
>>>>>>>> attributes of the
>>>>>> second is small.
>>>>>>>> Khalid
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>   Jim
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Paul Groth [mailto:p.t.groth@vu.nl]
>>>>>>>>>> Sent: Tuesday, August 16, 2011 1:21 AM
>>>>>>>>>> To: Myers, Jim
>>>>>>>>>> Cc: Graham Klyne; public-prov-wg@w3.org
>>>>>>>>>> Subject: Re: updates to PAQ doc for discussion
>>>>>>>>>>
>>>>>>>>>> Hi Jim
>>>>>>>>>>
>>>>>>>>>> I think<link>  elements in PAQ serve a different purpose the
>>>>>>>>>> semantics is here's how you find me (the resource)  in
>>>>>>>>>> provenance
>>>>>>>> information.
>>>>>>>>>> ComplementOf has a much more constrained meaning.
>>>>>>>>>>
>>>>>>>>>> Paul
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Aug 16, 2011, at 3:01, "Myers, Jim"<MYERSJ4@rpi.edu>  wrote:
>>>>>>>>>>
>>>>>>>>>>> But, having introduced the definition in this way, other uses
>>>>>>>>>>> are possible.  The example I've started thinking about is that
>>>>>>>>>>> multiple <link>  elements might indicate different URIs
>>>>>>>>>>> denoting different levels of
>>>>>>>>>> invariance.
>>>>>>>>>>> - why aren't these just IVPof relationships? (I'm not arguing
>>>>>>>>>>> against encoding pil relationships as links, just against
>>>>>>>>>>> adding a
>>>> 'target'
>>>>>>>>>>> concept that duplicates other relationships in the model.)
>>>>>>>>>>>
>>>>>>>>>>> Jim
>>>>>>>>>>> ________________________________________
>>>>>>>>>>> From: Graham Klyne [graham.klyne@zoo.ox.ac.uk]
>>>>>>>>>>> Sent: Monday, August 15, 2011 5:38 PM
>>>>>>>>>>> To: Myers, Jim
>>>>>>>>>>> Cc: Paul Groth; public-prov-wg@w3.org
>>>>>>>>>>> Subject: Re: updates to PAQ doc for discussion
>>>>>>>>>>>
>>>>>>>>>>> Myers, Jim wrote:
>>>>>>>>>>>>> In Issue 46 (http://www.w3.org/2011/prov/track/issues/46),
>>>>>>>>>>>>> Luc raised the point that the scenario we had agreed to
>>>>>>>>>>>>> address included a case where the recipient of a resource
>>>>>>>>>>>>> representation had no way to know its URI for the purposes
>>>>>>>>>>>>> of provenance discovery.  After short discussion, my
>>>>>>>>>>>>> response to this issue was to introduce a new link relation
>>>>>>>>>>>>> type (currently called
>>>>>>>>>>>>> "target") to allow this URI to be encoded
>>>>>>>>>> in the header of an HTML document.
>>>>>>>>>>>>> Does this help?
>>>>>>>>>>>> So this is only used inside an HTML entity?
>>>>>>>>>>> That was the compelling use-case, but once defined, other uses
>>>>>>>>>>> are not
>>>>>>>>>> excluded.
>>>>>>>>>>>> ... I.e. it is not a relationship between two entities, but
>>>>>>>>>>>> is a means to embed an identifier in an entity (for HTML)?
>>>>>>>>>>> Interesting take.  Practically, in the HTML use case, I think
>>>>>>>>>>> I'd have to
>>>>>>>> agree.
>>>>>>>>>>> But I think it is still technically a relation in the same way
>>>>>>>>>>> that owl:sameAs is a relation, even though its semantics tell
>>>>>>>>>>> us that the related RDF nodes denote the same thing.  Like all
>>>>>>>>>>> HTML<link> elements, it defines a relation between the
>>>>>>>>>>> resource of which the containing document is a representation
>>>>>>>>>>> and a resource denoted by the given
>>>>>>>>>> URI.  They may both be the same resource.
>>>>>>>>>>> But, having introduced the definition in this way, other uses
>>>>>>>>>>> are possible.  The example I've started thinking about is that
>>>>>>>>>>> multiple <link>  elements might indicate different URIs
>>>>>>>>>>> denoting different levels of invariance.  If the HTML is a
>>>>>>>>>>> document in a source code management system, one such URI
>>>>>>>>>>> might denote a specific version, and another might denote the
>> "current"
>>>>>>>>>>> version, both of which might reasonably
>>>>>>>>>> be the referent for provenance assertions.
>>>>>>>>>>> These other uses are not reasons that the propoal was
>>>>>>>>>>> introduced, but are just consequences of not placing
>>>>>>>>>>> unnecessary constraints on the use of the existing<link>  feature
>> as defined.
>>>>>>>>>>>> An "ID card" mechanism that would allow me to keep my
>>>>>>>>>>>> rdf:resource URL
>>>>>>>>>> on my physical body so you could connect me to my online
>>>>>>>>>> identity is the same type of thing?
>>>>>>>>>>> Hmmm... I suppose you might think of it like that, but I'm
>>>>>>>>>>> wary of adopting that view as it tends to arbitrarily exclude
>>>>>>>>>>> other possibilities that arguably should flow from this use of
>>>>>>>>>>> the<link>
>>>>>> element.
>>>>>>>>>>> #g
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>>
>>>
>>>
>
>


Received on Saturday, 20 August 2011 19:27:37 UTC