Re: PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology]

Hello Martin,

I wouldn't disagree with what you say about physical and information
resources, but I would take a different perspective on what needs
defining.

The view taken in OPM (and, before that, the PASOA project and
others), is that to unambiguously talk about a "thing's" provenance,
the "thing" you are talking about must be immutable. The provenance of
a mutable thing would depend on its state when you ask the question,
and this might not even be the state you think it is (it may have
changed since you last looked). This is not to say that it is
unreasonable for anyone to ask for the provenance of something
mutable, only that this should be answered by drawing on the more
tangible provenance of immutable things.

Either a "physical object" or an "information object" can be viewed
from a single state and context, i.e. its state unchanging at a given
time and (for information objects) replica. This state is immutable
(once the object changes it becomes a new state) and has a
well-defined provenance.

I agree that the provenance of the state of a physical object may look
different from the provenance of a state of an information object, in
that prior states of the same object will be linear for the physical
object, but may not be for the information object. However, I'm not
convinced that this is actually important. The provenance of anything
will surely be non-linear, as what it is now will derive from not only
prior versions of itself, but also whatever else caused it to change
or be created (e.g. I am a product of my earlier self, but also of
your emails :-)).

So, my personal inclination would be to start with defining a concept
for something immutable, and specify what the provenance of such a
thing would be, then expressing what the provenance of something
mutable is based on that.

I would also personally not predefine what an immutable thing could
be. For example, one webpage contained in one HTTP response may appear
to be an immutable state, but could be transformed by proxies in
communication or by the browser after download or have copies cached,
while on the other hand, we may be so certain the batch of items from
a production line are identical and unchanged through transit that we
can unambiguously talk about them as one immutable state even though
in spans space (multiple objects) and time (time in transit). Another
example is that "reading" a file will not change its contents, but may
change a timestamp which may later be relevant to how it is used.
Immutability seems critical to defining provenance, but anything is
immutable only by ignoring changes you consider irrelevant.

Thanks,
Simon

On 29 May 2011 14:02, martin <martin@ics.forth.gr> wrote:
> Dear All,
>
> Following the phone-conference on May 26, let me repeat some thoughts:
>
> The definition of a Resource that has the potential to have a provenance
> (following Guarion, Gruber, ontologies describe possible states of affairs as precisely as possible)
> in a Semantic Web relevant way, should be specific enough, so that we can clearly
> identify a set of properties that are relevant and connect in a relevant way to
> answer provenance query.
>
>  From our background I suggest that the distinction of a Physical Resource consisting of matter
> ( for instance crm:E18 Physical Thing) from an InformationResource (irw:InformationResource
> or crm:E73 Information) is necessary and fundamental, because
>
> 1) a physical thing undergoes a linear sequence of states and changes, because any change destroys
>  the previous form. It can only be at on place at a time. From this we infer most of our
> common sense logic of provenance and identity. Even splitting or merging an object destroys all
> predecessors. Identity can be based on continuity of custody (sequence of all ID cards), or
> essential properties (fingerprints etc).
>
> 2) An Information Object can reside on multiple carriers (or "realizations", "copies", "items") at the same time.
> The state of change of any of the copies cannot be related without complete world knowledge to that of
> other copies, because we cannot know what may happen on the other side of the world.
> Therefore the Information Object itself has no well-defined or verifiable states of change in its nature as data.
> Therefore changes of Information Objects are better described as creations of new ones for any minimal change.
> Identity can be based on content, for provenance reasoning best on a bit or character identity.
>
> As a consequence, analogue photograhic material in film industry etc. is better traced as material objects,
> because there is no convention to define identity of content for different copies of analogue photographs.
>
> Using provenance for authenticity reasoning on information objects will rely, besides others, on the fate of multiple copies.
> Not being able to distinguish the behavior of carriers from the actual data would be prohibitive to
> such reasoning.
>
> Further, universals ("Concept", crm:E55 Type), such as "man", "dog" behave again differently, because the
> IsA relations and often fuzzy boundaries of concepts create again different identity conditions and much
> more confused states. I propose to exclude provenance of universals from the discussion until we have understood
> the other two.
>
> I maintain that no more distinctions need be made for this PROVENANCE discussion.
>
> FRBR entities have been mentioned in the discussion. In the CRM-FRBR Harmonization Group we
> concluded (http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/FRBRoo_V1.0.1.doc)
> together with the IFLA FRBR Review Group that the identification of Work, Expression, Manifestation
> is in practice done by selecting "representative" existing realizations, which have a clear identity by content,
> be it fragments or copies of copies of lost works. Therefore the "conceptual nature" of a Work should not confuse
> us. The provenance would still be based on realizations.
>
> Best,
>
> Martin
>
>
> --
>
> --------------------------------------------------------------
>  Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>  Research Director             |  Fax:+30(2810)391638        |
>                                |  Email: martin@ics.forth.gr |
>                                                              |
>                Center for Cultural Informatics               |
>                Information Systems Laboratory                |
>                 Institute of Computer Science                |
>    Foundation for Research and Technology - Hellas (FORTH)   |
>                                                              |
>  Vassilika Vouton,P.O.Box1385,GR71110 Heraklion,Crete,Greece |
>                                                              |
>          Web-site: http://www.ics.forth.gr/isl               |
> --------------------------------------------------------------
>
>
>
> ______________________________________________________________________
> This email has been scanned by the MessageLabs Email Security System.
> For more information please visit http://www.messagelabs.com/email
> ______________________________________________________________________
>



-- 
Dr Simon Miles
Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166

Received on Sunday, 29 May 2011 17:47:02 UTC