Additional thoughts on IVPT

From Provenance WG Wiki
Jump to: navigation, search

Following are some personal notes that I would like to submit to the discussion as additional input (Paolo)

These notes are an attempt to reconcile a number of proposals on the concept of "Invariant view or perspective on thing (IVPT)".

As a baseline, I will use the most recent definition, found here which seems to summarize the preceding discussions. The proposal that follows is an attempt to consolidate, extend, and in part rephrase, the definition.

Note. I have replaced the term IVPT with just perspective, to make it more neutral wrt previous nomenclature, but one can see that they are very close. One key difference is that in this formulation "all perspectives are created equal", i.e., they are all perspectives on some abstract entity, rather than being subordinate to one another (whereas B is a view of A is anti-symmetric in the original formulation). This is up for discussion but is a core distinction.

Our goal is to define the subject of provenance, that is, a model for a universe of objects (in the sense of sets of property-value pairs) to which provenance can be meaningfully associated.

The over-arching principles that govern the model are as follows.

Abstract entities and their perspectives:

  • We assume a relativistic view of the world. In this view, abstract entities are primitive and have an ontological value, (in the original philosophical sense), but they can only be described through observations, and these observations are generally subjective and relative to an observer. Thus, the observer cannot describe the entity per se, but can only describe a perspective on it.
  • A perspective consists of a set of properties.
  • Observers are also known as asserters, as they are responsible for asserting the properties of a perspective.
    • Example: I see a tree in my garden. it is the manifestation of some "Tree", which I can describe in terms of colour, size, and shape of the leaves.
  • There may be multiple observers, each providing a (possibly) different perspectives on some entity.
    • Example: my botanist friend describes a tree in my garden, in terms of species and age, in addition to size.

Mappings acrosss perspectives:

  • Different observers may agree that their perspectives are different manifestations of the same abstract entity. Such agreement may be stipulated in terms of mappings, or correspondences, across sets of properties that characterize each of the different perspectives.
    • Example: The botanist and I agree that we are really describing two manifestations of the same tree, as we find there is sufficient evidence of that (ultimately, we trust our measuraments, or our sense). The botanist will even be able to describe my property "shape of leaves" in terms of her more appropriate "species" property (assuming that species determines shape of leaves!). We also agree on size, possibly by mapping across two different ways of measuring it, and within some tolerance.
  • Mappings can be partial.
    • Example: The botanist can tell the age of the tree, while I can't. This is not to say that age is not a plausible property for me, but I am unable to associate a value to it from my perspective.

Evolving state of perspectives:

  • The state of a perspective P at time t consists of the values of the properties that characterize P at time t.
  • The state of P may change over time. Sometimes changes can be ascribed to activities and events that some of the observers can perceive. This means that some events may cause a particular perspective to change. More commonly, events may result in changes in multiple perspectives, only some of which may be able to relate the event with the change.
  • When observers can make causal connections, they may use relations such as "derivation", "generation", and others, which are being discussed elsewhere, to characterize the change.
    • Example: the tree is pruned overnight. We both acknowledge the change, however in addition the botanist witnessed the process. From her perspective, there is a causal relation between the process and the fact that the tree has changed. I, on the other hand, can do no better than recording that some properties of the tree have changed. However, I may relay on the botanist's observations to conclude that the cause of the tree having a different shape is the process that she observed. This inference relies on our prior agreement that we have been observing the same tree.

Key known issues (leading to the nature of provenance)

It is not the goal of this section to define provenance. However, an intuitive notion of what provenance should be able to express has been driving the discussion on invariant properties, so it must be mentioned.

Identity, or: "The artist formerly known as Prince"...

Sometimes state changes are so drastic that one observer may no longer recognize the same underlying entity after the change.

  • Example: a block of ice is sculpted into the shape of a tree. One observer (the only one, for simplicity) agrees that it is no longer just a block of ice, and she also agrees on assertions that describe the provenance of the new shape (these include the raw block, the process of sculpting, and possibly more). Later, the ice melts, leaving a pool of water. The observer agrees that it is no longer a tree sculpture. The best she can do at this point is to acknowledge the new observable (the pool of water), possibly make additional causal assertions having to do with an increase in temperature, and make assertions that link it to the previous observable, the sculpture. So she has a new perspective, on something she describes as "a pool of water", and there is no expectation that this new perspective is "the same as" the sculpture, or the raw block. They are different, they have different properties, but they are part of one evolutionary process: provenance, as perceived by this observer, connects them.

Do we need invariant properties?<\u> I don't know. The previous note seems to indicate that we don't, but this is open to discussion.

<u>Provenance of the state of a perspective, and across multiple perspectives: In this model, we have:

  • provenance of a perspective at a given point in time, which consists of a collection of assertions about that perspective, accumulated over time.
  • The provenance assertions of different perspectives may be related to one another. Such relationships are induced by (follow) the mappings across perspectives, that observers have agreed upon. Thus, provenance is not one single provenance graph, but rather it consists of multiple graphs, interconnected in principle at least, by correspondences that map across each other's properties.

The File/Document/Printout example.

Example found [here]

The are three observers (they may be the same, but logically they are distinct) and three perspectives:

  • a Document d
  • a File f
  • a Printout p

These perspectives have some corresponding properties, including Author (for Document), Owner (for Fileand Printout), Content (for all of them), Creator, etc. In each perspective the meaning may be different, but this is ok, as long as there is agreement on partial mappings across them (or agreement that mappings don't exist).

Each perspective may evolve along its own Events line (sort of a Timeline but with implicit time). Events are naturally interleaved, for instance:

  • d events: Paolo edits d, then Stian edits d
  • f events: Stian creates f, then Paolo writes f, then Stian reads f
  • p events: Stian prints p

Each events line leads to a perspective-specific provenance graph.

Not only are the events temporally interleaved, but a logical relationship may sometimes be detected. This is only really necessary when we attempt to link up the individual provenance graphs into a inter-perspective provenance. In this case, mappings provide "bridges" across the different provenance graphs. For instance, file f at time t corresponds to document d after Paolo's edits and before Stian's edits, etc.