Provenance Working Group Teleconference

06 Jul 2011


See also: IRC log


+1.617.715.aaaa, Meeting_Room, stain, zednik, GK, [ISI], Lena, +1.561.216.aadd, +1.858.210.aaee
Luc Moreau
Simon Miles, ericstephan, JimMcCusker, Tim Lebo


<trackbot> Date: 06 July 2011

<Luc> Scribe: Simon Miles

<Luc> conference code 77681#

<stain> mr. conference is not listening to my code

<stain> it's restricted at this time

<stain> what code is it?

<pgroth> hmm, it should be the same one, right?

<Luc> conference code 77681#

<stain> ah.. with a 1 in the end

<stain> no, it's not valid

<stain> The conference is restricted at this time for 7768# - not valid for 77681#

<zednik> I just got on with 77681#

<stain> hurray

<stain> Zakim: +44.789.470.aacc is me

<stain> (and my mobile number recognized from Skype)

<stain> is there a ppt or video link?

<stain> Zakim: +??P9 is me

<zednik> will there be any screen sharing? webex or gotomeeting?

<zednik> + +1.518.633.aabb

<zednik> Zakim: +1.518.633.aabb is me

<stain> Zakim: ??P9 is me

<smiles> Luc: a round of introductions...

<smiles> Luc: I am a co-chair of the WG

<stain> we only hear fragments as the conference telephone is muting you too eagerly

<tlebo> we will be louder

<smiles> All: introduce themselves

<smiles> Luc: 4 sessions today, 4 today; finish 5pm on dot tomorrow, maybe later today

<Luc> PROPOSED to accept the minutes of 30 Jun telecon

<smiles> +1

<khalidbelhajjame> +1

<stain> +1

<jcheney> +1

<Luc> ACCEPTED minutes of 30 Jun telecon

<zednik> +1

<smiles> Luc: Action review - no actions

<smiles> Luc: Meeting objectives: slides available from agenda page

<stain> http://www.w3.org/2011/prov/wiki/File:F2FObjectives.pdf

<smiles> Luc: 7 deliverables and timetable to produce them are in the charter

<smiles> ... first draft of conceptual and formal models due in 3 months time

<smiles> ... What would we like to release by 6 months deadline?

<stain> zednik: are you able to hear this..?

<smiles> ... aspire to define *core* concepts and resolve most issues for these concepts

<stain> both my skype and voip connection are fragmenting a lot.. "that's the minimal. We need the inspir... ahsl ... got some agreements

<smiles> Deborah: Are which are core concepts documented somewhere?

<khalidbelhajjame> http://www.w3.org/2011/prov/wiki/ProvenanceConcepts

<smiles> Luc: for formal model first draft, have lightweight model using semweb technologies, have resolved issues related to that model

<zednik> stain: the audio is quiet but followable for me

<smiles> Luc: access and query TF, could aim to produce draft regarding access only by 6 months deadline

<smiles> ... issues related to the proposals resolved by first draft

<smiles> Luc: any comments on first draft aims?

<smiles> Paulo: in incubator group, we identified core concepts which we now use in WG, but can see some redundancy and overlapping in them

<qwebirc856316> so the ProvenanceConcepts link above by khalidbelhajjame i think is a set of proposed core ; is there a similar list for other concepts that may or may not be included?

<smiles> Luc: agreed that need to avoid overlap/ambiguity

<qwebirc856316> (sorry - qwebirc856316 is Deborah - i named myself but irc did not take it)

<IlkayAltintas> +q

<smiles> ... shows slide proposing process for next 3 months

<stain> GK, the sound might drop if the meeting goes quiet - as long as someone keeps making noise or talking it's OK :)

<smiles> ... aspiration to define all the core concepts in the charter as identified by model TF

<stain> GK: we're on http://www.w3.org/2011/prov/wiki/File:F2FObjectives.pdf

<sandro> WEBCAM IS UP. http://www.w3.org/People/Sandro/webcam

<sandro> (Sorry for low contrast on slides... the room is fairly bright.)

<stain> sandro: thanks, it's quite allright

<smiles> ... as soon as F2F1 over, want to produce draft of deliverables in W3C style, including schema (formal model)

<GK> @sandro, looks pretty useful, tx

<smiles> ... then review period, using W3C tools; it is here that we raise issues of overlap, redundancy etc.

<smiles> ... use telecons to discuss and resolve, prioritised by how much traffic on mailing list

<smiles> ... iterate for each issue, resolve by vote; last 2 weeks to finalise documents

<GK> @smiles, ReSpec makes it v. easy to make W3C style docs - http://dev.w3.org/2009/dap/ReSpec.js/documentation.html

<smiles> Khalid: two deliverables are due at same time, but D2 (formal model) dependent on D1 (conceptual model)

<smiles> Luc: have to do in parallel, co-evolve; people will be working on both

<smiles> Ilkay: confusion between formal model and formal semantics

<smiles> Paul: formal model is instantiation of model in semweb technology; (formal model is bad name); formal semantics is mathematical definition

<Luc> ack

<zednik> very quiet right now

<qwebirc413501> perhaps we should use another name rather than formal model - i think it is confusing - perhaps schema model

<smiles> jcheney: ambiguity in term formalisation, could mean mathematics or schema

<smiles> Paolo: note that D3 (formal semantics) is optional

<smiles> Luc: specified optional because we weren't sure if there would be critical in mass in WG; it seems that there is

<smiles> Deborah: terms may confuse readers

<GK> I think there is a danger that formal semantics makes a spec *less* useful if it's over-specfified / over-constrained.

<smiles> Paul: mean "schema"

<smiles> Deborah: we need 1 schema

<GK> I'm not speaking against formal semantics, but think it needs to be approached lightly.

<smiles> Luc: for first draft, we are suggesting lightweight (e.g. RDFS) schema

<smiles> (note for minuting: qwebirc = Deborah)

<smiles> Luc: objectives for this meeting:

<smiles> ... gain further agreement on concept definitions

<smiles> ... solve some issues in concept definitions; some will be left to those defining schema

<smiles> ... describe journalism example using concepts

<smiles> ... discuss possible graphical notation

<qwebirc413501> Just for the record, I would like to get an RDFS as well as an OWL encoding (luc thought an owl encoding may take too much time - I think we can get a lightweight one out)

<smiles> ... gain agreement on provenance access, decide document structure, decide tech, resolve some issues

<smiles> ... for other two TFs, decide where we are going to go next, what test cases are and what we will do with them; identify responsibilities, ownership of documents

<smiles> Luc: anything else?

<smiles> Paolo: are we happy with the journalism example?

<smiles> pgroth: example can change, but agreed as that as basis

<smiles> Luc: good to adapt to expose problems of change

<smiles> jcheney: need other examples also so that others see connection with their domains

<smiles> pgroth: for illustration purposes, nice to have one

<smiles> Luc: Move onto next topic: Model TF

<stain> is it http://lists.w3.org/Archives/Public/public-prov-wg/2011Jul/att-0017/ModelTaskForce_F2F1.pptx ?

<smiles> Paolo: introduces TF members

<smiles> Paolo: overall objective of TF to define provenance model

<smiles> ... starting points: incubator group report, journalism example

<smiles> ... initially articulate concepts independently of semweb, then connect and define schema after and provide semantics

<smiles> ... for F2F1, tried to consolidate effort on mailing list, Wiki around key concepts discussed

<smiles> ... these are the consolidated concepts

<khalidbelhajjame> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts

<smiles> ... some came up recently (e.g. time) so not discussed much prior but considered important by WG

<smiles> Khalid: some can be seen as "concepts", some "relations between concepts"

<smiles> Paolo: looking at Thing definition, we have definition, examples in journalism use cases plus others

<tlebo> BTW, I'm tagging the wiki with categories http://www.w3.org/2011/prov/wiki/Category:Discussed_at_F2F1

<smiles> ... followed by issues for discussion, these are from the WG mailing list/telecon discussions

<smiles> ... we need to finalise definitions, evolve towards the deliverable document

<smiles> pgroth: in consolidated concepts, there are links to concepts that have been discussed, but there are others identified in charter but not discussed (e.g. collection)

<smiles> Paolo: also need to coordinate with access and query TF, to say how you obtain assertions in model

<smiles> ... as a WG, we have agreed on some points (see slides/Wiki for exact wording of points)

<pgroth> +q

<smiles> ... there are outstanding issues which need to be addressed

<smiles> ... next steps: formalise prioritised provenance concepts, map to journalism example and extend to account for agreed concepts

<smiles> ... example comes with some sample queries, which we need to try to express these using our concepts

<smiles> ... also need a primer in natural language for those outside WG

<smiles> Deborah: primer also has examples of use?

<smiles> Paolo: yes

<smiles> pgroth: there is a separate primer for all of WG, but this comes later

<smiles> Paolo: being able to express example queries and write primer are tests of model

<smiles> pgroth: over dinner, ask us to come up with better names than PIL

<smiles> Luc: questions on Paolo presentation?

<smiles> Paulo: Was derivation dicsussed in a telecon?

<smiles> Luc: yes

<smiles> Paulo: do we need this concept at all?

<tlebo> where is the page listing suggested names for PIL?

<smiles> Luc: Derivation will be discussed in one of the F2F1 sessions

<smiles> Paulo: we will eventually need a "theory of provenance", founded on the model, combining formal semantics and model

<GK1> This talk of *a* theory of provenance makes me feel deeply uneasy. I think we need to put some vocabulary out there that developers can use.

<GK1> Also, there may be different theories applicable to different situations.

<smiles> ... looking at current discussions, looks like provenance theory would be based partially on proof theory, part on assertion theory

<smiles> ... would like WG to connect model with proof theory, as part of activity on formal semantics

<smiles> Luc: not yet discussed how formal semantics will be developed, happy for Paulo to put forward suggestions

<GK1> This is a standardization working group, not an academic research project. It's fair to note that there may be existing theories, and point them out, but I would worry if our work is committed to one that isn't *widely* recognized - and I'm not aware that such a thing exists.

<smiles> Paolo: see it as, if we can formalise model in, for example, proof theory, then this is welcome

<smiles> jcheney: waiting for informal definition process to converge before formalising

<smiles> Luc: it is clear that at this table there are those keen to provide formal semantics; want to get started after F2F1, but focus now is on natural language definitions

<qwebirc413501> +1 to getting a formalization discussion going (and acknowledge that it follows at least some consensus on some core from the model task force)

<smiles> Luc: we spent a long time talking about resources before we made some decisions - separate model from web architecture, then find some adequate definitions (thing, IPV of)

<GK1> We may have stopped talking about "resources", but IFAICT, a "thing" is described as exactly what is called a "resource" in web architecture.

<smiles> Luc: now want open discussion on these two concepts: thing and IPVT

<qwebirc413501> now looking at http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Thing

<smiles> Paolo: we now have "stuff", "state of stuff", "thing", "properties"

<smiles> Paolo: thing as defined has identity, invariant properties, mutable properties

<qwebirc413501> do we also have a distinction between stuff and thing? i am not sure of the need for "stuff"

<GK1> Paolo interesting example of ICE -> sculpture -> pool of water.

<smiles> ... talk about identity, and what changes mean a change in identity

<smiles> ... invariance is relative to a context/scope

<GK1> I agree that invariance is relative.

<smiles> ... therefore, mutable is also relative

<JimMcCusker> +1 that invariance is relative.

<smiles> Luc: Sandro came new to this; yesterday Paul and Luc discussed

<stain> +1 as well

<stain> qwebirc413501: I did previously suggest 'turtles all the way' so that there are no 'stuff' - but I guess the stuff is useful because it's the real thing behind a certain thing (which is just an interpretation)

<smiles> Sandro: first problem had was "thing", as assumed subject of provenance, but actually characterisation of that subject

<stain> but it's still outside our vocabulary - we're not going to say anything about the stuff

<smiles> ... saw no place for variant properties

<smiles> Khalid: from provenance point of view, only describing invariant properties

<smiles> Paulo: may be more abstract or concrete things (e.g. sculpture vs water)

<GK1> I don't see more or less abstraction in sculpture vs water.

<zednik> GK: I agree

<smiles> ... don't think variance (IVP of) and abstraction are the same thing

<smiles> Paolo: agreed that abstractions give different assertions of provenance of same thing, but all boils down to properties

<smiles> zednik: can get into morass when talking about abstraction; all we talk about are abstractions

<qwebirc413501> +1 to not including more or less mutable or more less abstract

<qwebirc413501> +q (deborah)

<GK1> @zednik: +1. I'm thinking that this talk of "invariance" is really constraining to a context, such that provenance assertions we can make *are* invariant within that context.

<zednik> @GK: I completely agree

<smiles> Paolo: need to know scope to know what invariance is relative to

<stain> @Paolo: Very good description

<smiles> smiles: the identity of the thing could be the scope

<IlkayAltintas> +q

<stain> you can have abstract properties such as "the materials that make out the shape of a shirt"

<stain> it doesn't have to be a measurement

<smiles> Paulo: by abstract/concrete, see thing as concept over which reason, provenance as metadata to concept

<smiles> Luc: WG agreed that this is an assertion language

<JimMcCusker> +q

<smiles> Paulo: it is "description of thing" we care about

<smiles> SamCoppens: need to distinguish information resource and physical resource

<smiles> Luc: do not use the word "resource"

<Luc> ack (deborah)

<zednik> @deborah: please speak louder

<GK1> @samcoppens: I don't think distinguishing physical and info resources is helpful

<smiles> Deborah: don't think "stuff" is a good thing to introduce

<smiles> ... also not sure need to distinguish invariant and variant

<smiles> Luc: what is meant by not using "stuff"?

<smiles> Luc: "thing" is what is in assertion language, "stuff" is what it refers to in the world

<GK1> Re. Deborah's comment, I think provenance is (mainly) intended to describe instances, not classes

<GK1> (I think that's part of what the "in the past" discussion is trying to nail.)

<sandro> deb: PML used "IdentifiedThing"

<smiles> Deborah: in PML, stuff is merely the instance of the IdentifiedThing

<smiles> Luc: it is not just stuff identified, but state of stuff

<zednik> thing is state of stuff?

<zednik> cannot hear current speaker

<JimMcCusker> I would argue that in what we're talking about, thing is an observation of stuff.

<smiles> Ilkay: if when you change some property of a thing and it becomes a different thing, then it is an invariant property

<zednik> still cannot follow speaker

<GK1> FWIW, in Web Arch, a "resource" is something that *can be* identified. To the extent that "state of stuff" can be identified, it's also a resource in that sense.

<smiles> zednik: distinction between abstract and concrete not important or strong, what matters is what we can assert about

<GK1> @zednik: +1

<stain> @zednik: +1

<pgroth> close the queue

<sandro> "observation" for "thing"

<smiles> JimMcCusker: if a thing is a set of properties observed/asserted, then call invariant properties "observations"

<sandro> luc: but some things are not observed, thus "characterization".

<smiles> Luc: but also want to talk about things not observer

<tlebo> then the subjects of two disparate "observations" can or cannot be inferred to be identical.

<GK1> I'm not fully convionced by ovservations and things. Consider a stock ticker: a reasonable provenance asseryion is that it's 15 minutes later than the "real" market data, IMO.

<sandro> (I wonder about "fingerprint")

<GK1> That's an invariant that survives any single observation.

<tlebo> fingerprint fits well with Jim's "observation".

<stain> the asserter might not just observe, also interpret, reason and.. guess

<smiles> Paulo: in response to Deborah, distinction between invariant and variant is often of interest; for example, in versions what we care about is what has changed versus the stable identity

<JimMcCusker> True. I guess "Assertion" would be the most general, with a particular plan/recipe/whatever that describes how the assertion is being made.

<sandro> Paulo: Provenance implies continuity and observation

<smiles> Paolo: more important that observed change than that change happened, and infer that process occurred to make that change

<sandro> paolo: process is also a key to provenance

<JimMcCusker> An assertion that has a creator who has the observer role is considered an observation.

<sandro> (I'm thinking it's not about mutablity, but about chaining from one snapshot to the next.)

<smiles> Luc: close this session for a break

<pgroth> hi all were breaking 15 minutes

<sandro> restart at 11:05

<Luc> Chair: Paul Groth

<ericstephan> scribe: ericstephan

Paul - talk about some of the other concepts

Luc - we can raise issues but we also need to be pragmatic in terms of our time. Agreeing to disagree.

<sandro> ericstephan, us ":" after person's name

Jim - If we say what we are calling a thing, is an observation or assertion (or composite of assertions). It is an information artifact about a thing in the world. The assertion is something that is invariant.

<qwebirc413501> ? shall we mention states in this discussion?

Jim - the state of the thing in the world changes through time. If we assume that any worldly thing is variant and the assertion is invariant. We can make the distinction between the two concepts

Paul: Suggest we propose definitions like Jim's and modify them.

<Paolo-2> Looks like we are going to project the irc window here so we are all on the same page regardless of location

<Luc> A thing is an information artifact about a subject in the world.

<tlebo> http://www.w3.org/2011/prov/wiki/ConceptThing

<smiles> Thing: "things" represent real-world stuffs and have properties modeling aspects of stuff states. Things have: an identity, a set of invariant (== immutable) properties, a set of mutable properties

<Paolo-2> For reference, above is the current proposal for thing

<GK1> It seems to me that the "invariance" is captured by saying that we can make certain enduring assertions about it.

<tlebo> Observer, ObservationalContext, SubjectOfObservation ?

Jim: The assertion describes the state as asserted by a particular entity.

<zednik> the characterization of a thing in a provenance assertion is invariant for the scope of the provenance assertion

Jim: The subject that is being described is always variant. The description stays the same at a particular point by a particular entity.

Tim: Descriptions of subjects do not exist outside an observation?

Luc: Its in the modeling that you talk about particular properties

<tlebo> Observations renamed to Descriptions.

<tlebo> Subjects are the things described by Descriptions.

Luc: I'd like to come back to the word description. When we had the word thing. the process execution used things. If you replace the word thing by description...

<zednik> What about Characterization?

Paul - it sounds like you need to do all of this in terms of description. Something in the world describes a particular state.

<tlebo> State of a Subject is captured within its Description.

Satya: How do you describe the characteristics of a process?

Jim: A process is a kind of thing therefore it is an entity in the world.

Satya: need to Distinguish between Occurrence and Continual

<GK1> I think Satya is talking about "Occurrent" vs "continuant"

Paul: Rephrased generation describes a subject in the world described by a description (sorry if I munged this - Eric)

<Luc> A Description is an information artifact about a subject in the world. A Description is an invariant assertion, made at a particular point. (A Description could be made by guessing, lying, observing, ...) A Description is an Assertion about a subject that is variant in the world. A Description consists of invariant characteristics.

<qwebirc413501> +q

<sandro> GK just type it

<GK1> Iack GK1

<GK1> My question is in the log, shoul;d show if you ack me

<Zakim> GK, you wanted to ask if my example of a 15 minute delayed stock ticker would be regarded as a reasonable provenance assertion. If so, I think description as observation doesn't

Paulo: Problem why we moved from observation to description?

<khalidbelhajjame> +q

Deborah: wanted to bring up the lack of provenance in state. Describing something in a moment. It could be a long period of time. Were we working with a state centric view but not discussing it?

<qwebirc413501> and further that possibly this new way of discussing it with descriptions might work

Paulo da Silva: Adding Subject Assertion to Thing Description.

<Luc> http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions

Luc: Revised definitions on the wiki

<satya> Is description a form of narration? (derived from Luc's defintion)

Simon: Not clear about the later definition and what was being defined by Jim.

Jim: Description is always invarient
... Just because the description is invariant it doesn't mean the entire entity is invariant

<GK1> @jim: Don't we want to say the "Description" has enduring truth?

<GK1> ... (for "Description" as a provenance assertion)

Sandro: Put in a little vote for observation, description isn't bad but has many different types of meanings.

Luc: Can you have an observation that is not observed?

<Paolo-2> I like observation as it implies it is relative - to an observer. Of which there can be multiple

Paulo: make note of what Graham is trying to say.

<tlebo> (paolo - if you reload http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions the image will be smaller)

<GK1> yes .. the perspective/context

Jim: its a claim not an enduring truth

<GK1> OK "truth" is problematic

Jim: Its a piece of information that is enduring, but not sure about the truth bit.

<qwebirc413501> +1 to not using the word truth

<GK1> The nature is that the turth or otherwise of the claim doesn't change

<zednik> +1 to not using truth

<sandro> "invariant claim" maybe

<Zakim> zednik, you wanted to ask Observation has defined semantics in science

<zednik> an act of observing a property or phenomenon, with the goal of producing an estimate of the value of the property. A specialized event whose result is a data value.

Stephan: Within science observation has a different definition than the way we are using it.

<qwebirc413501> +1 to not using the word observation

Stephan: Avoid the term observation.

Paul: It is reasonable to replace the verbage, who has the most votes for each term on the whiteboard?

<sandro> webcam folks, working? reload?

<GK1> WebCam OK

Vote on stuff, subject thing, entity, and something in the world. Which one is your favorite?

<zednik> webcam is back up for me

<GK1> ARe we voting on terms to appear in the actual spec?

<stain> Are we using AV or first past the post?

<stain> +1 stuff

<stain> I'm confused by te process.. can't see the hands and the video is out of sync

<GK1> stuff:-1, thing:0, entity:0, somethinginworld:-1,subject:OK,object:0

<zednik> +1 for entity

<stain> stuff:-1 thing:+1 entity:+1 somethingintheworld:-1 subject:1 object: 0

<Paolo> @stian just having fun

Paul: Restart vote rejection is the goal

<GK1> Webcam is a bit high on whiteboard, can't see bottom

Stuff rejected

<GK1> Resource:+1

<GK1> :)

Something in the world rejected

object and resource rejected

<tlebo> Subject ~= Thing ~= Entity

<Paolo> @GK you are then /rejecting/ resource, right?

<GK1> No, vote FOR. In the final analysis, I think what we want to capture is exactly the notion of a web resource.

<qwebirc413501> yes - rejecting Stuff, something in the world, object, and resource

<stain> Derivation as subject and objet

<stain> has

Satya: Subject can be confusing from RDF perspective

<GK1> Subject and Object are confusing terms in RDF, but it's what we're stuck with.

<stain> luckily "stuff" is just as blurry everywhere else it's used!

Sandro: Suggest item

<zednik> Item: -1

Paul: We already made this decision: we cannot use resource.

<qwebirc413501> remaining terms - subject, thing, entity (and possibly item)

Sandro: We need to be clear on why we rejected resource

<GK1> Is this terminology fixed for the final spec? I'm happy to continue for now.

<GK1> There's no real discussion about *what* a *web resource* is -- the main discussion is about distinguishing different kinds of resource.

<Paolo> @gk not final, but we are trying to replace "stuff" and "thing" for the purpose of the next draft

Deborah: Entity Decently defined in some knowledge sources.

Tim: Of the three, thing and entity are not oriented toward being observed. We should give something of what we are talking about.

<GK1> @paolo - I'm content to continue for now with ¬resource, but I'd like to keep an option to revisit later

Paul: Can we just take a vote now?

Sandro: Unless anyone strongly rejects it may be reasonable to vote.

James: Just to put it in context, this vote is for the next draft

<pgroth> straw poll - choice between subject, thing and entity

<GK1> (IETF does "humming")

<pgroth> subject:

<GK1> subject:+1 (of the three)

<tlebo> +1 for subject

<qwebirc413501> Deborah votes for entity

<sandro> entity, because of rdf:subject

<satya> entity

<stain> +1 for entity

<Paolo> +1

<pgroth> subject

Paul: Reset

<Paulo> +1

<satya> -1

<GK1> +1

<tlebo> +1 for subject

<StephenCresswell> +1 for subject

<IlkayAltintas> -1

<pgroth> All those in favor of subject

(Reset again)

<satya> -1

<GK1> subject:+1

<tlebo> +1 for subject

<Paulo> +1

<StephenCresswell> +1

<Paolo> +1

<sandro> Jim: +1 subject

<pgroth> All those in favor of Thing

<khalidbelhajjame> +1

<satya> +1

<Vinh> +1

<qwebirc413501> Deborah +1 for entity


<pgroth> All those in favor of Entity

<RyanGolden> +1

<sandro> +1 entity

<stain> +1

<smiles> +1

<zednik> +1 for entity

<IlkayAltintas> +1

<jcheney> +1 entity

<SamCoppens> +1 for entity

<satya> +1

<sandro> No one was in favor of THING. khalidbelhajjame was about ENTITY

<Vinh> +1

Deborah: Khalid and Satya voted for Entity

<pgroth> decision entity

<sandro> PROPOSED: For the first draft, we'll use "ENTITY" instead of "stuff"....

<qwebirc413501> Second sandro's proposal

<jcheney> +1

<sandro> +1


<Paolo> +1

<smiles> +1

<qwebirc413501> +1

<Vinh> +1

<RyanGolden> +1

<satya> +1

<tlebo> +1

<SamCoppens> +1

<stain> +1

<IlkayAltintas> +1

<GK1> 0

<khalidbelhajjame> +1

<zednik> +1

<sandro> RESOLVED: For the first draft, we'll use "ENTITY" instead of "stuff"....

<stain> does it have to be <!--ENTITY caps?

<sandro> NOT caps!

Paul: Vote for new names for Thing

<GK1> Snapshot:-1, Fingerprint:-1

<GK1> I need to break off, have (infrequent) train to catch.

<pgroth> thanks GK

<stain> View, Perspective, Interpretation

<zednik> lost, call - calling back in

Luc: If you go back to original definition of thing. We were identifying the state not the stuff. In the same token, the "thing" has an identity, but not an entity in the world.

<stain> exactly!

<stain> when we give something an identity, we are implying a 'thing'

<stain> hence an interpretation/perspective/selection of the entity

Jim: The point of this is from the set of entitites you should be able to identify which entity you are talking about.

<tlebo> (my vote no): characterize: describe the distinctive nature or features of

Paul: We will do the speaker queue and then go through the rejections.

<tlebo> when we author OWL axioms, the owl:Class is the pil:Entity that we are creating pil:Descriptions of ?

<tlebo> pil: Descriptions

Satya: You need to have enough to properly distinguish between two things (black shirt and blue shirt)

<IlkayAltintas> +q

<tlebo> +1 to not needing to name the global thing and being PERMITTED to use your own name for the thing you are describing.

<tlebo> we don't need to name entities to describe them.

<zednik> audio is breaking up

<tlebo> feedback on phone: please mute yourself.

<tlebo> thanks!

Ilkay: We are trying to define to many things with one word.

<stain> Stian: the thing IS identifying the entity - we don't need to worry about how it identifies the entity

<pgroth> stain - yes

<zednik> audio is better now

<pgroth> stian, i think that's exactly point

Paul: Lets try to reject some of the words

<zednik> vote to reject Observation

<zednik> vote to reject Assertion

<stain> @pgroth, yes, the 'thing' is a contextualised way to talk about the entity, like our blue shirt in the office

<zednik> vote to reject Entity Assertion

<zednik> vote to reject Fingerprint

<zednik> vote to reject Snapshot

<stain> -1 snapshot

<zednik> half vote on Representation?

<stain> Representation is good - but 'taken' already by HTTP

<stain> Description - does it imply that you need to include the description? (ie. some properties)

<satya> I agree with James - state description

Simon: The concept definitions from the conceptdefinitions page seem all very different than in the original definition.

is description to general?

<stain> +1 too general

majority raised hands at f2f1

<zednik> +1 'just' Description is too general

<stain> no, not a state, a view or understanding of the entity

<tlebo> anti "characterization": b/c describe the ___distinctive nature___ or features of

<tlebo> pro "characterization": b/c describe the distinctive nature or ___features of___

<stain> the distinctiveness is key

<scribe> new editted definitions: http:www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions

<stain> http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions

Tim: Concerned about state in the definition.
... Do I need two descriptions of temperature if the temperature changed over two hours?

Paulo: The task modeling approach doesn't need to know all the intermediate states. My concern is not states but state transition.

Tim: is proposing is to eliminate state at the top level.

Deborah: A weakness of PML looking back is that we didn't have a top level concept state. If you added granularity to it, how would you describe state?

<sandro> luc: temp drop example is like a car with a known velocity and unknown location.

James: Why not have state description and change description?

state description doesn't change, change description does change

Yolanda: When i think about state I think about the state of the world, not of a particular entity.

<Zakim> sandro, you wanted to ask when state changes we have two SDs of one entity, or two entities.

<qwebirc413501> i like james' slant with having both description and state description..... we just encourage state descriptions as we get more information

Paul: I don't think anyone wants to demand which level of abstraction. State has implications

<YolandaGil> So I think we need to constrain ourselves to descriptions of the entity we are describing the provenance of. "State" often refers to state of the world and the context of that entity, so I'd recommend not to use the term "state"

<YolandaGil> I agree with whoever said that whether an entity is the same or not is a domain-dependent decision

<Paolo> ...and I agree with Yolanda

<stain> how is it spelt?

Paul: Recommend using Bob as a placeholder until we find a replacement for thing

<Paolo> Bob? as "bob"

<stain> Bob! ihi

<YolandaGil> as in "thingama-bob"?

<stain> I heard 'bofh'

<stain> YolandaGil: oh no, stuffama-bob!

<pgroth> we are breaking for lunch

:-) Yolanda

<pgroth> we'll start again at in half an hour (1:30pm)

<stain> when is it back from lunch?

<stain> ok

<stain> thanks

<stain> time for dinner here :)

<Luc> helena, stephan who is presenting?

Connection TF & Implementation TF

<Luc> SCRIBE: JimMcCusker

<pgroth> we are starting again

<tlebo> I spoke with Simon at lunch. "State" is not constrained to a single moment in time. So I am comfortable with "State", but still not convinced it is necessary as part of the Concept term names.

<qwebirc413501> if remote people dropped off, now is a good time to call back in

<pgroth> cool tle

<pgroth> cool tlebo

<pgroth> i think there may be another name

<pgroth> or a better name than state

Likewise, I think I'm more comfortable with State as opposed to Description, but we need to be clear that it's a contextualized state, and is intended as assertions about an entity as described by an agent.

<smiles> @JimMcCusker agreed

<qwebirc413501> we are getting the presentation up....

ericstephan: Connection TF
... Open Brainstorming to identify different sorts of connections, define "connectivity"
... Member contributions: DCMI, DataONE WG on Provenance, HCLS BioRDF TF, HCLS Sci Discource IG. and more.
... Next Steps: reach out to other connections? Quantify , Assess, Filter results. Identify linkage points for PIL, potential gaps between PIL and the connection.

<Luc> Chair: Luc Moreau

<qwebirc413501> link to analytic provenance community eric mentioned - http://vacommunity.org/AnalyticProvenanceWorkshop


ericstephan: Balancing Act: Lots of provenance activities to reach out to, small group with which to do it. Don't want to bais.
... Coordination: What do other task forces need from us? When should the task forces meet with us?

smiles: Conflict between adoption and implementation specific issues. The Conn. TF is there to define the relationships.
... Concept of profiles, but maybe that's too heavyweight?

ericstephan: Communities need to be able to explain in their own language, but who does the formal connections?

<pgroth> +q

ericstephan: Analytic Provenance still in early stages, but maybe they might be a first adopter.

smiles: The bridge is being made by the WG TF?

<qwebirc413501> +q

<tlebo> BTW, the vocab mappings is in a google spreadsheet and pdfs at http://inference-web.org/wiki/Review_of_prov-xg%27s_Provenance_Vocabulary_Mappings

<qwebirc413501> whops +q (deborah)

paolo: is this where we talk about extension mechanisms?

ericstephan: extension or mapping into PIL.

pgroth: We established the TF to make sure we get wide adoption.
... three levels: adoption, mapping, and extension.
... initial steps are to establish links to other groups for feedback.

Luc: There will be a question by the public: DC provenance vs. W3C provenance? And how can we work with both?
... Some goals include hopefully provide mappings on standards like DC.

<tlebo> best link for DC's provenance definitions?

<tlebo> dublin core's

<khalidbelhajjame> +q

ericstephan: prov-xg did an excellent job identifying existing provenance.

<IlkayAltintas> +q

<satya> I agree with Luc - mappings and extensions are not in the scope of the WG

Luc: It's not the responsibility of the WG to map to PML, OPM, Provenir, etc.

pgroth: one example I find interesting is Creative Commons. The connection task force can show a way to link PIL to CC licensing standards.

deborah: Plea to start the mapping. The xg identified a number of issues that were found late in the game.

ericstephan: start with the uncontroversial mappings to experiment.

paulo: Working with scientists on cyber-infrastructure. NSF uses this on a domain basis. 500 or so cyber-infrastructures that come and go.
... many existing concepts in e science is already provenance.


<Luc> D6. PIL Best Practice Cookbook (W3C Note). This document includes a limited set of best practice profiles that link with other relevant models, such as Dublin Core provenance related concepts, licensing in Creative Commons, and the OpenId identity mechanism for people.

<YolandaGil> Luc: Thanks for bringing up D6. I agree with the 3 categories: 1) licensing and CC, 2) preservation (DC, Premis, InterPARES), 3) authentication (openID and digital signatures)


khalidbelhajjame: Mappings could help us identify issues in modeling.

IlkayAltintas: Is the goal of the mapping to become inclusive of all other efforts?

Luc: To some extent, we will do this.

<pgroth> where is this idea of mappings coming from?

YolandaGil: Likes the 3 categories of D6, and need to be driven by those sort of tasks.
... we will fail the group if we don't link to other groups within W3C.
... in science communities, questions about what scientifically driven folks are participating.

<Zakim> satya, you wanted to Deborah's point

satya: We won't have time to address all concerns of communities.

<YolandaGil> Doing mappings to other vocabularies is a lot of work, for the XG our mappings were an order of magnitude more work than we originally expected.

satya: on mappings, there might be complex mappings that might not get finished.

<zednik> file at http://www.w3.org/2011/prov/wiki/File:ITCTF_F2F1.pdf

<qwebirc413501> just to be clear - I suggested starting the mappings to some key targets... I realize that the complete mapping is potentially time consuming but i think at least getting some initial thinking about the mapping needs to be done (from deborah)

<YolandaGil> I think rather than mappings we need to start with an informal report of how our goals relate to other activities. Then engage other communities if we decide to do certain mappings, but doing the mappings ourselves and as an initial goal will be too hard.

Impl Questionnaire URL: http://goo.gl/rHxAg

zednik: What did I mean by Plain HTML?

<satya> @Deborah: I agree to your point that other standards should inform our work, but creating explicit mapping will be difficult (even for something like DC - which does not have formal/mathematical definitions)

Most interest in toolkits in Java

<stain> <div class="provenance"> !


<stain> .. but note that almost 60% are using something else than Java (as well)

JimMcCusker: I will reach out to caBIG for additional feedback using questionnaire.

Luc: What's next?

<YolandaGil> what is the third level? it's sooo hard to hear...

<pgroth> scientific communities

<YolandaGil> ah, yes, got it all!

<YolandaGil> thanks!

Luc: First Level: Licensing, etc., Second Level: W3C communities, Third level: scientific communities

<Lena> @sandro any chance that the quality of the sound in the room is improved? the mic wakes up in the middle of sentences, so we are missing some parts of waht people are saying

<pgroth> +q

Luc: what sort of coordination is expected?

sandro: whatever is in the charter.

deborah: What about open govt data?

sandro: their charter mentions prov-wg, so there is a connection.

Luc: JimMcCusker, Lena, and satya should provide interfaces with HCLS.

<YolandaGil> I also mentioned the geospatial group at W3C, my understanding is that they are focused on ISO 19115 -- that is a very high impact area!

<sandro> +1 liasons using drafts as way to communicate.

pgroth: only 3 months until we have a first draft. Maybe outreach should happen once we have something to show.

<Luc> we also have a rep of the OGC consortium in the WG

<YolandaGil> yes, Carl Reed

pgroth: Until then, we make sure we have the right framework in place to introduce the PIL.

<YolandaGil> I agree with Paul, start with an informal report of how our goals relate to other activities. An initial report in 3 months makes sense too.

ericstephan: Would cataloging possible early adopters be a useful product?

<Lena> (I am trying to convince HCLS to leave the prov work to the prov wg ;) )

Luc: how do we go about producing the report?

<pgroth> go lena!

<YolandaGil> Lena: that's a great goal, but they have many additional requirements that might be too much to cover for us :)

<scribe> ACTION: ericstephan to create a plan to deliver a connection report. Plan will include a timetable, a list of connections, and individuals who will deliver to the connection. [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action01]

<trackbot> Created ACTION-13 - Create a plan to deliver a connection report. Plan will include a timetable, a list of connections, and individuals who will deliver to the connection. [on Eric Stephan - due 2011-07-13].

<ericstephan> Action Plan to deliver the connection report and the plan will include a timetable and a list of connections and individuals who will contribute a description of their connection.

<trackbot> Sorry, couldn't find user - Plan

Note: ACTION-13 should have due date of 2011-07-14.

<YolandaGil> Eric: I will absolutely help with the report, though I have very limited availability until Aug 15 unfortunately

Paulo: It would be good to use direct liasons to communities and working groups.

<tlebo> http://www.w3.org/2011/prov/wiki/Connection_Task_Force#Connections

Paulo: please take note of community milestones.

<YolandaGil> Paolo: I agree. I'd suggest that the WG develops one slide with an overview/wiki pointer/POC that we can all use when we go present our stuff or attend meetings!

<Lena> (need to include countries in the questionnaire also)

zednik: Tasks implementation should do is to catalog stakeholders, put out a second version of the questionnaire.

<Lena> (goal of the survey: if people are able to express their opinion, they will more likely adopt the product of the wg)

<Lena> (since some of them have offered contact information and interest in developing toolkits, we can contact them once we have a product)

<zednik> gather implementation requirements - touches upon access and connection TF as well

<zednik> audio is really breaking up for me right now

<scribe> ACTION: zednik to create a plan for a implementation report [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action02]

<trackbot> Created ACTION-14 - Create a plan for a implementation report [on Stephan Zednik - due 2011-07-13].

Note: actual due date for ACTION-14 is 2011-07-14.

<scribe> ACTION: zednik to write second iteration of the questionnaire. [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action03]

<trackbot> Created ACTION-15 - Write second iteration of the questionnaire. [on Stephan Zednik - due 2011-07-13].

Luc: Test cases and use cases.

<zednik> example of test cases from W3C process - http://www.w3.org/TR/rdf-testcases/

<Lena> those in the room, PLEASE scribe the questions directed to me or stephan - we REALLY are having a hard time hearing what's going on in the room!

jcheney: split use cases into generating and storing use cases?

smiles: test cases need to be implementation-specific.

<IlkayAltintas> +q

zednik: test cases must be machine processable as well as implementation-specific.
... therefore, we need a formal schema.
... how are these test cases different from other kinds of test cases?

<satya> We need to consider that the test cases are part of the W3C recommendation process - notionally demonstrates that our work is practical/implementable

Luc: The idea of a validator isn't bad. We may come up with additional constraints that aren't syntactic.

pgroth: We all agree we need test cases, but it's too early to figure out what those test cases should be yet.

sandro: test cases were used along the way to record decisions in other groups like OWL.

Luc; we're not going to have test cases for a while, around T+7.

<qwebirc413501> (from deborah) we have a integrity constraint-based validator model for PML (my student Jiao Tao's phd work is on this). just mentioning it for the notes since we may want to come back and look at this model

pgroth: we can talk about this in 2 month's time and still have test cases in time.
... coming up with test cases is easier with a draft document to work against.

zednik: we do need feedback from implementers on what test cases they would like to see.

IlkayAltintas: What about backwards compatibility?

sandro: this isn't an issue until we get to candidate recommendation.

<zednik> audio is breaking up

<sandro> zednik, James just talks very softly.

jcheney: It seems that as we work on the model, there will be decision points, and each of those points should be recorded as a test case.

<Lena> (heya ericP!)

sandro: Introductions of Eric Prud'hommeaux

(ericP to the rest of us).

<stain> what's going on.. is it still the break?

<stain> I heard Luc and Satya and started paying attention

<zednik> according to my calendar we should have anothe 15 minutes of break

<IlkayAltintas> t

<Luc> Scribe: Tim Lebo

<tlebo> scibe: me

session 4: Model TF

<Luc> SUBTOPIC: Process Execution

<tlebo> http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions

<tlebo> BOB - the stand-in name for Description/Characterization/Thing/EntityDescription/StateDescription

<qwebirc413501> restaurant - http://www.tommydoyles.com/ - 1 Kendall Square Cambridge, MA 02138 617-225-0888 (right sandro?)

<tlebo> we are NOT defining BOB in this sesssion

<sandro> right, qwebirc413501

<sandro> reservation is under "W3C" for 21 people (18 of us, and 3 additional family members)

<tlebo> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts

For what it's worth, my original idea about Bob was something like datum and datasets in Information Artifact Ontology: http://code.google.com/p/information-artifact-ontology/

<tlebo> rephrased definition of process execution: A process execution is an activity that uses (zero or more) entities in specific states, described by BOBs, performs a piece of work, and generates (zero or more) new entities in specific states, described by BOBS.

<tlebo> jim: can we NOT imply agency in the process?

<tlebo> paulo: fundamental issues. e.g. "generate" making new entities w/o specifying the process (recipe?) used.

<tlebo> paulo: process of asserting or deriving or both or neither?

<tlebo> I am trying to track provenance of the pages discussing concepts at http://www.w3.org/2011/prov/wiki/Model_Task_Force#Materials_discussing_Concepts

<tlebo> zednick: ask to clarify producing 0 or more entities' states (new BOBs describing a previous Entity)

<zednik> new bobs?

<GK> I think entity::BOB relationship is n::m

<stain> yes

<satya> a general comment (following on Stephan's comment): Do we lose any information if we remove the "state" and "Bob" from the current definition?

<GK> Or may be

<Zakim> tlebo, you wanted to ask about managing our page creation

<tlebo> http://www.w3.org/2011/prov/wiki/Model_Task_Force#Materials_discussing_Concepts

<tlebo> existing issue 1 - It should be understood that, in the definition, use, perform a piece of work, and generate do not have to be performed sequentially, e.g. some generate can happen before some use

<stain> slightly louder please :)

<tlebo> we will be louder

<tlebo> ordering of use and generation - any order is acceptable?

<tlebo> stain: compound processes - this is needed.

<tlebo> we need to compose (and abstract) processes.

<tlebo> satya: orig def included state as part of the Stuff. Now that we have Entities described by BOBs. BOBs are not changing.

<tlebo> satya: just leave it at generating BOBs?

<stain> yes - it uses an entity (in such a state) as described by the BOB

<stain> but just talking about BOBs avoids us having to disassemble the BOBs every time its used

<zednik> +1 to BOBS as input/output

<tlebo> luc: processes to not generate BOBs; they generate entities that are described by BOBs.

<stain> perhaps the BOB is more like a proxy than a description

<GK> I've lost the plot: how cab BOBs be input?

<stain> like a smart query in itunes

<tlebo> -1 BOBs at I/O - I/O is Entities that can be described by BOBs.

<stain> if Bob is to be useful it needs to be standing instead of the entity - otherwise everything is just "entity as described by a bob"

<zednik> if entities is I/O, then why even have BOB?

<GK> @tlebo: +1 (BOBs not I/O of process execution?)

<tlebo> paulo: Recipe. Process Execution is an execution of a Recipe.

<stain> @tlebo - when I summarised process execution I said 0-or-more both for inputs and outputs

<tlebo> Recipe vs. Reproducible

<stain> (the process might act as an agent instead, or just be very lonely)

<stain> what is the decission?

<tlebo> accepted: issue 1 is done.

<tlebo> proposed issue 2 - A process execution should be associated with an actor. (Proposed by Jun on 2011-05-31)

<tlebo> proposed: Process Execution issue 3 - A process specification can be either pre-defined or not. (Proposed by Khalid on 2011-05-31)

<GK> Issue 3: why does this matter?

<tlebo> paulo: predefined recipe vs. unspecified recipe

<tlebo> recipe is nameable/unnamed, repeatable/unrepeatable, specified/unspecified.

<tlebo> jimmcusker: recipes are specified as a Recipe role of a process execution.

<GK> Why do we need recipe in our vocabulary?

<tlebo> luc: revisiting - distinction between process execution and process specification

<tlebo> luc: specifying a recipe is out of scope for wg (recipe ~= process specification)

<Paolo_> @gk we don't. We are pointing out that it is out of scope of the wg

<tlebo> paulo: this will make it harder to formalize

<GK> "We describe process executions independently of how the process is specified" - what more is needed?

<Paolo_> I mean, it is a sort of undefined term for us. A placeholder that will not be resolved...?

<tlebo> paulo: need to define work, activity, recipe. specification of process execution is in terms of recipe.

<stain> @GK - agreed

<tlebo> paulo: rebuilding what was done.

<tlebo> luc: workflow script is a kind of recipe.

<Paolo_> For those back home: Luc just went for a quick jog around the room...

<stain> ;-))

<tlebo> we are trying to distinguish 1) process execution and 2) process specification

<tlebo> ACTION: Paulo to document definition of "process execution" and "recipe" and provide to group. [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action04]

<trackbot> Created ACTION-16 - Document definition of "process execution" and "recipe" and provide to group. [on Paulo Pinheiro da Silva - due 2011-07-13].

<tlebo> tlebo: concerned about "pre-defined" - can the recipe be described after the process execution has occurred and been described?

<stain> what if we just say 'defined' ?

<GK> As stated, issue 3 looks like a content-free assertion. I'm not sure what value it adds.

<tlebo> accepted: issue three. A process specification can be either pre-defined or not. (Proposed by Khalid on 2011-05-31)

<tlebo> (my post-description concern is handled in "or not" situation)

<tlebo> resolved issue 4 is N/A A process execution may consume and/or generate IVPTs. (Proposed by Paolo on 2011-05-20)

<tlebo> resolved: issue 4 is N/A A process execution may consume and/or generate IVPTs. (Proposed by Paolo on 2011-05-20)

<tlebo> proposed: issue #5 A process execution represents a specific data processing activity in which in which all inputs and outputs are fully determined. (Proposed by Graham and curated by Jun on 2011-06-20)

<stain> A process execution represents a specific data processing activity in which in which all inputs and outputs are fully determined. (Proposed by Graham and curated by Jun on 2011-06-20)

<tlebo> gk: getting around everything in the past.

<tlebo> jimmcusker: process execution is a closed world. no other inputs/outputs can be added.

<khalidbelhajjame> +q

<tlebo> group: no!

<tlebo> gk: it is not going to CHANGE (in/outs)

<tlebo> group is not comfortable with all in/outs being fully specified.

<GK> We had said in/out s are fully known, but that didn't work...

<tlebo> @gk - you're typing

<stain> GK - mute

<GK> ... hence tried "determined"

<stain> we can hear your typing mood :)

<GK> muted

<tlebo> gk: most of time saying in past is OK. but may lead to issues.

<GK> I'm uneasy about forcing process execution into the past ... think it could trip us up, not sure why.

<tlebo> khalidbelhajjame: a currently running process execution may continue to take new inputs and produce new outputs after a user asks about it.

<GK> "A potential futiure event is not an occurrent" - is this true?

<zednik> definition I used: "actually occurring or observable, not potential or hypothetical."

<tlebo> zednik: if process execution is an occurrent, then it must have started (but not nec. finished). must NOT be planned for future.

<zednik> from new oxford american dictionary

<tlebo> paulo: we've learned that many restrictions are relaxed as a language is applied to other situations.

<tlebo> paulo: e.g. provenance of greek vase

<tlebo> paulo: what about an unknown process that we still want to describe?

<tlebo> group disagrees with "fully determined"

<GK> I also think "fully determined" doesn't cut it. So +1

<tlebo> smiles: do we lose anything by NOT saying that it has to be in the past?

<tlebo> satya: MUST be in past.

+1 on Stephan's proposal: Process Execution is an occurrent, and therefore must have started in the past relative to the provenance assertion.

<tlebo> smiles: putting it into the definition limits us. leave it for the primer "provenance is about things in the past"

<tlebo> satya: provenance metadata vs. other types of metadata. only distinction is that provenance is past. Must put it into definition of process execution.

<tlebo> smiles: then put "past" into all definitions?

<Luc> all assertions in PIL have to be interpreted as something that has happened

<tlebo> resolved: GK's phrasing of process execution not satisfactory. change 1, change 2 (zednik) must not be planned for future, must have started. change 3 (luc et al.)

<tlebo> pgroth: yo .... dude ...

<tlebo> zednik: by using occurrent - then it may not be finished that makes output in real time that we want to encode. we can't say outputs are fully determined.

<tlebo> pgroth: occurrent approach or provenance "has happened, in past"

<tlebo> zeknik: occurrent is not too constraining

<satya> @stephan - can you please confirm that occurrent definition as you described is from oxford dictionary - since the common interpretation of occurrent in philosophical ontology work - BFO and DOLCE does not specify anything regarding it being in the past

<tlebo> s/zeknik/zednik/

<tlebo> luc: "occurrent' is very technical.

<zednik> occurrent |əˈkərənt|

<zednik> adjective

<zednik> actually occurring or observable, not potential or hypothetical.

<GK> A process execution has is associated with specific (but maybe unknown) inputs and outputs. Alternative inputs and outputs are not an option. ??

<tlebo> luc: we are failing by not keeping the term simple.

<tlebo> luc: push "occurrent' further down in the definition.

<zednik> has or is

<tlebo> satya: use definition of occurrent instead of stating "occurrent' in the definition.

<GK> Usually, I think provenance *is* about things that *have* happened, but I worry about formalizing that intent.

<tlebo> pgroth: "happened in the past" is a given in what we are describing.

<tlebo> proposed: add statement "provenance describes things that happened in the past. this is assumed for all remaining definitions."

<tlebo> accepted: issue 5 is subsumed by statement "provenance describes things that happened in the past. this is assumed for all remaining definitions."

<tlebo> satya: getting incorrect inferences.

<tlebo> pgroth: constraints can be imposed in the semantics.

<GK> E.g. A test suite for a provenance generating system must necessarily contains statements of provenance about things that will be computed in the future.

<stain> so I would not be allowed to 'fake-run' a workflow and generate a PIL provenance trace of what the provenance would look like? The asserter is here not observing, but predicting. (It might still be to guess what the non-recorded provenance of a previously ran workflow was)

<tlebo> proposed: issue 6 If we adopt an “OS Style” process model, then a distinction needs to be made between process specification, process, which is an instance of a process specification, and process execution, which is the state of a process with in a time interval, when the activities specified in the process specification take place. This may have been resolve

<tlebo> resolved by the agreement above, where the distinction is partially made (process spec vs process exec), and it was decided that process spec == recipe is out of scope. I will not insist on process (Paolo)

<GK> What is "OS level provenance"?

<tlebo> what is the OS provenance group's name?

<tlebo> at Harvard?

<smiles> @tlebo PASS

<IlkayAltintas> http://www.eecs.harvard.edu/syrah/pass/

<tlebo> resolved: issue 6 is dropped.

<smiles> @GK how files are created, used etc. by OS processes

<tlebo> subtopic: Generation


<GK> @smiles OK, thanks.

<tlebo> satya: did not include "modification" in process execution.

<stain> a modification is generating a new bob

<tlebo> Generation is the action/transition/event by which a process execution creates a new entity state.

<tlebo> proposed: Generation is the action/transition/event by which a process execution creates a new entity state.

<GK> s/entity state/entity/

<tlebo> luc: the only way to describe entity state is via BOBs

<tlebo> khalidbelhajjame: process execution without entity states? group - yes. 0 or more.

<tlebo> paulo: database queried that does not modify database.

<GK> We have a definition here that mentions "entity states", but I don't know what that is distinct from an "entity"

<stain> then database was used, and query result was generated

<stain> @GK I think entity state means Bob - but not sure

<stain> I think that might be our bob

<tlebo> BOB is a placeholder for how we are describing entitie states.

<zednik> @stain, I think so too. entity state is our BOB

<GK> @stain: that doesn't work: BOB is a description of an entity

<tlebo> (was called Thing before today, which described Stuffs)

<stain> ARE YOU HIM?

<GK> I understood BOB to describe *entities*

<tlebo> proposal: generation issue # 1 - Whether generation should be modelled as a concept itself or as a relationship between concepts, such as a process execution and a thing. This issue is raised based on the initial definitions raised by Jun. However, Luc did raise that "Whether this is a concept or a relationship seems to me more relevant to the formalization o

<zednik> if it is a concept itself, what does it entail? what are the properties/relationships associated with a Generation concept?

<stain> @GK: *I* think BOB is what allows us to talk about a certain entity state. So it's more like a proxy, symlink, smart query, view - when we say "BOB x is blah" it means "Entity e, within the description of BOB (the blue shirt in the office) - is blah

<stain> @zednik Agent, Process Execution, BOB

<tlebo> BOB a placeholder for Thing/Description/Characterization/EntityDescription/StateDescription

<tlebo> luc: issues on definitions are First In First Out.

<qwebirc413501> +q

<tlebo> ericp: getting to new stateS. why plural?

<Zakim> ericP, you wanted to ask why multiple states

<tlebo> ericp: a process can influence the states of one or more thing.

<tlebo> smiles: should be a relation, not a concept.

<tlebo> smiles: generation should be defined in terms of process execution

<tlebo> luc: want to relate events to one another.

<stain> it was raised on the mailing list that one want to stop somewhere. I don't want to specify how my file system driver found the right sectors on the disk - but might want to talk about what was generated in the end.

<tlebo> Generation is a time constraint on process execution/ activity.

<tlebo> (events and activities? are these synonyms for the True concepts?)

<tlebo> smiles: temporal ordering of process executions helps avoid Activities Generation Events.

<tlebo> zednik: repeating smiles Generation overlapping with Process Execution.

<tlebo> deborah: where do we place new issues? place it onto http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions after/during F2F meeting.

<tlebo> pgroth: relationship in natural language vs. formal languages.

<GK> I think the formal term here is "relation"

<stain> +1 - we don't want to say it's NOT a concept.

<tlebo> pgroth: conceptual relationship vs. modeling it as a Concept/Relationship in a logical model of your choice.

<zednik> conceptual relationship does not entail rdf:Property

<stain> I think if we agree on this, then we can say it's a relationship

<GK> (In RDF formal semantics, a property has an associated relation over pairs of concepts from the domain of discourse.)

<tlebo> luc: what does a conceptual relationship?

<stain> @Paolo: Exactly - so it can't just be dangling alone which is my worry

<tlebo> paolo: relationship depends on other primary concepts. (mathematically)

<tlebo> (rdf:Property is a good smiley)

<Zakim> pgroth, you wanted to make the point that there's a difference between relationship and RELATIONSHIP

<tlebo> paolo: relation does not stand on it's own.

<tlebo> smiles: does not use Event/Transition; just describe relationships among the entities.

<GK> Generation can quite reasonably be a relation between process executions and entities.

<GK> But *instances* of a relation can be reified.

<tlebo> pgroth: Generation as a proxy for Event.

<GK> ... as members of a class that might be caled "Events". I think there is no dichotomy here.

<tlebo> pgroth: main concern of group is that Generation is being confused with Process Execution.


<GK> I think the definitions are dual.

<stain> I don't see why two definitions can't refer to each-other.. in fact if they don't, then you might wander what makes PIL a model/language instead of just a vocabulary

<smiles> @stain I agree there's no absolute reason why not, but still it can make the definitions simpler to refer to less other things that need to be looked up

<Paolo_> @stian they seem to be redundant in the way they overlap....

<stain> strip one of them down then

<Paolo_> New version just came up on the page. Still under discussion

<stain> wich page?

<smiles> @stain http://www.w3.org/2011/prov/wiki/F2F1ConceptDefinitions

<tlebo> process exeuction: 0..* ins, a middle, 0..* outs

<tlebo> generation: a middle and 1 out

<tlebo> (will get to) use: 1 in and a middle

<tlebo> why isn't a "use" and "generation" a process execution missing some bits?


<stain> @tlebo +1

<stain> I think we might need to talk about Collections or composition when talking about multiple processes generating one entity state

<tlebo> resolved: issue 1; we have new definitions

<tlebo> resolved: all generation issues.

<tlebo> proposed: use issue # 1: For a thing X to be used by a process execution P, the following must hold (see discussion): X was generated before its use Use occurs after P's beginning and before P's end

<tlebo> use: Use is the consumption of an entity state by a process execution.

<tlebo> can we consume entityStates multiple times?

<tlebo> PDFs don't get consumed by being printed.

<tlebo> "involved"

<tlebo> ?

<zednik> does consumption imply "using up" or destroying the BOB?

<stain> 'consumed' also sounds like it's the whole thing.. so what about Paolo's database example?

<tlebo> (btw, BOB is leading to be renamed EntityState)

<stain> tlebo: YAAY

<zednik> consumption |kənˈsəm(p) sh ən|

<zednik> noun

<zednik> 1 the using up of a resource

<stain> what if we say something like "utilised (as e.g. input, source) by process execution"

<stain> @tlebo - soudns very active - like the agent

<stain> who is involved iwth the PE

<tlebo> not sure why "involves" implies agency.

<tlebo> my keyboard involves this text string.

<tlebo> (other way around)

<tlebo> this text string involves my keyboard.

<stain> do you involve your car when going to work? It's not like you ask if it wants to come along.

<tlebo> I'd say "use" implies too much agency.

I think we made "participate" the top level relation, which subsumes "use" and "consume".

<stain> @tlebo - oo.. I.. agree

<zednik> @JimMcCusker participate was top level for agents



<Zakim> tlebo, you wanted to suggest "involves" instead of use/consume. and to ask that we use the q

<zednik> @JimMcCusker consumed was specialization of used (and implied destruction)

<zednik> @JimMcCusker we also had influenced...

<stain> @tlebo 'involves' would allow for a process execution to also involve a recipe/perl script, etc. (might be good - but less specific than use)

<tlebo> satya: EntityState to exist for Process Execution to happen?

<IlkayAltintas> +q

<stain> +1 +1 +1

<stain> we're recording what DID happen

<tlebo> what is @stain +1'ing?

<tlebo> which entity was used to generate which entity is lost.

<tlebo> issue: we lose which entity was used to generate which entity.

<trackbot> Created ISSUE-22 - We lose which entity was used to generate which entity. ; please complete additional details at http://www.w3.org/2011/prov/track/issues/22/edit .

<tlebo> issue: create definition of involve to replace Use

<trackbot> Created ISSUE-23 - Create definition of involve to replace Use ; please complete additional details at http://www.w3.org/2011/prov/track/issues/23/edit .

<tlebo> paolo: we should have a collection of Use, Involves - not a replacement.

<tlebo> proposed: Generation issue # 2 - Should we also mention in the definition that, for a thing X to be generated by a process execution P, the following must hold (see discussion): X must be something that did not exist before generation time (this means that nothing had the thing's identity before that time) generation occurs after P's beginning and be

<tlebo> (how did we get back to Generation issues? I thought we skipped them intentionally)

<tlebo> paolo: functional flavor of Generation issue 2.3 P and things used by P determine the values of X's invariant properties, but not the values of variant properties (too(?) strict)

<tlebo> resolved Generation 2.3 is too strong P and things used by P determine the values of X's invariant properties, but not the values of variant properties (too(?) strict)

<GK> I need to break off now. See/hear you tomorrow.

<smiles> @tlebo we intended to skip just subpoints 2.1 and 2.2 (not 2.3 and 2.4)

<stain> GK - nightie!

<Luc> thanks Graham

<tlebo> proposed: Generation issue 2.4

<tlebo> P and things used by P determine values of some of X's invariant properties (less strict)

<tlebo> luc: process execution or entities that went into process execution can be used to understand aspect of an entitystate output

<tlebo> open world assumption of describing the process execution or inputs.

<tlebo> issue: semantic document address "P and things used by P determine values of some of X's invariant properties (less strict)"

<trackbot> Created ISSUE-24 - Semantic document address "P and things used by P determine values of some of X's invariant properties (less strict)" ; please complete additional details at http://www.w3.org/2011/prov/track/issues/24/edit .

<tlebo> proposed: Derivation expresses that some entity is transformed from, created from, or affected by an other entity. An entity state B is derived from an entity state A if the values of some properties of B are at least partially determined by the values of some properties of A.

<tlebo> smiles: some connection needs to be there.

<tlebo> jcheney: SOME values need to overlap across EntityStates connected with a Derivation.

<tlebo> issue: semantics group to incorporate ""derivation" or "partially determined by" relationship could be subjective or context-dependent assertion, not an objectively true or false statement." Derivation issue # 2

<trackbot> Created ISSUE-25 - Semantics group to incorporate ""derivation" or "partially determined by" relationship could be subjective or context-dependent assertion, not an objectively true or false statement." Derivation issue # 2 ; please complete additional details at http://www.w3.org/2011/prov/track/issues/25/edit .

<tlebo> proposed: Derivation issue # 3 Does derivation include control dependency? If so, is this reflected in this definition

<tlebo> khalidbelhajjame: A, B, threshold example.

<tlebo> luc: division example.

<tlebo> determined by the presence of a value that does NOT affect it's result

<tlebo> triggered execution but not involved (did not influence it's result other than starting it)

<tlebo> "triggering" is a kind of "use"

<tlebo> smiles: represent it with an invariant property

<tlebo> ice sculpture example.

<tlebo> photo of ice sculpture

<tlebo> ice sculpture does not exist, but is relevant to the derivation of the photo.

<tlebo> paulo: redundancy of something-already-used. derived from something can be inferred from knowing the process execution (~)

<tlebo> smiles derived from Louis XIV

<tlebo> luc: derivation is trying to describe the info flow within the black box of the process execution.

<tlebo> luc: an output may have been created before the input to the process was used.

<tlebo> satya: apples and oranges. some want to describe the same thing from either of two perspectives.

<tlebo> knowing the relationship between inputs and outputs VS NOT knowing the relationship.

<tlebo> zednik: example - sci process that reads docs in directory, new file for each found and craeting arvhive file of all files it read. 500 files at a time.

<tlebo> model 500 or model 1

<tlebo> scientists don't care about 500 process executions.

<tlebo> "procedure they understand" 500 in 500 out

<tlebo> but we lose the derivation from one of the 500 to one of the 500.

<tlebo> paulo: figuring out what went wrong when it went wrong.

<tlebo> pgroth: people kind of like derivation notions (we've seen from experience)

FYI, it's 6:15

<tlebo> pgroth: some people like talking about process executions (a different perspective)

<tlebo> +1 6:15

<tlebo> deborah: w.r.t paulo's derivation issue. we don't need any particular granularity. we should permit any granularity. people want to see the provenance at differing granularities.

<tlebo> luc: @paulo re redundancy.

<tlebo> paulo: use and Generates are not nec. the way they are b/c they need more specific meanings towards Derivation.

<tlebo> luc: some have process view of word, some have derivation view of the world.

<tlebo> use/generation is the process view.

<tlebo> derivation connects the data

<tlebo> luc: knowing inputs and outputs DOES NOT imply derivation.

<zednik> +1 knowing inputs/outputs does not imply derivation nor casuality

+1 knowing inputs/outputs does not imply derivation nor casuality

<tlebo> paulo: scientists do not know the process of how things are created, but they want to have process about the data.

<tlebo> people like processes, some like data views

<tlebo> jcheney: children building rockets need calculus - if they want to learn rocket building learn calculus.

<tlebo> luc: PASS harvard knows the processes and inputs but DO NOT know the derivation among the ins and outs.

<tlebo> derivation is defined independently of inputs and outputs (by design)

<pgroth> ack (deborah)

If you want to find out what process was used to derive b from a, given that b derived from a, look for a process that has a as an input and b as an output.

<tlebo> paulo: dataset interpolated to get uniform distribution of the data.

<tlebo> a parameter is used and affects the interpolation

<zednik> definition of dataset is not consistent among science communitites

<tlebo> process view does NOT say output depends on inputs. THEN assert derivation associating the interpolation to the input data and the parameter.

<tlebo> (is there a Recipe on a Derivation?)

no, but you can look up what recipe was used as such in a process that has the inputs and outputs that were derived from each other.

<Luc> Time to go to the restaurant!!!

<Luc> topic of discussion: http://www.w3.org/2011/prov/wiki/NameSuggestions

<tlebo> trackbot, end telcon

Summary of Action Items

[NEW] ACTION: ericstephan to create a plan to deliver a connection report. Plan will include a timetable, a list of connections, and individuals who will deliver to the connection. [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action01]
[NEW] ACTION: Paulo to document definition of "process execution" and "recipe" and provide to group. [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action04]
[NEW] ACTION: zednik to create a plan for a implementation report [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action02]
[NEW] ACTION: zednik to write second iteration of the questionnaire. [recorded in http://www.w3.org/2011/07/06-prov-minutes.html#action03]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2011/07/06 22:31:21 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.136  of Date: 2011/05/12 12:01:43  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/paulo/paolo/
Found Scribe: Simon Miles
Found Scribe: ericstephan
Inferring ScribeNick: ericstephan
Found Scribe: JimMcCusker
Inferring ScribeNick: JimMcCusker
Found Scribe: Tim Lebo
Scribes: Simon Miles, ericstephan, JimMcCusker, Tim Lebo
ScribeNicks: ericstephan, JimMcCusker
Default Present: +1.617.715.aaaa, Meeting_Room, stain, zednik, GK, [ISI], Lena, +1.561.216.aadd, +1.858.210.aaee
Present: +1.617.715.aaaa Meeting_Room stain zednik GK [ISI] Lena +1.561.216.aadd +1.858.210.aaee
Agenda: http://www.w3.org/2011/prov/wiki/Meetings:F2F1Timetable
Found Date: 06 Jul 2011
Guessing minutes URL: http://www.w3.org/2011/07/06-prov-minutes.html
People with action items: ericstephan paulo zednik

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.

[End of scribe.perl diagnostic output]