W3C

- DRAFT -

Provenance Working Group Teleconference

09 Aug 2012

Agenda

See also: IRC log

Attendees

Present
Regrets
Paul_Groth, Stephan_Zednik
Chair
Luc Moreau
Scribe
gk, gk1

Contents


<trackbot> Date: 09 August 2012

<Luc_> i am looking for a scribe

<gk> Luc, I'll scribe when I get online. Still bringing up apps.

<Luc_> thanks

<Luc_> scribe: gk

<Luc_> @gk, everything is set up for you

Admin

<Luc_> @gk can you let me know when ready?

<Luc_> scribe: gk1

<stain> I can scribe until GK is connected to the cloud

<stain> ah

<Luc_> http://www.w3.org/2011/prov/meeting/2012-08-02

<Luc_> proposed: to approve last week's teleconference minutes

0 (not present)

<Curt> +1

<ivan> +1

<TomDN> +1

<SamCoppens> +1

<CraigTrim> +1

<jcheney> +1

<smiles> +1

<stain> +1

<Luc_> accepted: last week's teleconference minutes

prov-constraints

<scribe> scribe: gk

PROV-constraints document

James: update on situation... Stian's review recvd Monday, identified things needing discussion, most have been resolved, 2-3 outstanding
... have tried to address points in the draft, some ongoing discussion of resolution with Simon
... should we try and resolve outstanding issues now?

Luc: we could review eachj outsatdning issue now...

<jcheney> http://www.w3.org/2011/prov/track/products/12

<jcheney> http://www.w3.org/2011/prov/track/issues/467

James: re; http://www.w3.org/2011/prov/track/issues/467 should we infer existence of trigger?

Stian: if an activity is starts and ends, do we require/assume existence of a trigger; seems odd as it may not apply to all activities.
... can lead to chicken-and-egg - where do triggers come from?

<stain> To be clear, is it correct to say that the options are:

<stain> 1. [status quo] - allow expanding the trigger parameter to an existential variable denoting an unknown (but definite) trigger entity

<stain> 2. change the trigger parameter to be non-expandable, so that "-" means "absent trigger", as with plan and other non-expandables.

Stian: not entirely sure which way this should be resolved; two options (1) trigger always exists and may be undefined, or (2) trigger may not exist. leaning to (2).

<Luc_> http://lists.w3.org/Archives/Public/public-prov-wg/2012Aug/0076.html

<Luc_> http://www.w3.org/2011/prov/track/issues/311

<Luc_> http://dvcs.w3.org/hg/prov/raw-file/b9d2157889f7/model/optional.html

Luc: constraint document is as it is... regarding 3.1.1, and meaning of optional arguments
... 3rd of above links says existence is implied
... sop this seems like reopening an issue previously closed? is there new information?

Stian: this is different because it's about identifying optional arguments; what are the consequences of all these things existing? This is clearer now we have constraints document.

<Zakim> GK, you wanted to commetn about real numbers - many exist that are not named

<stain> @GK - right, PROV-Constraint don't force them to be named, just to exist

GK: lots of real numbers exist for which there are no names... is this a similar issue?

<TomDN> +q

<stain> I suggested a strawman poll

jcheney: if the trigger parameter can denote something is absent... four other things line that (plan and 3 others) ... not implied if not specified.

<TomDN> -q

<Luc_> in effect, it's like having two relation startWithTrigger and startWithoutTrigger

jcheney: all of these introduce a slight (formal?) complication needing to be specific when mentioning a constraint/association, can parameter be a null placeholder; needs additional editing of inferences.

<stain> @jcheney I agree #2 does not make it prettier :'(

jcheney: trigger inferrable if activity is specified an option, ... all this doable but may have unanticipated consequences.

<stain> but I think the concern is what the model should allow, not how easy to read the (already hard) PROV-Constraint document is

Luc: also consider purpose of inferences... for validating provenance, not necessarily used outside.

Stian: are there too many inferences? some of them always make sense. PROV-constraints says...

GK: would prefer more compact option

<stain> my question is semantically - is there a problem with enforcing the existince of triggers for every activity start and end?

Luc: concern is with always assumingthe existence of a trigger.

Stian: don't see any real complications, but does it reflect the boundaries of the PROV model?
... e.g. queue of cars on motorway, what triggered this? Does it make sense, philosophically, for these to be part of the model?

<stain> @Luc right, like an optional parameter would be done in Java and wasDerivedFrom(e2,e1,-,-,-)

<stain> without usage/gen/act

<Luc_> @gk, the trigger is the combination of all them

<stain> @GK, the trigger entity could be a collection though ;)

<jcheney> but there is no uniqueness constraint on triggers currently

<TomDN> @GK: just use a collection of cars as trigger?

<TomDN> @stain: you beat me to it ;)

<stain> @jcheney yes, by the merging rules

@stian: what about weather factor?

<Luc_> option 1: is to keep currrent formalization, assuming existence of trigger for any activity

<stain> 0

+1

<smiles> +1

<TomDN> +1

<ivan> 0

<SamCoppens> +1

<jun> 0

<hook> 0

<jcheney> @stian: ah yes, uniqueness of start events + key constraint does it

<Curt> +1

<stain> @gk you'll need entity(theworld) as trigger then..

<satya> 0.5

<jcheney> 0 (either way fine)

@stian - OK, why not?

<Luc_> option 2: change current formalization, do not assume existence of a trigger for activity

<stain> +1

0

<ivan> 0

<smiles> 0

<SamCoppens> 0

<hook> 0

<Curt> 0

<TomDN> 0

<jcheney> -0 (don't wanna do it but not going to block it)

<satya> 0

<jun> +0 (This is really philosophical)

Yeah, -0

<Luc_> proposed: is to keep currrent formalization, assuming existence of trigger for any activity

<jcheney> +1

Staw poll (doh) indicates staying with current fortmalization, Stian is Ok with this

<TomDN> +1

<SamCoppens> +1

<smiles> +1

<Curt> +1

<jun> +1

<stain> +1

+1

<hook> +1

<ivan> 0

<satya> +1

<Luc_> accepted: to keep current formalization, assuming existence of trigger for any activity

<stain> I have closed ISSUE-467

<jcheney> http://www.w3.org/2011/prov/track/issues/452

Next issue

<Luc_> subtopic: issue 452

<stain> 15/16 missing (but has remark)

<stain> example: http://dvcs.w3.org/hg/prov/raw-file/default/model/prov-constraints.html#wasAssociatedWith-ordering_text (47(

jcheney: "what is plan association inference" - where there are "not-expanable things" in inferences, ... (missed detail) ... does anyone have a problem

<Luc_> ackgk

<Zakim> GK, you wanted to say the bigger problem for me is that there is A unique trigger

stian: not a blocking thing, inferences 15, 16 different.

jcheney: will fix this, send email, and hopefully we'll close

<Luc_> accepted: close issue 452

Next issue

<jcheney> http://www.w3.org/2011/prov/track/issues/459

Catch-all for reviews... asking for feedback from reviewers. Some reviewers on hols, but apart from Simon don't think there are any blocking issues remaining.

^^jcheney:

<satya> sorry, have to leave now

jcheney: some things in reviews have not been fixed yet - some figiures, non-technical text, happy to leave it pending review with no technical issues outstanding

luc: we'll leave it that

<jcheney> http://www.w3.org/2011/prov/track/issues/473

Next issue

<Luc_> subtopic: issue 473

jcheney: simon ..., is everyone happy with unique generation event for each entity; then the associated activity is also unique?

<stain> (this is also related to activities part-of activities)

Simon: original concern was that thetext mismatched the constraints, but that is resolved - every entity has a unique generating activity, not just unique event.

<dgarijo> @stain: I am late, so I wouldn't want to raise things that you have already discussed. Have you discussed the entity being generated by 2 activities at 2 levels of granularity?

problem why an entities' generation coming from one single activity - would make primer examp,e invalid, as multiple levels of granularity are expressed.

<stain> @dgarijo - no, put yourself on the queue!

scribe: could have two entities, related as specialized, linked to different levels of granularity.

<stain> in workflow export we ended up with alternateOf for this, e1a, e1b, e1c which are generated by nested acitvities A, B and C (which made queries hard)

scribe: implications of this constraint need justifying

jcheney: easiet resolution would be to remove the constraint.

Luc: concerned that if removed, some inferences around derivation may be no longer valid

jcheney: taking away an inference won;'t make other inferences incorrect, but maybe non-derivable

the uniqueness constraint is effectively saying "don't mix levels of abstraction"

^^Luc:

Luc: don't think we limit expressiveness, just providing structure - good for "proper" provenance? Does it matter if the primer has "not proper" provenance?

Simon: does this need different instanbces for different levels of abstraction? Can live with that, but it seems surprising (?)

<Luc_> q/

dgarijo: other places this happens - entities in DC mapping - all the steps that comprise the proiduction of an entity
... in scientific workflows, complicated to query model if upper level activities generate different entities than lower level activities.

<dgarijo> @gk: thx

stian: ended up creating multiple entities corresponding to appearance at b"different doors" in a workflow

<dgarijo> @jcheney: +1!

jcheney: seems to me that people want a validity checker - what's more useful is catching things that are definitely nonsense, or probably indicative of problem. Uniqueness seems to be in the latter category. We should be focusing on catching nonsense rather than limiting what people can do.

<dgarijo> I really like the Ok vs Warning vs Invalid.

jcheney: what are the consequences of being invalid?

<stain> I supposed it would be to drop the KEY property of wasGeneratedBy

Straw poll... status quo vs dropping unique generation requirement?

<stain> Dropping key property: so that wasGeneratedBy(id1; e1, a1, t1) wasGeneratedBy(id2; e1, a2, t2) would be allowed. (a2 <> a1, t1 <> t2 ?)

<dgarijo> If we detect that an enitity is being generated by 2 activities, we could generate a warning, as James proposed.

<stain> for instance "When are you born?" - it depends on how you measure which activity

jcheney: need to think about consequences of options

<Luc_> option 1: keep generation unique and key property on generation

<smiles> -1

<jcheney> 0

<dgarijo> -1

<jun> -1

<stephenc> -1

<stain> 0

<SamCoppens> 0

-0

<Curt> 0

<hook> 0

<TomDN> +0

<ivan> 0

<Luc_> option 2: design a solution that relaxes uniqueness of generation

<smiles> +1

<dgarijo> +1

<jun> +1

<jcheney> 0

<TomDN> +1

<stephenc> +1

<Curt> +1

<SamCoppens> +1

+0

<hook> +0

<stain> +1

<dgarijo> yes

Luc: Indication that we need to think about option to relax uniqueness of generation

<dgarijo> like generating a warning.

<smiles> yes - but could be connected to levels of abstraction

<jcheney> http://www.w3.org/2011/prov/track/issues/474

hmmm... I don't think we should get too far into implementations

<stain> I think adding something about activity abstractions would help

<Luc_> subtopic: issue 474

smiles: lack of clarity what is link between bundles and instances?
... text seems to make different assumptions in diufferent places

<Luc_> can somebody take over from gk?

jcheney: text here was written at last minute, may need clarification

<TomDN> (for reference, our definition of instance in 1.2: A PROV instance is a set of PROV statements, possibly including bundles, or named sets of statements. For example, such a PROV instance could be a .provn document, the result of a query, a triple store containing PROV statements in RDF, etc.)

jcheney: an instance may contain bundles
... initially, talk about instances that are single (unnamed) bundles
... deal with named bundles independently, in same way
... statements inside a bundle is an instance
... collection of statements without identifier - currently calling this a top level bundle

<TomDN> Would (b) be solved by calling it the "toplevel instance"?

<Luc_> prov-n has a top level bundle, i think that's what influenced this design

smiles: per DM, bundle has identifier so we can express provenance-of-provenance ...

jcheney: three things - blob of provenance (in whatever form); a named set of statements (bundle); a set of statements. have used two terms for these three things, probably confuses.
... will try to come up with less confusing terminology

Luc: summary - it appears editors have homework to do on two issues - biundles and generation uniqueness. Can't really proceed for vote yet.

<smiles> I don't think the bundles thing is a blocking issue

jcheney: could release as *a* working draft, it's been over 3 months.

Luc: getting LC draft ready is more important.
... hoping for LC vote early September - is there any point in producing a WD now, then LC draft in september?
... given lots of ppl are on vacation

Sandro: concurs

Tele-Conference Schedule over summer

teleconference schedule over summer

luc: Paul and Luc away for rest of August.

<stain> we can argue about wasGeneratedBy... ;)

GK: I probably can't make next 2 weeks anyway.

<jcheney> i think the remaining issues can get done over email...

<Curt> I talked with Stephan yesterday (he's on a plane right now). We might want to have an informal XML call, but it needn't be a formal working group teleconference..

<jun> We will still have separate prov-o calls, in which implementation of constraints can be discussed

Curt(?): happy to chair informal meeing on 23rd; would help if Luc and/or Paul can sent out agenda

Luc: will circulate agenda for 23rd.

<Curt> [not curt]

Luc: we'll speak agin formally in September

@Luc - I'm guessing you'll take it from here?

<Luc_> @GK, yes, thanks a lot for scribing!

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/08/09 16:17:38 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.136  of Date: 2011/05/12 12:01:43  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/to/do/
Succeeded: s/pole/poll (doh)/
Found Scribe: gk
Found Scribe: gk1
Inferring ScribeNick: GK1
Found Scribe: gk
Scribes: gk, gk1

WARNING: No "Present: ... " found!
Possibly Present: CraigTrim Curt Curt_Tilmes GK GK1 IPcaller Ivan James Luc_ MacTed P19 P3 P30 P31 SamCoppens Satya_Sahoo Simon TomDN aaaa aabb aacc aadd accepted dgarijo example hook jcheney joined jun luc proposed prov sandro satya smiles stain stephenc stian subtopic trackbot
You can indicate people for the Present list like this:
        <dbooth> Present: dbooth jonathan mary
        <dbooth> Present+ amy

Regrets: Paul_Groth Stephan_Zednik
Agenda: http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.08.09
Found Date: 09 Aug 2012
Guessing minutes URL: http://www.w3.org/2012/08/09-prov-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]