Provenance Incubator Group Teleconference

04 Dec 2009

See also: IRC log


+1.518.276.aaaa, +1.860.995.aabb, +1.781.646.aacc, +49.302.093.aadd, +0238059aaff, +1.609.462.aagg, Ivan, +1.915.747.aahh, YolandaG, ssahoo2, pgroth, [IPcaller], +1.530.554.aajj, jmccuske, michaelp




<trackbot> Date: 04 December 2009

<Deborah> +q - plans for calls for the rest of the month

<ivan> sigh...

<michaelp> +1.614.592.aaee is really michaelp

<jmccuske> +1.860.995.aaii is really me

<Paolo> ok no problem that's a good answer :-)

<jmccuske> +1.860.995.aaii is really jmccuske

yolanda: discuss use cases
... several usecases on escience, several aspects of dimensions that do not show up in usecases. People have offered to add usecases in the next few days.
... add comments to discussion pages of wiki. Maybe have usecase editors to work with proposer to refine the usecase, make sure there is enough detail, or merge two related usecases.
... any volunteers for usecase editors ?

<Luc> I am not sure where we're supposed to comment? on each use case page itself?

yolanda: a glossary of terms to keep track of terms that we use
... going through each use case.
... 1. Simple Trustworthiness Assessment - Olaf is proposer

<pgroth> luc, i think, if you have comments on a use case you can do that in the discussion section of the use case

<Deborah> http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Dimensions

Olaf: Alice publishes Bob's and Carol's data, she is not the provider of the data. So there are different trust relations that the consumer/user of the data, Eve, uses to decide whether or not to use the data. Dimensions: Attribution (Alice provides and manipulates data, Bob and Carol who provide the data), Publication (how to publish this provenance info), Use (Understanding?), Trust (application filters data based on trust).

Luc: the use case should be expressed without the concept of provenance.
... this usecase is good because its connected to different dimensions and is strongly related to Linked Data. There is a lot of interest in RDF community to associate triples with provenance and I wonder if this usecase is influenced by that. Should we try to express usecases without such a dependence on technology and without assuming the provenance concept ?

(Luc, I'm not sure I captured all your comments/questions, pls add anything I missed)

Simon: One of the purposes of usecase is to find technical challenges. It would be good to if the owners of each use case added the technical challenges for each usecase.

jcheney: I agree that provenance shouldn't assumed in the use case. Is there a difference between trust and filtering in this use case ?

olaf: consumer makes the judgement whether to filter or not but uses information related to provenance to make that decision

yolanda: maybe usecase could talk about trustworthiness assessment and then talk about how it would apply to different domains - LOD, escience etc.

olaf: Linked Data Timeliness. A dataset that changes frequently. Two publishers use this data to make and publish their own dataset.
... Application accesses data provided by these publishers and uses provenance such as time to decide whether to use this data. Dimensions: process, creation time, data accessed time (new one) how and when data was accessed includes when publisher accessed the data, republishing, trust (filters data based on timeliness).

<pgroth> olaf, why isn't this timeliness problem fall under the updates category?

Paulo: +1 Luc. The usecase does not necessarily highlight problems.

<ssahoo2> +1 Paulo

Paulo, I missed that, please could you add it.

<pgroth> +q

Paolo: Strong focus on LOD, maybe not required. Should generalize the usecase.

Paul: Olaf, you want to introduce another dimension of data access. Why does update not include this ?

Olaf: The provider of the data may not be the creator, update is related to creation. But the provider is not creating it but accessing it for publishing.

Yolanda: Time is only one aspect of this, there are other aspects.

<Paulo> understanding trust itself may be a distraction for the group to understand the role of provenance in support of trust

michaelp: usecase is about reliability assumption and not so much about timeliness.

Yolanda: We need a glossary of terms. Reliability means something different to me. Time is crucial to provenance and we need to highlight it somehow.

Olaf: I agree with Michael. Similar to the trust case, we had a model to calculate trust in this case we have a model to calculate timeliness.

acl olaf

Simon: Result Differences use case. Process is imp thing. Management issue - large historic data, imperfection and debugging.

yolanda: is there a usecase where comparing different results would matter.
... in a different domain

<pgroth> +q

<Paolo> :-)

pgroth: Blogosphere usecase. Aggregating blogposts. Question: when you aggregate diff posts and discussions, how do you traceback to where the original data came from.

Yolanda: content evolution, republishes, who is responsible for publishing, is it a requirement, can someone reconstruct the provenance.

pgroth: republishing, interoperability

Mark: Re cross references mostly circular references. I'll post related paper.
... I'll add this use case

Yolanda: timeliness is not a matter of publishing but during the use itself, how provenance is used by consumer.

<ssahoo2> http://www.w3.org/2005/Incubator/prov/wiki/Talk:Provenance_Dimensions

ssahoo: For dimensions I had suggested agents and adding temporal and spatial dimensions.

Paolo: Track how a community came to a scientific conclusion. Should there be another use case.

pgroth: Having a scientific aggregation use case would be good to have. Provenance of blogosphere, tweets, and scientific aggregation are related and might lead to a common/general use case.

Yolanda: volunteers for usecase editors ?

<ssahoo2> I will be happy to help in editing the use cases

Simon: yes

Paolo: yes

ssahoo: will we be discussing dimensions ?

Yolanda: lets work on usecases first and then refine dimensions

<pgroth> can we talk about the dimensions on the email list?

Yolanda: glossary editor ?

<Luc> yolanda: should we raise the issue of a face 2 face meeting?

<ssahoo2> Paul: we are posting comments on the dimensions in the discussion tab

<pgroth> that's true

<pgroth> so a good place to do it

<ssahoo2> we can always have complementary discussions on the mailing list

<Deborah> bye

Yolanda, should I create the minutes and put them on the wiki ?

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2009/12/04 17:09:16 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.135  of Date: 2009/03/02 03:52:20  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

No ScribeNick specified.  Guessing ScribeNick: lkagal
Inferring Scribes: lkagal

WARNING: No "Topic:" lines found.

Default Present: +1.518.276.aaaa, +1.860.995.aabb, +1.781.646.aacc, +49.302.093.aadd, +0238059aaff, +1.609.462.aagg, Ivan, +1.915.747.aahh, YolandaG, ssahoo2, pgroth, [IPcaller], +1.530.554.aajj, jmccuske, michaelp
Present: +1.518.276.aaaa +1.860.995.aabb +1.781.646.aacc +49.302.093.aadd +0238059aaff +1.609.462.aagg Ivan +1.915.747.aahh YolandaG ssahoo2 pgroth [IPcaller] +1.530.554.aajj jmccuske michaelp

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Found Date: 04 Dec 2009
Guessing minutes URL: http://www.w3.org/2009/12/04-prov-xg-minutes.html
People with action items: 

WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report

[End of scribe.perl diagnostic output]