See also: IRC log
<trackbot> Date: 18 December 2009
trackbot, prepare telcon
<trackbot> Meeting: Provenance Incubator Group Teleconference
<trackbot> Date: 18 December 2009
Discussion of new batch of use cases (led by Simon Miles and Satya Sahoo)
Coverage of provenance dimensions by current use cases (led by Simon Miles and Yolanda Gil)
Planning for next meeting, agenda and scribe (led by Yolanda Gil)
Review of action items (by scribe)
<YolandaG> Thanks for doing this Irini!!
<crunnega> Christine on the phone too
yolanda: Look at the provenance dimensions and use cases and how to organize the use cases and provenance dimensions
Satya will be covering 2 use cases
<mccuskej> same here
Use case inspired from experiments. Combine data from different sources and databases
Manual Extraction and NLP techniques
Basic issue is whether a particular instrument has been used.
Interpretatioon query and experimentation results.
Types of data: curated data with high quality. But, data from prediction algorithms does not have the same quality as the human curated data.
Examples/Sets of Goals in the Use case: exhanging data between groups, essential to understand the process and the instruments used.
Get administrative data (instruments etc.)
Standard queries in provenance scenario to be answered.
Important to add information that is important to understand and interpret results
Storing and querying efficiently provenance information is a big issue
yolanda thinks that a general problem is the presence of experimental data and with no provenance such data has a limited use.
Question: how do we capture and represent provenance information to be used later on.
yolanda thinks that there is a more general problem that is important.
yolanda's question: in terms of provenance does it mean that there is a provenance query engine that searches the web that will be looking for all experimental data with provenance and it will return these results?
satya's answer: information is linked to the experimental results and the results are tracked back (provenance within a lab)
<JimM> a way to register to get updates to prov would address this - a trackback service
yolanda: what happens in a data exchange or data integration scenario? what is the scale?
Satya: scale of provenance information increases
<JimM> (there was an IEEE Escience 2009 presentation doing this for citations)
James: results in social sciences
used in policy decisions.
... do they exist regulations that must be satisfied in the biomedicine domain?
<JimM> Pharma and analytical chemistry labs would be under FDA and other regulations
Satya: no legal requirements except the fact that journals want to have the dataset used in the papers published
Yolanda: good practices exist but not in the form of regulations
<JimM> legally acceptable records were an interest expressed via censa.org in the context of e-notebooks
Satya: argument from the community is that they want to maximize the publications before releasing the dataset
Yolanda: another argument is that
it is too much work to capture all the information
... as a group can we facilitate and production of provenance information?
Satya: 2nd Use case
Use Case from Paolo.
They want to enhance the provenance information from a workflow enviromnent
highlight from the use case domain specific metadata for provenance
provenance trail from workflow must be extended with provenance annotations
specific challenge how to best to associate unstructured provenance with domain specific provenance.
<JimM> the key issue with annotaions is that they need to be part of the account structure, i.e. they are things being asserted
Satya: workflow based infrastructure associated with the domain specific vocabularies
can domain specific ontologies be used to annotate the trail of workflow process?
JimM: we need to be able to have
an assertion structure for provenance metadata
... in a provenance discussion we need to deal with named graphs, reasoning, in order to be able to answer questions related to implicit information
JimM: we need to be able to make assertions across sources
Luc: not sure he would describe
those as a provenance use case. To Luc, a provenance use case
should solve a query of the user.
... the use cases state that the users want to just query the provenance but not why.
... 2nd Use case: not a functional requirement for provenance
... Use case should not be defined in terms of provenance
<YolandaG> -q JimM
<YolandaG> -q Luc
<JimM> i'd be curious to hear more about why named graphs are insufficient...
<ivan> I guess this is the paper Irini referred to: Fundulaki, Irini, Vassilis Christophides, Giorgos Flouris, and Panagiotis Pediaditis. "On Explicit Provenance Management in RDF/S Graphs." In First Workshop on the theory and practice of provenance, TaPP'09, San Francisco, CA, James Cheney. San Francisco, CA, 2009. http://www.usenix.org/events/tapp09/tech/full_papers/pediaditis/pediaditis_html/.
Yes, thanks Ivan.
Satya: 3rd Use case
Luc; 3rd USe Case [ Use of private data]
Regulations for the use of private data, data protection acts
the use case refers to the provenance dimensions for accountability
processes use information compatible with rules/regulations
able to audit systems that process private information.
check whether the use of data was legal
whether the colleciton of data was lawful
the problems with the scenario:metadata representation (all possible notions: tasks, obligations, etc.)
for this SW technologies
another problem: provenance management: processing has to be documented so there is the need for a common documentation and provenance models (interoperability issue)
auditing the provenance in order to perform the auditing task
the results of the audit can be trusted if the provenance can be trusted
cryptography hashes as part of provenance
checking the provenance against rules and this is a provenance use issue
JimM: the audit can be done only if provenance is reconstructed
trail is going to be broken by the different playes
<crunnega> There may be a business advantage in being able to reassure customers that their priviacy policies and practices can be verified
provenace could give some hints on the problem but not explanation of what has happened.
partial provenance could nail down where the leak has hasppened
Yolanda thinks is very controvercial to create a use case to highlight compliance
Luc; the primary dimension is accountability which is not necessarily compliance.
Do not want to enforce compliance just be able to have accountability
A different use case: compliance to processes
crunnega: number of use case scenarios for privacy that could use provenance
Personal Data/Private Data equivalent terms.
<jcheney> = confidential data?
Yolanda takes the floor:
Yolanda plans to talk to Simon to go through provenance dimensions and use cases
Invitation to members to join and see the coverage of use cases
Missing half of the expected set
F2F meeting: most popular venue WWW, 2nd Meeting in NYC
Considering both venues WWW, IPAW
<mccuskej> I can't log into that page.
<JimM> +1 for two mtgs
Possibility to join on phone.
<ivan> i do
end of April will be reasonable. IPAW could be a good idea.
Next Meeting, January 8th
trackbot, end telcon
This is scribe.perl Revision: 1.135 of Date: 2009/03/02 03:52:20 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) No ScribeNick specified. Guessing ScribeNick: Irini Inferring Scribes: Irini WARNING: No "Topic:" lines found. Default Present: Irini, Luc, Ivan, satya, Betty, YolandaG, +49.308.937.aabb, jcheney, olaf, +1.217.417.aacc, mccuskej, JimM, [IPcaller], lkagal Present: Irini Luc Ivan satya Betty YolandaG +49.308.937.aabb jcheney olaf +1.217.417.aacc mccuskej JimM [IPcaller] lkagal WARNING: No meeting chair found! You should specify the meeting chair like this: <dbooth> Chair: dbooth Found Date: 18 Dec 2009 Guessing minutes URL: http://www.w3.org/2009/12/18-prov-xg-minutes.html People with action items: WARNING: Input appears to use implicit continuation lines. You may need the "-implicitContinuations" option. WARNING: No "Topic: ..." lines found! Resulting HTML may have an empty (invalid) <ol>...</ol>. Explanation: "Topic: ..." lines are used to indicate the start of new discussion topics or agenda items, such as: <dbooth> Topic: Review of Amy's report[End of scribe.perl diagnostic output]