See also: IRC log
<trackbot> Date: 03 June 2011
<sandro> scribe: tlebo
<GK_> Having trouble with conference passcode again
yolanda: notes the final report.
<dgarijo> the link to the final report: http://www.w3.org/2005/Incubator/prov/XGR-prov-20101214/
trust, what things are and what they mean, how it was collected. CLOSED SYSTEM - we know it all and trust it.
<GK_> Provenance: needed for operating in an open information system. Make implicit expectations of closed system explicit.
contrast with OPEN SYSTEM - harder to use it because many contribute that you do not know.
consumer: how can I trust what I see?
<GK_> (Slide 3)^
Yolanda listing examles of multiple sources from which we collect evidence. who created it, who is responsible, whom do I attribute?
how old, who is managing repository? how can we veify these aspects?
in business - how do we ensure compliance with processes. e.g., outsourcing and getting results.
in science - how are results obtained? papers can get retracted.
in news -
<GK_> Wondering how much interaction is there between work on provenance and work on trust in open systems (e.g. trust conferences, etc.
in law and IP - who owns or has released document with what permissions?
TBL's oh yeah button quote 1997
trust at the top of the layer cake.
provenance need quotes.
John Sheridan UK National Archives data.gov.uk "Provenance is the number one issue that we face when publishing governmetn data in data.gov.uk"
being able to qualify what the data means.
provenance in science. not being able to reproduce results.
research forensics - people that dissect publications failing to reproduce results. e.g. clinical trials being done are based on false results.
e.g. Nobel prize winner's paper was retracted becuase couldn't be reproduced (not the prize paper)
some think "provenance is a no brainer; just do it :-)"
work done in incubator group
<GK_> IMO, If we can't make it a (nearly) a no-brainer for developers, we'll struggle to make it happen
people don't know how to approach provenance.
linked data community if facing the problem - querying the linked data and getting triples that don't make sense. what text extraction tools produced them?
scattered terminology, confounded with "trust"
<GK_> Before "provenance", there was a fair amount of SemWeb interest in "Context"
increased interest in provenance: Luc claims 1/2 of provenance papers published in last two years.
incubator group: state of art and develop road map
shared definition done at VERY END of group's work.
summarized 30 use cases by using 3 flagship scenarios
reviewed existing provenance vocabularies.
numbers (11/15) are dates
(slide assumes audience knows period of activity)
<GK_> I'd quite like to take this definition, and notes, into the WG work
provenance is the infrastructure that provides the BASIS to decide trust, verification, etc.
trust algorithm operate over provenance records.
provenance assertions of provenance assertions
inference to handle incompleteness and errors.
different accounts for same resource.
Three major dimensions to use to think about provenance.
Dimension 1 - content = what are we representing?
(5 types of Dimension 1, Content: attribution, process, evolution and versioning, justification for decisions, and entailment)
Dimension 2 - Management
<GK_> @tlebo, still talking to (1) content, I think
(4 types of Dimension 2, Management: publication, access, dissemintation control, scale)
(@GK_ sorry, I confounded Data Access and Access)
I know 2) Mangement - Access as "Discoverability and Accessibility"
Dimension 3 - Use includes (Understanding, interoperability, comparison, accountability, trust, imperfections, debugging)
<paolo> just muted myself, sorry
3 Dimensions are a framework to think about provenance issues.
30 use cases from the community
<GK_> I've wrestled with these 3 dimensions; still not completely sure, but seems to be (1) what does provenance consist of; (2) how make provenance available; (3) what can I do with provenance once I get it?
spent a lot of time defining how to structure use cases.
3 flagship scenarios
blogging news company needs to produce truthful and quality reports.
tweets of panda, NYTimes journalist - all different sources that the blogging news company can use.
<jcheney> By the way Yolanda there are slides for the Disease Outbreak scenario at: http://www.w3.org/2005/Incubator/prov/wiki/Analysis_of_Disease_Outbreak_Scenario
did the tweeter modify the image of the panda?
<jorn> "without getting sued" :)
manage heterogenous provenance records. how to present them, how to expose more details.
different communities analyzing the outbreak
business scenario - how does a company show that they complied with a contract? letting the consumer run verification procedures.
keeping some processes proprietary, but not breaking the verification.
start of art report
areas of research and application for provenance
(I organized the mappings at https://spreadsheets.google.com/spreadsheet/ccc?key=0ArTeDpS4-nUDdFBrQ3ZJMXROUHh4SmxRUVE5V0QwbVE&hl=en_US#gid=0)
yolanda enumerating the provenance vocabularies
<jorn> provenance surveys in literature: http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Survey
origina mappings that Yolanda mentioned: http://www.w3.org/2005/Incubator/prov/wiki/Provenance_Vocabulary_Mappings#Mappings
short vs longer term recommendations for next steps.
reproducability should be longer term
GK_: relationships to other work? Trust in open systems. Has provenance work interacted with work in trust in open systems and the Trust Conferences.
Yolanda: published survey of Trust in CS and semweb 3/4 years ago. on prov-xg wiki state of the art report.
trust: can you trust a certain entity. Can I authenticate to give access. Develop algorithms that I trust you and you trust another (transfer of trust) PLENTY of work this.
LESS work on "can I trust this content" (as opposed to "can I trust this entity"
trust you on movie recommendation or using one road over another.
content-based trust research is quite narrow.
trusting agents vs. trusting content.
Yolanda: many say doing provenance is easy, just make a schema and do it.
but the content in the provenance record is one, but how do you access, manage, and use those records?
it requires many considerations.
need for standards - many systems that track provenance by themselves, but how can other systems get, read and use those records?
need provenance in an open system where you don't have full control.
<Zakim> GK_, you wanted to test understanding of dimensions
<dgarijo> not only that, but provide guidelines for publishing provenance should be important too
GK_ how do 3 dimensions apply to doing a user requirements analysis? "what, how, and why" a fair reflection?
tlebo: scientific apps? observation and measurements?
yolanda: use case 2, but there are MANY sociological aspects within that scientific process.
tlebo: is there a nugget of observation and measurement within the disease outbreak flagship scenario?
pgroth: notion of objects
<Luc> thanks Yolanda!
<jun> thank you very much Yolanda!
+1 for Yolanda being helpful!
<pgroth> +1 thanks
<jorn> yupp, thanks a lot :)
<GK_> Thank you Yolanda.
<paolo> thank you once again, Yolanda!
<dgarijo> thank Yolanda!
This is scribe.perl Revision: 1.136 of Date: 2011/05/12 12:01:43 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: RRSAgent_Text_Format (score 1.00) Succeeded: s/infernece/inference/ Succeeded: s/Dimention/Dimension/ Found Scribe: tlebo Inferring ScribeNick: tlebo WARNING: No "Present: ... " found! Possibly Present: GK_ IPcaller ISI P11 P14 P2 P22 P24 P7 P8 P9 Provenance StephenCresswell Yogesh YolandaGil aaaa consumer dgarijo frew jcheney joined jorn jun luc paolo paolo_ pgroth prov sandro stain tlebo trackbot trust yolanda zednik You can indicate people for the Present list like this: <dbooth> Present: dbooth jonathan mary <dbooth> Present+ amy WARNING: No meeting chair found! You should specify the meeting chair like this: <dbooth> Chair: dbooth Found Date: 03 Jun 2011 Guessing minutes URL: http://www.w3.org/2011/06/03-prov-minutes.html People with action items: WARNING: Input appears to use implicit continuation lines. You may need the "-implicitContinuations" option.[End of scribe.perl diagnostic output]