W3C

Scientific Discourse

20 Feb 2012

Attendees

Present
+1.619.252.aaaa, pgroth, EricP, RichBoyce, +1.310.279.aabb, [IPcaller], Yolanda, Jodi, +44.777.500.aacc, +1.617.947.aadd
Regrets
Chair
Anita
Scribe
ericP

Contents


<Anita> Did you dial into the UK or France? My colleagues are having problems dialing in

<pgroth> helena deus?

<pgroth> http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf

<boycer> please repost link to slides for those of us who came in late to the chat

<Anita> prov-overview-update-pg.pdf [View next to chat]

<jodi> http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf

<boycer> thanks!

[slide 2]

<Anita> Will also post to meeting wiki at http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/meetings/20120220

<scribe> scribenick: ericP

[slide 3]

<Anita> ericP In 'note that the topic has the link to the slides' - where/what is this 'topic' that you speak of?

<Gully> repost slide location?

pgroth: Science needs repeatability and reproducability

<pgroth> http://www.few.vu.nl/~pgroth/prov-overview-update-pg.pdf

[slide 4]

pgroth: [timbl's quote about the "Oh yeah?" button] [actually comes from Dan Connolly, I believe]

[slide 5]

pgroth: many folks want prov representations immediately
... also, many implementations

<YolandaGil> We took Tim's quote from in his book as I recall

<Anita> ericP do you have the details on the 'other' way to dial in?

pgroth: the incubator group produced a literature review and a WG

[slide 8]

<pgroth> slide 8

pgroth: wide membership: industry, govornment, science

[slide 9]

pgroth: at the top of slide 8 are serializations of e.g. OWL encoding (Prov-O), XML (Prov-XML), JSON, ...
... these express the provenance data model (Prov-DM)
... one can access the DM via Prov-AQ

<Anita> Prov AQ = Provenance Accessing Query - how do I go from resource to Provenance

pgroth: we also have a provenance primer

[slide 10]

pgroth: DM has a small set of classes and relationships

[slide 11 - provenance example]

pgroth: [re: slide 11] we need a process to describe the chart which Alice produced

[slide 12]

pgroth: "Entity" describes a chart, CSV file, car, ...

[slide 13]

[slide 14]

pgroth: Prov helps us know who did what

<Anita> Agent takes an active role in an activity

<Anita> (Small thought: you could model the entire process of doing science in this model!)

pgroth: Prov talks about Entities, Agents and Activities

[slide 16]

pgroth: Activities can produce Entities

[slide 17]

<Anita> 'Generation is the production of a new entity by an activity' (by an agent, correct?)

pgroth: Usage is when an Activity consumes an entity

[slide 18]

pgroth: sometimes you want to describe the derivation without describing the exact process.
... e.g. "Chart1 was derived from Entity1" without going into details

[slide 19]

pgroth: want to link an Activity to an Agent
... e.g. "Alice is responsible for the Excel analysis"

[slide 20]

pgroth: responsibility of Alice. analysis used a CSV file, chart was derived from the CSV file

@@1: when you write out workflows for e.g. experiments, you have lots of intermediate activities

scribe: i found having an activity-to-activity relationship is useful
... do you have a representation of that intermediate structure. otherwise representation gets laborious

[slide 21]

pgroth: i think you're describing something we call wasInformedBy
... wasAttributedTo is fundamental
... we offer extra constructs.
... there may be many constructs which you want, but we need a small set to achieve interop
... that's a trade-off frequently discussed in the group

[slide 22]

pgroth: it's easy to write this in Turtle

[slide 23]

[slide 24]

pgroth: we're working on simplifying the explainations
... released a draft last year. got the feedback that we need a simpler explaination
... Ontology is available but volatile at this point
... we're working with Dublin Core on a document describing the relationship between Prov and DC
... preparing for deeper community feedback
... aiming for Rec by the end of 2012

[slide 25]

pgroth: want to hear how in HCLS you need to extend this model
... there are some implementations, and if you want to implement, we're anxious to work with you

<Anita> This is a good time to give feedback to teh PROV model

pgroth: feedback is useful now

1+

<pgroth> go ahead

Tim: please use a transcript of this presentation as documentation as it was very clear

<Anita> DavidShotton did you want to be added to the speaker queue?

Tim: Roles would be useful

<pgroth> err… actually...

<Anita> Tim: Can you add roles to the (core) model?

Tim: I reallize that Roles don't belong in the Core
... e.g. role as a presenter, convener, etc.
... need it for a project

<Anita> TIm; I have a project I need that in

<DavidShotton> We are developing role ontologies

<Anita> pgroth: roles are in the spec

pgroth: we do have roles. i should add to this explaination
... we have a placeholder for entities with respect to activites

<Anita> pgroth: have a placeholder - entities wrt activity: e.g. chart plays-role-of output

<Anita> pgroth: but we don't define any roles

pgroth: we have a construct called prov:Role, but don't supply any kinds of Role

<pgroth> http://dvcs.w3.org/hg/prov/raw-file/default/primer/Primer.html

<DavidShotton> I would distinguish the STATUS of a document as an output from the ROLE of an AGENT

pgroth: you'll find Roles in the primer

Tim

<Anita> Tim: implementations: we started working with Susanna Sansone on an OWL model to encapsulate ISA-TAB data for experiments;

Tim: re: implementations and use use, we're working with Susana S on the ISA representation of genetic experiments
... we started saying "we need to subclass OBI" but migrated to Prov

<Anita> Tim: incorporating provenance as a motivator for that; much more useful to take Provenance as a motivator for experiments

<Zakim> ericP, you wanted to ask about Prov-AQ vs. SPARQL over the RDF representation

<Anita> So - if you have a question please type in 'q+' and then your question

pgroth: Prov-AQ tells you how to go from a resource to it's associated provenance
... we have a number of ways

<Anita> EricP asks question about Prov-AQ

pgroth: .. what metadata you need to embed in HTML to get to the provenance
... .. use of HTTP protocol to look up a prov service or set of data

<Anita> Are there other questions?

pgroth: another part of Prov-AQ is an query endpoint
... .. you POST or GET and we give you back some prov info
... also discusses SPARQL patterns

DavidShotton: excellent work. the simplicity is the key thing
... people have tried to capture e.g. time-dependent Roles, but that complicates things

pgroth: we've seen two broad uses:
... .. i've got a web doc or publication and i want to assign some prove info.

<Anita> paulG 1) web document, want to assign Provenance information; need simple document

pgroth: .. i want to track in a detailed fashion provenance in an automated system

<Anita> paulG: 2) You want to track in lots of detail provenance in automated system, fixed or time-dependent things, Paul on Feb 20th at 16:43 in Amsterdam

pgroth: in latter, i want to talk about e.g. "Paul at 4:21 while he's in Europe"
... supporting both of these views has been a challenge to the group
... we think we've got it with this model of starting with a simple model but having additions you can use

<Zakim> Anita, you wanted to say current implementations anywhere? Also: publishers can apply somehow?

Anita: there might be provenance stored in workflow tools
... could the prov model expose that in e.g. supplementary material in a publication?

<Anita> Sorry really two questions: first what are implementations? Second can we store workflow data in publication as Prov?

pgroth: Workflow4Ever (Taverna) provenance info is exposed via the Prov model
... there are some tools for mapping from OPN to Prov

<DavidShotton> For encoding time-related roles, see http://imageweb.zoo.ox.ac.uk/pub/2012/cerif/Shotton&Peroni_PRO-and-PSO.ppt

pgroth: not many implementation yet, still in a bit of flux

<pgroth> http://www.w3.org/2011/prov/wiki/TavernaProvenance

<pgroth> ack, I forgot about wings!

Anita: we've been talking with many folks about capturing workflows

<pgroth> sorry yolanda :-)

<david_r_newman> http://www.wf4ever-project.org/

Anita: and presenting in a way which can be easily used

<DavidShotton> Workflows 4 Ever http://www.wf4ever-project.org/

Anita: prov model is generic enough that it could be used in many domain

Tim: i expected another use case:
... .. when you guys publish, there's workflow applied to treating the manuscript

Anita: good point, we don't have good ways of capturing that

<pgroth> https://github.com/lucmoreau/ProvToolbox

Anita: could start earlier at e.g. figures and charts
... i don't know of any publisher who has that

<DavidShotton> Simple post hoc capture of publishing workflows: The Publishing Workflows Ontology http://purl.org/spar/pwo/

Tim: this is a goal of ISA-RDF
... .. define metadata in a way which standardizes workflow and provenance

<pgroth> https://github.com/INCF/ProvenanceLibrary

<Anita> Tim: say if you publish something and you have a figure - be able to click on it and go to a data repository that has primary data, steps and provenance

Tim: we're hoping by having a standard model, you could be able to e.g. click on a heatmap and get back to the source data

YolandaGil: i thought Tim was discussing the e.g. review process
... there's also the processing to generate a certain figure

<Anita> YolandaGil: Tim is capturing review process, is one workflow; other one is kind of processing that took place to generate a certain figure

YolandaGil: or the lab experiments and steps you took to obtain the data in the first place
... these are different steps, but they are connected

<Anita> Or link to gully's work: steps taken in the lab, observational assertions, interpretational assertions...

YolandaGil: reviewers often don't have a good way to check the work that they are reviewing
... this could help the reviewer inspect and detect errors

<Anita> YolandaGil: easier to inspect errors and insufficiencies in the papers; helps us review (and reproduce! - Anita) what was done

YolandaGil: the issue of credit is crucial to scientists

<Anita> YolandaGil credit is intertwined with provenance - different people do different things: who did what?

YolandaGil: with today's level of collaboration/re-use, they want to assign credit with precise detail

<Anita> ORCID is interested in developing a taxonomy of microcredit attribution

YolandaGil: if we were able to easily record prov on a dataset, samples, etc, we'd end up with extensive provenance records
... gives us the movie-level credits

<Anita> It would be interesting to link this - Amy Brand at Harvard is leading this, my colleague Mike Taylor is working on it in practice

Anita: Orchid is working on a model for micro-credit for authors working on a paper
... use case came from Provost, the evaluation side

<Anita> http://about.orcid.org/

YolandaGil: you may not be able to anticipate the credits you want to include
... prov model would provide a substrate

Tim: there's a little workshop on this

<Anita> http://about.orcid.org/civicrm/event/info?id=4&reset=1 ORCID workshop

<Anita> Tim: Yolanda can you attend the ORCID event?

<jodi> definitely a good time to hear about this!

Anita: many thanks. you'll hear from us

<boycer> one sec

<YolandaGil> Thank you for inviting us to talk about provenance!

Use Case 1

<pgroth> thanks everyone

boycer: on track wrt milestones

<Anita> Use case 1: developing demo: https://docs.google.com/document/d/1QpW-axtGL7Tuhd_Zcaf30a4-s4S_lveIJtpKL5QBLfQ/edit?hl=en_US

boycer: Anita, Joey and Anita, we've developed a proof-of-concept of the product inserts

<pgroth> If you have any questions, let us know. We really want feedback

boycer: linked to claims at clinical trials.gov, medical pathways, and drug-drug interactions

Use Case 2

<Anita> Use case 2: Boycer is presenting at C-SHALS this week about Use Case 1 - http://www.iscb.org/cshals2012-program/

DavidShotton: no progress since last meeting

<Anita> Sorry that's Use case 1: Boycer is presenting at C-SHALS this week about Use Case 1 - http://www.iscb.org/cshals2012-program/

<Anita> DavidShotton: no news on UC 2, need to talk to TIm

<Anita> Tim: we have model we've been developing with classes and object properties, not yet datatype props get started on discussions, happy to chat

Tim: i can share with you the latest and greatest, added object properties and starting on data properties

Use Case 3

Anita: joanne and I have been developing UC3 for a class which Deb McGuinness teacing
... students will provide a portal to link adolescent antidepressants to drug interaction and some proprietary Elsevier data

DavidShotton: there are two meetings in Cambrige MA related Roles:
... Welcome Trust and then ORCHID the next day
... 16 and 17 May

Tim: they are specifically joined

Anita: next joint meeting in 4 weeks
... soliciting thoughts for presentations

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/02/20 16:17:07 $