menu Status #p3ylX [PROV: Three Years Later] [11:52] == LucMoreau [~LucMoreau@public.cloak] has joined #p3yl [11:53] <@LucMoreau> RRSAgent, make logs public [11:54] <@LucMoreau> present +LucMoreau [11:56] == LucMoreau changed the topic of #p3yl to: PROV: Three Years Later [11:58] <@LucMoreau> Programme: http://provenanceweek.org/2016/p3yl/programme.html [12:12] == LucMoreau_ [~LucMoreau@public.cloak] has joined #p3yl [12:22] == Paolo [~Paolo@public.cloak] has joined #p3yl [12:29] <@LucMoreau> present +LucMoreau, Paolo Missier [12:37] == Paolo [~Paolo@public.cloak] has quit [Ping timeout: 180 seconds] [13:05] == Yuankai [~Yuankai@public.cloak] has joined #p3yl [13:05] == Yuankai [~Yuankai@public.cloak] has quit ["Page closed"] [13:15] == CurtTilmes [~CurtTilmes@public.cloak] has joined #p3yl [13:17] == Paolo [~Paolo@public.cloak] has joined #p3yl [13:19] == ericstephan [~ericstephan@public.cloak] has joined #p3yl [13:19] == ericstephan [~ericstephan@public.cloak] has quit ["Page closed"] [13:19] == tdn [~tdn@public.cloak] has joined #p3yl [13:19] == Tom_Bytheway [~Tom_Bytheway@public.cloak] has joined #p3yl [13:20] == Schmendrick [~Schmendrick@public.cloak] has joined #p3yl [13:20] == ericstephan [~ericstephan@public.cloak] has joined #p3yl [13:20] == NickCar [~NickCar@public.cloak] has joined #p3yl [13:20] == onyame [~onyame@public.cloak] has joined #p3yl [13:23] <@LucMoreau> present +LucMoreau, Paolo Missier, Curt Tilmes, Eric Stephan, Tom Bytheway, Nick Car, onyame, tsn, Schmendrick [13:24] == pgroth [~pgroth@public.cloak] has joined #p3yl [13:25] agenda: http://provenanceweek.org/2016/p3yl/programme.html [13:25] Session 1: Experience and Impact [13:30] <@LucMoreau> present +LucMoreau, Paolo Missier, Curt Tilmes, Eric Stephan, Tom Bytheway, Nick Car, onyame, tdn, Schmendrick [13:32] == DeborahLNichols [~DeborahLNichols@public.cloak] has joined #p3yl [13:33] <@LucMoreau> Nick, do you have a URL describing your application with Australian government? [13:33] NickCar need for "serious URI" [13:36] http://www.bristol.ac.uk/cmm/software/statjr/ [13:37] == NickCar_ [~NickCar@public.cloak] has joined #p3yl [13:41] "serious URI" = "Cool URIs" (https://www.w3.org/TR/cooluris/)? [13:43] Yeah but it's the social/governance arrangements around the CoolURIs that are hard. Making them 'cool' is relatively easy. Keeping the domain accounts account entries paid, keeping proxy servers online over time is hard [13:44] lots of nice applications that use prov [13:45] now we need users who use prov applications [13:50] here is a link to the ProvONE demo: http://dataoneorg.github.io/provweek2016-demo/ [13:52] We have a shedload of PROV documents here (66,822 documents and growing). They're a little repetitive, but may be useful. https://provenance.ecs.soton.ac.uk/store/public/ [13:53] http://lov.okfn.org/dataset/lov/vocabs/prov shows 39 vocabs reusing prov-o [13:55] <@LucMoreau> advertise your prov data sets/tools at https://www.w3.org/2001/sw/wiki/PROV [13:56] == NickCar [~NickCar@public.cloak] has quit [Ping timeout: 180 seconds] [13:56] <@LucMoreau> ProvStore has also got lots provenance documents: see https://provenance.ecs.soton.ac.uk/store/ [13:58] Could this be fixed by a adding a pingback link to the provstore query service to every LOD endpoint? ;-) [13:59] there is LOD and Web provenance, and there is deep provenance [13:59] == ajax [~ajax@public.cloak] has joined #p3yl [13:59] deep provenance may be where much of the real impact happens [13:59] <@LucMoreau> Action: document the impact of PROV [14:00] +1 pgroth [14:00] == pedwards [~pedwards@public.cloak] has joined #p3yl [14:01] <@LucMoreau> present +LucMoreau, Paolo Missier, Curt Tilmes, Eric Stephan, Tom Bytheway, Nick Car, onyame, tdn, Schmendrick, Peter Edwards [14:03] provenance querying and analysis to understand the impact if changes in the data used in analytics applications: https://www.usenix.org/conference/tapp16/workshop-program/presentation/missier [14:03] == Alban_Gaignard [~Alban_Gaignard@public.cloak] has joined #p3yl [14:03] <@LucMoreau> Curt: PROV awareness has greatly increased [14:03] A few other PROV usage cases: PROV for food safety compliance monitoring (with UK Food Standards Agency) [14:04] Also much more PROV papers appearing on more generic conferences [14:04] or at least papers where PROV is used as part of a process/workflow [14:04] Another PROV use case: PROV to aid data quality assessment from crowd reports (rural public transport) [14:04] <@LucMoreau> data on the Web Best Practice https://www.w3.org/TR/dwbp/#provenance [14:06] Another use case: PROV templates to aid social science researchers to document social media data analytics (see poster at IPAW) [14:06] A work-in-progress on consuming PROV to generate linked experiment reports https://www.usenix.org/conference/tapp16/workshop-program/presentation/gaignard [14:06] <@LucMoreau> Domain specific provenance needed for auditing, private data, ... [14:06] <@LucMoreau> present +LucMoreau, Paolo Missier, Curt Tilmes, Eric Stephan, Tom Bytheway, Nick Car, onyame, tdn, Schmendrick, Peter Edwards, Alban Gaignard [14:08] Best practices in life science datasets : http://www.w3.org/TR/hcls-dataset/#s6_4 [14:08] Schmendrick [~Schmendrick@public.cloak] requested CTCP VERSION from LucMoreau: [14:10] <@LucMoreau> Challenge in the legal community: how do you assign the cost of producing and authenticating records [14:10] <@LucMoreau> Can we better quantify the benefits of using provenance in specific (business) contexts [14:12] <@LucMoreau> alternative set of use cases being suggested .... [14:12] <@LucMoreau> .... science can't replicate their own result [14:12] DOE use cases: reproducibility [14:13] <@LucMoreau> .... government putting pressure to share data to validate results [14:13] Should we launch FOIs designed to show up the lack of reproducibility? [14:13] is results validation the killer app for provenance? (low complexity / high impact) [14:13] UK social science community have recognised that most research using social media data is poorly documented/not reproducible. [14:13] <@LucMoreau> ... design systems to capture provenance and automated reasoning over provenance [14:14] <@LucMoreau> Does provenance make a difference in the climate assessment report? [14:15] Use case for Government: evidence-based policy? But beware: as much of what actually happens is policy-based evidence! [14:19] Australia has used the climate assessment report as an example of transparent science best practice [14:20] Is anybody storing the provenance of provenance? Do we know how much of the PROV on the web has been manually curated versus automatically created? [14:22] I think 'REFable' statements of success could be an output of the Research Data Alliance's Research Data Provenance Interest Group (https://rd-alliance.org/groups/research-data-provenance.html) [14:25] <@LucMoreau> Need better way to communicate our success stories [14:25] The american and european geophysical unions are also very nice communities to evangelize [14:26] == ajax [~ajax@public.cloak] has quit [Ping timeout: 180 seconds] [14:26] Communities looking for traceability solutions: primary food producers, makers, crafters ... Not just scientific community. [14:27] Some crappy ols provenance use cases http://promsns.org/uc/usecases/ [14:31] == Alban_Gaignard [~Alban_Gaignard@public.cloak] has quit [Ping timeout: 180 seconds] [14:32] == pgroth [~pgroth@public.cloak] has quit [Ping timeout: 180 seconds] [14:32] == ericstephan [~ericstephan@public.cloak] has quit [Ping timeout: 180 seconds] [14:35] == Schmendrick [~Schmendrick@public.cloak] has quit [Client closed connection] [14:39] == Schmendrick [~Schmendrick@public.cloak] has joined #p3yl [14:39] == Alban_Gaignard [~Alban_Gaignard@public.cloak] has joined #p3yl [14:40] == pgroth [~pgroth@public.cloak] has joined #p3yl [14:40] Session 2: Inter-operability Issues and Gaps [14:40] 1) Make a standard provenance system [14:40] == ericstephan [~ericstephan@public.cloak] has joined #p3yl [14:40] issues of software reuse [14:43] nidm.nidash.org - domain specific extension of prov [14:43] == RDGelzer [~RDGelzer@public.cloak] has joined #p3yl [14:43] issues of round trip conversion between different serializations [14:43] interesting domain-specific extension in the area of brain imaging: http://provenanceweek.org/2016/p3yl/slides/slide_90.pdf shortcoming of the prov specs to enable interoperability at tool level? [14:44] hard to write queries over prov [14:44] == RDGelzer [~RDGelzer@public.cloak] has quit ["Page closed"] [14:44] @paolo it's really a problem of writing sparql over qualified patterns [14:45] == RDGelzer [~RDGelzer@public.cloak] has joined #p3yl [14:48] == RDGelzer [~RDGelzer@public.cloak] has quit ["Page closed"] [14:48] == RDGelzer [~RDGelzer@public.cloak] has joined #p3yl [14:53] @LucMoreau Some work on access control to provenance that may be of interest: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=7232963 [14:55] <@LucMoreau> Paul: do we need more common libraries? [14:58] <@LucMoreau> If you use ProvToolbox or provpi, can you post a link to your app here? [14:58] <@LucMoreau> All those extensions of PROV (dataone, provone) can be out of control? [15:04] <@LucMoreau> Tom: it would be useful to have a PROV validator at W3C? [15:09] <@LucMoreau> Paolo: Should we have a service that provides signed provenance documents if they validate? [15:14] <@LucMoreau> present +LucMoreau, Paolo Missier, Curt Tilmes, Eric Stephan, Tom Bytheway, Nick Car, onyame, tdn, Schmendrick, Peter Edwards, Alban Gaignard, RDGelzer [15:15] <@LucMoreau> present +LucMoreau, Paolo Missier, Curt Tilmes, Eric Stephan, Tom Bytheway, Nick Car, onyame, tdn, Schmendrick, Peter Edwards, Alban Gaignard, RDGelzer, DeborahLNichols, pgroth [15:16] == pgroth [~pgroth@public.cloak] has quit [Ping timeout: 180 seconds] [15:16] == NickCar_ [~NickCar@public.cloak] has quit ["Page closed"] [15:16] <@LucMoreau> many communities have not embraced data as first class citizen yet, so no surprise they struggle with provenance! [15:23] == RDGelzer [~RDGelzer@public.cloak] has quit [Ping timeout: 180 seconds] [15:30] == LucMoreau_ [~LucMoreau@public.cloak] has quit [Ping timeout: 180 seconds] [15:40] == tdn [~tdn@public.cloak] has quit [Ping timeout: 180 seconds] [15:40] <@LucMoreau> test [15:41] This is what I lot of people see when they Google provenance: https://www.provenance.org/ [15:41] == LucMoreau_ [~LucMoreau@public.cloak] has joined #p3yl [15:41] == NickCar [~NickCar@public.cloak] has joined #p3yl [15:41] Vasa: need an informative landing page with success stories that newcomers and interested potentiall up-takers would find useful (and appealing) [15:41] <@LucMoreau> Vasa: no proper community portal, no proper case study [15:42] == BertramLudaescher [~BertramLudaescher@public.cloak] has joined #p3yl [15:44] <@LucMoreau> Eric: advertise to the provenance community conferences that make use of provenance [15:44] Is there a provenance mailing list? [15:44] <@LucMoreau> Need educational section, tutorial material, small examples [15:44] <@LucMoreau> Paul: impact of PROV paper/web page [15:45] <@LucMoreau> Lucy: organised a provenance workshop at a supercomputing conference [15:46] <@LucMoreau> Nick: add provenance modules to existing data set [15:47] <@LucMoreau> Luc: Quantifying the impact/benefits of provenance [15:47] <@LucMoreau> Paolo: community portal where we can advertise what we do. An active vibrant community needs a place to go to. [15:48] <@LucMoreau> Bertram: what is the best practice in provenance queries [15:49] <@LucMoreau> Curt: Sparql is not novice friendly, needs more abstract layer [15:50] == dakoop [~dakoop@public.cloak] has joined #p3yl [15:50] Luc: should we restart the interop provenance challenges? [15:51] == tdn [~tdn@public.cloak] has joined #p3yl [15:51] <@LucMoreau> Eric: github, tools, api [15:51] <@LucMoreau> Luc: provenance challenge integrating tools with queries, and quantifying benefits [15:52] <@LucMoreau> Paolo: we had provbench, can we also measure performance in that context? performance of queries, for instance. [15:54] <@LucMoreau> Paolo/Eric: discoverability of provenance, how do we do it? [15:55] <@LucMoreau> Paul: integrating prov software into docker, jupiter notebook, etc [15:56] <@LucMoreau> Luc: reconvening WG or Community group? [15:57] Paul: add PROV tooling to infrastrufture for reproducibility (see previous point) [15:57] <@LucMoreau> Paul: why isn't ping back a recommendation? [15:57] s/Paul/Tom [15:57] <@LucMoreau> Nick: why can't ping back allow us to submit bunch of provenance [15:58] <@LucMoreau> Eric: Maybe Interest group rather than Community group [15:59] <@LucMoreau> Paul: would like to see software showing that there is a concrete list of inter-operability issues [16:00] <@LucMoreau> Reed: is there a forum to discuss use cases/scenarios? [16:03] <@LucMoreau> Nick/James: RDA is useful for scientific data. What is the right place for legal/medical? [16:04] <@LucMoreau> Vasa: what is the funding scheme for this? network grants? [16:08] == RDGelzer [~RDGelzer@public.cloak] has joined #p3yl [16:09] <@LucMoreau> Age: clear statement of why it is worth paying the money for? what's the highest impact the community can aim for? what is the nugget to chase? [16:10] <@LucMoreau> Paul: EBM semantic banking are using prov, but not well known to us [16:10] <@LucMoreau> Paul: schema.org is a place where we start putting prov, and get developers attention [16:12] <@LucMoreau> Paolo: more coorrdination in terms extracting provenance for R, python, [16:13] <@LucMoreau> Marta: spark community claims to have provenance, but it is not PROV [16:13] == dakoop [~dakoop@public.cloak] has quit [Ping timeout: 180 seconds] [16:15] <@LucMoreau> Who would like to contribute to a community portal, in some way to be determined? [16:15] +1 [16:15] +1 [16:15] <@LucMoreau> +1 [16:15] +1 [16:15] +! [16:15] back to good old +1! [16:15] +1 [16:15] +pgroth [16:15] +/- 1 [16:15] +1 [16:16] in the name of mankind, +1 [16:17] <@LucMoreau> Luc: Should we have a special issue on provenance impact? [16:19] <@LucMoreau> Vasa: this does not preclude a community portal [16:19] <@LucMoreau> Paul: wouldn't it be better to have a common paper? [16:20] Luc wrote "Foundations of Provenance on the Web", perhaps it's time for something like "Impact of Provenance on the Web?" [16:21] <@LucMoreau> James: maybe better to have papers on provenance in domain specific journals (e.g. the Nature paper by Curt and team) [16:21] more science papers on provenance? [16:23] == age [~age@public.cloak] has joined #p3yl [16:23] <@LucMoreau> Nick: more certification in standardisation bodies, eg. OGC [16:25] <@LucMoreau> Nick: data fitness assessment ... why not use provenance? [16:26] <@LucMoreau> Nick: international ping back exercise [16:27] <@LucMoreau> who would like to participate in an inter-operability exercise (data, ping back, etc ...)? [16:27] +1 [16:27] +1 [16:27] +1 [16:27] +pgroth [16:27] +1 [16:27] +1 [16:27] +1 [16:27] +1 if it included complexity and/or at web scale [16:28] == vlad [~vlad@public.cloak] has joined #p3yl [16:28] == VasaCurcin [~VasaCurcin@public.cloak] has joined #p3yl [16:28] +1 [16:28] +1 on community portal, +1 on interoperability exercise [16:29] <@LucMoreau> action on luc: check prov-comments@w3.org is working [16:29] <@LucMoreau> present: +vlad [16:31] <@LucMoreau> So three outcomes: community portal, special issue, some interoperability challenge [16:33] <@LucMoreau> We are closing the proceedings, thanks to all [16:33] == Schmendrick [~Schmendrick@public.cloak] has left #p3yl [] [16:34] <@LucMoreau> RRSAgent, draft minutes. [16:34] == tdn [~tdn@public.cloak] has quit ["Page closed"] [16:34] == CurtTilmes [~CurtTilmes@public.cloak] has quit ["Page closed"] [16:34] == NickCar [~NickCar@public.cloak] has quit ["Page closed"] [16:35] <@LucMoreau> RRSAgent, draft minutes [16:38] == Paolo [~Paolo@public.cloak] has quit [Ping timeout: 180 seconds] [16:38] == age [~age@public.cloak] has quit [Ping timeout: 180 seconds] [16:38] == Alban_Gaignard [~Alban_Gaignard@public.cloak] has quit [Ping timeout: 180 seconds] [16:38] == ericstephan [~ericstephan@public.cloak] has quit [Ping timeout: 180 seconds] @LucMoreau BertramLudaescher DeborahLNichols LucMoreau_ onyame pedwards RDGelzer Tom_Bytheway VasaCurcin vlad