14:48:56 RRSAgent has joined #prov-xg 14:48:56 logging to http://www.w3.org/2009/12/18-prov-xg-irc 14:48:58 RRSAgent, make logs world 14:48:58 Zakim has joined #prov-xg 14:49:00 Zakim, this will be 98765 14:49:00 I do not see a conference matching that name scheduled within the next hour, trackbot 14:49:01 Meeting: Provenance Incubator Group Teleconference 14:49:01 Date: 18 December 2009 15:42:07 ssahoo2 has joined #prov-xg 15:49:35 Irini has joined #prov-xg 15:49:50 trackbot, prepare telcon 15:49:52 RRSAgent, make logs world 15:49:54 Zakim, this will be 98765 15:49:54 ok, trackbot; I see INC_PROVXG()11:00AM scheduled to start in 11 minutes 15:49:55 Meeting: Provenance Incubator Group Teleconference 15:49:55 Date: 18 December 2009 15:50:40 zakim, who is here? 15:50:40 INC_PROVXG()11:00AM has not yet started, Irini 15:50:41 On IRC I see Irini, ssahoo2, Zakim, RRSAgent, ivan, trackbot 15:52:01 YolandaG has joined #prov-xg 15:52:04 zakim, agenda? 15:52:04 I see nothing on the agenda 15:52:29 agenda+ Welcome, review of agenda (by Yolanda Gil) 15:52:29 Discussion of new batch of use cases (led by Simon Miles and Satya Sahoo) 15:52:29 - http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_1 15:52:29 - http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_2 15:52:30 - http://www.w3.org/2005/Incubator/prov/wiki/Use_Case_private_data_use 15:52:30 Coverage of provenance dimensions by current use cases (led by Simon Miles and Yolanda Gil) 15:52:31 Planning for next meeting, agenda and scribe (led by Yolanda Gil) 15:52:34 Review of action items (by scribe) 15:53:05 agenda+ Welcome, review of agenda (by Yolanda Gil) 15:53:17 agenda+ Discussion of new batch of use cases (led by Simon Miles and Satya Sahoo) 15:53:29 agenda+ http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_1 15:53:37 Thanks for doing this Irini!! 15:53:40 agenda+ http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_2 15:53:52 agenda+ Coverage of provenance dimensions by current use cases (led by Simon Miles and Yolanda Gil) 15:54:03 agenda+ Planning for next meeting, agenda and scribe (led by Yolanda Gil) 15:54:15 agenda+ Review of action items (by scribe) 15:54:22 zakim, agenda? 15:54:22 I see 8 items remaining on the agenda: 15:54:23 1. Welcome, review of agenda (by Yolanda Gil) [from Irini] 15:54:26 2. Welcome, review of agenda (by Yolanda Gil) [from Irini] 15:54:28 3. Discussion of new batch of use cases (led by Simon Miles and Satya Sahoo) [from Irini] 15:54:30 4. http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_1 [from Irini] 15:54:32 5. http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_2 [from Irini] 15:54:35 6. Coverage of provenance dimensions by current use cases (led by Simon Miles and Yolanda Gil) [from Irini] 15:54:37 7. Planning for next meeting, agenda and scribe (led by Yolanda Gil) [from Irini] 15:54:38 8. Review of action items (by scribe) [from Irini] 15:56:16 INC_PROVXG()11:00AM has now started 15:56:23 +Irini 15:56:25 zakim, who is here? 15:56:25 On the phone I see Irini 15:56:26 On IRC I see YolandaG, Irini, ssahoo2, Zakim, RRSAgent, ivan, trackbot 15:56:51 crunnega has joined #prov-xg 15:57:47 +??P1 15:57:50 Luc has joined #prov-xg 15:58:10 zakim, +??P1 is Luc 15:58:10 sorry, Irini, I do not recognize a party named '+??P1' 15:58:16 olaf has joined #prov-xg 15:58:32 zakim, ??P1 is Luc 15:58:32 +Luc; got it 15:58:42 zakim, who is here? 15:58:42 On the phone I see Irini, Luc 15:58:44 On IRC I see olaf, Luc, crunnega, YolandaG, Irini, ssahoo2, Zakim, RRSAgent, ivan, trackbot 15:58:48 +Prateek 15:59:17 zakim, dial ivan-voip 15:59:17 ok, ivan; the call is being made 15:59:18 +Ivan 15:59:21 zakim, prateek is satya 15:59:22 + +1.860.673.aaaa 15:59:22 +satya; got it 15:59:59 +Betty 16:00:24 jcheney has joined #prov-xg 16:00:24 zakim, who is here? 16:00:24 On the phone I see Irini, Luc, satya, Ivan, +1.860.673.aaaa, Betty 16:00:25 On IRC I see jcheney, olaf, Luc, crunnega, YolandaG, Irini, ssahoo2, Zakim, RRSAgent, ivan, trackbot 16:00:36 +Jerry_Hobbs 16:00:41 mccuskej has joined #prov-xg 16:00:46 Christine on the phone too 16:00:48 zakim, Jerry_Hobbs is really me 16:00:48 +YolandaG; got it 16:01:11 + +49.308.937.aabb 16:01:32 mccuskej has joined #prov-xg 16:02:01 JimM has joined #prov-xg 16:02:08 +??P11 16:02:34 zakim, ??P11 is really me 16:02:34 +jcheney; got it 16:02:56 zakim, aabb is olaf 16:02:56 +olaf; got it 16:02:59 +??P12 16:03:10 yolanda: Look at the provenance dimensions and use cases and how to organize the use cases and provenance dimensions 16:03:37 + +1.217.417.aacc 16:03:40 Satya will be covering 2 use cases 16:03:46 Biomedical one. 16:03:57 http://www.w3.org/2005/Incubator/prov/wiki/Domain_Specific_Provenance_2 16:04:29 same here 16:04:51 afreitas has joined #prov-xg 16:05:09 zakim, who is on the phone? 16:05:09 On the phone I see Irini, Luc, satya, Ivan, +1.860.673.aaaa, Betty, YolandaG, olaf, jcheney, ??P12, +1.217.417.aacc 16:05:25 Use case inspired from experiments. Combine data from different sources and databases 16:05:37 Manual Extraction and NLP techniques 16:05:40 zakim, +1.860.673.aaaa is really me 16:05:40 +mccuskej; got it 16:05:51 Basic issue is whether a particular instrument has been used. 16:05:54 zakim, ??P12 is really me 16:05:54 +JimM; got it 16:06:17 Interpretatioon query and experimentation results. 16:06:52 Types of data: curated data with high quality. But, data from prediction algorithms does not have the same quality as the human curated data. 16:07:24 Aleksey has joined #prov-xg 16:07:41 Examples/Sets of Goals in the Use case: exhanging data between groups, essential to understand the process and the instruments used. 16:08:03 Get administrative data (instruments etc.) 16:08:14 lkagal has joined #prov-xg 16:08:19 +[IPcaller] 16:08:57 +Marisol 16:09:02 Standard queries in provenance scenario to be answered. 16:09:23 Important to add information that is important to understand and interpret results 16:09:28 zakim, +Marisol is me 16:09:28 sorry, lkagal, I do not recognize a party named '+Marisol' 16:09:36 q+ 16:09:39 q+ 16:09:46 Storing and querying efficiently provenance information is a big issue 16:10:14 yolanda: 16:10:57 yolanda thinks that a general problem is the presence of experimental data and with no provenance such data has a limited use. 16:11:09 zakim, Marisol is me 16:11:09 +lkagal; got it 16:11:40 Question: how do we capture and represent provenance information to be used later on. 16:12:20 yolanda thinks that there is a more general problem that is important. 16:13:11 yolanda's question: in terms of provenance does it mean that there is a provenance query engine that searches the web that will be looking for all experimental data with provenance and it will return these results? 16:13:52 satya's answer: information is linked to the experimental results and the results are tracked back (provenance within a lab) 16:14:27 a way to register to get updates to prov would address this - a trackback service 16:14:35 yolanda: what happens in a data exchange or data integration scenario? what is the scale? 16:14:59 Satya: scale of provenance information increases 16:15:33 (there was an IEEE Escience 2009 presentation doing this for citations) 16:15:40 James: results in social sciences used in policy decisions. 16:16:11 James: do they exist regulations that must be satisfied in the biomedicine domain? 16:16:37 Pharma and analytical chemistry labs would be under FDA and other regulations 16:16:43 Satya: no legal requirements except the fact that journals want to have the dataset used in the papers published 16:17:17 Yolanda: good practices exist but not in the form of regulations 16:17:56 legally acceptable records were an interest expressed via censa.org in the context of e-notebooks 16:18:17 Satya: argument from the community is that they want to maximize the publications before releasing the dataset 16:18:34 Yolanda: another argument is that it is too much work to capture all the information 16:19:19 Yolanda: as a group can we facilitate and production of provenance information? 16:19:32 Satya: 2nd Use case 16:19:40 Use Case from Paolo. 16:20:04 They want to enhance the provenance information from a workflow enviromnent 16:21:19 highlight from the use case domain specific metadata for provenance 16:21:32 provenance trail from workflow must be extended with provenance annotations 16:22:09 specific challenge how to best to associate unstructured provenance with domain specific provenance. 16:22:27 the key issue with annotaions is that they need to be part of the account structure, i.e. they are things being asserted 16:22:58 q+ 16:23:17 Satya: workflow based infrastructure associated with the domain specific vocabularies 16:23:41 can domain specific ontologies be used to annotate the trail of workflow process? 16:23:41 q+ 16:24:00 q- 16:24:48 JimM: we need to be able to have an assertion structure for provenance metadata 16:25:45 JimM: in a provenance discussion we need to deal with named graphs, reasoning, in order to be able to answer questions related to implicit information 16:26:03 +q 16:28:14 JimM: we need to be able to make assertions across sources 16:28:58 Luc: not sure he would describe those as a provenance use case. To Luc, a provenance use case should solve a query of the user. 16:29:29 Luc: the use cases state that the users want to just query the provenance but not why. 16:30:34 Luc: 2nd Use case: not a functional requirement for provenance 16:31:41 Luc: Use case should not be defined in terms of provenance 16:32:27 +q 16:33:36 q- 16:33:46 -q 16:34:06 -q JimM 16:34:12 -q Luc 16:34:39 i'd be curious to hear more about why named graphs are insufficient... 16:36:06 q+ 16:36:59 q- 16:37:39 -Irini 16:37:59 I guess this is the paper Irini referred to: Fundulaki, Irini, Vassilis Christophides, Giorgos Flouris, and Panagiotis Pediaditis. "On Explicit Provenance Management in RDF/S Graphs." In First Workshop on the theory and practice of provenance, TaPP'09, San Francisco, CA, James Cheney. San Francisco, CA, 2009. http://www.usenix.org/events/tapp09/tech/full_papers/pediaditis/pediaditis_html/. 16:38:13 Yes, thanks Ivan. 16:39:48 q- 16:40:03 Satya: 3rd Use case 16:40:14 http://www.w3.org/2005/Incubator/prov/wiki/Use_Case_private_data_use 16:40:28 Luc; 3rd USe Case [ Use of private data] 16:41:13 Regulations for the use of private data, data protection acts 16:41:51 the use case refers to the provenance dimensions for accountability 16:42:13 processes use information compatible with rules/regulations 16:42:25 able to audit systems that process private information. 16:42:47 check whether the use of data was legal 16:43:09 whether the colleciton of data was lawful 16:44:24 -[IPcaller] 16:45:08 the problems with the scenario:metadata representation (all possible notions: tasks, obligations, etc.) 16:45:13 for this SW technologies 16:45:58 another problem: provenance management: processing has to be documented so there is the need for a common documentation and provenance models (interoperability issue) 16:46:15 auditing the provenance in order to perform the auditing task 16:46:27 q+ 16:46:30 the results of the audit can be trusted if the provenance can be trusted 16:46:32 -Irini 16:46:34 -q 16:46:50 q+ 16:46:51 cryptography hashes as part of provenance 16:47:10 checking the provenance against rules and this is a provenance use issue 16:47:15 +q 16:48:24 JimM: the audit can be done only if provenance is reconstructed 16:48:45 trail is going to be broken by the different playes 16:48:47 players 16:48:54 zakim, who is making noise? 16:48:58 There may be a business advantage in being able to reassure customers that their priviacy policies and practices can be verified 16:49:05 Irini, listening for 10 seconds I heard sound from the following: Luc (4%) 16:49:39 q+ 16:50:19 provenace could give some hints on the problem but not explanation of what has happened. 16:50:46 -mccuskej 16:51:29 partial provenance could nail down where the leak has hasppened 16:51:42 (from JimM) 16:52:52 Yolanda thinks is very controvercial to create a use case to highlight compliance 16:53:47 Luc; the primary dimension is accountability which is not necessarily compliance. 16:54:04 q+ 16:54:19 Do not want to enforce compliance just be able to have accountability 16:54:52 A different use case: compliance to processes 16:55:27 pgroth has joined #prov-xg 16:56:18 crunnega: number of use case scenarios for privacy that could use provenance 16:58:01 Personal Data/Private Data equivalent terms. 16:58:12 q- 16:58:16 = confidential data? 16:58:19 q- 16:59:34 q- 16:59:40 Yolanda takes the floor: 16:59:41 q- 17:00:00 Yolanda plans to talk to Simon to go through provenance dimensions and use cases 17:00:20 Invitation to members to join and see the coverage of use cases 17:00:30 Missing half of the expected set 17:01:52 F2F meeting: most popular venue WWW, 2nd Meeting in NYC 17:01:52 http://www.w3.org/2002/09/wbs/43897/FindingTimeforF2F/results 17:02:39 Considering both venues WWW, IPAW 17:03:34 I can't log into that page. 17:03:36 +1 for two mtgs 17:03:43 Possibility to join on phone. 17:04:18 i do 17:06:22 end of April will be reasonable. IPAW could be a good idea. 17:06:39 Next Meeting, January 8th 17:07:05 -satya 17:07:07 -JimM 17:07:07 - +1.217.417.aacc 17:07:08 -olaf 17:07:08 -lkagal 17:07:10 -Irini 17:07:10 -Betty 17:07:11 -jcheney 17:07:12 -Ivan 17:07:17 lkagal has left #prov-xg 17:07:18 mccuskej has left #prov-xg 17:07:33 -Luc 17:07:51 trackbot, end telcon 17:07:51 Zakim, list attendees 17:07:51 As of this point the attendees have been Irini, Luc, Ivan, satya, Betty, YolandaG, +49.308.937.aabb, jcheney, olaf, +1.217.417.aacc, mccuskej, JimM, [IPcaller], lkagal 17:07:52 RRSAgent, please draft minutes 17:07:52 I have made the request to generate http://www.w3.org/2009/12/18-prov-xg-minutes.html trackbot 17:07:53 RRSAgent, bye 17:07:53 I see no action items