08:14:39 RRSAgent has joined #dxwg 08:14:39 logging to http://www.w3.org/2017/07/17-dxwg-irc 08:15:39 Caroline_ has joined #DXWG 08:15:50 Present+ 08:16:16 LarsG has joined #dxwg 08:16:22 present+ 08:16:34 Introductions... 08:16:41 roba has joined #dxwg 08:16:47 newton has joined #dxwg 08:17:58 Can't hear :-( 08:18:28 AndreaPerego has joined #dxwg 08:18:33 rrsagent, set logs public 08:18:38 present+ AndreaPerego 08:18:43 present+ 08:18:48 present+ 08:18:52 meeting: DXWG Oxford Face to Face 08:19:26 Can hear you reasonably well Karen, but it dropped out past danbri 08:19:43 annette_g has joined #dxwg 08:19:43 chair: Karen 08:19:50 Jaroslav_Pullmann has joined #dxwg 08:19:55 present+ 08:19:55 present+ Dave_Raggett 08:20:04 Present+ annette_g 08:20:18 LuizBonino has joined #dxwg 08:20:29 present+ 08:20:37 scribenick: caroline_ 08:20:42 * agenda says check IRC for pw ... IRC says see todays agenda... 08:21:17 Scribe: Caroline_ 08:21:52 Topic: Introductions 08:21:52 kcoyle: the main goal for our F2F is to discuss the UCR 08:22:18 present Rob Atkinson 08:22:44 s/present/present+/ 08:22:45 ... Caroline and I tried to categorize them. If we get to a Use Case and think it is in another category we just move it 08:22:48 present+ Rob_Atkinson 08:23:31 ... the idea is to get through all of them even though if we don't get resolutions about all we have listed 08:23:48 https://www.w3.org/2017/dxwg/wiki/Main_Page#Working_Documents -> https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space & https://www.w3.org/2017/dxwg/wiki/Use_Cases_and_Requirements 08:23:56 ... if we need we may finish some of them afterwords 08:24:07 Keith has joined #dxwg 08:24:13 s/afterwords/afterwards/ 08:24:34 Topic: DCAT and "dataset" 08:24:49 q+ 08:24:51 kcoyle: the first one we are going to discuss is https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID8 08:24:53 q? 08:25:11 ack SimonCox 08:25:48 SimonCox: looking the version one of DCAT there is no ?? 08:26:14 ... to the extended DCAT part of what we are looking at is part of dublincore 08:26:21 https://www.w3.org/TR/vocab-dcat/#introduction """Data can come in many formats, ranging from spreadsheets over XML and RDF to various speciality formats. DCAT does not make any assumptions about the format of the datasets described in a catalog. Other, complementary vocabularies may be used together with DCAT to provide more detailed format-specific information.""" 08:27:20 ... what is the scope for DCAT descriptions? 08:27:22 ... also dataset 08:27:43 ... we recommend the use of existing DCAT recommendations 08:27:51 DCAT alludes to http://dublincore.org/documents/2003/02/12/dcmi-type-vocabulary/ """(Dataset) A dataset is information encoded in a defined structure (for example, lists, tables, and databases), intended to be useful for direct machine processing.""" 08:28:15 ... the original dublicore metadata 08:28:32 q? 08:28:38 ... the description of the use case is above 08:29:07 ... it is clear as well as the requirements "Guidance on use of dc:type or similar for DCAT records. Recommendation on content-type vocabularies." 08:29:19 annette_g has joined #dxwg 08:29:24 q+ 08:30:04 Jaroslav_Pullmann: Is this still a dataset or is any resource which is not anymore a dataset? 08:30:13 ... I support the dataset 08:30:19 About the different resource types in different metadata standards, I prepared a summary table (incomplete): https://docs.google.com/spreadsheets/d/1nlAgLUGQcBe40oTk5WNCVz-6rud1JtLwjoYyyqAT45U/edit?usp=sharing 08:30:24 q? 08:30:24 ... it should be more than separetaly 08:30:41 s/separetaly/separately/ 08:30:47 ack Makx 08:30:48 s/separetaly/separately 08:31:01 Q+ 08:31:38 Makx: I am against of limiting the scope of what DCAT dataset is 08:31:39 q? 08:31:42 q+ 08:31:42 +1 to Makx 08:31:43 q+ 08:31:46 ack annette_g 08:31:49 ... I am in favor of using vocab to say what dataset is 08:32:09 annette_g: I think the use case approach should come down to actual use cases 08:32:24 s/here/hear 08:32:25 ... some of the use cases are questions 08:32:41 q? 08:32:48 ... we may consider those as separate questions 08:32:53 q+ 08:33:06 LuizBonino: I like the idea to be able to describe diferent types of information as assets 08:33:20 q? 08:33:26 ack LuizBonino 08:33:29 ack antoine 08:34:10 antoine: it seems this use case is to describe what is the dataset but it can also be understood about the context 08:34:19 q? 08:34:21 q+ to suggest that ANY collection of 0s and 1s (including empty collection) can be treated as a dataset; "dataset" is about how the data is handled/treated/managed, not an intrinsic property. 08:34:25 ack alejandra 08:34:40 alejandra: I think it is important to discuss the scope of the use cases 08:34:50 ... make sure that we provide guidance on the type 08:35:03 ... I agree with the Use Case and I think we need to consider it 08:35:06 q+ 08:35:08 the problem with using 'type' is that 'type' may be made up of many different attributes 08:35:23 q? 08:35:26 Keith++ 08:35:48 q? 08:35:52 ack danbri 08:35:52 danbri, you wanted to suggest that ANY collection of 0s and 1s (including empty collection) can be treated as a dataset; "dataset" is about how the data is handled/treated/managed, 08:35:55 ... not an intrinsic property. 08:35:59 q? 08:36:07 ack Makx 08:36:19 Makx: the definition of dataset 08:36:37 ... has to be cured 08:36:42 curated 08:36:42 seems to me the main thing is not to try to define it now - but to decide if we will maintain (or adopt) a list of types 08:36:44 s/cured/curated/ 08:36:58 ... I think it is important to clear it up 08:37:00 q? 08:37:04 danbri: I think we agree 08:37:13 ... is about the curation of the process around data 08:37:19 +1 for makx and dan 08:37:24 maybe this is useful: software vs data https://github.com/danielskatz/software-vs-data 08:37:24 accept 08:37:41 [I agree with Makx that being a dataset is around the social context surrounding data, not the data itself] 08:37:41 kcoyle: can we accept the use case ID8 as it is? 08:38:02 q+ 08:38:25 annette_g has joined #dxwg 08:38:27 Jaroslav_Pullmann: we can just accept it 08:38:40 +1 to Jaroslav 08:38:45 ... there are questions that are not stated on the use case 08:39:05 q? 08:39:08 S/can/can't/ 08:39:18 ... maybe we could check others use case related to see the requirements and descriptions to see if they complete themselves 08:40:00 kcoyle: let's check the use case ID20 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID20 08:40:09 kcoyle: we ante to be able to specify a type 08:40:24 ... we are probably going to have to point to a small number of recommended vocabs 08:40:38 q? 08:40:49 ... given that, could we vote on the ID8 and ID20 at the same time? 08:40:52 The link to parse.insight in the use-case description was unhelpful - I've corrected it 08:40:54 antoine: I think it would be all right 08:41:08 ... maybe SimonCox could explain 08:41:38 PWinstanley has joined #dxwg 08:41:51 SimonCox: there are a lot of diferent file types 08:41:55 present+ PWinstanley 08:42:07 ... they call content type 08:42:11 dataset type != encoding type - dataset may be exposed in many encodings 08:42:21 ... there are diferent formats of media type 08:42:46 antoine: on the web context content type uses media type 08:42:46 [media type could be .Z (application/x-compress, LZW) in the case of the Web History collection https://www.w3.org/History/1992/timbl-floppies/TimBerners-Lee_CERN/hype.tar.Z ] 08:42:48 a problem with the concept of dataset concerns streaming data because of its continuity: is the dataset the whole thing or a defined 'window' 08:42:49 s/diferent/different 08:42:55 SimonCox: I am talking about semantic oriented 08:43:08 q? 08:43:11 ... the language chosen is certain conflicted 08:43:14 ack an 08:43:20 ... talking about content type 08:43:22 should definitely change "content-type" wording in Use Case 08:43:51 we are talking about the range of dc:type 08:43:51 Is it the "nature" of the dataset instead of how it is serialised, right? 08:44:08 Right; that's how I perceive it also 08:44:09 ... the dublincore descriptions from 20 years ago recognize datasets chich are images, maps, etc 08:44:24 s/etc/spreadsheets, et 08:44:27 s/et/etc 08:44:31 q? 08:44:40 ... there is a strong sense the images are different 08:44:40 s/datasets chich/datasets which/ 08:44:52 antoine: I accept it 08:45:01 q+ 08:45:02 https://en.wikipedia.org/wiki/List_of_HTTP_header_fields 08:45:10 ...as someone suggested to put a small note saying it 08:46:03 q+ 08:46:09 kcoyle: we have mentioned something that was not discussed on the use cases 08:46:17 SimonCox: it says that in the use case 08:47:28 q+ 08:47:29 kcoyle: are we at a point that caould we vote on this 08:47:35 Jaroslav_Pullmann: we should merge them 08:47:41 s/caould/could/ 08:47:42 +1 to merge them 08:48:10 Sorry, merge what? 08:48:12 Makx: reminded us that we could merge only the requirements 08:48:16 +1 to merging reqs 08:48:24 the uses cases ID8 and ID20, AndreaPerego 08:48:30 +1 to keeping the use case separated (they were contributed separately) but having the requirements consolidated. 08:48:37 q- 08:48:41 +1 08:48:51 q+ 08:49:04 ack Jaroslav_Pullmann 08:49:14 +1 to merge reqs - this will drive DCAT 1.x - keep use cases separate for record keeping 08:49:27 Jaroslav_Pullmann: if we are looking for audiences we have differents 08:49:43 ... they were not in the discussions. That was my motivation to merge them 08:49:56 q+ to ask if it's just about to accept or decline use cases 08:50:00 ... it might be interesting for researchers to see them merged 08:50:27 ... if we talk about access the question is if are we talking about datasets 08:50:32 Ine_ has joined #dxwg 08:50:47 ... we should be talking always about digital access resources 08:51:00 ... the access would be only by protocols 08:51:32 ... the definition of data maybe also about non digital data. It can be anything. So we must be sure to be talking about data accessible 08:51:38 ack Thomas 08:51:48 [is there anything DCAT can't describe? :] 08:51:53 Thomas: these two use cases coul be anout anything 08:52:18 ... the discussion about content type and so on is part of content negotiation 08:52:18 s/coul be anout /could be about / 08:52:31 q? 08:52:53 ... agree with Jaroslav_Pullmann to merge the requirements 08:52:55 +1 danbri 08:53:22 Jaroslav_Pullmann: is the purpose is to have a history we should merge only the requirements 08:53:32 q+ 08:54:02 Jaroslav_Pullmann: sometimes the use cases are very valueable 08:54:22 s/valueable/valuable/ 08:54:23 ... it is important to have reports of what we are missing 08:54:35 kcoyle: if you feel there is a use case missing, please create it 08:54:38 ack AndreaPerego 08:55:08 annette_g_ has joined #dxwg 08:56:38 AndreaPerego: we should consider include descriptions or resources that are not data 08:56:43 q+ 08:56:45 ack LarsG 08:56:45 LarsG, you wanted to ask if it's just about to accept or decline use cases 08:57:08 LarsG: I have a metaquestion. are we discussin the merging and how to proceed? 08:57:10 q- 08:57:37 ... we discussed that in a call and agreed to keep the use cases separeted and merge the requirements 08:57:57 ... alo a catalogue should be considered 08:58:36 Proposal will follow here 08:58:56 q+ 08:58:59 I agree that ID20 partly elaborates ID8, but it is only the requirements arising from these that matters in the end! 08:59:16 PROPOSAL: to accept the use cases ID8 and ID20 as they are 08:59:24 q- 08:59:36 The use-cases stay on the books so that we can check at the end if the products solve the use-cases 08:59:40 Q+ 08:59:48 +1 o Simon 08:59:58 s/o/to 09:00:17 +1 09:00:32 kcoyle: is up to the group to drive requirements 09:00:43 PROPOSAL: to accept the use cases ID8 and ID20 as they are 09:00:45 +1 09:00:45 +1 09:00:45 +1 09:00:47 +1 09:00:47 +1 09:00:47 +1 09:00:48 +1 09:00:48 -! 09:00:49 +1 09:00:49 +1 09:00:50 +1 09:00:50 +1 09:00:51 +1 09:00:51 +1 09:00:52 +1 09:00:52 +1 09:00:53 -1 09:00:53 +1 09:00:56 +1 09:00:56 with or without the requirement part? 09:00:57 +1 09:00:57 +1 09:01:14 antoine without for now 09:01:20 annette_g_: I still have a concern about the ID8 being a use case 09:01:21 ok then +1 09:01:26 ... it is too general 09:01:28 Philippe has joined #dxwg 09:01:35 ... I feel the use cases should be concrete 09:02:09 kcoyle: annette_g_ do you volunteer to rewrite it? 09:02:21 SimonCox: I agree that annette_g_ do it 09:02:57 PROPOSAL: to accept the use cases ID8 with edits that annette_g_ will provide and ID20 as it is 09:03:09 +1 09:03:10 +1 09:03:12 +1 09:03:12 +1 09:03:12 +1 09:03:13 +1 09:03:13 +1 09:03:14 +1 09:03:14 +1 09:03:15 +1 09:03:15 +1 09:03:15 +1 09:03:16 +1 09:03:17 +1 09:03:17 +1 09:03:19 +1 09:03:19 +1 09:03:20 present+philippe_roccaserra 09:03:24 +1 09:03:32 +1 09:03:38 +0 09:03:39 RESOLVED: to accept the use cases ID8 with edits that annette_g_ will provide and ID20 as it is 09:03:56 philippe keep the space after + 09:04:05 sorry; it works 09:04:17 (still getting used to IRC) 09:04:25 IMO we should be quite generous in accepting use-cases, since these exemplify concerns in the community. The more challenging part is distilling the _requirements_ and consolidating these where they overlap or duplicate. The requirements will drive the design of the products. 09:04:25 sorry I've abstained only because I've missed the explanation of how annette_g_ wanted to make the UC more concrete. 09:04:31 RRSAgent, draft minutes v2 09:04:31 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 09:04:39 THE USE CASE ID36 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID36 09:04:56 s/THE USE CASE/ the use case/ 09:05:43 q+ 09:05:47 q? 09:05:51 Makx: Cross-vocabulary relationships is about the need that might be in the dcat about those other type of datasets 09:05:58 ack Jaroslav_Pullmann 09:05:58 +1 to Makx ( probably is just a matter of providing some examples..) 09:06:00 agree with Simon, accept all use cases and get on with the work of distilling requirements 09:06:04 [q: couldn't I distribute my qb:DataSet in either Turtle or RDF/XML syntaxes, each being a Distribution?] 09:06:11 Q- 09:06:13 q? 09:06:17 Jaroslav_Pullmann: I can reffer to the wikipage 09:06:41 ... Makx is right. Some schema.org consider the data being abstract 09:06:51 s/reffer/refer 09:06:58 q+ 09:07:17 ack roba 09:07:30 roba: I think it is an important use case 09:07:35 ... it is not just a distribuition 09:07:48 q+ to mention CSVW too 09:08:10 ... we should just double check that we create a situtation that can't be a dcat 09:08:13 +1 to roba 09:08:22 Makx: it is a litle bit more complicate than that 09:08:34 ... if you have a dataset as a datacube 09:08:47 ... the concept is almost the same, but now you have 2 implementation 09:08:51 q+ 09:09:24 ... one part would be of what dcat call a dataset 09:09:44 roba: I was saying that description can be a distribution 09:09:49 q- 09:09:55 q+ 09:10:24 ... just we don't get confused on describing data 09:10:30 ack danbri 09:10:30 danbri, you wanted to mention CSVW too 09:10:36 danbri: it is a very important problem 09:10:49 dataset/distribution: the problem is DCAT does not use the concepts conceptual, logical, physical - this would help 09:11:11 ... we have the choice of going of very specific things 09:11:29 ... seems that we have agreed with evey domain 09:11:33 ... we have to be pragmatic 09:11:49 s/evey /every / 09:12:04 ... if we are describing as a distribution then describe it as a distribution 09:12:15 q? 09:12:19 ... there is no right answer, but having concrete use cases might help 09:12:37 ack antoine 09:12:48 q? 09:12:51 ack Jaroslav_Pullmann 09:13:09 Jaroslav_Pullmann: if this would modif dcat standard 09:13:19 ... concepts of what this dataset is 09:13:34 ... if we agree that the dataset is abstract 09:13:52 ... with this notion in mind we should compare with other standards 09:14:03 ... these are the differences 09:14:10 ... comparing to shema 09:14:16 s/situtation/situation/ 09:14:24 [Dublin Core is scruffy and pragmatic where https://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records#FRBR_entities is overly prescriptive; even scoped to libraries, having 4 mutually exclusive types has been hard. It feels like there's a lesson for describing data here.] 09:14:30 s/comparing to shema/comparing to schema/ 09:14:33 kcoyle: is this a use case we want to address? 09:14:48 PROPOSAL: accept the use case ID36 09:14:49 +1 09:14:51 +1 09:14:52 +1 09:14:52 +1 09:14:52 +1 of course 09:14:54 +1 09:14:54 +1 09:14:54 +1 09:14:55 +1 09:14:56 +1 09:14:57 +1 09:14:59 +1 09:14:59 +1 09:14:59 +1 09:15:00 +1 09:15:00 +1 09:15:01 +1 09:15:02 +1 09:15:04 +1 09:15:05 +1 09:15:05 +1 09:15:07 +2 09:15:15 RESOLVED: accept the use case ID36 09:15:30 scribe: DaveBrowning 09:15:50 I vote to accept all use cases. But then we will need to distill, and collate, the *requirements* implied by the use cases. 09:15:58 scribenick: dsr 09:15:58 +1 09:16:02 scribe: Dave_Raggett 09:16:37 Topic: DCAT data elements 09:16:59 We start with ID9, seehttps://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID9 09:17:12 which talks about Common requirements for scientific data 09:17:57 Andrea: this is a use case based upon experience at JRC 09:18:19 s/Andrea/AndreaPerego 09:18:38 We need to verify requirements for multidisciplinary scientific data 09:18:49 q+ 09:19:00 q+ 09:19:20 annette_g_ has joined #dxwg 09:19:23 Q+ 09:20:37 we want to be able to describe the context, inclding authors, lineage, usage, links to publications about the dataset and links to input data 09:21:08 s/inclding/including/ 09:22:16 ack PWinstanley 09:22:23 we should start with a link to the context, and later work on what we can describe in the context 09:22:35 q+ 09:23:00 ack Keith 09:23:13 PWinstanley: I would be very hesitant to distinguish scientific in the requirements, although its fine as a use case 09:23:24 +1 to Peter's concern about distinguishing "science" from non 09:23:27 +q 09:23:40 q+ 09:23:56 Keith: I would like to go further with a complex set of role bound properties 09:24:00 q+ to comment on the use of "scientific" in the use case 09:24:24 We need this additional layer if intelligent software is to make use if it effectively 09:24:54 ack annette_g_ 09:25:25 ack Jaroslav_Pullmann 09:25:27 q+ to note that data citation metadata is critical for data(set) discovery (+maybe "scholarly" can substitute for "science" in some places?) 09:25:31 Annette will extend the use case 09:25:51 Keith will generate an extended use case referencing ID9 emphasising relationships of dataset to many other entities with role and temporal limits 09:26:34 q+ 09:26:39 Jaroslav: for scientific datasets, there will be an appropriate set of metadata 09:27:05 ack alejandra 09:27:14 ACTION: Keith to generate an extended use case referencing ID9 emphasising relationships of dataset to many other entities with role and temporal limits 09:27:14 Error finding 'Keith'. You can review and register nicknames at . 09:27:27 q? 09:27:39 ack tho 09:28:31 Thomas: we don’t want to scare people off with long lists of metadata which may be optional 09:28:34 ack AndreaPerego 09:28:34 AndreaPerego, you wanted to comment on the use of "scientific" in the use case 09:29:02 Data lineage: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#Modeling_data_lineage 09:29:10 annette_g_ has joined #dxwg 09:29:31 ACTION: annette_g_ to make the UC ID8 more concrete 09:29:31 Error finding 'annette_g_'. You can review and register nicknames at . 09:29:36 [what is the unique content of this usecase, beyond those it covers? e.g. I just noticed data citation is also in UC10 also from AndreaPerego] 09:30:03 S/annette_g_/annette_g/ 09:30:21 AndreaPerego: you put a link to the original dataset, but it could also be interesting to describe the processing involved in the lineage 09:30:24 ACTION: annette_g to make the UC ID8 more concrete 09:30:25 Created ACTION-14 - Make the uc id8 more concrete [on Annette Greiner - due 2017-07-24]. 09:30:43 It would be very useful to have two levels 09:30:50 [Do use cases need to be unique? I thought we just decided that they are only there to be hooks for unique _requirements_] 09:32:07 AndreaPerego: it is useful to have a link to a specific community where the metadata is relevant 09:32:12 ack danbri 09:32:12 danbri, you wanted to note that data citation metadata is critical for data(set) discovery (+maybe "scholarly" can substitute for "science" in some places?) 09:32:24 q+ to ask if scientific/scholarly dataset description cannot modeled as a specific profile extending DCAT-AP 09:32:49 danbri: I wanted to speak up for search indexing, we love text rather than numbers 09:32:52 q+ 09:32:57 ack LuizBonino 09:33:22 AndreaPerego, is there anything unique in UC9 not in your other related UCs? 09:33:38 q- 09:33:41 LuizBonino talks about different roles of authors 09:34:01 danbri, it's more a "meta" use case, giving the general context 09:34:09 +1 to danbri "what is the extra requirement from this use case" - we should spend our time on extracting requirements. There will be a lot of overlaps, but I'm not sure that chugging through votes on each uc is good use of time? 09:34:42 q? 09:34:46 Karen: can we vote on ID9, and how it relates to profiles 09:34:47 +0 then (it seems a useful aggregation of the others, but if it has no unique content, seems an administrative/editorial matter) 09:34:57 PROPOSAL: accept id9, and consider this also when we discuss profiles 09:35:07 +1 09:35:12 +1 09:35:13 +1 09:35:16 +1 09:35:18 +1 09:35:19 +1 09:35:19 +1 09:35:19 +1 09:35:19 +1 09:35:20 +1 09:35:24 +1 09:35:24 +1 accept all use cases and get on with requirements 09:35:26 +1 09:35:27 +1 09:35:28 +1 09:35:36 +1 but with the caveat that we remove 'scientific' from it 09:35:39 +1 and +2 to Keith 09:35:40 +1 09:35:47 +1 09:35:53 +1 09:36:06 RESOLVED: accept id9, and consider this also when we discuss profiles 09:36:31 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID10 09:36:45 We look at ID10 Requirements for data citation 09:36:49 q+ 09:37:06 Karen invites Andrea to introduce ID10 09:38:10 AndreaPerego: this is about being able to cite bibliographic information and to associated related resources with persistent identifiers 09:38:20 Q? 09:38:29 q+ 09:38:30 "Being able to specify the basic mandatory information for data citation" suggests a relation to using SHACL/SHEX or similar c.f. https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID41 . I don't see the word mandatory in https://www.w3.org/TR/vocab-dcat/ 09:38:53 DCAT doesn’t yet provide the means to distinguish these identifiers sufficiently 09:40:28 q+ 09:40:42 Karen summarises the wording in the DXWG charter. We want to describe what is meant by an application profile, but not to define domain specific vocabularies for profiles 09:41:31 q+ 09:41:34 AndreaPerego: the use case is not about that as such, but rather about enabling citations 09:42:00 q? 09:42:01 so ID10 is about DCAT rather than app profiles 09:42:15 +q 09:42:27 q- 09:43:00 Thomas: the title of this use case is a little misleading 09:43:12 [i.e. If a DCAT-based description is going to be useful for data citation, we'll need to at least show how it would be modeled i.e. in terms of vocabulary not mandatory-ness. Someone else's problem to represent that profile using shex/shacl/etc.] 09:43:27 ack Jaroslav_Pullmann 09:43:31 ack Thomas 09:43:52 ack Keith 09:44:03 ack LarsG 09:44:10 Keith: one of the big thing with citation is being able to reference a specific version and section of a data set, and this is best handled in terms of a query expression 09:44:39 q+ 09:45:10 LarsG asks for clarification about the target of the use case 09:45:13 ack alejandra 09:45:17 AndreaPerego: this is about DCAT 09:46:12 alejandra: I think this is an important use case for DCAT, and we need to clarify the requirements as it overlaps with ID9 09:46:17 ack Jaroslav_Pullmann 09:46:22 q+ 09:47:08 Jaroslav: we need to consider the query parameters for referencing the distribution 09:47:36 Keith: this can get really complicated with some data stores 09:48:10 LarsG: I am still not sure if this is about DCAT or DCAT-AP 09:48:20 q+ 09:48:38 ack LarsG 09:48:45 ack Jaroslav_Pullmann 09:48:51 are we here to extend DCAT or to support some form of profile of DCAT usage 09:48:52 +q 09:48:55 q+ 09:49:02 annette_g_ has joined #dxwg 09:49:29 Jaroslav: we seem to missing a use case on data identification 09:50:13 I wonder whether "data identification" could not be too abstract. I see it more as a requirement. 09:50:29 isn't that kind of described in ID11? 09:51:01 ack alejandra 09:51:10 ACTION: Jaroslav_Pullmann to work with Keith on a use case on data identification 09:51:11 Created ACTION-15 - Work with keith on a use case on data identification [on Jaroslav Pullmann - due 2017-07-24]. 09:51:32 ack Andr 09:52:08 PROPOSAL: accept ID10 09:52:13 +1 09:52:16 +1 09:52:17 +1 09:52:18 +1 09:52:19 +1 09:52:20 +1 09:52:24 +1 09:52:24 +1 09:52:29 +1 09:52:30 +1 09:52:34 Q+ 09:52:35 Meaning some core citation info is essential for DCAT VOC 09:52:39 +1 09:52:40 +1 09:52:40 q+ 09:52:41 +1 09:53:45 +1 09:53:57 annette_g: every scientific domain has its own list of metadata for its data sets 09:54:20 q? 09:54:39 what is the level that a profile sits at? 09:55:09 +1 to Karen 09:55:33 Karen: we will define how to express a dataset profile, but we won’t work on specific profiles which will be left to the relevant communities 09:56:30 annette_g: we do need to provide guidance to communities as to what we’re expecting them to do 09:57:47 XXX talks about how to define profiles 09:58:12 how to provide a consistent set of extensions 09:58:23 s/XXX/Thomas 09:58:40 Karen: we can look at how Dublic Core tacked this 09:58:45 +1 09:58:47 s/tacked/tackled/ 09:58:57 +1 09:58:58 +1 09:59:23 RESOLVED: accept ID10 09:59:25 +1 to accept all use cases ... 09:59:34 +1 09:59:36 +1 09:59:40 +1 09:59:42 +1 09:59:43 ]1 09:59:49 +1 09:59:50 +1 09:59:51 +1 09:59:52 +1 accept use cases and get to requirements 09:59:52 +1 10:00:02 s/]1// 10:00:46 q? 10:00:47 We move onto ID11 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID11 Modeling identifiers and making them actionable 10:00:53 +1 10:00:55 ack annette_g_ 10:01:08 +1 10:01:10 Karen: this is similar to others, can we just vote on accepting it? 10:01:19 q+ 10:01:30 ack Jaroslav_Pullmann 10:01:35 dsr has left #dxwg 10:01:41 ack Keith 10:01:56 dsr has joined #dxwg 10:01:56 alejandra_ has joined #dxwg 10:02:31 Keith: many identifiers are role based, and we need to be general in supporting them 10:02:46 Q+ 10:03:32 Karen: you need to say what kind of identifier it is to enable search 10:03:36 ack annette_g_ 10:03:59 q+ 10:04:17 it also says alternative identifiers 10:04:24 Annette_G: we need to void limiting people to a single de-referenceable link 10:04:33 ack LuizBonino 10:05:09 LuizBonino: people need to state their identifier and the schema it belongs to 10:05:20 PROPOSED: Accept ID11 10:05:24 +1 10:05:25 +1 10:05:26 +1 10:05:27 +1 10:05:27 +1 10:05:28 +1 10:05:29 +1 10:05:29 +1 10:05:29 +1 10:05:30 +1 10:05:30 +1 10:05:32 +1 10:05:32 +1 10:05:33 +1 10:05:34 +1 10:05:40 +1 10:05:52 +1 10:05:53 RESOLVED: Accept ID11 10:05:55 +1 10:05:55 +1 10:05:57 +1 10:06:14 q+ to propose we accept all usecases 10:06:19 the question should be: is the UC clear? 10:06:32 PROPOSAL: We accept all use cases. 10:06:34 +1 10:06:39 +1 10:06:39 +1 10:06:40 accept them unless osmeone objects 10:06:45 No objections so far 10:06:46 +1 10:06:53 +1 10:06:53 danbri proposes we accept all of the use cases so we can discuss requirements 10:06:54 ack me 10:06:54 danbri, you wanted to propose we accept all usecases 10:07:07 -1 10:07:08 +1 10:07:34 antoine, can you repeat? 10:07:51 notuc = no objection to unanimous consent 10:08:13 I am co-chairing a group where people actually submitted UCs that were out of scope 10:08:19 so I have to flag this 10:08:41 That said, I will not strongly object if the WG here decides to just move on! 10:09:02 Maybe this is a better group :-) 10:09:45 Karen: we want to decide whether the use cases are in or out scope 10:10:19 danbri: has anyone some ideas as to which use cases should be out of scope 10:11:06 ID9 needs some rewriting to avoid being specific to scientific datasets 10:11:52 Out of scope, means that we won’t address the use case in DCAT 10:11:55 q+ about ID9 10:12:16 ack about 10:12:20 ack ID9 10:12:36 +1 to AndreaPerego 10:13:15 AndreaPerego: it is better to be concrete in the use cases and then generalise the requirements 10:13:31 Karen agrees 10:14:14 Do we want to go through each of the use cases to resolve whether they are in or out of scope? 10:14:39 We won’t be able to get a full list of requirements from the use cases today. 10:14:56 q+ 10:14:56 what time come back? 10:14:58 Our purpose today is to determine what is the scope of DCAT 1.1 10:15:08 [Maybe we can get away with a 'bulk' resolution that we believe all UCs submitted are in-scope to *consider* as reasonable asks of DCAT 1.1] 10:15:09 annette_g_ has joined #dxwg 10:15:09 … we break for 15 minutes … 10:15:09 in 15min we will come back 10:15:27 I won't come back. Getting late here and I'm still nursing pneumonia 10:15:47 hope you get better soon SimonCox 10:16:00 goodnight, SimonCox! 10:17:01 resolved: SimonCox to get well soon 10:17:11 :-) 10:28:56 newton has joined #dxwg 10:37:24 on a pause her, makx/andrea 10:37:34 starting within a few minutes 10:37:42 (I think) 10:37:56 alejandra has joined #dxwg 10:38:22 My apologies, will have to disconnect at 13:00 my time/noon Oxford for another call. 10:38:58 Hope we'll get through Quality by the top of the hour 10:40:01 annette_g has joined #dxwg 10:40:08 newton has joined #dxwg 10:40:18 scribe Thomas 10:40:52 scribenick: Thomas 10:41:08 rrsagent, make minutes 10:41:08 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html dsr 10:41:10 kcoyle has joined #dxwg 10:41:44 kcoyle: next two use cases - similar to identifiers 10:42:15 id28 & id29 parallel 10:42:17 q? 10:42:18 Caroline_ has joined #DXWG 10:42:21 are these out of scope? 10:42:23 Present+ Caroline_ 10:42:30 q+ 10:42:31 present+ 10:42:54 present+ 10:43:25 q+ 10:43:27 q- 10:43:29 ack Si 10:43:44 AndreaPerego: relationship id28-29 10:43:49 both about spatial aspect of data 10:43:52 not related otherwise 10:44:25 28 is about specifying reference systems that the data use (coordinate systems) 10:44:42 29 is about the spatial coverage 10:45:00 present+ 10:45:05 29 is more literal by nature 10:45:08 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID28 10:46:31 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID38 10:46:35 kcoyle: talking about 28 and 38 10:46:38 29 later 10:47:35 these are both cases of ID26 10:48:10 looking at id28 and id38 now 10:48:21 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID28 10:48:26 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID38 10:48:29 any objections against 28? 10:48:38 28 and 27 are basically the same pattern 10:48:40 q? 10:48:48 ack AndreaPerego 10:48:58 q+ 10:49:00 q+ 10:49:20 ack roba 10:49:30 Noting that https://www.w3.org/TR/owl-time/ is in Candidate Recommendation review re time-related aspects in ID38 10:49:34 Again, it is seems that both 28 and 38 are suitable for extensions/profiles 10:49:39 can we update the agenda to have the correct name if ID28? 10:49:55 roba: 27 and 28 are alike and go about the modelling aspects of semantics 10:50:09 q+ 10:50:20 general set of requirements will have to be extracted from there 10:50:26 and then discussed 10:50:29 I would say UC27 (temporal coverage) is related to UC29 (spatial coverage), not UC28 (reference systems). 10:50:41 lot of people are using these things differently 10:50:45 AndreaPerego++ 10:50:59 kcoyle: grandfather in related aspects to 26 10:51:48 q+ 10:51:49 Jaroslav_Pullmann: we're going to look into requirements - we have to ask of the requirement in UC28 is enough? 10:52:05 kcoyle: are there other requirements - they can be added to the UC 10:52:19 ack Jaroslav_Pullmann 10:52:25 ack PWinstanley 10:52:53 PWinstanley: spatial and temporal - have a 'scaling'-property 10:53:07 q+ to ask whether time series period/schedule is covered (e.g. IoT carbon monoxide sensor readings) 10:53:53 Maybe some 'superclass'-object for scaling? 10:54:59 Describe a reference system where the scaling info originates to 10:55:04 some ontology etc 10:55:32 q? 10:55:38 q? 10:55:40 q+ 10:55:49 UC1 is such even more general UC in https://www.w3.org/2017/dxwg/wiki/Use_Cases_and_Requirements 10:55:59 dsr: what are we up to here? 10:56:35 describing conventions or validating consistency to a reference system 10:57:18 q- in the interests of time 10:57:21 PWinstanley: do we need a place for this in DCAT? We shouldn't miss out on flexibility 10:57:21 q- 10:57:41 q+ 10:57:44 PWinstanley: keep a future-proof architecture 10:58:29 dsr: do we need to check on integrity? 10:58:43 PWinstanley: yes, but we need a chunck to make that possible 10:58:54 kcoyle: can we make a use case for this? 10:59:07 annette_g has joined #dxwg 10:59:08 PWinstanley: action: Peter W will do a use case for this 10:59:30 q? 10:59:33 q- 10:59:42 Present+ annette_g 11:00:23 Philippe has joined #dxwg 11:00:29 https://www.w3.org/2017/dxwg/wiki/Use_Cases_and_Requirements#UC1_Consistent_use_of_summary_properties_and_extension_points_with_more_detailed_domain_specific_information_models 11:00:31 roba: general use case-attempt for that one 11:00:46 feel free to edit this one 11:00:55 ack roba 11:00:57 q? 11:01:28 q+ to point also to UC14 and ID16 11:01:49 roba: job of just describing a reference is really not easy 11:02:04 a simple property defining the reference is just not enough 11:02:22 especially within spatial world 11:02:40 s/to point also to UC14 and ID16/to point also to UC14 and UC16/ 11:02:52 look at the spatial data on the web WG for that 11:02:58 not overspecify within DCAT 11:02:59 ack Keith 11:03:23 Keith: don't forget astronomical and microscopical coordinate systems 11:03:53 AndreaPerego: in the UC there is a reference to the SDW-WG 11:04:11 review the work from there and follow the best practices might be an option 11:04:12 ack AndreaPerego 11:04:12 AndreaPerego, you wanted to point also to UC14 and ID16 11:04:47 AndreaPerego: related to UC14 and UC 16 11:05:21 kcoyle: let's stick with the spatial and temporal for the time being 11:05:26 objections for having these? 11:05:46 PROPOSED: Accept ID28 and ID38 11:05:47 Relevant SDWBP, linked from UC28: https://www.w3.org/TR/sdw-bp/#bp-crs 11:06:06 +1 11:06:07 +1 11:06:09 +1 11:06:10 +1 11:06:12 +1 11:06:12 +1 11:06:13 +1 11:06:14 +1 11:06:14 +1 11:06:14 +1 11:06:14 +1 11:06:14 +1 11:06:15 +1 11:06:16 +1 11:06:17 +1 11:06:18 +1 11:06:18 +1 11:06:21 +1 11:06:27 RESOLVED: Accept ID28 and ID38 11:06:58 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID14 11:06:59 kcoyle: data quality; UC14-15 11:07:12 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID15 11:07:36 ID14 is related to ID43 11:07:45 I note that XBRL supports hypercubes as an abstract coordinate space for financial reporting data 11:07:54 Q? 11:08:10 kcoyle: 14 about data quality and 15 about precision and accuracy 11:08:16 q? 11:08:18 q+ 11:08:19 alejandra, yep, it is. 11:08:28 ack PWinstanley 11:08:33 q+ to ask what "Provide patterns for " means here; e.g. is it showing some examples using other vocabs? 11:08:34 q+ 11:08:46 PWinstanley: 15 is a subset of 14 11:08:58 q+ 11:09:00 And constraints on useability come into that 11:09:32 q+ 11:09:41 PWinstanley: What are the things that restricts the re-use of the data? 11:09:55 Partly the data-quality and partly the collection-process 11:11:05 Maybe we should use another pluggable container alike reference systems 11:11:31 q? 11:12:09 PWinstanley: we shouldn't focus on the things that we directly see at hand 11:12:12 ack danbri 11:12:12 danbri, you wanted to ask what "Provide patterns for " means here; e.g. is it showing some examples using other vocabs? 11:12:12 all the time 11:12:55 danbri: should we provide examples of other vocabularies 11:13:42 AndreaPerego: what is missing was a recommendation to follow 11:13:48 ack Thomas 11:14:24 Thomas: what I saw in these 2 use cases is another case of using reference systems. We shouldn't make lots and lots of reference system classes, but have some common modelling structure. 11:14:45 ... we shouldn't go further than that in defining DCAT. Anything going deeper is up to the profiles. 11:14:48 q? 11:15:16 ack Jaroslav_Pullmann 11:15:51 Jaroslav_Pullmann: in a commercial PoV, in order to express quality of service levels, we need that information also 11:16:10 Jaroslav_Pullmann: they ought to be valid use cases for us 11:16:23 Can we enable profiles with annotations covering: 11:16:23 * Where the estimates of precision/accuracy come from 11:16:24 * When a data point has been interpolated (e.g. lost data, broken sensor) 11:16:25 ack dsr 11:16:25 * When a sensor is no longer trusted, despite what it says 11:16:34 [ AndreaPerego - do you think the topic of modeling caveats would fit in these UCs? https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Jul/0041.html ] 11:17:15 [ danbri, possibly, but need to look at it more closely ] 11:17:15 q? 11:17:25 annette_g has joined #dxwg 11:17:38 dsr: a question is how to enable profiles to allow the reason/connotation on data quality issues 11:17:47 PROPOSED: Accept ID14 and ID15 11:17:51 broken sensors vs fawl data 11:17:56 +1 11:17:58 +! 11:17:59 +1 11:17:59 +1 11:17:59 +1 11:17:59 +1 11:18:01 +1 11:18:01 +1 11:18:02 +1 11:18:03 +1 11:18:03 +1 11:18:05 +1 11:18:05 +1 11:18:07 +1 11:18:07 +1 11:18:07 +1 11:18:09 _1 11:18:11 +1 11:18:11 +1 11:18:13 +1 11:18:21 s/!/+1 11:18:39 s/_1/+1 11:18:43 RESOLVED: Accept ID14 and ID15 11:19:02 kcoyle: go to ID12 and then have lunch 11:19:32 kcoyle: ID12 - data lineage 11:19:44 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID12 11:20:12 kcoyle: we probably need a place to denote the source of the data 11:20:13 does data lineage = provenance? 11:20:16 q+ 11:20:21 ack Jaroslav_Pullmann 11:20:27 Is PROV an option? 11:20:40 Jaroslav_Pullmann: provenance is important 11:20:50 Textual of structured? 11:20:59 q? 11:21:05 we need a structured, machine-readable way for this 11:21:07 q? 11:21:21 Keith, indeed, there may be related (lineage / provenance), but it depends on the definition of provenance. 11:21:36 Thomas, PROV is mentioned as an option in the UC. 11:21:43 q+ 11:21:47 alejandra: is'nt this too generic 11:21:49 thx, andrea 11:21:57 q+ 11:22:05 s/is'nt isn't 11:22:07 ack Keith 11:22:30 s/is'nt/isn't 11:22:34 Keith: vertical vs horizontical provenance including the relationship between those two 11:22:41 goes beyond PROV VOC 11:22:43 s/s/is'nt isn't/ 11:23:02 ack AndreaPerego 11:23:07 AndreaPerego: comment on provenance lineage 11:23:16 sometimes 'who did the job'? 11:23:17 s/horizontical/horizontal 11:23:29 Keith: the ability to reconstruct the state at a specified time 11:23:31 q+ 11:24:15 you have provenance on the dataset-level and provenance on the agent roles, workload, ... 11:24:27 q+ 11:24:27 ack PWinstanley 11:24:30 s/you AndreaPerego :/ 11:24:56 PWinstanley: you have also instances where the data is being used etc 11:25:10 don't strictly belong to the provenance of the dataset 11:25:16 Quoting from the DC definition of dct:provenance: "A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation." 11:25:57 kcoyle: we don't have anything that goes beyond the strict provenance 11:26:44 PWinstanley: when a dataset is used in different contexts, the meaning/nature of the dataset might change but the dataset itself isn't 11:27:07 q+ 11:27:27 PWinstanley: we could have an 'event' and a 'transition' (transition = change; event = not changed) 11:27:40 all can adhere to 'provenance' 11:28:14 kcoyle: are we describing another requirement 11:28:24 q+ 11:28:25 danbri_ has joined #dxwg 11:28:46 PWinstanley: want to leave that to be decided 11:28:47 ack Jaroslav_Pullmann 11:29:26 Jaroslav_Pullmann: UC 'funding sources' is another related aspect - that UC is linked to this one 11:29:36 also very strong related to versioning 11:29:55 ack roba 11:30:03 ACTION: Jaroslav_Pullmann will link these 11:30:03 Created ACTION-16 - Will link these [on Jaroslav Pullmann - due 2017-07-24]. 11:30:54 roba: what is the goal for bringing extra properties into DCAT? 11:31:14 ack AndreaPerego 11:32:26 PROPOSED: Accept ID12 11:32:31 annette_g has joined #dxwg 11:32:40 AndreaPerego: we should also take into account the goal to which the dataset should be used 11:32:51 (andrea: correct me if I'm wrong please) 11:33:17 Jaroslav_Pullmann: we should understand why provenance should be modeled 11:33:30 it isn't clear this time 11:33:47 kcoyle: that's what happens when we pull the requirements out of the use cases 11:33:49 Thanks, Thomas. The 2 purposes I see are: data reproducibility and fitness for purpose. 11:34:01 thx 11:34:12 +1 11:34:12 +1 11:34:15 +1 11:34:16 +1 11:34:17 +1 11:34:17 +1 11:34:17 +1 11:34:17 +1 11:34:18 +1 11:34:19 +1 11:34:20 +1 11:34:21 +1 11:34:21 +1 11:34:22 +1 11:34:23 +1 11:34:36 +1 11:34:49 RESOLVED: Accept ID12 11:34:58 kcoyle: declares lunch 11:35:00 +1 11:35:01 resume in one hour 11:35:05 +! 11:35:12 s/! 1/ 11:35:15 s/+!/+1/ 11:35:19 bye 11:35:22 Bye. 11:36:44 have other commitments tomorrow eve - so will be joining later in your morning for a bit. 11:37:21 newton has joined #dxwg 11:42:31 Makx has joined #dxwg 11:47:00 newton has joined #dxwg 12:26:38 present+ Makx 12:29:44 Where are we on the agenda? 12:30:32 LarsG has joined #dxwg 12:31:10 OK so I just missed the whole item on Quality. Pity. 12:31:38 dsr has joined #dxwg 12:33:25 kcoyle has joined #dxwg 12:33:29 danbri has joined #dxwg 12:34:12 newton has joined #dxwg 12:35:13 annette_g has joined #dxwg 12:35:32 Caroline_ has joined #DXWG 12:35:37 chair: Caroline_ 12:37:10 LuizBonino has joined #dxwg 12:37:11 Topic: Data Quality 12:37:14 Topic: data quality 12:37:23 Use Case https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID16 12:37:24 ID16: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID16 12:38:27 is it a duplicate of ID43? 12:38:35 q+ 12:38:59 (we break for lunch) 12:39:13 q? 12:39:24 ack antoine 12:39:37 s/antoine/AndreaPerego 12:39:47 ack Andr 12:39:49 antoine: wondering if Andrea was joking, there was a lot of discussion about this use case 12:39:52 s/AndreaPerego/antoine 12:39:55 Thomas has joined #dxwg 12:39:56 Jaroslav_Pullmann has joined #dxwg 12:39:59 ... in the DWBP WG 12:40:18 ... test wether a dataset complies with a given standard, one wants to record this 12:40:27 ... this was a use case in the data quality vocabulary 12:40:37 ... we ended up not being able to implement what Andrea wanted 12:41:12 ... should I reformulate the discussion or confirm that the use case it is still relevant? 12:41:24 q? 12:41:25 Caroline_: do people think it is out of scope or relevant? 12:41:28 q+ 12:41:35 ack Jaroslav_Pullmann 12:41:52 q+ 12:41:53 Jaroslav_Pullmann: expression of quality is hard to express or assess without reference to an evaluation criteria 12:41:54 q+ 12:42:06 kcoyle: same as ID14 12:42:07 q? 12:42:11 ack LarsG 12:42:25 LarsG: same question about providing a hook in DCAT core to provide these things? 12:42:37 ... or is it outside of DCAT core? 12:42:47 ack LuizBonino 12:42:51 Q+ 12:43:06 LuizBonino: my understanding is about compliance with something (standard, quality parameter, etc) 12:43:15 ... then there will be a validator to check the compliance 12:43:23 ... compliance is context dependence 12:43:23 q+ to note that lots of UCs have this structure - a reasonable usecase that may likely be beyond core and addressed by dcat + another vocab. Will we make a collection of the other vocabs? 12:43:24 For the record here is the part about it in DQV, with a note about our resolution in DWBP: https://www.w3.org/TR/vocab-dqv/#ExpressConformanceWithStandard 12:43:27 ack annette_g 12:43:31 annette_g: 12:43:43 annette_g: there is a lot of overlap with the data quality vocabulary 12:43:45 Thanks, Antoine. 12:44:10 q+ 12:44:13 ack danbri 12:44:13 danbri, you wanted to note that lots of UCs have this structure - a reasonable usecase that may likely be beyond core and addressed by dcat + another vocab. Will we make a 12:44:14 ... maybe we need to discuss how to address that with DCAT and at this point, not how to deal with it 12:44:16 ... collection of the other vocabs? 12:44:19 q+ to explain about the specificity of this UC 12:44:32 q? 12:44:36 danbri: a lot of the UCs indicate that we can grow the DCAT core very quickly 12:44:53 q+ 12:44:53 ... should we collect a list of useful extras? 12:44:56 +q 12:45:02 ack kcoyle 12:45:13 can we get requirements from UCs and then decide what is core and what not? 12:45:27 -q 12:45:55 kcoyle: we need a document advising about what fits in 12:46:05 alejandra, ... sorry I meant to say the opposite. Rather that we keep getting UCs where we could make a small change to the core but not address the full usercase. 12:46:15 sorry! 12:46:19 Also for the record, I could dig the issue about Andrea's suggestions: https://www.w3.org/2013/dwbp/track/issues/202 12:46:24 ack AndreaPerego 12:46:24 AndreaPerego, you wanted to explain about the specificity of this UC 12:46:28 The charter makes provision for work on a primer 12:46:40 q+ 12:46:41 ... and that we don't have a repeatable answer, such as "we'll add this to our 'useful multi-vocabularies cookbook' page/document" 12:47:00 https://lists.w3.org/Archives/Public/public.../att.../DCAT-APimplementationguide.pdf 12:47:04 AndreaPerego: this UC is not included in data quality 1 because we came across when using DCAT for spatial metadata 12:47:36 ... you should be able to express conformity and non-conformity 12:47:43 ... important for discovery purposes 12:47:58 ... what are the data that needs to be modified to be conformant 12:48:15 ... general UC wasn't explaining these specific issues 12:48:47 https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Jul/att-0010/DCAT-APimplementationguide.pdf 12:48:50 ... we identified this in the implementation of DCAT 12:48:58 ... in some cases, we found a solution, not in others 12:49:08 ... 90% of these use cases are meta-UCs 12:49:17 ... use of DCAT for supporting cross-domain interoperability 12:49:24 ... there is always a reference to other standards 12:49:36 ... we want to support interoperability across metadata standards 12:49:50 ... we had to address this problem on how data standards are modeling things 12:49:58 ack LuizBonino 12:50:17 LuizBonino: the majority of the use cases we discussed seem suitable for extensions around datasets 12:50:27 ... other parts may interfere on the structure of DCAT as it is now 12:50:49 ... if you have an approach for versioning, you have a version and a distribution, the distribution should not be attached to the dataset anymore but to the version 12:50:49 @PWinstanley https://joinup.ec.europa.eu/asset/dcat-ap_implementation_guidelines/description are based on actual problems brought forward by implementers. 12:51:18 ack kcoyle 12:51:24 ... we need to define the profile description method to define how people are going to use this 12:51:42 kcoyle: I edited some UCs related to this 12:51:57 ... e.g. in ID42 12:52:02 RRSAgent, draft minutes v2 12:52:02 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 12:52:10 ... this is about the dataset itself and not about the data itself 12:52:22 q? 12:52:26 ... I don't know if it needs to be brought to the level of DCAT 12:52:41 LuizBonino: we have the dataset and the distribution, and we have the metadata about the semantics 12:53:14 ... each distribution matches to the generic concept 12:54:07 ... the constraints on what you have to provide is what I would consider a profile 12:54:22 kcoyle: a picture would be good 12:54:30 q? 12:56:33 PROPOSAL: accept ID16 12:56:43 +1 12:56:44 +1 12:56:46 +1 12:56:46 +1 12:56:46 +1 12:56:48 +1 12:56:48 +1 12:56:50 +1 12:56:53 +1 12:56:57 +1 12:56:58 +1 12:57:06 +1 12:57:09 +1 12:57:10 +1 12:57:17 +1 - yes to in scope as a use case 12:57:24 I agree it is very similar to 43 12:57:25 is this overlapping with ID43? 12:57:32 +1 12:57:50 @antoine French headphone level limit? 12:57:51 Overlap is okay, no? 12:58:02 Yes 12:58:15 +1 12:58:22 LuizBonino: explaining diagram 13:00:05 diagram here also useful: https://www.w3.org/TR/hcls-dataset/ 13:00:49 see https://twitter.com/danbri/status/886933651178573824 13:00:58 can't you share a screen on Webex? 13:01:39 LuizBonino: dataset can extended with any profile 13:01:54 ... distinction between dataset, version/release, distribution 13:01:54 q+ 13:02:09 ack Makx 13:02:39 Makx: this is a substantial model change to DCAT 13:02:50 kcoyle: this is LuizBonino's current model 13:02:58 ... it doesn't mean that we will follow this model 13:03:09 Makx: we need to be very careful in making substantial model changes 13:03:12 q? 13:03:19 ... as it can break current implementations 13:03:24 q+ 13:03:29 +1 to Makx 13:03:36 RESOLVED: accepted ID 16 13:03:45 +1 13:04:05 q? 13:04:06 Caroline_: now discussing ID23: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID23 13:04:11 ack riccardoAlbertoni 13:04:34 riccardoAlbertoni: I'll give some context to this UC that we collected from the DQV 13:04:55 ... Antoine, other people and I contributed 13:05:10 ... meta-use case, data quality is very important for any reuse of data 13:05:20 ... data collected in the past is considered also in this group 13:05:38 ... it seems that there is some overlap in the use cases that Andrea proposed 13:05:51 ... even though from a different perspective 13:07:03 ... some concrete case studies, how to identify integrity constraints (e.g.) 13:07:30 ... depending on how far we want to go in data quality within DCAT, there is some DQV housekeeping 13:07:38 ... DQV was released last December 13:07:57 ... it would be great for us to have the possibility to make small changes 13:08:01 q? 13:08:39 antoine: general question on what should be the position of the DQV WG in terms of the core and profiles for DCAT 13:08:44 q+ 13:08:53 riccardoAlbertoni: it is quite difficult to define how far we have to go in data quality 13:09:03 s/DQV WG/DQV 13:09:13 ... this discussion should consider the UCs presented by AndreaPerego 13:09:24 ack Makx 13:09:52 Makx: two comments on DQV 13:10:06 ... it makes sense to use UCs as a good point to see how DQV can be attached to DCAT 13:10:27 ... the one that danbri came up in the last couple of days w.r.t caveats on statistical data 13:10:35 Caveats discussion: https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Jul/0041.html (caveat/footnotes even at data item level) 13:10:42 ... I've got a use case I forgot to put it 13:10:56 statDCAT AP, https://joinup.ec.europa.eu/node/147940 13:10:59 ... people have included annotations of DQV to datasets 13:11:07 ... I will write that UC 13:11:16 ... and danbri can write the other UC 13:11:29 Caroline_: yes, please write more use cases 13:11:43 annette_g has joined #dxwg 13:11:50 Just to note that another case for the use of DQV in DCAT is UC15, where DQV is indicated in the existing approaches, and mentioned also in SDWBP: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID15 13:11:57 q? 13:12:02 action: danbri write UC for data-item level caveat annotations 13:12:03 Created ACTION-17 - Write uc for data-item level caveat annotations [on Dan Brickley - due 2017-07-24]. 13:12:28 PROPOSAL: accept ID23 (https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID23) 13:12:30 +1 13:12:38 +! 13:12:38 +1 13:12:40 +1 13:12:41 +1 13:12:44 +1 13:12:45 +1 13:12:48 s/+!/+1/ 13:12:49 +1 13:12:52 +1 13:12:52 will you give me an action for DQV annotiation? 13:12:58 +1 13:12:59 +1 13:12:59 +1 13:13:02 kcoyle: there will be a need to tease out lots of requirements 13:13:09 +1 13:13:13 +1 13:13:29 +1 13:13:33 +1 13:13:36 ACTION: Makx to create a UC for DQV annotation 13:13:36 Created ACTION-18 - Create a uc for dqv annotation [on Makx Dekkers - due 2017-07-24]. 13:14:03 RESOLVED: accepted ID23 13:14:27 q? 13:14:29 Next UC ID19: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID19 13:15:04 AndreaPerego: this is a general or meta UC 13:15:17 ... modelling different types of information 13:15:40 ... e.g. input data 13:15:56 ... property linking a dataset with the publisher and the author 13:16:08 ... you may need to attach to these relationships some additional information 13:16:15 ... such as the temporal context 13:17:44 ... general use case where we need some guidance on how to provide this information 13:17:59 ... it can be applied for any type of information to be attached to a relationship 13:18:31 q? 13:18:37 q+ 13:18:40 ack kcoyle 13:19:08 kcoyle: in our UCs we have mixed up UCs about the dataset and the data in the dataset 13:19:15 ... we need to tease those apart 13:19:24 ... as we may want to address them differently 13:19:47 ... there may be statements that we may want to make about the data / data semantics 13:19:59 q? 13:20:01 q+ 13:20:04 ... we haven't made that distinction in the discussion 13:20:30 AndreaPerego: when we talk about dataset and when we talk about the data itself 13:20:42 ... do we have this in DCAT already? DataRecord and Dataset 13:20:55 s/DataRecord/CatalogRecord/ 13:21:09 kcoyle: 13:21:20 LuizBonino_ has joined #dxwg 13:21:28 kcoyle: are we ok in mixing those or do we need to keep them separately? 13:21:47 ... if we need to say something about quality, we need to say quality about what 13:22:05 AndreaPerego: we had these discussions in the DQV and we concluded that in most cases we are talking about data 13:22:14 ... we can use the same approach in data or metadata 13:22:24 ... this is a topic that may need further discussion 13:22:27 ack Jaroslav_Pullmann 13:22:47 annette_g has joined #dxwg 13:22:56 Jaroslav_Pullmann: I didn't make this difference 13:23:03 ... dataset is about whatever data is behind 13:23:12 ... I think this is related to what Rob was proposing 13:23:13 q+ 13:23:30 ... atomic properties and specific descriptions 13:23:52 ... I think this is a general approach of modelling 13:24:13 ... atomic properties and complex descriptions, which the UC says qualified descriptions 13:24:23 ... what does it mean qualified form? 13:24:30 AndreaPerego: this is from PROV-O 13:25:12 ... different ways of representing the same information: the core, the extended and, the qualified 13:25:22 ... reified representation of a relationship 13:25:33 ... where you can attach additional information to a relationship 13:25:45 PROV properties with qualified forms: https://www.w3.org/TR/prov-o/#inverse-names-table 13:26:20 ... e.g. prov:qualifiedAssociation 13:26:47 ... I don't know if there is a better term, but the idea is to have another relationship to add more information 13:27:01 ... bridge between the dataset and the source data 13:27:16 ... where you can attached more information on when the data was processed and so on 13:27:31 Jaroslav_Pullmann: simple atomic properties and qualified properties 13:27:31 q? 13:27:42 ... is there a concrete suggestion on when this patterns applies? 13:27:52 ... e.g. for quality, accuracy 13:28:05 ... how would you restrict the application of this pattern? 13:28:16 Q- 13:28:26 AndreaPerego: personally I would stick to what we define as concrete requirements 13:28:40 ... I would rely on what the community used in DCAT to identify what is relevant 13:29:26 ... the point is to have concrete use cases where people want to specify concrete information and either they can't do it or they do it differently every time 13:29:39 Jaroslav_Pullmann: would this be an extensibility pattern for DCAT? 13:30:05 AndreaPerego: yes, I think so - but you cannot be sure if the proposal is universally applicable 13:30:12 ... unless you collect use cases 13:30:32 q? 13:31:01 PROPOSAL: accept ID19 as relevant use case https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID19 13:31:09 +1 13:31:11 +1 13:31:14 +1 13:31:15 +1 13:31:17 +1 13:31:18 +1 13:31:19 +1 13:31:19 +1 13:31:19 +1 13:31:19 +1 13:31:19 +1 13:31:20 +1 13:31:21 +1 13:31:25 +1 13:31:25 +1 13:31:56 +1 13:31:59 RESOLVED: ID19 accepted 13:31:59 +1 13:32:15 Topic: DCAT general 13:32:52 Next UC: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID26 13:32:57 ID26 13:33:23 ack Jaroslav_Pullmann 13:33:38 Jaroslav_Pullmann: this is a meta UC 13:34:06 ... I consider DCAT as a core where one would attach properties with specific standards/vocabularies 13:34:06 q? 13:34:18 annette_g has joined #dxwg 13:34:33 UTC :-) 13:34:55 q+ 13:35:00 Jaroslav_Pullmann: identify what are the specific aspects that need an extension 13:35:05 ack kcoyle 13:35:07 ... and identify a property for each of them 13:35:33 kcoyle: are you anticipating that there will be particular vocabularies that DCAT needs to consider? 13:35:41 q+ to suggest UC bakes in a specific kind of solution 13:35:57 kcoyle: how open should we be? 13:36:11 ... should we decide for each element whether there are recommended vocabularies? 13:36:24 +q 13:36:31 q+ 13:36:32 Jaroslav_Pullmann: what should happen is an analysis of what is there 13:36:45 ... and have simple properties for describing simple stuff 13:36:46 q? 13:36:53 ... and attach further descriptions 13:37:08 ack danbri 13:37:08 danbri, you wanted to suggest UC bakes in a specific kind of solution 13:37:20 danbri: I like the general intent 13:37:34 ... the current formulation of the UC assumes a specific technical approach 13:37:51 ... LuizBonino diagram includes a release structure 13:38:10 ... Makx's pointed out on being careful on changing the structure 13:38:22 ... maybe it would be good to rephrase not to consider specific implementation 13:38:38 ... we shouldn't assume that a specific property is the solution 13:38:38 ack Makx 13:38:40 q? 13:39:02 Makx: I have difficulties understanding how this would work 13:39:22 ... danbri indicates that we shouldn't mention properties, but in an RDF world we need to speak about properties 13:39:35 RRSAgent, draft minutes v2 13:39:35 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 13:39:38 ... these 6 bullet points seem to imply that there are separate properties for separate extensions 13:39:46 ... but we already have a potential provenance one 13:40:00 ... we already have specific ones for temporal and spatial coverage 13:40:03 q+ 13:40:30 Makx, my point was that in some extreme/complex important cases the WG might actually restructure DCAT's overall pattern with new types (e.g. Release/Version as in Luiz's diagram) 13:40:30 ... I'm not quite sure on what the proposal is 13:40:38 ... if a catch-all approach 13:40:44 ... with loose semantics 13:40:52 ... or identifying what extensions are needed 13:41:01 ... I prefer the latter 13:41:05 ack LarsG 13:41:23 LarsG: I'm pro having extension points in DCAT, I don't think we should mandate which vocabularies to use 13:41:31 +q 13:41:38 ... here you can put provenance, you may want to use PROV 13:41:42 Q+ 13:41:58 ack Jaroslav_Pullmann 13:42:09 ... but we shouldn't say 'you must use PROV', as this is getting into the area of profiles 13:42:27 Jaroslav_Pullmann: benefit of this use case is to identify the important aspects that should not be forgotten 13:42:39 q+ 13:43:01 q+ 13:43:21 ... there should be a dedicated property that relates to whatever specification of this aspect 13:43:26 ack Makx 13:43:30 Makx: I wanted to react to what LarsG was saying 13:43:44 ... we are absolutely not in a position to recommend vocabularies 13:44:06 ... we can only provide a property where people can put whatever they think it is relevant 13:44:08 ack annette_g 13:44:23 annette_g: I agree with LarsG, it would be the role of a profile to define what extensions to use 13:44:40 ... we can say in a profile 'we will use PROV-O' 13:44:43 ... but not in the core 13:45:13 +q 13:45:17 kcoyle: we can provide guidance 13:45:28 s/Topic: Data Quality/scribe: alejandra/ 13:45:30 RRSAgent, draft minutes v2 13:45:30 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 13:45:31 -q 13:45:34 ack alejandra 13:45:57 alejandra: for each element listed in the UC there are specific use cases 13:45:57 alejandra: there are other use cases that are not specific 13:46:08 s/alejandra: there are other use cases that are not specific/ 13:46:35 ... there are areas that might not be relevant for DCAT but for profiles 13:46:52 q? 13:46:59 Jaroslav_Pullmann: it's about grouping the other UCs 13:47:11 ... we have extension points where we can hook that in 13:47:21 so, this UC is for grouping the other UCs 13:47:23 s/scribe: alejandra/scribenick: alejandra/ 13:47:25 RRSAgent, draft minutes v2 13:47:25 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 13:47:31 scribe: LarsG 13:47:35 Thomas: It would be useful for clients to know how to handle that 13:47:43 ack antoine 13:47:56 antoine: wants to continue on alejandra 's point 13:48:11 ... UCs are not very specific, but more meta 13:48:38 ... every property in DCAT can be seen as an extension point 13:48:40 ack Jaroslav_Pullmann 13:48:56 Jaroslav_Pullmann: it wasn't ment to be implemented in the model 13:48:59 good point Antoine 13:49:04 q+ 13:49:11 ... more a hint that we need to consider this in the model 13:49:14 s/ment to/meant to/ 13:49:32 ... it's a meta use case listing what I think is important 13:49:39 ack antoine 13:49:55 antoine: so it's more like a design principle 13:50:08 +q 13:50:14 ack Jaroslav_Pullmann 13:50:19 ... "if I want to extend DCAT this is what I should consider" 13:50:49 Jaroslav_Pullmann: if we agree on accepting this UC, Jaroslav_Pullmann would have a task 13:50:51 q+ 13:50:53 ack Makx 13:50:55 ... to think about this 13:51:31 Makx: when we worked on the European profile this is exactly what we did (describing extension points) 13:51:53 ... if there is a large group of people that want the same thing, we could go back to DCAT and add it 13:52:29 ... so the six bullet points in the UC, there might be 200, but in the end we need to figure 13:52:41 q+ 13:52:50 ack antoine 13:52:50 ... out which ones we want: what goes into DCAT core and what is profile 13:53:07 antoine: agrees with Makx, would be in favour of accepting the UC 13:53:19 ... suggests that Jaroslav_Pullmann focuses on the meta aspect 13:53:33 ... should phrase it as a methodological point 13:53:50 +100 to antoine 13:53:53 ack Jaroslav_Pullmann 13:53:54 ... "if you have own needs, we have a methodology for creating profiles" 13:54:17 Jaroslav_Pullmann: The UC was a first shot. Compared to the ISO standards there is an aspect 13:54:26 ... of maintenance that isn't covered in DCAT 13:54:49 ... DCAT needs a reference to that 13:55:06 ... those are aspects that are usually covered by specific vocabularies 13:55:38 PROPOSED: Accept ID26 13:55:44 +1 13:55:45 +1 13:55:47 +1 13:55:49 +1 13:55:53 with editing! 13:55:54 +1 13:55:57 +1 13:56:02 +1 13:56:04 with editing 13:56:08 +1 13:56:11 PROPOSED: Accept ID26 with editing by Jaroslav_Pullmann 13:56:11 +1 13:56:13 +1 subject to editing the text 13:56:14 +1 with editing 13:56:15 W/e 13:56:17 +1 13:56:19 q+ 13:56:24 +1 13:56:32 ack Keith 13:56:34 Jaroslav_Pullman, suggested edit: "extension points (properties) " -> "extension points (typically properties)" 13:56:36 +1 13:57:10 s/scribe: LarsG/scribenick: LarsG/ 13:57:13 RRSAgent, draft minutes v2 13:57:13 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 13:57:29 q? 13:57:30 ACTION: Jaroslav_Pullmann to edit ID26 13:57:33 Created ACTION-19 - Edit id26 [on Jaroslav Pullmann - due 2017-07-24]. 13:58:09 [discussion about how to get from Use Cases to Requirements...] 13:58:34 q? 13:58:48 RESOLVED: Accept ID26 with editing by Jaroslav_Pullmann 13:59:18 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID33 13:59:20 ack alejandra 13:59:46 alejandra: we need a way to provide an overview of data 13:59:49 q+ 13:59:53 ack kcoyle 13:59:56 ... could be statistics 14:00:08 .. and might go into a profile 14:00:12 Ixchel has joined #dxwg 14:00:27 https://www.w3.org/TR/hcls-dataset/#s6_6 (mentioned by alejandra) seems to use VoID for rdf triple stats 14:00:29 kcoyle: danbri said that search engines are better at text than at numbers 14:00:44 +q 14:00:56 ... so like an abstract in a paper, an overview could improve discovery 14:01:00 ... can DCAT help there 14:01:05 ack alejandra 14:01:17 (text and also entities that can be found via textual queries, rather than raw decontextualized numbers) 14:01:26 alejandra: it's more than just a description, but telling potential users of how much data to expect 14:01:32 ... ten patients or 1000 14:01:47 ... also how many triples etc 14:01:53 +q 14:02:01 ... but hard to generalise so might go into a profile 14:02:12 PWinstanley: so it won't be text 14:02:29 kcoyle: if it's not in a particular format, is it for dieplay? 14:02:40 LuizBonino: it might be for validation 14:02:57 ... if you use a profile you want to check 14:03:11 s/dieplay/display 14:03:16 q+ 14:03:26 ... in the profile you need to attach the vocabulary that describes how many patients 14:03:46 ack Makx 14:03:52 ... and then you can validate: does the metadata contain this statistical information? 14:04:36 Makx: we tried some of that. DCAT only has byteSize (not clear to everybody). When you start talking about how many things are in your DS, there are many 14:04:56 ... different ways to define that and that is specific to the use of the DC 14:05:02 s/DC/DS/ 14:05:30 ... so this is community-specific and can hardly be generalised => profile 14:05:31 ack antoine 14:05:44 I agree Makx - maybe to consider for AP guidelines 14:05:48 annette_g has joined #dxwg 14:05:59 https://www.w3.org/2013/dwbp/track/issues/164 14:06:04 https://www.w3.org/2013/dwbp/track/issues/189 14:06:05 antoine: sends around a cople of pointers from the data quality work. 14:06:14 ... there statistical information was very important 14:06:33 ... they point to initiatives about statistics that were considered relevant 14:06:37 riccardoAlbertoni has joined #dxwg 14:06:49 ... agrees with Makx that counting is very difficult 14:06:50 https://www.w3.org/TR/void/#statistics 14:06:53 present+ 14:07:16 +q 14:07:23 ... void has somie counting properties, and even if we don't incorporate 14:07:28 ack alejandra 14:07:41 ... void into DCAT there are similarities that might satisfy this DCAT 14:07:59 alejandra: void is specific to RDF so could be more a guidance 14:08:06 q+ 14:08:19 q+ 14:08:29 ... but this UC could be about a generic pattern how to count things 14:08:51 ack antoine 14:08:55 Thomas: if you leave semantics behind you can count anything, so we need to stay within the domain 14:09:17 q+ 14:09:35 q+ to report a bioschemas discussion on "Data record" structures that relates 14:09:41 ack PWinstanley 14:09:42 antoine: we should at least be able to say why we didn't look at these issues 14:10:25 ack Jaroslav_Pullmann 14:10:34 PWinstanley: if we were to put the summary as an XMLLiteral the search engines could pick that up and leave a hook for people to provide structured data 14:11:07 q? 14:11:20 Jaroslav_Pullmann: usually it's important to provide the range of a property to give hints: Do we plan to do that in our ontologies? 14:11:40 q+ 14:11:42 PWinstanley: we're not obliged to, it's an additional layer of modelling 14:11:55 ack danbri 14:11:55 danbri, you wanted to report a bioschemas discussion on "Data record" structures that relates 14:11:56 Jaroslav_Pullmann: but that's an important part of modelling 14:12:44 danbri: It depends on how static your ontology is. schema keeps changing so they are very conservative with domains and ranges 14:13:12 ack kcoyle 14:13:23 http://bioschemas.org/ 14:13:34 ... Bioschemas do much typing (rows/columns) and it would be good if DXWG do the same 14:14:06 kcoyle: domains and ranges could be part of a profile, not necessarily DCAT 14:14:23 ... we saw value in having a DataRecord view into contents of a datset, but even a simple multi-table dataset has two obvious representations as a set of records (1. entities 2. table rows). Either or both may be useful. 14:14:26 q? 14:14:29 ... profiles can not only add new elements but also add constraints to existing ones 14:15:38 alejandra: much of it could be put into a profile 14:15:59 ... healthcare data is important because it's often not freely available 14:16:50 PROPOSED: Accept ID33 14:16:56 +1 14:16:57 +1 14:16:59 +1 14:17:00 +1 14:17:01 +1 14:17:02 +1 14:17:02 +1 14:17:02 +1 14:17:02 +1 14:17:02 +1 14:17:04 +1 14:17:07 +1 14:17:09 +1 14:17:12 +1; curious about the requirement here 14:17:23 RESOLVED: Accept ID33 14:17:27 +1 14:17:41 +1 14:17:53 next: UC35 14:18:10 ack Makx 14:18:20 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID35 14:19:04 Makx: we have had this discussion before: There is a dataset and a description of it but no designated catalogue (or no catalogue at all) 14:19:18 ... like people creating datasets of their own. 14:19:21 +q 14:19:22 q? 14:19:25 q+ to suggest RDF vocabs don't do 'mandatory' 14:19:26 q+ 14:19:27 q+ 14:19:30 ack alejandra 14:19:40 q+ 14:19:43 ... Is DCAT the data _catalogue_ or about datasets, too 14:19:46 q+ 14:19:47 ack danbri 14:19:49 danbri, you wanted to suggest RDF vocabs don't do 'mandatory' 14:20:21 ... you could use DCAT to describe a dataset before it's made part of a catalogue 14:20:43 danbri: doesn't see a big problem. Vocabularies just provide useful terms 14:20:56 ack Keith 14:21:02 ... can provide some statistics from google 14:21:31 ack LuizBonino 14:21:31 Keith: catalogues are create by manual creation or by harvesting from other catalogues, so it's not a proble 14:21:52 LuizBonino: the focus of DCAT is the dataset and not the catalogue 14:22:16 q+ 14:22:17 q- 14:22:26 ... the model isn't clear, though. It should be possible to have datasets without a catalogue, so we need to fix the cardinality 14:22:28 ack Jaroslav_Pullmann 14:23:07 ack Thomas 14:23:39 Jaroslav_Pullmann: one issue could be that the concepts/topics are part of the catalogue and would be lost for datasets without it 14:24:12 dcat:Dataset definition: A collection of data, published or curated by a single agent, and available for access or download in one or more formats. 14:24:19 and of course the catalogs are themselves all datasets 14:24:25 doesn't mention catalogue 14:24:46 First sentence from DCAT does: "DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web." 14:24:47 PROPOSED: Accept ID35 14:24:50 +1 14:24:53 +1 14:24:53 +1 14:24:56 +1 14:24:56 +1 14:24:56 +1 14:24:56 +1 14:24:57 +1 14:24:57 +1 14:24:58 annette_g has joined #dxwg 14:24:58 +1 14:24:58 +1 14:24:58 +1 14:24:58 +1 14:24:58 +1 14:25:01 +1 14:25:01 Interesting - https://www.w3.org/TR/vocab-dcat/#class-dataset is quite restrictive. Whereas https://www.w3.org/TR/vocab-dcat/#introduction is quite general about data. 14:25:02 +1 14:25:05 +0 14:25:13 +1 14:25:36 Section 1 is non-noramtive 14:27:27 kcoyle: Yes, since it's called IFLA-LRM now ;-) 14:27:45 What does "published or curated by a single agent" mean? If two people publish something together must we treat them as an Organization to meet this semantic? 14:27:53 s/kcoyle: Yes, since it's called IFLA-LRM now ;-)// 14:28:22 RESOLVED: Accept ID35 14:28:23 I agree, Lars 14:29:14 OK 14:29:22 Caroline_: we're discussing more than the chairs had planned 14:29:44 q+ 14:29:55 LuizBonino: What with the people who participate remotely in specific timeslots? 14:30:32 ack antoine 14:30:52 kcoyle: it's specifically about profile negotiation wher Ruben wanted to call in, but LarsG s here to cover that 14:31:26 antoine: would want to have ID37 moved up since he can only join until 3pm 14:31:47 Coffee break until 4pm 14:31:59 RRSAgent, draft minutes v2 14:31:59 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 14:32:25 ^^ 5PM CEST 14:32:30 when will you be back from breack 14:32:38 in 30 minutes 14:32:54 scribe: alejandra 14:32:58 scribe: LarsG 14:33:03 RRSAgent, draft minutes v2 14:33:03 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 14:36:47 newton has joined #dxwg 14:39:23 newton has joined #dxwg 14:48:16 newton has joined #dxwg 14:51:45 LarsG has joined #dxwg 14:52:42 annette_g has joined #dxwg 15:01:54 newton has joined #dxwg 15:04:10 Caroline has joined #DXWG 15:04:46 annette_g has joined #dxwg 15:04:49 kcoyle has joined #dxwg 15:04:53 Present+ 15:04:57 newton has joined #dxwg 15:04:59 present+ 15:05:18 Jaroslav_Pullmann has joined #dxwg 15:05:26 present+ 15:07:23 Topic: https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID40 15:07:31 Present+ annette_g 15:07:37 scribe: Jaroslav_Pullmann 15:07:44 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID40 15:07:46 scribenick: Jaroslav_Pullmann 15:07:56 present + 15:08:06 present+ 15:08:42 reading the use case ID40 15:10:19 q? 15:10:25 q+ 15:10:33 ack kcoyle 15:10:48 q+ 15:10:53 PWinstanley has joined #dxwg 15:11:05 kcoyle: what part of dcat should be aligned with Schema.org? 15:12:01 kcoyle, I made a comment on this topic here http://lists.w3.org/Archives/Public/public-dxwg-wg/2017Jul/0052.html 15:12:12 danbri has joined #dxwg 15:12:19 annette_g has joined #dxwg 15:12:27 q+ 15:13:03 danbri: summarizing about the evolution approach of Schema.org 15:13:59 kcoyle: the question remains - how do we get both aligned (via sameAs etc.)? 15:14:17 can we please respect speaker queue? 15:15:08 kcoyle: all the properties are in dcat/dct namespace .. 15:15:09 kcoyle was on her turn :) 15:15:14 ack Keith 15:15:59 view-source:http://schema.org/Dataset 15:16:07 15:16:10 q+ 15:16:36 ack Keith 15:16:40 ack Makx 15:17:20 Keith: suggestion to use an Schema-annotated HTML page to make catalog/datasets accessible (~ landing page) 15:17:35 How are data catalogues discovered? One idea is to embed schema.org tags in web pages as a means to discover catalogues, and then use the DCAT vocabulary for further queries 15:18:24 q? 15:18:55 q+ 15:18:56 Makx: rewrite the use case to exemplify the publishing process 15:18:59 ack danbri 15:19:08 Keith: we need a way to expose DCAT to schema.org 15:20:10 RRSAgent, draft minutes v2 15:20:10 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 15:20:37 danbri: there are mutliple classes in Schema.org that might fit the individual Dataset notion like Product, CreativeWork 15:21:26 q? 15:21:37 q+ 15:22:46 are we talking about the use case or arguing dumping DCAT in favour of schema.org? 15:22:59 danbri: describes how the metadata is being extracted and processed out of the web pages, there should be ampping to this Schema.org subset from DCAT 15:24:09 s/ampping/mapping/ 15:24:46 ack kcoyle 15:24:50 q+ 15:25:08 see https://developers.google.com/search/docs/data-types/datasets and associated blog post 15:25:30 i.e. https://research.googleblog.com/2017/01/facilitating-discovery-of-public.html 15:25:43 +1 to kcoyle 15:25:46 q? 15:26:00 kcoyle: approach to be best found by search engines via web pages annotated with Schema.org metadata? 15:26:29 q+ 15:27:39 q+ 15:27:45 ack Jaroslav_Pullmann 15:28:39 Jaroslav_Pullmann: in the level of a dataset which may be dynamic we don't need a API endpoint 15:29:05 we can annotate with some of the schema.org elements 15:30:27 which of these propoerties could be exported to schema.org? 15:30:36 annette_g has joined #dxwg 15:30:48 danbri: shared the documentation see https://developers.google.com/search/docs/data-types/datasets and associated blog post i.e. https://research.googleblog.com/2017/01/facilitating-discovery-of-public.html 15:30:51 q? 15:30:59 ack dsr 15:31:08 https://www.w3.org/TR/2016/NOTE-csvw-html-20160225/ 15:31:19 ... is the CSVW WG's note on JSON-LD in HTML 15:32:32 q? 15:32:33 dsr: asking for use cases for seraching of particular type of resources (datatsets, services) .. 15:32:40 ack LuizBonino 15:33:00 LuizBonino: 2 differents ways considered 15:33:12 can LuizBonino speak louder please 15:33:38 is it better Makx ? 15:33:53 1) we generate HTML pages annotated by Schema.org metadata for dataset 15:33:56 a bit bvetter yes 15:34:43 users typing informal queries to discover catalogues and data sets, intent based search involving hidden APIs for quick added value results, or back end use cases where a service initiates a query and generates a composition of services as a design for later instantiation or a dynamically instantiated composition for immediate use. 15:34:52 the dcat:Dataset is described as schema:Dataset as well 15:35:36 LuizBonino was showing on the figure he draw what he is explaining. He is talking about DCAT:dataset in the figure 15:35:59 s/and data sets/and data sets and linking to pages offering a richer more structured search/ 15:36:37 q? 15:36:40 ack alejandra 15:36:49 2) the findability is further supported by indicating a dcat:theme 15:39:02 q? 15:39:05 alejandra: might Google support DCAT natively, in contrast to mapping to Schema.org? 15:39:23 rrsagent, make minutes 15:39:23 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html dsr 15:40:01 q+ 15:40:17 ack Jaroslav_Pullmann 15:41:06 European Data Portal 750.000, data.gov 160.000 data sets 15:41:42 Jaroslav_Pullmann: What woud be the taret of such an indexing by search engines? 15:42:14 s/taret/target 15:43:00 s/woud/would/ 15:43:01 Jaroslav_Pullmann: what to do if you are searching for data and are given 200.000 datasets? 15:43:56 Thomas has joined #dxwg 15:44:06 +q 15:44:38 ack Makx 15:45:00 Makx: comming back to the use case.. 15:46:22 Q+ 15:46:22 q? 15:46:23 +1 15:46:28 ack annette_g 15:46:29 assuming the approach to define such a landing page of a dataset, what is the guidance of how to epxose it in terms of Schema.org annotation? 15:46:58 q? 15:47:32 cookbooks are good 15:47:47 +1 15:47:55 cookbook feels the right level to me; DCAT-shaped structures... 15:48:00 +1 15:48:02 PWinstanley: suggesting a cook book with examples 15:48:11 ... with extras from other vocabs, and in schema case maybe mappings 15:49:14 q? 15:49:28 +1 to separate cookbook + alignment on terms where possible 15:49:32 (prefer "cookbook" to "best practice" given that these things are still in flux) 15:49:37 danbri: in effect, this is mostly about mapping DC terms to Schema, which has been done already 15:50:28 see also http://wiki.dublincore.org/index.php/Schema.org_Alignment/MappingIssues 15:50:33 danbri would the cookbook be like a primer? 15:52:29 PWinstanley: what is the (tool) support for creating these annotations? 15:52:41 Philippe has joined #dxwg 15:53:58 dsr: what is our adivse on choosing and using such tools? 15:54:49 q+ 15:55:37 ack Jaroslav_Pullmann 15:56:34 PWinstanley: there is a commercial potential for creation and provision of such tools and services 15:56:35 q? 15:57:24 +1 15:57:28 PROPOSED: accept ID40 as a use case for a non-normative document 15:57:35 +1 15:57:35 +1 15:57:36 +1 15:57:37 +1 15:57:37 +1 15:57:38 +1 15:57:38 +1 15:57:39 +1 15:57:39 +1 15:57:40 +1 15:57:41 +1 15:57:43 +1 15:57:43 +1 15:57:50 +1 15:58:25 RESOLVED: accept ID40 as a use case for a non-normative document 15:58:49 We should consider how open source projects could help with building both DCAT and schema.org markup 15:59:02 belated +1 15:59:10 /me sorry I have to leave.. Thanks for the interesting discussion, See you tomorrow! 15:59:24 bye 15:59:37 thank you for participating riccardoAlbertoni 15:59:55 [PWinstanley talking about https://twitter.com/nwplanet ] 16:01:04 dsr, I did have a conversation with someone in CKAN community about getting schema.org dataset markup into CKAN per-dataset landing pages. Idea would be to improve and publicise the existing DCAT addon rather than make a rival addon. 16:01:10 PWinstanley: suggesting to create a wiki on tooling support 16:01:21 q+ 16:01:48 ack AndreaPerego 16:02:17 Caroline: we will create an informal document on topic (cook book) 16:02:46 +1 for discussing it now. 16:03:27 annette_g has joined #dxwg 16:03:33 scribenick: kcoyle 16:03:39 https://www.w3.org/2017/dxwg/wiki/Use_Case_Working_Space#ID18 16:03:43 scribe: kcoyle 16:03:57 andrea: ID18; there are three other use cases that mention same problem 16:04:18 ... problem is that in many cases your distribution is not a direct file download 16:04:24 ... could be an api or another service 16:04:36 [q: (Makx?) how would I find DCAT from a page like view-source:https://www.europeandataportal.eu/data/en/dataset/air-pollution-monitoring-data-dublin-city ?] 16:04:49 ... the issue is that both machines and humans what happens when you follow the link 16:05:04 ... the response from the api may be an error or doesn't make sense to you 16:05:10 q? 16:06:09 ... this is a main issue left open by DCAT 1.0 16:06:32 ... there was once a subclass of distribution -> service, but this was dropped 16:06:56 ... this is a big problem that users have - when they don't get the data back they are confused 16:07:22 ... a sparql endpoint, they get multiple datasets back 16:07:55 dsr: this is about what people expect from a search 16:08:00 q? 16:08:01 q+ 16:08:08 ack Jaroslav_Pullmann 16:08:52 Jaroslav_Pullmann: dynamic distribution - let the data be pushed 16:09:21 q? 16:09:33 s/distribution -> service/dcat:Distribution -> dcat:WebService/ 16:09:39 danbri: does the group consider finding commercial datasets? find out that it exists and how much you pay for it 16:10:20 q+ 16:10:23 Jaroslav_Pullmann: there could be domain-specific solutions 16:10:27 ... or templated urls 16:10:37 ack LarsG 16:10:47 q+ to say that it may be worth finding a domain-independent solution 16:10:55 where we need to give the url parameters some kind of semantics 16:10:58 LarsG: We are not limiting ourselves only to open datasets; it's about finding 16:11:08 q+ 16:11:20 ... this is how Europeana works; you can find things but they may be behind a firewall 16:11:23 +1 for Lars 16:11:25 q+ 16:11:27 q+ 16:11:45 ack antoine 16:11:58 s/antoine/AndreaPerego 16:12:20 q- 16:12:23 AndreaPerego: find a way to model the info in a domain-independent way, a minimal set 16:12:53 ... 2 main things: 1) distribute is not direct, uses API / service 16:13:15 ... 2) type of service - specify with a code type of endpoint/service 16:13:34 ... even this small amount of info would be helpful to people 16:13:46 q? 16:13:49 ... and could be used by software engines if know the code 16:13:53 +1 for Andrea 16:13:54 ack AndreaPerego 16:13:54 AndreaPerego, you wanted to say that it may be worth finding a domain-independent solution 16:14:13 ... what is missing is that minimal info 16:14:15 ack Keith 16:14:31 Keith: 1) searching an individual dataset 16:14:46 q+ 16:14:46 ... worst case 2) complex API with distributed data 16:14:58 ... are we going to describe APIs or datasets? 16:15:07 Yep, the API description is the complex bit. 16:15:12 ack dsr 16:15:33 suggesting that https://www.w3.org/TR/vocab-data-cube/ covers some of Keith's (1.). 16:15:37 q+ to say that antoine was accidentally kicked out of the queue... 16:15:43 dsr: links to goals of WoT in W3C - links to general services and domain-specific situations 16:16:01 ... dcat needs to say - the type of this is an api. beyond that is outside of dcat 16:16:08 annette_g has joined #dxwg 16:16:12 q? 16:16:23 q+ antoine 16:16:31 LuizBonino: in health area, data is electronic, access process is offline 16:16:36 ack LarsG 16:16:36 LarsG, you wanted to say that antoine was accidentally kicked out of the queue... 16:16:39 so sorry antoine 16:17:10 ack antoine 16:17:45 antoine: asking Andrea if his use case includes sparql endpoints, because dcat has that solution 16:17:59 AndreaPerego: ? dcat has a solution for sparql end points? 16:18:09 q+ 16:18:14 antoine: there is a dcat access url that could be used for sparql endpoints 16:18:50 ack Makx 16:19:16 Makx: it's true that dcat says that this could be used with sparql end points but never says how 16:19:17 DCAT should provide information about where to get further information about an API and if this is machine interpretable, what formats are supported, e.g. thing descriptions for the Web of Things, or schema languages for RESTful APIs 16:19:28 https://www.w3.org/TR/vocab-dcat/ , search for 'SPARQL' and this eventually gives dcat:accessURL 16:19:32 q- 16:19:34 dct:WebService: https://www.w3.org/TR/2012/WD-vocab-dcat-20120405/#Class:_WebService 16:19:38 q+ 16:20:06 AndreaPerego: dct:WebService was dropped from the document 16:20:35 ack antoine 16:20:50 s/dct:WebService/dcat:WebService/ 16:20:55 nearby, sparql, void etc: https://www.w3.org/TR/void/#sparql-sd 16:20:58 antoine: is sparql included in your use case? 16:21:38 AndreaPerego: no, not mentioned 16:21:41 ... and then there is some literature around SPARQL as interface to data cubes e.g. https://jbiomedsem.biomedcentral.com/articles/10.1186/s13326-017-0112-6 16:22:28 ACTION: Andrea will add SPARQL endpoint to ID18 16:22:28 Created ACTION-20 - Will add sparql endpoint to id18 [on Andrea Perego - due 2017-07-24]. 16:23:41 FWIW in last week's IoT/WoT discussions, RAML, Swagger (https://swagger.io/specification/) and JSON-Schema came up a lot. 16:23:57 annette_g has joined #dxwg 16:24:19 Peter: some don't handle both GET and POST params 16:24:41 Q+ 16:25:18 ack annette_g 16:25:45 dsr: as above, what comes back: file or msg? what service/more info do you get. 16:25:51 ... is it machine-readable? 16:25:53 RRSAgent, draft minutes v2 16:25:53 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html AndreaPerego 16:26:27 annette_g: what you get back ... can differ 16:26:44 s/dct:WebService/dcat:WebService/ 16:27:26 +1 16:27:27 PROPOSE: Accept use case ID18 as in scope 16:27:29 +1 16:27:31 +1 16:27:32 +1 16:27:32 +1 16:27:32 +1 16:27:33 +1 16:27:33 +1 16:27:34 +1 16:27:35 +1 16:27:38 +1 16:27:39 +1 16:27:40 +1 16:27:43 +1 16:27:43 +1 16:27:48 +1 16:28:10 RESOLVED: Accept use case ID18 as in scope 16:28:29 rrsagent, make minutes 16:28:29 I have made the request to generate http://www.w3.org/2017/07/17-dxwg-minutes.html dsr 16:28:53 OK see you tomorrow, signing off -- was a very good meeting, thanks 16:29:22 Thanks, enjoy your dinner! Meet you tomorrow. 16:32:34 annette_g has joined #dxwg