14:42:23 RRSAgent has joined #csvw 14:42:23 logging to http://www.w3.org/2015/02/04-csvw-irc 14:42:25 RRSAgent, make logs public 14:42:25 Zakim has joined #csvw 14:42:27 Zakim, this will be CSVW 14:42:27 ok, trackbot; I see DATA_CSVWG()10:00AM scheduled to start in 18 minutes 14:42:28 Meeting: CSV on the Web Working Group Teleconference 14:42:28 Date: 04 February 2015 14:59:30 JeniT has joined #csvw 14:59:45 RRSAgent, make logs public 14:59:47 Zakim, this will be CSVW 14:59:47 ok, trackbot; I see DATA_CSVWG()10:00AM scheduled to start in 1 minute 14:59:48 Meeting: CSV on the Web Working Group Teleconference 14:59:48 Date: 04 February 2015 14:59:54 gkellogg has joined #csvw 15:00:15 zakim, dial ivan-voip 15:00:15 ok, ivan; the call is being made 15:00:16 DATA_CSVWG()10:00AM has now started 15:00:19 +Ivan 15:00:51 zakim, code? 15:00:51 the conference code is 2789 (tel:+1.617.761.6200 sip:zakim@voip.w3.org), gkellogg 15:01:14 +[IPcaller] 15:01:48 shall i scribe? 15:01:51 jumbrich has joined #csvw 15:02:08 +??P8 15:02:12 zakim, I am ??P8 15:02:12 +gkellogg; got it 15:02:31 + +44.207.346.aaaa 15:02:37 +??P16 15:02:46 zakim, I am ??P16 15:02:46 +jumbrich; got it 15:02:56 zakim, who is here? 15:02:57 On the phone I see Ivan, JeniT, gkellogg, danbri, jumbrich 15:02:58 On IRC I see jumbrich, gkellogg, JeniT, Zakim, RRSAgent, danbri, ivan, trackbot 15:03:06 scribenick: danbri 15:03:20 Regrets: jtandy 15:03:47 gregg's suggestions: https://lists.w3.org/Archives/Public/public-csv-wg/2015Feb/0004.html 15:03:57 jenit: let's start w/ f2f and topics for that, next thu/fri 15:04:05 +1 15:04:35 jenit: i've been going through issues clustering them, … pulled out only 4 areas 15:04:42 (I don't see in https://lists.w3.org/Archives/Public/public-csv-wg/2015Feb/ … you mean github notifications?) 15:04:52 … there's a set around disconnect between CSV files and tables 15:05:11 one set around primary key + refs between multiple files 15:05:26 a couple around annotations and how annotations reference other things, how they are incorp'd in conversions 15:05:38 and a few around language, and how lang is handled in mapping esp for common properties 15:05:45 those are the areas where I found clusters of issues 15:06:05 also we should spend a fairly high % of time looking at the conversions 15:06:12 as they need more attention and have most issues 15:06:19 any other thoughts on f2f priorities? 15:06:27 (+1 from me on conversions getting attention -dan) 15:06:36 ivan: we have currently 75 open issues, which is quite a number 15:06:48 but there may be quite a lot of these which are really editorial things 15:07:04 jenit: yes, quite a few are labelled as resolved, just require editor action but not needing more discussion 15:07:06 https://github.com/w3c/csvw/issues 15:07:25 https://github.com/w3c/csvw/issues?q=is%3Aopen+is%3Aissue+label%3AResolved shows 9 'resolved' but open. 15:07:38 jenit: requires people to go through some other issues 15:07:54 … if an issue is neither resolved or requires discussion/decision then it is in a limbo and needs looking at 15:08:07 ivan: we have been a bit chaotic with issues. After f2f we have made over 100 15:08:11 … but we'll find a way 15:08:15 jenit: yes 15:08:28 ivan: What is the goal we set ourselves for the f2f? 15:09:15 ivan: My dream would be that after f2f, we have to make a bunch of editorial work, but after that round i.e. approx end march, we're in position to issue what would've once been called a Last Call. 15:09:26 …. "maybe I'm a dreamer…" 15:09:32 q+ 15:09:37 ack gkellogg 15:10:03 dan: we might not close all issues at f2f but can leave it with a clear owner on anything left open 15:10:07 gregg: this should be achievable 15:10:14 where i am with my impl it looks pretty solid 15:10:24 i don't think we have normative text on extracting embedded metadata on CSVs 15:10:32 perhaps we need reqs around what other formats need to find 15:10:39 s/find/provide/ 15:10:41 e.g. TSVs are simple 15:10:45 ivan: eh? 15:11:06 gregg: we have some description on how to extract CSVs, illustrative text, gives a flavour for how other formats might be handled 15:11:13 this should be normative 15:11:22 q+ 15:11:31 if we're opening door for other formats, need to be clearer on this and use some hypothetical format at least to be testable 15:11:36 ack ivan 15:11:38 re csv mappings and our existing examples they are pretty solid 15:11:58 ivan: to other formats I think that we agreed at some point in time, that we have the conceptual thing which is our model 15:12:09 … and we should not get into any other syntax variation on how these can be mapped on our model 15:12:13 i don't think that we should go there 15:12:30 i fully agree that the default metadata issue is still open and needs to be defined somewhere 15:12:45 we agreed at some point… that if metadata is part of the orig csv file, we do not define the format of that 15:12:59 in a sense the only thing that we will have somehow in our default metadata is when the 1st row are the col names 15:13:02 that's a kind of metadata 15:13:15 any other variation on metadata within the csv file we decided to be out of scope for now 15:13:24 gregg: I thinkyou're right, not trying to solve that here 15:13:34 … jsut that i think the results we have are [good] 15:14:09 ivan: back to orig thing, i'm playing with an impl as well. Although there are many things I have not done, I have same feeling as you. Where we seem to be the weakest, … the usage of foreign keys and how diff tables are related to each other 15:14:15 in our current docs that area is weakest 15:14:25 can we spend time reviewing that? 15:14:39 other issues are smaller details 15:14:45 good as we have captured big picture 15:14:58 to be less of a dreamer, it is realistic to expect "last call" sometime spring 15:15:08 we aim for intermediate draft publication end of march 15:15:14 may not be last call 15:15:21 aim for lc by early june 15:15:25 that is realistic 15:15:43 jenit: maybe we can work it out at f2f, ... 15:15:51 … may simply be useful for us to all sit in a room and type 15:15:58 … actually get some stuff done or on a screen 15:16:01 sharing in smaller groups 15:16:08 actually do some work, not just discussion 15:16:20 action: dan check beamer setup for non google staff 15:16:20 Created ACTION-62 - Check beamer setup for non google staff [on Dan Brickley - due 2015-02-11]. 15:17:17 jenit: Rufus may or may not come for some portion of the time 15:17:38 ivan: same for phil archer 15:17:39 -jumbrich 15:17:44 … they're not on list but might come over 15:17:52 for a day maybe 15:18:18 DavideCeolin has joined #csvw 15:18:22 +??P16 15:18:29 zakim ??P16 is me 15:18:46 zakim, I am ??P16 15:18:46 +jumbrich; got it 15:18:48 jenit: let's try to eat together somewhere around victoria, plenty of places 15:18:57 ivan, can you paste the f2f url? 15:19:04 https://www.w3.org/2013/csvw/wiki/F2F_Agenda_2015-02 15:19:06 https://www.w3.org/2013/csvw/wiki/F2F_Agenda_2015-02 15:19:09 thx thx 15:20:02 gregg: we're approaching time when counting implementations matters 15:20:14 jenit: that's it for f2f 15:20:39 +[IPcaller] 15:20:51 zakim, [IPcaller] is me 15:20:51 +DavideCeolin; got it 15:21:03 hi Davide! 15:21:37 davide: I may be able to attend f2f, will try 15:22:04 https://lists.w3.org/Archives/Public/public-csv-wg/2015Feb/0004.html 15:22:05 topic: Gregg's Big List Of Issues 15:22:18 Issue #96 (PR #187): Use of aboutUrl, propertyUrl and valueUrl. (should be able to close #101). 15:22:23 https://github.com/w3c/csvw/issues/96 15:22:24 oops, bot clash 15:23:03 gregg: #96 this was as discussed/resolved last week, change urlTemplate into aboutUrl, propertyUrl to be a uritemplate property, and add a value url and make them all inherited properties 15:23:13 (don't take my spelling as canonical --scribe) 15:23:17 https://github.com/w3c/csvw/pull/187 15:23:18 … see a PR in github 15:23:20 tx 15:23:49 … desc in metadata doc, … property can be an array i.e. repeatable, that is challenging w.r.t. spec text how to handle/define this. 15:23:52 q+ 15:23:54 that's the primary thing I noticed 15:23:56 ack ivan 15:24:31 ivan: there was another issue, whether the property uri can be an array, … q was whether the about uri if it inherited down to a column level, then the about uri can change from one col to another 15:24:33 that's a question 15:24:42 (did i capture that? is there an issue # for this?) 15:24:55 gregg: that's implication and intention for making it inheritable 15:25:07 point was to allow diff columns in rdf to be diff entities 15:25:14 not clear what this means in javascript 15:25:27 (javascript meaning json) 15:25:37 gregg: json properties typically based on column name not the about url 15:25:47 if you just used col names there would be nothing to distinguish from the single json object 15:25:58 we'd need to decide how we'd reflect them 15:26:00 [if at all] 15:26:05 ivan: let's separate the 2 issues 15:26:10 property names being an array vs about URLs 15:26:15 diff numbers? 15:26:22 https://github.com/w3c/csvw/issues/186 15:26:27 (+1 on allowing different entities for diff columns in the rdf) 15:26:52 gregg: the other impl of proeprty uri being a template is that it could vary per row 15:27:03 … some discussion on not including col metadata 15:27:10 jenit: re #186 and multiple property names 15:27:12 https://github.com/w3c/csvw/issues/186#issuecomment-72842231 15:27:34 my comment (see link) i found it useful when mapping data into rdf, you sometimes want dc:title AND rdfs:label 15:27:42 but for simplicity sake let's have it be only a single value 15:27:46 +1 for jeni's proposal 15:27:51 q+ 15:27:59 ack danbri 15:28:35 PROPOSED RESOLUTION: #186 propertyUrl only has one value 15:28:39 +1 15:28:39 +1 15:28:40 danbri: in rdfa/microdata multiple arcs to a value are graceful but horrible in json, support skipping 15:28:42 +1 15:28:45 +1 15:29:03 0 15:29:13 0 15:29:14 jenit: jumbrich and davide -- if you don't feel you have enough info to vote you can write 0 15:29:19 RESOLUTION: #186 propertyUrl only has one value 15:29:25 (i.e. it helps to indicate that you don't object) 15:29:33 ivan: shall i close it directly? 15:29:42 gregg: let's keep it open until text is updated and commited. 15:29:47 ivan: i'll mark it as editorial 15:29:54 (done) 15:30:04 jenit: i suggest we don't go into full complexity of #187 15:30:24 gregg: i thought if we could resolve this we'd be in a less compelx situation 15:30:32 re pull requests, dependencies, ... 15:30:35 jenit: ok 15:30:48 ivan: #187 makes all 3 properties inheritable 15:30:57 -JeniT 15:31:04 https://github.com/w3c/csvw/pull/187 15:31:18 gregg: ivan if you're not comfortable with this let's wait til f2f 15:31:21 ivan: I'm not comfortable 15:31:25 gregg: let's just move on 15:31:36 ivan: yes, f2f. 15:31:37 +[IPcaller] 15:31:42 gregg: a couple of other pull requests 15:31:56 ivan: [as above] 15:32:12 … consequences of having all 3 properties inherited are … potentially complex 15:32:19 not a prob with property uri, but with the about uri 15:32:24 jenit: yes 15:32:30 gregg: not a problem for rdf 15:32:39 ivan: even in rdf, now we have a structure with the row property 15:32:50 … what is , … how do i bind the triple to the rest of the structure, so to say? 15:32:56 gregg: up to creative use of the metadata 15:33:05 normally you'd […] for the row 15:33:17 if someone assigns diff abouturi to diff rows, complex things become possible 15:33:21 jenit: timeout 15:33:24 -jumbrich 15:33:30 … we'll do better going through this at f2f with examples 15:33:34 https://github.com/w3c/csvw/pull/185 15:33:43 +??P6 15:33:47 (re examples - reminder that https://github.com/w3c/csvw/tree/gh-pages/examples/tests/scenarios/chinook is a nice dataset with links and multiple entities) 15:33:50 #185 15:33:51 zakim, I am ??P6 15:33:51 +jumbrich; got it 15:33:57 ivan: bunch of editorial things in pull req 15:34:17 gregg: back to orig intention, you can have embedded csv without lang in it but matching title in asserted metadata that does have a language 15:34:39 gregg: i had made a change some time ago, meaning that lang of metadata needed to be applied when you created the embedded metadata 15:34:50 you'd end up with title having same lang as the embedded metadata and they'd match 15:35:04 this change allows a title with no lang or lang=UND i.e. undefined to match a title in any other language 15:35:13 meaning that we do not need to arbirarily apply a lang to embedded metadata 15:35:20 and we can apply as intended to [missed] 15:35:23 i tried! 15:36:14 gregg: e.g. tree metadata 15:36:23 … lang for metadata terms in the context is assreted as english 15:36:30 the csv itself when you extract metadata from it 15:36:33 default metadata 15:36:36 something that's emplty 15:36:44 creates meta without a default lang 15:36:50 no name field only title assumed from col names 15:37:11 in order to allow the title from the embedded metadata to match the title from the found metadata which is in english, we need to allow it to match 15:37:18 ivan: i ran into this problem exactly 15:37:43 gregg: there is a further conseq which is that if you were to merge these two metadata you would get descriptions the same except for language spec 15:37:51 ivan: hence should be considered as the same 15:37:58 gregg: yes, as being the one with the lang (defined) 15:38:03 ivan: i agree 15:38:17 jenit: i have a comment on it, but it is captured elsewhere, happy just merging 15:38:21 gregg: great 15:38:31 jenit: done 15:38:40 that was #184 15:38:52 skipping #64 as related to phantom col thing, which we need to work through. 15:38:54 #170 15:39:02 https://github.com/w3c/csvw/pull/170 15:39:02 https://github.com/w3c/csvw/issues/170 15:39:08 gregg: relates to rdf/json mappings, 3 sections 15:39:14 core table, table from metadata, table group 15:39:28 as i went through it, to specify it to ignore metadata would seem to need processor action 15:39:38 and wouldn't provide anything that you'd get from just using default extracted metadata 15:39:47 q+ 15:39:49 i did not see a reasonable need to have a different processing form for core tabular data 15:39:50 ack ivan 15:39:56 suggestion to remove those 15:39:59 sections 15:40:11 ivan: fundamentally agree. 2 areas where a simplification on the conv docs can be done - 15:40:25 … by pushing some of the things into a common place, which is probably the metadata doc 15:40:29 one is the one just said 15:40:36 conversion doc works only with annotated table 15:40:47 as one of the specs says what the default metadata is 15:40:58 i think it makes a lot of sense to have default metadata specified somewhere anyway 15:41:26 other thing related but diff, is that there are a lot of words in these docs on how the cell values should be converted and also how the inheritance of properties work, down to a cell level 15:41:37 i think this is something that will eventually move to the metadata doc too as it is a general principle 15:41:42 gregg: i believe most of it is already there 15:41:54 ivan: that is great, means these 2 will make the conv docs very much simpler 15:42:00 this is reflected in my impl experience 15:42:06 most of the sweat is on the metadata 15:42:14 once it is there, … generation is of rdf or json 15:42:18 comparatively simple 15:42:19 http://htmlpreview.github.io/?https://github.com/w3c/csvw/blob/gk-transformations/csv2rdf/index.html 15:42:31 gregg: i did my own version of the rdf transofrm doc -^ url 15:42:38 takes this approach, is more template like 15:42:44 proscribes triples to generate 15:43:11 ivan: q is whether making the core tabular data model dissapear is a major change 15:43:18 jenit: bbiab 15:44:05 gregg: doc has instead of SHALL specific triples given for output 15:44:13 does include specific metadata eg on cols 15:44:19 fundamentally it outputs triple info 15:44:28 for each row, it uses all the lang from the metadata doc 15:44:39 ivan: that's more or less what i do as well 15:44:57 what i do, once all the metadata are merged, i create for every row on the fly, i make a cell level with all the transformations etc 15:45:04 then use the structure to issue either rdf or json 15:45:17 gregg: my version is extremely sparse 15:45:23 simple describes what's emitted 15:45:50 dan q re testing 15:45:55 gregg: i created a no. of tests 15:46:03 could make more, paused as we were changing things 15:46:12 i think we have perhaps 30 or 40 tests for both rdf and json 15:46:52 I asked about testing ivan's impl w.r.t. the tests gregg created 15:47:44 jenit: we were going to get rid of the idea of the core tabular data model, there's no value add for it, ... 15:47:51 i don't think there are any implications of that 15:47:55 mark this as resolved 15:48:01 ie. in #170 15:48:15 PROPOSED RESOLUTION: We remove ‘Core Tabular Data Model’ and define everything in terms of ‘Annotated Tabular Data Model’ #170 15:48:19 +1 15:48:21 +1 15:48:29 +1 15:48:36 we will spend time at F2F going through conversion documents 15:48:55 RESOLUTION: We remove ‘Core Tabular Data Model’ and define everything in terms of ‘Annotated Tabular Data Model’ #170 15:49:07 -jumbrich 15:49:15 https://github.com/w3c/csvw/issues/175 15:49:22 +??P0 15:49:28 zakim, I am ??P0 15:49:28 +jumbrich; got it 15:49:28 gregg: this was basically "what is the default metadata ?" 15:49:35 what's expected to be provided by it? 15:49:47 gregg: primary diff from core metadata, … is that core does not assume that 1st row is header 15:49:58 instead we had col=123 etc for headings 15:50:16 my take is that typical way you expect to find a csv is that 1st row is a col titles 15:50:35 … so using that you can get a reasonable mapping 15:50:46 so table group with empty resources [scribe not capturing detail] 15:51:07 gregg: some discussion on http content type param, header absent, ... 15:51:17 to get back to orig appraoch of col=1 etc instead of an initial header 15:51:26 jenit: I don't really understand this w.r.t. defaulting 15:51:39 even with an indiv csv file you get the metadata you can guess at from that csv file 15:51:49 gregg; you need a context from which to extrat the dialect info 15:51:53 q+ 15:51:59 having simply an empty metadata estabs that context 15:52:05 could be same as saying that there is none 15:52:20 gregg: my processor looks at all the metadata, merges, uses that then re-merges everything 15:52:26 ack ivan 15:52:30 some concept of having metadata when you extract embedded metadata is required 15:52:47 ivan: there are 2 things, i still have to understand whether essentially saying,… forgetting table groups, i have only one table 15:53:04 … if you have no metadata at all, all the processing we have on metadata will ultimately generate [whatever we need] 15:53:06 not sure if this true 15:53:12 gregg: it generates a reasonable output 15:53:36 ivan: wait, no… not whether it generates reasonable output. In the prev discussion, we do not need the core tabular model, because we have only got annotated tabular models now 15:54:03 … we need to show clearly that if there is really nothing in the current terminology -core tab data - process of merging, defaulting etc, will give us a reasonable annotated model. 15:54:14 gregg: to achieve this we could change defaults for skip rows and header 15:54:26 if you say header: false, you'd get to what is currently in the core metadata extraction 15:54:34 jenit: feels like a real technicality 15:54:46 … without examples that show diffs it is hard to decide 15:54:53 ivan: agree, one of the issues for next week 15:54:59 … another q for what gregg said 15:55:41 ivan: I had this discussion with Ivan, … the dialect info that we introduced, if the processing model of the whole thing is such that the dialect is used to control the parsing, with prescriptive stuff, … or vs is it info i get after parser? 15:55:45 you seem to go the 1st way, ... 15:56:07 gregg: dialect tells you what the col separator is, e.g. ",". In order to parse the doc, you'd need to know that 15:56:38 ivan: maybe worth putting into the doc somewhere, is some sort of an abstract processing model, if i have a conformant processor, what are the steps that it must take w.r.t. metadata, parsing, generation of the output 15:56:55 gregg: syntax doc, … locating the metadata, … 15:56:56 http://w3c.github.io/csvw/syntax/#parsing 15:57:09 ivan: not only the parsing but the merging of the metadata etc 15:57:15 … my impl right now does it the other way 15:57:44 … it has to be restructured. I start with the CSV file. I extract everything from there. But essentially I should be doing it the other way around, get metadata 1st then do parsing afterwards. 15:57:51 gregg: that was point of the default metadata yes 15:58:02 … in doc for parsing it has informative language towards what you want 15:58:05 jenit: t-2 15:58:14 … taking parsing discussion to f2f 15:58:27 … we have very strong direction originally that we couldn't talk about parsing, it was out of our remit 15:58:38 AOB pre f2f? 15:58:54 gregg: suggest people examine the test cases that are in there as they explore many of the things in there 15:58:58 can go thru at f2f 15:59:08 describe additional testcase needs 15:59:21 jenit: i'll try to put a rough structure together for f2f but expect it to be fairly fluid 15:59:36 … let's try to get to a place where editors are equiped to take fwd drafts end march 15:59:51 -Ivan 15:59:52 -danbri 15:59:53 -gkellogg 15:59:55 -JeniT 15:59:58 -DavideCeolin 16:00:00 DATA_CSVWG()10:00AM has ended 16:00:00 Attendees were Ivan, JeniT, gkellogg, +44.207.346.aaaa, danbri, jumbrich, DavideCeolin 16:00:15 trackbot, end telcon 16:00:15 Zakim, list attendees 16:00:15 sorry, trackbot, I don't know what conference this is 16:00:23 RRSAgent, please draft minutes 16:00:23 I have made the request to generate http://www.w3.org/2015/02/04-csvw-minutes.html trackbot 16:00:24 RRSAgent, bye 16:00:24 I see 1 open action item saved in http://www.w3.org/2015/02/04-csvw-actions.rdf : 16:00:24 ACTION: dan check beamer setup for non google staff [1] 16:00:24 recorded in http://www.w3.org/2015/02/04-csvw-irc#T15-16-20