12:09:30 RRSAgent has joined #csvw 12:09:30 logging to http://www.w3.org/2014/03/26-csvw-irc 12:09:32 RRSAgent, make logs public 12:09:32 Zakim has joined #csvw 12:09:34 Zakim, this will be CSVW 12:09:34 ok, trackbot; I see DATA_CSVWG()9:00AM scheduled to start in 51 minutes 12:09:35 Meeting: CSV on the Web Working Group Teleconference 12:09:35 Date: 26 March 2014 12:10:12 ivan has changed the topic to: Meeting agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-03-26 12:10:16 Chair: Jeni 12:40:09 AndyS has joined #csvw 12:40:19 JeniT has joined #csvw 12:52:14 AndyS has joined #csvw 12:57:03 hi folks 12:57:35 DATA_CSVWG()9:00AM has now started 12:57:42 +[IPcaller] 12:57:48 zakim, ipcaller is me 12:57:48 +AndyS; got it 12:58:37 +[IPcaller] 12:59:50 +??P9 13:00:01 zakim, I am ??P9 13:00:01 +gkellogg; got it 13:00:02 zakim, dial ivan-voip 13:00:02 ok, ivan; the call is being made 13:00:03 +Ivan 13:00:05 DavideCeolin has joined #csvw 13:01:05 +??P10 13:01:06 yakovsh has joined #csvw 13:01:16 zakim, ??P10 is me 13:01:16 +DavideCeolin; got it 13:01:48 scribe: Gregg Kellogg 13:01:49 Chair: Jeni 13:01:53 scribenick: gkellogg 13:01:59 +yakovsh 13:03:20 Regrets: Jeremy Tandy, Tim Finin, Eric Stephan, Axel Polleres 13:03:49 +??P14 13:04:15 fonso has joined #csvw 13:04:30 Agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-03-26 13:04:46 topic: previous minutes 13:04:47 http://www.w3.org/2014/03/19-csvw-minutes.html 13:05:06 zakim, mute me 13:05:06 Ivan should now be muted 13:05:07 RESOLVED: approve previous minutes 13:05:20 I'll note that I requested mon/tuesday TPAC per my action. 13:05:29 topic: Model for tabular data on the web 13:05:41 http://w3c.github.io/csvw/syntax/#metadata 13:06:16 jenit: this section is a sketch of different methods of finding a metadata document that provides metadata about a CSV, or finding it within the CSV itself. 13:06:43 ... The metadata document can tell an application how to deal with that file, in particular, how to transform into different formats. 13:06:55 ... In that document, there are five different methods listed with issues. 13:07:17 ... 3.5, use a standard path 13:07:18 http://w3c.github.io/csvw/syntax/#standard-path 13:07:28 -JeniT 13:07:36 ergh 13:07:49 gives us time to read up 13:07:50 give you time to read ;) 13:08:03 chairs are of one mind on this. 13:08:04 +[IPcaller] 13:08:06 q+ 13:08:19 q+ 13:08:19 q? 13:08:29 -[IPcaller] 13:08:42 we heard you twice saying 'can you hear me' 13:08:48 then you should have replied! 13:08:52 we did! 13:08:53 ack yakovsh 13:08:55 ack yakov 13:09:10 yakovsh: is 3.5 specifically when used with HTTP? 13:09:13 +[IPcaller] 13:09:31 http not https, ftp, gopher, … ? 13:09:38 +??P17 13:09:41 ... If so, why is the standard name considered? If using HTTP, then the Link header can describe it. 13:09:49 and "file:" 13:10:01 zakim, this is +??P17 13:10:01 sorry, fonso, I do not see a conference named '+??P17' in progress or scheduled at this time 13:10:17 jenit: yes, and 3.4 talks about using the Link header. When we discussed on the list, people felt that having a standard location relative to the CSV would be easier than controlling the Link header. 13:10:22 zakim, ??P17 is fonso 13:10:22 +fonso; got it 13:10:33 "When retrieving a CSV file via HTTP, the default location for a metadata file that describes that CSV file is set to csv-metadata in the same directory. If this metadata file does not explicitly point to the relevant CSV file then it must be ignored." 13:10:43 yakovsh: Can 3.5 also be used when files are on disk? Why HTTP only? 13:10:59 jenit: No particular reason, and that's a good point. 13:11:16 q? 13:11:17 thanks andy 13:11:22 ack AndyS 13:11:23 nearby: http://tools.ietf.org/html/draft-nottingham-site-meta-05 13:12:00 andys: I think we also need to be adhere when we're deadlining with packages of CSV files, in which case a package description file will be needed. Something to address that will be needed. 13:12:39 ... When I mentioned being able to work out a file given a CSV file, I was thinking of one per CSV, such as given foo.csv, it might be foo.csvm. 13:12:50 q+ to ask about "/.well-known/" ("http://tools.ietf.org/html/draft-nottingham-site-meta-05#appendix-B.4. Why aren't per-directory well-known locations defined?") 13:12:53 a bit of an echo 13:12:57 jenit: something about being in a similar directory 13:13:31 zakim, mute me 13:13:31 Ivan was already muted, ivan 13:13:37 ... Where I've seen metadata files used with CSV, such as simple data format, or googles, the metadata file has always been describing several related CSV files. 13:13:53 (this? https://developers.google.com/public-data/ -> DSPL ) 13:13:55 ... I took it as a strength that a metadata file would describe several CSV files, as that matched current usage. 13:14:00 danbri: yes 13:14:10 for favicon, here is the link to the w3c doc: http://www.w3.org/2005/10/howto-favicon 13:14:23 andys: that's good when there's one publisher, but CSV files may come from a number of different publishers, and the publisher is just mechanically moving them into place. 13:14:30 q? 13:14:56 "the final publisher" putting up the files on the directory. 13:15:48 q+ sitemaps 13:15:51 erg 13:15:53 q- sitemaps 13:16:05 andys: In a lot of environments, it's either impossible to control, or very difficult to control in terms of technology 13:16:09 q+ to suggests sitemaps.org 13:16:38 jenit: what about if you use a suffix on a file name; if you want to use it on all files in a directory, use a suffix on the directory name. 13:17:17 andys: perhaps we document both, and in an issue say that the WG is likely to pick one, so people have a warning. It should depend on actual user experience. 13:17:57 jenit: let's change the document to cover both cases. I think it's reasonable for both to be possible: somewhere you look for an individual CSV file, and another default location. 13:18:02 +1 to Jeni, we should have several documents in a priority order 13:18:34 makes sense avoiding .xyz 13:18:55 +1 to danbri + jeni, too 13:19:02 ... I'm inclined to use a suffix that doesn't look like ".foo", as those are associated with different formats. 13:19:11 +q 13:19:25 jenit: do we anticipate a single mime type for CSV metadata, or not? We'll take this to the list. 13:19:41 ack danbri 13:19:41 danbri, you wanted to ask about "/.well-known/" ("http://tools.ietf.org/html/draft-nottingham-site-meta-05#appendix-B.4. Why aren't per-directory well-known locations defined?") 13:19:43 ack danbri 13:19:44 ... and to suggests sitemaps.org 13:20:07 its an rfc: http://tools.ietf.org/html/rfc5785 13:20:17 https://www.iana.org/assignments/well-known-uris/well-known-uris.xhtml 13:20:22 the actual registry 13:20:25 q+ 13:20:32 danbri: There's an IETF draft from mnot and friends. As I understand it, it's really one place per site. I wonder if we could consider extending it to be per-directory. 13:20:51 ... Personally, I'm not excited about well known paths, but we should look at site-map files. 13:21:15 jenit: yes, .well-known is one-per-site. 13:21:33 q- 13:21:41 ... Given we're trying to do something really easy, I think it's unlikely they could access either .well-known or site-map. 13:21:44 ack yakovsh 13:22:11 yakovsh: Are we sure that every OS uses file extensions? I think MacOS uses something in the file itself. 13:22:15 OS X uses extensions I believe 13:22:16 osx is hybrid now 13:22:37 jenit: I think Mac uses a combination of both. When we're talking about a default method, I don't think that's relevant. 13:23:01 q+ 13:23:24 yakovish: regarding .well-known URIs, it's tied into AWWW. It might be prudent to reach out to mnot. It's not clear how widely it's used, such as robots.txt 13:23:46 q- 13:23:49 jenit: some of these (e.g. robots.txt) came before .well-known. I'm not sure it's a relevant notion. 13:24:13 even if something's not in the .well-known/ registry, it can still provide a safe sub-namespace to put such names where they'll only clash with other would-be-well-known names, and not with publisher names 13:24:15 http://tools.ietf.org/html/draft-nottingham-uri-get-off-my-lawn-02 13:25:21 jenit: we'll consider a standard path and a backup, possibly using a file extension. 13:25:32 http://w3c.github.io/csvw/syntax/#link-header 13:25:33 ... Moving on to 3.4, I think this is fairly straight forward. 13:25:44 Aside: Dan Connolly "Get off my lawn!" quote, http://lists.w3.org/Archives/Public/www-rdf-interest/2000Sep/0162.html 13:25:54 ... Just to be sure rel=describedby is the right header 13:26:25 andys: we have the two cases again, a description per CSV, or one for a group. 13:26:32 q+ 13:26:40 http://www.w3.org/TR/powder-dr/#assoc-linking 13:26:51 jenit: It doesn't make a case, as the Link header describes the resource (which could be multi-part?) 13:26:56 zakim, unmute me 13:26:56 Ivan should no longer be muted 13:27:01 describedby is registered in https://www.iana.org/assignments/link-relations/link-relations.xhtml 13:27:19 ... Perhaps we can assume that we always have a package description. 13:27:41 andys: if multiple people are dropping files into a directory, this might not be a good assumption. 13:28:00 I think it's me too :( 13:28:36 +1 for one type of metadata file 13:28:43 we can have conventions evolve over time 13:28:48 jenit: Andy seemed to be saying there would be two different types of files (packages, and individual). I'm suggesting there should be just one, but sometimes the package might just have one file in it. 13:28:54 there might be a few of these that get composed 13:29:00 (i.e. merged) 13:29:42 andys: there might be one directory with mixed information. Perhaps it should be either one or the other, a package or an individual file. Going down every path might be exhausing 13:30:07 ivan: from a syntactic point, does describedby allow me to use a list of URIs or just one. 13:30:31 jenit: I think you can have multiple Link headers, with different types and locations. 13:31:02 ivan: that's also related to Andy's question: the various access methods. We have to allow for different routes to get metadata with a prioritization. 13:31:36 ... In this sense, if it's one link header with a list of references, they are in priority order, and if some are metadata for the package, and some individual, falls back to priority. 13:32:01 ... I can imagine a system setting up a standard describedby for all CSV files, and the user adds more metadata with a well-known URI. 13:32:08 http://w3c.github.io/csvw/syntax/#metadata 13:32:22 jenit: I tried to put in something about cascade in section 3. That might not satisfy your requirements. 13:32:55 ... for the Link header, we should say you can have multiple link headers and that they are merged with the one at the top being the highest priority. 13:33:39 ivan: the problem is, what does priority really mean? Suppose it's all in RDF. The "RDF way" would say that all statements are accumulated and do not hide each other. Other systems would do occlusion. 13:34:09 -danbri 13:34:12 oops! 13:34:35 3 pixel difference between selecting a browser tab vs closing it 13:34:46 jenit: it says if the same property is specified in two different locations, information closer to the document should override that which is further away. 13:34:50 JeniT has joined #csvw 13:35:05 +??P14 13:35:08 +q 13:35:16 ack ivan 13:35:20 ack yakovsh 13:36:06 yakovsh: In RFC4180 I started defining metadata as part of the mime type. If the mime type is a good place to stash metadata? 13:36:22 jenit: Probably not, as it gets lost when it moves around. 13:37:04 (isdescribedby seems ok to me.) 13:37:41 topic: Conversions 13:37:45 https://www.w3.org/2013/csvw/wiki/Conversions 13:37:49 Can we take 5 min later to address Ivan's q about i18n/l18n, arabic etc? 13:38:26 zakim, unmute me 13:38:26 danbri should no longer be muted 13:38:52 danbri: it seems everyone wants to talk about RDF mappings, but we've been putting that off. 13:39:07 ... Also, XML, JSON, ... 13:39:24 jenit: the best way to structure discussion is to have a spec to discuss and "kick". 13:39:51 ... I'd like to have people step forward to edit a document and have others contribute. 13:40:01 .. On CSV to RDF 13:40:01 q+ 13:40:08 ack gkellogg 13:40:17 I'd like to help, and relay in some ideas from https://www.w3.org/wiki/WebSchemas/LookInside http://lists.w3.org/Archives/Public/public-vocabs/2013Aug/att-0061/Lookinginsidetables.html 13:40:57 We have two already -- https://www.w3.org/2013/csvw/wiki/CSV2RDF and https://www.w3.org/2013/csvw/wiki/CSV-LD 13:41:19 also we have a backwards sparql proof of concept, http://svn.foaf-project.org/foaftown/2010/lqraps/lqraps.html 13:41:38 q+ to suggest we pick some concrete CSV files (from the UC work) to focus the mapping design 13:42:05 ack danbri 13:42:06 q+ 13:42:07 danbri, you wanted to suggest we pick some concrete CSV files (from the UC work) to focus the mapping design 13:42:10 jenit: I find it hard to be able to say that one direction is definitely the way to go. I think the next step is for someone the characterize the difference between the different approaches so we can have an educated discussion in order to make a discussion. 13:42:19 zakim, unmute me 13:42:19 Ivan was not muted, ivan 13:42:46 danbri: I'm feeling a bit overwhelmed by the different threads using a set of CSV files. Then we can compare different designs. 13:43:14 andys: There already are examples in the different examples. 13:43:46 danbri: I think we should have some core examples. 13:44:53 what can we take from WD-csvw-ucr-20140327 ? 13:44:57 jenit: I think the first step is to focus on the direct mapping, i.e. with zero metadata. If we can get that down, we're in a good position. 13:45:05 q- 13:45:34 q+ 13:45:36 ... Who'd like to take forward direct mapping for CSV to RDF, the possibilities and advantages/disadvantages with proper examples. 13:46:11 jenit: can Andys and gkellogg get together on this? 13:46:28 andys: I don't think this quite touches on the fundamental differences between the two approaches. 13:46:51 ... Gregg's very much based on JSON-LD, and I'm interested in a mapping to RDF triples. 13:46:59 action: danbri try expressing a direct mapping expressed using http://www.w3.org/TR/2013/PR-vocab-data-cube-20131217/ 13:47:00 Created ACTION-10 - Try expressing a direct mapping expressed using http://www.w3.org/tr/2013/pr-vocab-data-cube-20131217/ [on Dan Brickley - due 2014-04-02]. 13:48:25 http://lists.w3.org/Archives/Public/public-csv-wg/2014Mar/0140.html 13:49:21 andys: I found three classes of JSON-style output. I have no idea which are commonly used. I understand the first (one row to object), I understand column arrays, I don't know what the background is about turning everything into arrays without objects inside. 13:50:18 jenit: given that the main difference between the two approaches is about the syntax of the metadata document, I'd like to get something down as a starting point, being just a direct mapping. This would be really helpful. 13:50:41 ... Andy, if you could do this? 13:50:48 gkellogg: why don't we work together. 13:51:06 andys: I'd like feedback on what I've written. 13:51:20 jenit: something in ReSpec on GitHub; copy/paste is fine. 13:51:59 andys: I'm looking at a mapping to RDF, gregg's looks to both RDF and JSON through JSON-LD. When you compare and contrast, it might not be as useful. 13:53:02 ivan: In a way, the JSON vs RDF model is just one dimension of the differences. There was another discussion is what level of complexity do we want to allow and define within that? 13:53:27 q+ 13:53:31 ... I'm a little concerned that we're having the same discussion as we had in RDB2RDF; I'm a bit worried we're just repeating the same arguments. 13:53:33 ack ivan 13:54:32 ... Before going beyond that, I'd like to have an understanding on how the RDF conversions are done in the use cases. There are only 2-3 that really rely on an RDF mapping. R2RML can be quite complex, with a full SQL language inside. It's a level of complexity I'm quite afraid of. 13:54:40 q+ to suggest that a table of timeseries/stats is more inherently tabular than an SQL dump, where we really want the relational structure exposed 13:54:48 ... It's a kind of difference between the proposals I'd like to examine. 13:55:11 ... Mappings of URIs and properties and much complexity. 13:55:47 +1 on defining the RDF output without dictating a particular serialisation of that RDF 13:57:04 I think there are layers: direct mapping using no metadata, mapping to RDF graph using metadata, mapping to RDF syntax using metadata 13:57:39 q- 13:58:09 jenit: a clear document that says we need a decision would help focus discussion. 13:58:20 topic: i18n 13:58:38 danbri: we should just say we're choosing one, say left to write, top to bottom. 13:58:52 jenit: also, commas are used as syntactic marker, and not as text. 13:59:38 ivan: I had a conversation with Richard, our i18n guy; the best wy to do that would be to contact the i18n mailing list and ask them to look at use cases to see if there's something too latin-biased. 14:00:06 ... apart from that, we should try to collect use cases outside of the US-Europe world. 14:00:10 +q 14:00:36 ... I can try to reach out to chinese colleagues, or google has some aribic people. 14:01:03 ack yakovsh 14:01:19 yakovsh: I'm a hebrew speaker, but I've never seen a hebrew CSV, but I'll poke around. 14:01:21 can we add "We particularly seek feedback and suggestions on the Internationalization aspects of this work" to the Status section? 14:01:54 ivan: next time, I don't want to touch it now, it's in the webmaster's control. 14:02:04 -danbri 14:02:05 -JeniT 14:02:05 -gkellogg 14:02:06 -Ivan 14:02:07 -AndyS 14:02:08 -DavideCeolin 14:02:08 -fonso 14:02:13 -yakovsh 14:02:14 DATA_CSVWG()9:00AM has ended 14:02:14 Attendees were AndyS, JeniT, gkellogg, Ivan, DavideCeolin, yakovsh, danbri, fonso 14:03:00 rrsagent, draft minutes 14:03:00 I have made the request to generate http://www.w3.org/2014/03/26-csvw-minutes.html ivan 14:03:11 trackbot, end telcon 14:03:11 Zakim, list attendees 14:03:11 sorry, trackbot, I don't know what conference this is 14:03:19 RRSAgent, please draft minutes 14:03:19 I have made the request to generate http://www.w3.org/2014/03/26-csvw-minutes.html trackbot 14:03:20 RRSAgent, bye 14:03:20 I see 1 open action item saved in http://www.w3.org/2014/03/26-csvw-actions.rdf : 14:03:20 ACTION: danbri try expressing a direct mapping expressed using http://www.w3.org/TR/2013/PR-vocab-data-cube-20131217/ [1] 14:03:20 recorded in http://www.w3.org/2014/03/26-csvw-irc#T13-46-59 14:03:25 thanks for scribing gkellogg!