See also: IRC log
<scribe> ScribeNick: JeniT
<scribe> Scribe: Jeni
RESOLUTION: Minutes accepted
<danbri> resolved: minutes are a fair record
<AndyS> That was March 5
<danbri> 'this morning i went through the actions from last week's call
<danbri> basically to add in a section that talks about the various methods of locating metadata
<danbri> about a csv file.
<danbri> that section is now …
<danbri> <- here
<danbri> 'it's very sparse, but with lots of issues highlighting places where more discussion/ work needed to resolve the details. but fine for FPWD.
<danbri> danbri: are you proposing that we publish this?
<danbri> jeni: it looks ready. changed short name as req'd. refined abstract. i think addresses concerns from ivan/ralph discussion.
<danbri> ivan: to be precise, ralph didn't object as such; i was trying to anticipate possible issues. i think it's fine.
ivan: we had some discussion about adding text to the status section
<danbri> … we had some discussion re adding text to SOTD
ivan: which we could add to make it clear what's happening in relation to IETF
<danbri> jeni: fine adding text that dan suggested
AndyS: suggest using 'tabular-data-model' rather than 'tabular-model' to make distinct from eg HTML
JeniT: happy to make that change
<danbri> ivan: let's concentrate on data rather than html tables
danbri: we might extend to HTML tables at
... propose we publish as FPWD
<ivan> PROPOSED: the tabular data model should be published as FPWD as 'tabular-data-model'
<danbri> +1 from Jeremy
DavideCeolin: I just sent an email about the issue about XML conversion
DavideCeolin: I added that issue, so as far as I'm concerned it's fine
danbri: we can always make another WD, do we have to resolve this before we publish?
ericstephan: we've put all the use cases
together: 18 use cases
... if there isn't a use case that you submitted, it might have been combined with another
... we have a number of requirements too
... but I believe we should be good to go for FPWD
danbri: consensus from the editors
... is short name fixed?
<ivan> PROPOSED: the use case document should be published as FPWD as 'csvw-ucr'
<danbri> +1 relayed from Jeremy
AndyS: 'csvw-ucr' is not what's in the document
<AndyS> Changing to shorter is fine.
ivan: yes, I propose changing to 'csvw-ucr' as it's short
ivan: there is a bit of a process
... danbri & JeniT will have to make the request for these short names
... point to the editor drafts & say what they do
... it won't be out of the blue
<danbri> ivan: no formal template, but not a big deal. can cc chairs.
ivan: in parallel I'll contact web master
<danbri> … i can contact webmaster as a placeholder
ivan: I've already started checking the
documents; I'll merge changes etc
... I propose publishing 27th March
danbri: ok, we'll get the request out
... when do we lose you?
ivan: early April, hence trying to publication before then
danbri: what about minutes publishing?
ivan: I'll take care of it, but probably with some delay
<danbri> "Sub-groups to explore RDF/JSON/XML mapping systems (that address our use cases)"
<danbri> jeni: dan and i propose subgroups on particular conversions, for rdf and json; and if a requirement, for xml as well.
<danbri> see wiki page ^—
<danbri> each group would look at what info is needed to convert something in tabular data model + annotations, into xyz format
<danbri> "idea is … if each group does this semi-independently, as a whole group we can look at overlaps
<danbri> e.g. if each group needs to know about datatypes for particular values, we can resolve what that looks like for whole group.
<danbri> "is this a reasonable way forward?"
ivan: I'm worried about strictly separating
the various conversions
... we discussed that the JSON and RDF conversions may be part of the same thing
... ie using JSON-LD
... I'm worried that if we strictly separate them, we'd remove a possible synergy
<gkellogg> The same could be said about XML, if RDF/XML is a reasonable way to publish XML
<danbri> jeni: i'm strongly of opinion that we should not aim for those synergies too early
<AndyS> I was going to join both because ... err ... will need both.
<danbri> we shouldn't prematurely assume that json users want json-ld too early. we might miss something.
<danbri> it's fine for the rdf group to think about how that might be done using XML, json(-ld), RDFa, etc.
<danbri> …but i want to make sure that we don't force people who are primarily interested in the browser to go through an unwanted rdf step
<Zakim> AndyS, you wanted to ask about a "direct mapping" style
AndyS: a common framework is to think of CSV
as a big array of fields
... if that's how people are used to using it in the browser, the JSON conversion should factor that in
... we're looking very much at the annotated version of the data model
... I was wondering about the direct mapping, without annotations
<danbri> jeni: I think in all the cases we should be looking at a conversion, … a default conversion unguided by annotations; and that the annotations then are tweaks over the top
<Zakim> danbri, you wanted to suggest that someone can simultaneously work on json and rdf
AndyS: ok, we do have to factor in the direct mapping
danbri: we should avoid being tribal here,
eg words like 'RDF people' or 'XML people'
... this group is full of pragmatists who use a bunch of technologies
<AndyS> PS We did briefly mention doing RDF via the abstract data model. Should all work out with direct to JSON-DL.
danbri: ivan's concerns that we fork too
early won't apply because we'll all be watching the mailing list
... we shouldn't rule out JSON-LD, but we shouldn't assume it
... the use cases should keep us on track
yakovsh: I agree with danbri that groups
with subgroups tend to fracture
... also, are the conversions well-defined formats?
... do all three need their own media types?
danbri: I don't think we need media types
... the RDF group in particular, don't need them
yakovsh: there are JSON & XML media types that are separate
<danbri> jeni: yes, as discussed on the list, depends on how the conversion is done, e.g. you could map into JSON that explicitly says 'there is a table with these columns', …
<danbri> … json properties for table/column/row
<danbri> … which could be an explicit media type for tabular data in json
<danbri> (and same in xml, <row>, <column>, <field>, ...)
<danbri> ….but i think more likely our mappings will be based on particular csv original documents
gkellogg: I'd suggest that a direct mapping
to JSON, based on column headings is compatible with JSON-LD
... with zero edits, plus the context
... doesn't mean it solves every desire to convert the data
... I think a direct mapping plus a JSON-LD context does quite a lot
... and obviously gets us part way to RDF
ivan: one more question from yakovsh was
about well-defined format
... we need to do more than an example, but a clear formal specification of a mapping
... a clear standard mapping definition
... also I wanted to add: I haven't looked at the details of the use cases in terms of these mappings
... but it would be good to look at the practice on how the mapping to JSON happens in those examples
... what is the usage of that in the real world
... and adding use cases on the conversion side rather than the structure of the CSV file
... those use cases & requirements might be useful
... there's no point having a specification that's completely different from what's done out there
<Zakim> danbri, you wanted to ask if we should be targeting existing XML and JSON idioms (e.g. SportsML for a sports CSV)
<danbri> eg. ical https://tools.ietf.org/html/rfc6321
danbri: as we map into particular XML languages etc: should we have a goal of mapping into fixed formats?
<yakovsh> open document xml
danbri: eg ical, SportsML
... in XML we might use XSLT to map into one of these formats
... but is that our aim?
<danbri> ack ivan?
ivan: I think it's somewhere in between: I
think if we say we should be able to map a CSV file to any XML schema or
any RDF vocabulary out there, we will have to define something fairly
... it's equivalent to what the RML work did, on converting relational databases to RDF
... that said, if there's something in the middle, in simple cases we might add something in the metadata
<AndyS> To gregg -- /me worried about putting JSON-LD algorithms on RDF path.
ivan: to help with the conversion, to
produce an RDF closer to what's desired
... I wouldn't want to be able to do it automatically with any vocabulary out there
<danbri> ack jenit?
<danbri> jenit: in other places, you can hand off to another system for the conversion
<andimou> *equivalent to what R2RML does with Relational Databases, RML extends R2RML and maps CSV/XML/JSON to RDF
<danbri> eg. grddl turns normal XMLs into RDF via XSLT files
<danbri> we might want to consider that we do need to discuss on the list, whether we want to have the option to bug out to such languages
<ivan> +1 to Jeni
<danbri> …eg. ptr to an xslt file over a standard mapping into a specific format
<danbri> or construct
<danbri> jeni: using those kinds of langs might be worthwhile
danbri: an easy win is to reuse the hard
work of other groups
... eg the work of the relational-to-RDF mapping work
... Barry Norton has mapped musicbrainz data into linked data
... he's dropping down into SQL all the time
... in some use cases, that's just right: in other cases XSLT might give us that flexibility
yakovsh: we're talking about conversion into JSON/XML/RDF, very removed from what users see
<danbri> musicbrainz example: https://github.com/LinkedBrainz/MusicBrainz-R2RML/blob/master/mappings/artist.ttl
yakovsh: what about open document XML and open XML, the document formats?
yakovsh: if we're talking about conversions
into formats, those are probably more common than anything else
... getting a spreadsheet out of CSV is going to be very common
danbri: that might be an unarticulated aspect of one of our existing use cases, we should take a look
<danbri> ACTION: danbri scan use cases to see if http://en.wikipedia.org/wiki/Comparison_of_Office_Open_XML_and_OpenDocument are mentioned/implied [recorded in http://www.w3.org/2014/03/19-csvw-minutes.html#action01]
<trackbot> Created ACTION-8 - Scan use cases to see if http://en.wikipedia.org/wiki/comparison_of_office_open_xml_and_opendocument are mentioned/implied [on Dan Brickley - due 2014-03-26].
ivan: if I look at open document format, if
I converted to that, it's down to Excel or OpenOffice, but these systems
can do it directly from CSV
... so there's no need to convert
... is there a significant use case of systems other than traditional spreadsheet programs
... that want to manipulate another format because they can't use the comma-separated version
danbri: last time I looked at the open
office format, it was a container that had lots of extensibility points
... even within XML languages, some are more specific than others
yakovsh: the point might be that these programs already handle CSV conversion, so no point because it's already built in
ivan: yes, I want to see if there are uses outside traditional spreadsheets
<danbri> jenit: also considering importing into a relational database
<danbri> … also can apply to spreadsheet formats
<danbri> … would be useful to know types of columns, to format; to create database structures etc.
<danbri> … maybe we need a kind of focus in this area, to make sure we collect useful metadata
<danbri> jenit: do we need a focus group on reading csv into spreadsheets, relational databases
danbri: why not MatLab or R etc?
JeniT: I think we should be looking at those: that's what data scientists use
danbri: we should make some initial forays and see where we get
<danbri> jenit: maybe given earlier discussion, instead of 'sub-groups', think of them as products we're aiming at
<danbri> so we'd need a lead editor on each, with co-editors etc
<danbri> … and yes let's have a 4th, csv reading into tabular data stores of various kinds
<danbri> (dan: I'd say structures/frameworks not stores)
yakovsh: I believe Part 9 of SQL discusses
how they load CSVs
... we should look at that
JeniT: yes, all of the conversions are going to have to take into account existing work
yakovsh: Postgres is the only one I think that has a full implementation
danbri: musicbrainz data is shared as a
Postgres dump, which isn't a good way of sharing data
... maybe they could look at CSV support from Postgres
<Zakim> AndyS, you wanted to ask about "sub group"
AndyS: what does 'sub group' mean? are we discussing on the main mailing list?
danbri: we'll focus on products: all discussion on main mailing list, but having these as focused efforts & documents
danbri: anyone have a clear view on which days they want to meet? Mon/Tue or Thur/Fri?
ivan: I'd like Mon/Tue because I have to be in another group Thur/Fri
<danbri> ivan: pref mon/tues
<yakovsh> no particualr pref
danbri: any objections to mon/tue?
<danbri> ACTION: danbri to request mon/tues tpac meeting [recorded in http://www.w3.org/2014/03/19-csvw-minutes.html#action02]
<trackbot> Created ACTION-9 - Request mon/tues tpac meeting [on Dan Brickley - due 2014-03-26].
danbri: please someone volunteer to scribe next week
<danbri> thanks all