CSV on the Web Working Group Teleconference -- 12 Mar 2014

<danbri> https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-03-12

<danbri> http://www.w3.org/2014/03/05-csvw-minutes.html

seeking approval for minutes?

no objections

minutes approved

<AndyS> Spec looks OK to me.

<AndyS> Indeed, too good

Model for Tabular Data and Metadata on the Web

<danbri> http://w3c.github.io/csvw/syntax/

http://w3c.github.io/csvw/syntax/

AndyS: current document is perhaps too polished for FPWD?
... it might be nice to include some technical requirements to improve potential engagement

danbri: so this looks too finished?

AndyS: but then again, let's publish

danbri: does anyone think we are / are not ready to submit this for FPWD

ivan: missing the proposed "how to find metadata" section as from the email thread http://lists.w3.org/Archives/Public/public-csv-wg/2014Mar/0043.html

<AxelPolleres> maybe add a Note making that non-only-csv-scope clear?

fresco: nothing to add; note that the section on Parsing is work in progress ...

danbri: that's fine, it's FPWD

fresco: would like to edit some "wrong" parts in the algorithm
... should be able to do this this week

danbri: any open issues?
... lets step through the open issues

<danbri> looking, http://w3c.github.io/csvw/syntax/ 'It might be useful to define annotated regions as follows"

danbri: issue 1, annotated regions
... issue 2, content types for CSV
... we're not getting far with issue by issue review

<danbri> see mention of 'mime' in http://www.w3.org/2014/03/05-csvw-minutes.html

ivan: regarding content type, we discussed this last week with Yakov ... the doc needs to be updated to reflect those discussions.

<danbri> ACTION: danbri make sure syntax doc updated in light of mar 5 mimetype discussion [recorded in http://www.w3.org/2014/03/12-csvw-minutes.html#action01]

<trackbot> Created ACTION-1 - Make sure syntax doc updated in light of mar 5 mimetype discussion [on Dan Brickley - due 2014-03-19].

<scribe> ACTION: danbri to make sure the syntax doc includes the discussion from the mailing list [recorded in http://www.w3.org/2014/03/12-csvw-minutes.html#action02]

<trackbot> Created ACTION-2 - Make sure the syntax doc includes the discussion from the mailing list [on Dan Brickley - due 2014-03-19].

use cases and requirements

jeremy: excellent progress on use cases doc, http://w3c.github.io/csvw/use-cases-and-requirements/

in terms of our status we have about 14 use cases already in the doc

eric is still in process of adding a few more use cases in

[eric confirms, 2 more to add.]

eric: those 2, … one is the city trees spreadsheet, city of palo alto,

jeremy: other is displaying care home locations on a map

… similar to tree one in that it is geospatial, but also has an interesting web widget aspect

jeremy: in terms of use cases that are already in the doc, …

eric has been getting set up w/ git.

re use cases 7, 12 and 17, they don't currently have any binding to requirements

eric: […] were combined, …

jeremy: it reads more smoothly now

… but look at uc 7 now, there is no ref to requirements currently within the text

compare for example w/ 8 from davide, which is clearer about its requirements.

we need to put in an explicit rel between the uc and the requirements that motivate it, based on existing examples

jeremy: if we find requirements we haven't covered yet, we can add more requirements

e.g. in 7 we might want to add something for substructure

ivan: e.g. dates?

jeremy: like a microsyntax, … in this case, the uc we're talking about, list of author names is a comma-delimited list
... we've talked about replicates on prev queues

fresco: will send another example to the list

<jtandy> http://lists.w3.org/Archives/Public/public-csv-wg/2014Mar/0063.html

jeremy: also sent an email to public list on substructure <- url

other things before FPWD: we need to renumber the use cases, currently they reflect the numbers in our wiki page rather than seq for the doc

… remove the other usecases section

… there is a proposal from jeni to cluster the requirements

… currently they are all filed as 'proposed' vs 'accepted'

ivan: i think it's ok [to leave as proposed]

… this doc isn't planned as a REC

ivan: socially it's ok if we say 'proposed' as 1st draft

jeremy: yes, that allows wider engagement
... some more work to be done pulling out requirements

eric: one thing i was doing on the journal example, was going to the live web site, …

… as a matter of policy is it better to link to live websearch or a screenshot?

jeremy: what i've done is taken a snippet of the data, ...

danbri: ivan, any advice?

ivan: safer for final doc, for draft i don't mind so much

jeremy: change or ultimately disappear

ivan: if you think screenshot now is safer, go for it; if around for a year, we can worry later

jeremy: for me, so long as we capture the motivation and we have sufficient info captured, that's enough provenance to justify the spec, no need to completely reconstruct everything

danbri: can also just copy screenshot into wiki or github repo, even if not used ultimately in the doc

eric: also i found using the live website very useful

davide: just to confirm i am indeed reconstructing use case 20

… as j mentioned it does need some attention, and there may be more requirements found, or mapped, as i do this

… others are ok but open for comments

… working on those last 2 use cases

jeremy: use case no.20, representing entities and facts extracted from text

… reading it now, a typo in title

… but also note that there is potential for substructure in a given field

jeremy: other things: any latent requirements coming from csv-ld proposal?

i've not tried linking into that, or csv2rdf email thread

ivan: these are slightly two different things

the way i read gregg's mail is jumping ahead to possible solutions

… less on use cases, more on technology

[gregg, can you comment? I was going to say same as Ivan]

ivan: it would be good to have [the rdb2rdf stuff] in the doc somewhere

can add a placeholder saying 'here is an area where we need use cases'

jeremy: there are two use cases with a csv 2 rdf conversion

the land registry data, … and digital preservation use case

ivan: i'd still like to see the same type of use cases for json and xml

danbri: maybe nobody cares about those? :)

<scribe> ACTION: jeremy add an issue into UC doc about lack of json and xml conversion [recorded in http://www.w3.org/2014/03/12-csvw-minutes.html#action03]

<trackbot> Created ACTION-3 - Add an issue into uc doc about lack of json and xml conversion [on Jeremy Tandy - due 2014-03-19].

<AndyS> The charter has a link to CVS to XML

ivan: we should talk to rufus as he had csv 2 json centrally in their work

gregg: without going into details, csv-ld, itt was a brain dump following jeni's earlier work

… thought it was worth capturing

<Zakim> gkellogg, you wanted to briefly discuss CSV-LD

worth comparing with [?] to see how it maps

<mielvds> technology wise there was also the RML mail, talking about json, xml and csv in an integrated matter: http://lists.w3.org/Archives/Public/public-csv-wg/2014Feb/0132.html

danbri: any motivations for csv-ld that aren't in our UC doc?

gregg: there are a couple in the doc

jtandy: to confirm, gregg will cross-ref the csv-ld and compare requirements, and identify anything missing

<scribe> ACTION: gregg cross-ref the csv-ld and compare requirements, and identify anything missing [recorded in http://www.w3.org/2014/03/12-csvw-minutes.html#action04]

<trackbot> Created ACTION-4 - Cross-ref the csv-ld and compare requirements, and identify anything missing [on Gregg Kellogg - due 2014-03-19].

jtandy: looking at outstanding issues in UC doc

1st issue: in dig preservation use case, waiting on feedback from original donor (adam retter(?))

<jtandy> http://w3c.github.io/csvw/use-cases-and-requirements/#UC-DigitalPreservationOfGovernmentRecords

jtandy: issue 2, related to pointing to regions within a csv; haven't seen a req to do this yet, so raised an issue.

ivan: also in syntax doc

jtandy: so we have consensus at least
... issue 3, the 0-edit compatibility requirement is currently orphaned

when i reorg'd, this was left dangling

issue 4: need def for data schema from jenit contrib'd uc on public roles, salaries

issue no 4, need def for data schema from jenit contrib'd uc on public roles, salaries

<scribe> ACTION: jenit provide data definition schema for public roles and salaries to jeremy for UC doc [recorded in http://www.w3.org/2014/03/12-csvw-minutes.html#action05]

<trackbot> Created ACTION-5 - Provide data definition schema for public roles and salaries to jeremy for uc doc [on Jeni Tennison - due 2014-03-19].

5th issue, clustering into sections.

6th issue, need correct ref for RDF. Is it 1.1?

danbri, ivan: yes

<trackbot> Created ACTION-6 - Mail jtandy correct reference for rdf1.1 [on Ivan Herman - due 2014-03-19].

more comments from eric, davide?

no, no.

<gkellogg> RDF 1.1 Concepts <http://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/>

ivan: one question ... use cases are structural in terms of the CSV file itself ... should we have use cases about the problems raised by size of CSV files
... these CSVs might be gigabytes long

<ericstephan> Nice point Ivan

ivan: example in the UK, national (?) service

<ericstephan> biologists are real "troublemakers" in this area ;-)

<danbri> in https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-03-12 we have "FPWD decision"

ivan: data is too big for most applications to manage

jtandy: there are several use cases that already reflect that size is an issue

ericstephan: there are other problems about "size" such as the number of columns

ivan: the problem I see relates to the number of rows

ericstephan: look for a requirement to stream arbitrary numbers of rows

<AndyS> and lots of cols and very large cell entries

danbri: XML close tags make streaming difficult
... any use cases dealing with clustered computing / map reduce

jtandy: no

<Zakim> danbri, you wanted to ask if we have cluster-oriented (map reduce etc) scenarios

<danbri> danbri: jeremy davide eric, when are we ready for FPWD?

danbri: are we ready for FPWD on use cases? not yet ... how long

DavideCeolin: at most a couple of days for me

ericstephan: hoping to get done by friday

danbri: should have the doc ready to review for release as FPWD net week

ivan: from end of march, I will be unavailable for 2-weeks ... conferences and meetings

ivan: suggest that we aim to publish for w/c 24 March ...
... the process means we do this on tuesday or thursday
... either the 26th or 28th
... we need to send the docs to the publishers
... need formal resolution next week for _BOTH_ documents to allow time for the publishing process
... need a formal vote on next week's teleconf

danbri: allow 1 or 2 days for participants to review
... therefore get ready to share on Monday

AndyS: there's no direct reference to CSV+ ... do we need to clarify?

ivan: we do have an item in the charter on the metadata ... this is the core of the recommendation, everything else is non-normative
... certain work will be given to IETF to standardize CSV

AndyS: tabular data places additional rules on CSV to make it amendable ...
... if your comfortable that we've got the issues covered then OK

<danbri> "The titles of the deliverables are not final; the Working Group will have to decide on the final titles as well as the structures of the documents. The Working Group may also decide to merge some deliverables into one document or produce several documents that together constitute one of the deliverables." http://www.w3.org/2013/05/lcsv-charter

danbri: there's enough wiggle room, but we should write down the mapping from doc to the charter

AndyS: lack of explicit naming can lengthen the publishing process (and has done in other WGs)

ivan: if we want to publish w/c 27th so we need a vote next week

danbri: Ivan - please have a chat with Ralph in parallel to group vote

ivan: let's do this with Ivan, Jeni and danbri

danbri: ok
... between the three of us

<trackbot> Created ACTION-7 - With jenit, ivan talk to ralph about fpwd doc names w.r.t. charter [on Dan Brickley - due 2014-03-19].

<danbri> jtandy: not available for next week's call

<danbri> will submit a 'postal vote'

<AxelPolleres> regrets for next week as well, but FWIW I (WU) am fine to publish FPWD, if it is allowed to pre-vote.

danbri: scribe volunteer for next week

[silence]

TPAC

danbri: face to face meeting at TPAC
... TPAC is great
... any objections to meeting at TPAC?

[silence]

ivan: nothing to add

<AxelPolleres> Santa Barbara?

<gkellogg> I believe TPAC is in Santa Clara again this year.

<danbri> http://www.w3.org/2014/11/TPAC/

danbri: aim for the monday / tuesday WG slot

<AndyS> 27-31 OCTOBER 2014 despite the 2014/11

danbri: aim to fill out form with Jeni later this week
... out of time
... thanks

<danbri> thanks all!

<danbri> TPAC is generally a much more interesting trip than travelling just to one WG meeting

<danbri> (but it takes a week out of your life)

CSV on the Web Working Group Teleconference

12 Mar 2014

Attendees

Contents

Model for Tabular Data and Metadata on the Web

use cases and requirements

TPAC

Summary of Action Items