W3C

CSV on the Web Working Group Teleconference

05 Nov 2014

Agenda

See also: IRC log

Attendees

Present
Greg Kellogg (gkellogg), Ivan Herman (Ivan), Dan Brickley (danbri), Eric Stephan (ericstephan), Jeni Tennison (JeniT),  Bill Ingram (bill-ingram), Andy Seaborne (AndyS)
Regrets
Axel, Jeremy
Chair
Jenni
Scribe
danbri

Contents


<trackbot> Date: 05 November 2014

Review of F2F decisions and actions

<JeniT> agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-11-05

<ericstephan> I will be dropped off of IRC at half past the hour to take my daughter to school, I will remain on the phone.

jeni: we'll go over list of resolutions/actions from f2f; and then go through issues from github that are flagged as needing attention.

<JeniT> http://lists.w3.org/Archives/Public/public-csv-wg/2014Oct/0110.html

see ivan's super-helpful list at url —^^

jenit: regarding resolutions, we talked about defining title + language as our minimal metadata, which is now in the draft. As is resolution not to support geopoint; 3rd, not object array geojson; also 4th. Also syntax resolutions

…but I have not yet put anything in on the imports property yet, still todo

JeniT: Implemented "We drop the ability in the schema to specify metadata at the row or cell level"

<JeniT> sorry, Skype dropped

Ivan: some resolution for conversion doc. We have not yet touched the doc itself. Jeremy and I talked yesterday morning on resolutions there.

… I pushed them all onto the issue list

ivan: some issues that we pushed there, … and I think the resolutions are fine, of course there are issues that we have to solve
... Jeremy and I have other commitments so we won't change the actual doc until next week or so. There are a bunch of issues on github

… some are trivial issues but still need decisions

[missed]

[re needs teleconf discussion flag in github]

jenit: can we run through those?

(still going through those resolutions and actions)

jenit: URI Template piece, …

REgarding """We will use URI templates for generating URLs for objects created from rows, for mapping of cell values, and for predicate URIs""" … I have added some things to metadata doc for that

but not cell values, or uris [missed detail]

jenit: have created github issues for each issue in syntax and metadata docs

… have put the csv configuration [echoey noises] …

[strange noises like in a carpark stairwell]

<ericstephan> okay now

it comes and goes

you're fine now

the strangled dalek thing doesn't suit you

jenit: so i was asking if there was anything that anybody thought of around those resolutions. But we were all there.

<JeniT> “For now, and without limiting our future work, that we scope to the simplest possible thing that might work and therefore do not have multi-object per line or multiple columns per value"

yup, that's all i remember recording

ah CG chartering

ivan: we mentioned possibility of a per-column skip flag

… did we agree on this?

jenit: still pending

ivan: somrthing to be recorded in issue list
... my proposal is to have it, for the following reasons: the URI template is rich enough, if my understanding is correct, that I can create an object URI that is put together based on several cells of the same row

… because a URI template may contain several templates, each can contain any col name, which means that combining … e.g. one col gets the day, another the month, then these can be combined via URI Templates, in which case one of the two cols becomes superflous in terms of the mapping

+1 i'm persuaded.

jenit: my pushback would be that first, it may well be output specific. You might want to preserve it in RDF but not in JSON

ivan: that's true

-1 I'm now unpersuaded.

jenit: … basically if you're creating an URL, they're supposed to be opaque, … if they are being built from meaningful information you should be capturing that meaningful info elsewhere anyway

… postprocessor can always ignore data

<JeniT> PROPOSAL: We don’t add a flag that indicates that a column should be skipped when mapping to other formats, because we don’t want people to have to hack URLs to get hold of data

<Zakim> gkellogg, you wanted to ask about URI unencoding URI templates not used for URIs

gregg: talking about use of URI Templates for things other than URIs like dates

… discussed possibility of unencoding some of these outputs

(data: URIs too? --danbri)

gregg: you could use a URI Template, and emit spaces which become %20 escaped markup

… if you wanted a literal string value, unencoding the result of that template would let you get the result you wanted i.e. with real spaces

<JeniT> PROPOSAL: We don’t add a flag that indicates that a column should be skipped when mapping to other formats, because we don’t want people to have to hack URLs to get hold of data

<ivan> +1

ivan/jeni: lets come back to that later

<gkellogg> +1

<ericstephan> +1

<bill-ingram> +1

<JeniT> +1

<ivan> RESOLUTION: We don’t add a flag that indicates that a column should be skipped when mapping to other formats, because we don’t want people to have to hack URLs to get hold of data

+0.01

<ericstephan> its still positive

ivan (re gregg's point about URIs to strings): I am pretty opposed to using URI templates for anything but URIs

(I agree w/ Ivan, let's not endorse URI Templates as a replacement for Mustache/Django/XSLT/etc)

ivan: if I see a URI Template, what I'll make is a URI Reference, otherwise a literal

… so that means if somebody would unescape things that would be a wrong URI

… could v easily lead to problems, should be a URI

gregg: (or ericstephan?) … re simple mappings, only mechanism for multi-column combination is URI Templates

ivan: my example is dated URIs
... I've been convinced not to do that anyway

did JeniT rejoin?

ivan: URI Templates are (for) URI Templates

jenit: since we discussed it, and I don't want to come back on it again, it would be good to have it as a decided issue

… maybe I'll do the proposal for it

<JeniT> PROPOSAL: We only use url templates for url values, not for atomic values (eg dates)

<JeniT> +1

<gkellogg> +1

<ericstephan> +1

<ivan> +1

<ivan> RESOLUTION: We only use url templates for url values, not for atomic values (eg dates)

+1 (assuming url means URI/IRI/etc too)

<bill-ingram> +1

jenit: so we have that decision for reference

… moving on to list of post-f2f actions

first was for Dan to write something about sitemaps.

dan: continue please

jenit: had an action to do something to section 3.4, [DONE]

… one on axel who isn't here

… one on dan/jeni to talk to people who aren't active, needs doing

<AndyS> regrets for at least the next 2 telecons and may be more.

re dan to set up CG - will discuss later

one on ericstephan to talk to bernadette, … started but continue action please

one on gregg looking at github/respec linkage

ivan: gregg showed me what to do, i changed the conversion doc, it is not 100% but we can now link to the issue tracker

<ericstephan> dropping off of IRC, will stay on the phone with (509-554)

… i was thinking more how to link from issue tracker back to the relevant document area

ivan: in the doc you can link to an issue pretty easily, so copy/paste from the two conversion docs to get the markup. Works fine for that.

gregg: on reverse linkage it is possible to go back from individual commits to visibility within the issues, use the #-sign with the number and it will show up accordingly.

… doesn't take you to a place in a doc as such, but to the 'diff' view.

ivan: it is a good habit when we push something, that we list the issues that are relevant to the changes in the comments.

gregg: ideally a commit is relevant to a particular issue

ivan: great

draft https://gist.github.com/danbri/5593faaa79a9ef30c098

<gkellogg> If you include, say “issue #1” in a commit message, it will add a reference to that commit in the issue on GitHub.

AndyS: need to be a little bit careful, … since this WG isnt 100% clear on what it'll do

… maybe better to wait til we're clear on what we'll do here

jenit: i think we have a pretty clear consensus from last week about the limits of the mappings that we're considering

AndyS: dan's wording was "supplement, support and enrich" the WGs output

jenit: so your concern is that it might sound too close a rel

andys: maybe better to say that it goes beyond the work of the WG and goes into new areas
... that it doesn't draw the line between the two

jenit: maybe it should mention the extension mechanism

andys: good idea

<scribe> ACTION: danbri circulate revised Advanced Mappings CG draft to list, reflecting today's discussion [recorded in http://www.w3.org/2014/11/05-csvw-minutes.html#action01]

<trackbot> Created ACTION-56 - Circulate revised advanced mappings cg draft to list, reflecting today's discussion [on Dan Brickley - due 2014-11-12].

[dropping action on ivan to … something something url templates for predicates]

ivan can you clarify which action was dropped?

<JeniT> PROPOSAL: we will not use url templates for predicate urls (for RDF mapping)

<ivan> +1

<JeniT> +1

+0

<gkellogg> +0

<bill-ingram> +1

(my mood is: URI Templates are super powerful, and will need code library support, so if they're in our toolset why not exploit them, at least when generating URLs)

resolution?

<ivan> RESOLUTION: we will not use url templates for predicate urls (for RDF mapping)

(I think it is important that predicate URLs can be from well known vocabs, but there are various ways to achieve that)

jenit: action on me to create foreign key pattern, is in metadata spec based on uc 4 and other examples, would be good to get that review. link:

<JeniT> http://w3c.github.io/csvw/metadata/#examples

jenit: considering uc-4, at schema level, refs are between schemas. All instances of these schemas will be referencing each other, ...

… showing how that might work, whether it works

… borrowed pattern from the data package schema

ivan: i think that what we discussed is that the primary key would essentially dissapear, but you reuse it in example 22, and i'm not sure what it means

… but it is certainly different than what we had in prev version

jenit: primary key refs a col or number of cols, which provide a primary key for each row

ivan: but why do you need it

… the ref in the other table refers to the col name in the other table

jenit: you don't need it for the foreign key bit to work

ivan: i think what we said is that the primary key is used so far to generate a URI for a row, …

… but that now has been exchanged against a URI Template for a ro

w

… my understanding is that primary key as such dissapears

jenit: there's a validation point around primaryKey which is that … e.g. primaryKey is family_name, given_name

… which is that you're asserting that comb of fields on each row is unique

… you don't get that from URI Template

… yet it is still a useful thing to say

ivan: I understand. But we must be clear that if primaryKey generates [missed]

jenit: i think separate. talk your point that we need to be clear. will record issue.

<JeniT> https://github.com/w3c/csvw/issues/63

<Zakim> danbri, you wanted to note that Kingsley / OpenLink volunteered to review - is it ready?

(jenit will act on this after issues are clarified)

ivan: memory fading but … we got to a structure whereby it turns out that generating the proper refs and URIs etc is relatively easy, …

…when i look at what you have here it looks like it is not

ivan: how do i generate in 1st country slice, i have to generate a URI into the other one

… i think we said we'll use frag id to get into the other one

… but when i see just one row for e.g. Andora

… I know from the metadata that

there is a ref somewhere to the other stuff

for the country column, but how do i find out

which row i should refer to?

let's say for Afghanistan you have 3 rows

ivan: or do we say that we refer to the column as a whole in terms of the fragment URI?

jenit: can we work through this in an issue rather than doing on phone?

ivan: we have an action with jeremy to work out what we have to do

… i realised/remembered we came with some simplification of the foreign key that made it easy,

jenit: I'll try to reconstruct that using eg. 22.

ivan: buffer overflowed

jenit: rest of actions —

were on jeremy and ivan, all deferred until roughly next week or after when Jeremy is back.

last 10 mins-

<JeniT> https://github.com/w3c/csvw/labels/Requires%20telcon%20discussion/decision

Github issues

Reviewing issues at https://github.com/w3c/csvw/labels/Requires%20telcon%20discussion/decision as things we can try to close or discuss.

jenit: any blocking ivan in particular?

ivan: not particularly. We should try to have some discussion of each on email, generally. These should have fairly quick resolutions.

jenit: let's go through them

<JeniT> https://github.com/w3c/csvw/issues/7

issue 7: q was whether generated rdf should carry some provenance info

<ivan> http://lists.w3.org/Archives/Public/public-csv-wg/2014May/0110.html

… he had a sketch of what might be put there, which comes from the PROV WG's vocab and could be added verbatim with few changes

issue is whether we should have something like this, vs not, with simple mapping

ivan: seems reasonable to have something on provenance in generated rdf, but might be overkill

jenit: so basically including provenance info, automatically, about the generation of the rdf, in the rdf that is generated

(feels like a 'MAY' not 'MUST' on tooling; —me)

ivan: there are two, simple vs complex, … the simple example seems pretty clear

… but i don't know whether it is useful

<gkellogg> Seems like a good idea.

ivan: I've fine w/ dan's having it as may on tools;, but would still need speccing

'informational' section of spec?

danbri: suggest giving it as an example of where tools can go beyond the core spec

ivan: should we mandate the minimal though?

… i'd be fine saying it is the minimal thing that ought to be there

jenit: github issues + their comments for majority of our technical discussion

… please flag any that you want discussed at telecon time

<ivan> PROPOSED: An RDF conversion processor MUST generate provenance information of the form of the first snippet in http://lists.w3.org/Archives/Public/public-csv-wg/2014May/0110.html, and MAY generate richer provenance information

… we'll use that as primary way we'll allocate our call time each week

jenit: ivan's proposal —^^

<gkellogg> +1

<JeniT> +1

<ivan> +1

+1

<bill-ingram> +1

<ivan> RESOLUTION: An RDF conversion processor MUST generate provenance information of the form of the first snippet in http://lists.w3.org/Archives/Public/public-csv-wg/2014May/0110.html, and MAY generate richer provenance information

ivan: i'll add to issue list and then close it

jenit: once you've done the draft

… when it's actually in the doc, rather than when decided

(yeah, let's write 'RESOLVED' into the still open issue)

jenit: kthxbye

Adjourned.

thanks all, thanks bots :)

<ivan> trackbot, end telcon

Summary of Action Items

[NEW] ACTION: danbri circulate revised Advanced Mappings CG draft to list, reflecting today's discussion [recorded in http://www.w3.org/2014/11/05-csvw-minutes.html#action01]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-11-05 16:30:33 $