CSV on the Web Working Group Teleconference

24 Sep 2014


See also: IRC log


Dan Brickley (danbri), Ivan Herman (Ivan), Eric Stephan (Eric), Jeni Tenison (JeniT), Andy Seaborne (AndyS), Bill Ingram (bill_ingram)


<trackbot> Date: 24 September 2014

<JeniT> Agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-09-24

jenit: 2 items on agenda

informal discussion as not quorat

1st is templating; 2nd is use of schema.org, dublin core or other metadata vocab, and building them into recognised terms within the metadata vocab


jeni: following on last week's discussion

<JeniT> http://lists.w3.org/Archives/Public/public-csv-wg/2014Sep/0072.html

if you count some weeks twice

jeni: from our discussion last week, we pulled out some of the issues

took a straw poll on options

Jeni: three lessons for me - 1st of all, the probable need to point to multiple templating languages *anyway*
... to have some some extensibility there

2nd - andy's suggestion that perhaps creation of templ lang could be begun in a Community Group, or (re)chartered later

3rd - question of who would do the work

fact that we have limited numbers of people in the WG and stepping up to edit specs, which limits what we can take on as a group

thoughts on these?

ivan: In your mail you also referred to ... templating, not clear what we mean by it. Complete vs minimalistic.
... refs to others would be the most complete ones
... in some sense, that also directs the possible ways fwd
... I had impression that having a complete template language, and by complete I mean a lang that can solve all our usecases, ... I think it is absolutely unrealistic. And the no. of people on today's call provides another argument against that.
... when you speak of a templating language, what level exactly do you have in mind?

<JeniT> danbri: some question about whether a templating language might be able to take on basic default mappings

<JeniT> … ie a default template for a boring/mechanistic mapping

ivan: default in sense of conceptual mapping

(does it need to be expressible in a templ lang, or built-in)

ivan: should be minimal level that a templ lang should have.
... q is whether we have capacity to do even that minimal one. That's the real issue.

Jeni: I had a bit of a tinker last weekend, looking into what an XML version of the data from uc-4 might be, and how that shows up w.r.t. mapping rules.
... in doing that exercise, it highlights some of the issues. For example, looking in particular at URL generation stuff that Andy was talking about last week. For the example I tried, it was clear that you need some level of regex processing. Replacement, lowercasing, basic string manipulation, to create URLs.
... which is a basic thing that we do need to support.

(Asks AndyS re url gen)

AndyS: I haven't done any concrete experimentation, looking at what others here have been doing to create URLs.

But agree, broadly. ... People do seem to care quite a lot about presentation of URLs, ...

so point about regexs, cleanup etc to avoid nasty-looking urls, uc/lcase etc.

scribe: going from DB IDs that might have - or : etc in them, etc.

ivan: we discusses this a little a few weeks ago

<ivan> https://github.com/w3c/csvw/tree/filtered-templates/experiments/simple-templates-jquery

I went ahead and played with what a minimal template should be [link]

this essentially takes Andy's filtering mechanism discussion, ...

scribe: this is something still relatively simple

ivan: I'm sure there are corner cases, but that stuff covers a lot of what you're saying

<AndyS> UC-4 -- http://www.w3.org/TR/csvw-ucr/#UC-OrganogramData

ivan: note that there are no conditionals, variables etc.
... that is there as a kind of a proposal but that's why I come back to the same issue. There in a rough level, needs packing, taking care of, ... all the escaping mechanisms, ... put into REC form etc.

even with that simple level like this one, there's a lot of work to be done

<Zakim> danbri, you wanted to note that normal Web sites have URL requirements too; any evidence from Mustache etc world?

scribe: I can definitely NOT take the responsibility to turn (something like) this into a fullblown REC all by myself - just not on the books

danbri: suspect there's a hunger out there in mustache etc world for url cleaning functions

ivan: having looked at mustache, ... we can't consider it simple. Complexity level is way beyond what we can do.
... for us to spec something it would need to be smaller

(I'd be happy with extensibility mechanism and a few conventions on top of non-w3c mustache, for now)

andys: for mustache, you build a map of values prior to templating

so that's where the mustache pipeline would put the work

<Zakim> JeniT, you wanted to make the point that templating URLs is a smaller problem than templating full mapping to other formats

<JeniT> https://tools.ietf.org/html/rfc6570

Jeni: agree with that. IF oyu look at the URI template RFC, ... then it doesn't have any of that manip stuff at all, ... assumes those variables are set prior
... generation of urls is in some ways a diff problem, subset of, from generation of the entire data in a different format e.g. vcard

so i think we're likely to run into wanting to template URLs even if we decide not to go the full hog on the templating language

ivan: If I have a template, ...

[rummages for template]

<ivan> https://github.com/w3c/csvw/blob/filtered-templates/experiments/simple-templates-jquery/tree-ops/tree-ops-turtle.tmpl

scribe: I took your example and I turn it into a template in the version I have

which does include things about URIs

some of these things are part of the template

ivan: you relaly have to have templates in one specific case, for one specific value... other part can be put textually into template
... if you have a function (in my lang I call it a filter) which has simple ops, eg. encoding for %20 etc., I think on the templating level this is all you need

you don't build the whole thing from pure templating

ivan: so you'd need a number of simple filters, e.g. the various filters defining SPARQL
... you have to take part of those, they're relatively well spec'd
... which gives us more or less what you need.

<JeniT> https://github.com/w3c/csvw/blob/gh-pages/examples/tests/scenarios/uc-4/attempts/attempt-3/gov.uk/schema.json

jeni: kind of thing that I was working on, to demonstrate diff approach :---/^
... a metadata spec where I have a thing for creating some XML that looks like:

<JeniT> https://github.com/w3c/csvw/blob/gh-pages/examples/tests/scenarios/uc-4/attempts/attempt-3/output/output.xml

<JeniT> "rowURL": "http://reference.data.gov.uk/id/department/{lowercase(acronym)}",

ivan: apart from syntax, it's quite similar

jeni: it's different in that there's not template file that does whole thing

ivan: yes, its all in the metadata here

<ivan> https://github.com/w3c/csvw/blob/filtered-templates/experiments/simple-templates-jquery/tree-ops/tree-ops-turtle.tmpl

<ivan> @id.lower

ivan: I think we're talking pretty much about same things

jeni: I don't!
... my point is that if mapping is done from metadata file, we might want basic URI templates like this, without full templating

ivan: if we're having a default mapping, then everything else is outside of the mapping

<AndyS> +1 to URI templates in metadata.

ivan: pushing things into the metadata file is opening the floodgates

AndyS: [muted?]
... Agreeing with point about URI templates in metadata
... A speculative idea. That the WG just looks at URI templating for a while. Puts aside shaping the data. Just to address a smaller problem.

That will also get us all used to the capabilities of templating in general.

scribe: then come back to wider solutions

AndyS: Just floating the idea, not advocating for it.

Jeni: I'd be v happy moving fwd in that way

ivan: still not sure. then we have to have a separate def, regardless of the uri tempalting, of what the default mapping is
... and we don't have that
... we have to define the default mapping in general terms

and then we try to fold into this some kind of uri mapping

my own idea, if we had a v simple language for templating, then it's something i can describe at least conceptually even if an impl doesn't do it that way

<Zakim> JeniT, you wanted to say that we need a default mapping anyway, not least to determine the basic requirements from the templating language

jeni: my opinion, that we need to define those default mappings anyway

not least to gain an understanding anyway

so it seems to me that this is work that we need to do, that we're chartered to do, ...

need to do it anyway

need some kind of def of what happens if you don't have a template

ivan: agree it has to be done. Would go one step beyond, ... let's do a 1st draft of the 3 default mappings without the URI story.

then we can look at it again

<Zakim> danbri, you wanted to ask AndyS et al if there's impl feedback from http://www.w3.org/TR/rdf-sparql-json-res/ that's relevant here

AndyS: ... it also has the goal of being able to transfer rdf terms
... so doesn't hae the json-friendliness of json-ld

speculating, it is the most common sparql results format

typically fed into rdf-sparql-json

faster parsing can matter for large resultsets

layering in xml, everyone finds consistently impacts clientside, across langauges

scribe: some people use csv too

jenit: concrete suggestions: initially use default mapping extremely basic for target formats

i tink we have some of that in wiki form in various places

scribe: matter of pulling that together
... look at then URL templating, and fuller language templating requirements

<ivan> https://github.com/w3c/csvw/blob/rdfconversion-ivan/csv2rdf/index.html

ivan: a long time ago I did this --//^^
... back in may. It was an attempt to document a default transformation to RDF as a model
... could be out-dated since metadata changed since
... available if anyone wants to pick it up.
... It was such that a version of this adapted to XML or JSON would be relatively easy, except that you can get into long existential debates on attrib vs element in XML

Jeni: Any objections to next step being v strawman version going into, and coming out of, the f2f?


ivan: yes, good 1st step

<ericstephan> +1

jeni: 2nd q: is there anybody who feels able to take on strawman for RDF

ivan: [?]
... the one I did is the rdf default mapping

jeni: ... volunteering to take that fwd?
... we can bring this to the f2f, or give it a look over to update for latest metadata etc specs

ivan: i can do that

<JeniT> ACTION: ivan to bring RDF mapping up to date with other specs, for discussion at F2F [recorded in http://www.w3.org/2014/09/24-csvw-minutes.html#action01]

<trackbot> Created ACTION-30 - Bring rdf mapping up to date with other specs, for discussion at f2f [on Ivan Herman - due 2014-10-01].

jeni: any volunteers for a v rough json strawman?

ivan: the doc I wrote was fairly general, so that it could be easily adapted to json and xml
... having somebody doing it in parallel might lead to duplication
... so probably should be done once there's an rdf strawman
... and then look at it that way

jeni: OK, can you do something this week, to give others a chance to build on it pre-f2f?
... re spec, in gh-pages you get github.io pages

<Zakim> danbri, you wanted to note that TPAC hotels are filling up, apparently

Metadata vocabs: DC, Schema.org etc.

jeni: metadata vocabs - which to use?

[/me notes that JSON-LD's capabilities around @context are quite confusing, re mappigns, defaults etc]

jenit: DC, schema.org which builds on DCAT/Void etc
... point being that we need a standard way within metadata to say - this is the 'license' for this dataset

so that tools can understand

and process based on metadata

jeni: I'm most interested in license, as that's key for open data,
... any suggestions?

"should we just adopt schema.org within the metadata?"

eric: I've been involved in data on the Web best practices
... seems that shema.org is at least referenced
... wondering if there is a particular approach that the W3C Data Activity advocates
... this WG is under that
... any cohesion on these things across Data Activity WGs?

ivan: I'm not following that WG directly, but ... if that WG is working on what metadata should be added to diff things on the Web, then yes we should have some coordination/centralization
... which is indep of whether we use schema.org
... expect that WG would say that license is something that's needed
... and we shouldn't do something different.
... obvious case for liaison is during TPAC. I see that Phil is also on our WG meeting, so it should be possible.
... so that's an answer to Eric here.

jeni: It would be useful to talk to Phil pre-TPAC re direction
... would i be right to say they're probably not citing it normatively
... ie. more like a recommendation rather than bringing it in as a spec

ivan: so this comes to the other issue that I did raise on the mailing list, and needs realising
... danbri don't take it personal
... having a normative ref to schema.org will raise all kinds of discussions and issues that are way more complicated than we think
... not sure that this is something that we want to go down
... big discussions on public-vocabs

<Zakim> JeniT, you wanted to suggest that we adopt the schema.org terms but reference the normative terms on which those terms are based

jeni: I had a suggestion - squaring that circle - which would be to adopt exactly same set of terms as in current schema.org but have their formal definitions from other sources

e.g. license could point to dc:license term definition

scribe: even if named the same
... so the mapping would be an implicit one

jeni: spec would say that the metadata terms you could use are catalogues, spatial, etc etc ... i.e. the properties that are defined on schema.org Dataset

<JeniT> http://schema.org/Dataset

but instead of defining them by ref to schema.org dataset, it would define them by ref to the orig vocabulary they arose, when there is one

so e.g. spatial and temporal are dublin core terms

we could say they mean what they refer to

in dc

ivan: so what would be the purpose to use schema.org here, if we refer to DC?

jeni: It would mean we'd be compliant with, consistent with, schema.org Dataset, ... but normatively ref other vocab.

ivan: but if json-ld has a context, it would still refer normatively to the URIs in the schema.org vocabulary.

jeni: that's a good question - I don't know whether the json-ld context allows you to point to two things, or to say that they're equivalent.

AndyS ... can barely hear you

<JeniT> ergh, wifi problems

AndyS: W3C getting itself into knots about using something else that's on the Web,
... a shame that it ends up on this WGs plate

Ivan: referring to DC is ok

[ btw http://www.w3.org/2013/09/normative-references ]

scribe: referrring to anything outside W3C requires stability around the technology. That there is a clear IPR situation around those terms. That there is a clear process on how those things evolve in time, now and in the future.

org is the issue i think - see http://www.w3.org/2013/09/normative-references#orgs

ivan: evolution of discussion around w3c may lead to clarifications

today these issues are still subject of lots of discussion

and that's why we'll run into those issues as well

the way schema.org operates is for many a question

witness the discussion going on right now

see lists.w3.org/Archives/Public/public-vocabs/

scribe: that's the unfortunate fact.
... we saw ... other WGs tried to successfully or not refer normatively to WHATWG docs
... all discussions that came up there

WHATWG generated much more discussion than schema.org ever would, hopefully

If we all decided that this is what we want, we can try going there, i'm just saying that this is a rocky start

<Zakim> danbri, you wanted to say that schema.org could add more equivalentProperty / Class mappings to its master file to support that

danbri: [says that.]

normative refs spec, please read :) http://www.w3.org/2013/09/normative-references#orgs

jeni: I think we'll need to return to this on the mailing list, [or] have it as a discussion item at the f2f.
... esp informed by what the other best practice recommendations are
... we'll consult with phil, and try to reach a decision based on that.

jenit: AOB?

reminder: Please register for TPAC, come to F2F [and book your hotel]. We'll get a phone for dial-in.

[or bring a tent]

ivan: see website for nearby hotels. don't leave til last minute!

<ericstephan> I have an RV :-)

Summary of Action Items

[NEW] ACTION: ivan to bring RDF mapping up to date with other specs, for discussion at F2F [recorded in http://www.w3.org/2014/09/24-csvw-minutes.html#action01]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.138 (CVS log)
$Date: 2014-09-24 13:27:05 $