See also: IRC log
<trackbot> Date: 03 September 2014
AndyS, would you like to talk us through http://jena.staging.apache.org/documentation/csv/ and any lessons learned / plans?
<AndyS> Is that better?
<AndyS> It's not that noisy here - the mic picks up what I can't hear!
:)
supersenses
<scribe> Agenda: https://www.w3.org/2013/csvw/wiki/Meeting_Agenda_2014-09-03
<AndyS> ... so remember, I didn't say it , and it wasn't me
<AndyS> Cheltenham is not far from here.
Ivan's template doc. https://www.w3.org/2013/csvw/wiki/CSVTemplating_status
ivan: Uni. Illinois, introduces new member
<AndyS> Hi waingram (Bill Ingram)
Bill Ingram, U Illinois. Repository developer, manage a team of devs on institutional repo, archives, ...
experience w/ a lot of CSV via research datasets, things attached to electronic theses, dissertations, RDF a lot, XML etc.
danbri: anything missing / interesting in use cases doc?
bill: did look, it's quite extensive; happy to contribute, but not sure if it's necessary, is probably covered
ivan: please look if the various features that would be in another use case are already covered. If already addressed, we're probably ok
bill: will keep in mind. i saw something around science data that seemed close, will look.
dan: can we take a start w/ https://www.w3.org/2013/csvw/wiki/CSVTemplating_status
ivan: looked at state of things yesterday
... latest status I could find
in addition, from JeniT: https://github.com/w3c/csvw/tree/testing-variations/examples/tests/scenarios/uc-4
ivan: summarizing, ... we are heading towards a structure that's inspired by template systems like Mustache, used as an example here, ...
with hope that we could have one system/structure defined that can be used essentially in an unchanged manner for the various output syntaxes that we have
i.e. RDF's various serializations + JSON(-LD), XML.
that's the hope. all the examples here are in Turtle. Could've used JSON but Turtle was in email.
this is one thing.
ivan: the very simple, basic approach that may cover several use cases, is to have a simple template like Mustache
where the template patterns are keys that identify the names of columns, and the template itself
in jeremy's example, is a file that can be referrred to from the metdata file
in some cases could be inline even in the metadata
won't copy in from the page
that's probably where things are very simple and quite useful
where we got into complications, world is not that
simple, we need some sort of "variable" structure, ...
... if exists, can be used for templating
e.g. each col you can have a number of variables, defined by a regex, named
what it means is that if i'm working on a specific cell
in a col, then i check the regex, if it matches, then the corresponding
variable is considered to be true/replaceable
... in the template itself i could then use the cell value for an output
(that's the 2nd example)
but likely we'll want conditionals of some sort
if-then-else
so you'll need a way to use the variables as a kind of branching mechanism
there were 2 approaches to that
jeremy had a structure that he put into the metadata, ... which essentially said that depending on a variable being true, ... acceptable, ...
then he browsed into separate template files, that's how if/then/else was created
once you have the template it becomes very mechanical to generate the output
some risk of combinatorial explosiion of template parts /c omonents
so i went back at took at look at mustache's mechanism
trying to use here a template that uses # if ... and a variable name
ivan: this was more or less where we got to in the discussion
some things weren't settled
i tried to list the points of disagreement, issues etc.
unclear what to do with unmatched templates
simplest is that nothing happens.
[...]
ivan: ... must be a place where i can put
global templates; things that appear only once. Typical case is that if
I want to generate a prefix statement in Turtle. Templates should have
global values taken only once.
... i've put there as a separate issue, ... a repeat structure, ...
anything outside that can be either not be a template or ref to global
metadata keys
... also Datatypes
... in metadata doc, we have a number of datatypes for a cell
can define them per column, per row, per cell, ...
but those datatypes may not have an equiv in all the output systems
e.g. pure json, this doesn't have a direct match
in rdf or xml they're close due to xml datatypes
ivan: also the template syntax itself. needs
some care. e.g. '{{', ... does this work for Turtle? XML? probably. For
JSON? ugh etc.
... can this be parameterised?
... sure there are other issues here. that's where I got yesterday.
<AndyS> {{ is impossible JSON so that's OK?
Dan ask's AndyS for status / perspective
AndyS: Ivan didn't say anything much I
disagree with. Re templates, ...
... they might be developed by different parties, may be use cases where
diff templates make sense
there are technical things that can be done around sharing/embedding/nesting
alternative is one template with lots of conditionals
which is known to be problematic
<AndyS> "Liquid" - a different templating system - not/less HTML content focused - http://docs.shopify.com/themes/liquid-documentation/basics. Used by jekyll sitegenerator (and github.io).
AndyS: I'd also emphasise, that we need to be careful ... re Mustache etc, that if W3C is to produce a templating language within the standards world, ...
then I presume we would need to specify the templating language ourselves, even if it is a close match to external work, to give a standards process and control
scribe: that might be an issue from w3c point of view
AndyS: also reporting, Google Summer of Code student work
we got a great student
(Jena?)
will work on basis for potential w3c work, csv to rdf
mappings
... used a mechanistic mapping for now, as there was no proposal ready
to implement
from JeniT: https://github.com/w3c/csvw/tree/testing-variations/examples/tests/scenarios/uc-4
https://github.com/w3c/csvw/tree/testing-variations/examples/tests/scenarios/uc-4/attempts/attempt-1
danbri: a common repo structure sohuld let us try out different design candidates
ivan: responding to AndyS; re multiple templates on same file, ... jeremy's clever trick, ... ref to the template is in the metadata
[train noises]
[steam train!]
ivan: part of the metadata for a specific
file, the way the model works, ... you can have metadata from diff
sources, you aggregate those to get the final metadata
... if someone wants his or her own, can use links/refs [...]
(handleable)
ivan: AndyS, you're absolutely right, in
that we can't make standard ref to the Mustache project as it may change
... so we need to specify it ourself in full detail
... to be v clear about it and address high-level question: i've written
down what the template language approach would mean
... and what we've realised here after all the discussion, is that we
cannot get away with something super-simple because we hit if-then-else
requirements very early
... AndyS emphasised this very early on.
... I'm not saying personally that the template mechanism, ... that this
is the ideal one, ... nor that this is THE solution that we must follow,
...
<ivan> https://github.com/w3c/csvw/tree/rdfconversion-ivan
Ivan: maybe by exploring this route, we
might need to step back from it. A while back I tried a purely
mechanistic way, ... using CSV + metadata (link above here)
... this essentially writes down the mechanical generation of RDF, same
could be used for others
we could say that this is what we define, where we can attach an XSLT script, RDF sparql transform, etc
that this might be as far as we go
... we have explored templating langs so far, but we need to consider
possibility that we back off, and use some simple structure + exploit
existing tools
... I'd be perfectly happy with that as well
AndyS: reiterating a point I've made before. By involving xslt, sparql, js, ... etc. We would disenfranchise all those people who don't use those tool chains.
q
ivan: ii'm not saying you must use XSLT, but that we could/should/might provide a way to indicate XSLT
[...
]
ivan: not just expressivity of lang ... but what we can reasonably define and get accepted by the community
AndyS: on that Q, has there been pushback?
danbri: not that i've heard
AndyS: [concerned by indecision]
ivan: ... "this is _a_ candidate road, but
what we need feedback on is not just the lang, but whether we need a
language in the 1st place"
... alternative is that we define a structure and mechanistic mapping
plus metadata for additional other mapping mechanisms
dan: how much work to take strawman through to a Tech Report, framed as a strawman
ivan: wiki page is close enough to First
Public WD territory, with medium amount of work
... but i don't have a lot of time to try
same q to Andy
AndyS: hearing contradictory msgs, ...
... catalog of examples with lots of diff formats
... ppl like jeremy have spent quite some time on an approach that works
for them
<AndyS> Charter?
<AndyS> http://www.w3.org/TR/csvw-ucr/#R-CsvToRdfTransformation -- published so expectation setting?
<AndyS> +JSON, +XML
timecheck
AndyS: [we don't normally have existential crises each week!]
sorry, kicked out of teleconf room