W3C

CSV on the Web Working Group Teleconference

07 Jan 2015

Agenda

See also: IRC log

Attendees

Present
Ivan Herman (Ivan), Gregg Kellogg (gkellogg), Bill Ingram (bill), Jeni Tennison (JeniT)
Regrets
Chair
Jeni
Scribe
JeniT, gkellogg

Contents


<trackbot> Date: 07 January 2015

<JeniT> sorry, just trying to dial

<bill> JeniT: lots of issues needing discussions — what should be focus on?

<bill> ivan: would like to see an example where metadata comes from different places

<bill> JeniT: examples don't [currently] represent that situation

<bill> …propose discussing examples then import/metadata issues

<bill> ivan: prefer to look at merge and import stuff first

<JeniT> http://w3c.github.io/csvw/syntax/#examples

<bill> gkellogg: trying to understand the merge order

<bill> …how to use data from meta file, e.g., skip columns, before ever seeing the csv itself

<JeniT> https://github.com/w3c/csvw/issues/145 is ivan’s point

<JeniT> ie how to merge metadata

<bill> … let's do combination first

<JeniT> https://github.com/w3c/csvw/issues/145#issuecomment-68766764

<bill> JeniT: point is to separate when we're talking about merged metadata document used to annotate tabular data model

<bill> …issues raised around how the metadata file is created from multiple metadata files — using imports

<bill> …why did we think we needed imports in the first place?

<JeniT> http://www.w3.org/2014/10/27-csvw-minutes.html#item09

<JeniT> “RESOLUTION: We use an ‘import’ property in the first metadata document found through the precedence hierarchy described in section 3 (but with inclusion of user-defined metadata); the merge is a depth first recursive inclusion”

<bill> …search for this text ^

<bill> ivan: I remember JeniT's comment that you have do a bunch of GETs in any case

<bill> gkellogg: you don't want to repeat yourself; import mechanism solves that

<bill> …but seems relatively advanced to start with

<bill> …first need to figure out how to deal with multiple sources of metadata

<bill> ivan: we had a slightly different situation: the unnecessary GETs were just one of the issues

<bill> …also have the possibility to talk about several files, i.e., directory-level metadata

<bill> JeniT: still have to repeat for each table common stuff

<bill> ivan: much more complicated in (e.g.) javascript because of async nature

<bill> …question is whether it is really worth it

<bill> gkellogg: complication is when promises are chained together

<bill> …if you encounter an import in there

<bill> ivan: right, you end up with recursive calls to promises — it's ugly

<bill> JeniT: let's move away from implementation in favor of focus on use cases

<bill> …do we need to merge metadata files at all — is that useful

<bill> ivan: wondering whether we can have a structure within the metadata file instead of relying on import

<bill> …some sort of a global structure that is conceptually copied into each

<bill> gkellogg: certainly the ability to have common table-level metadata is complicated by the fact that @id is required by table

<bill> …one way around that is to require @id *after* all processing is finished

<bill> JeniT: in that case schema is used in a slightly different way

<bill> gkellogg: schema is property of table group

<bill> JeniT: sounds like suggesting that some kind of table group property contains all these *global* properties

<bill> JeniT: propose we scrap imports in lieu of table group metadata

<bill> ivan: looking back at the structure from before the f2f

<bill> …you have few ways to get the metadata; also can do more than one, in which case you'd have to merge

<JeniT> it says “Processors must attempt to locate a metadata document based on each of these locations in order, and use first metadata document that is successfully located in this way.”

<bill> gkellogg: suggest you take the first one and then stop

<bill> …merge issues still exists in cases of user-supplied metadata, etc

<JeniT> http://w3c.github.io/csvw/syntax/#using-overriding-metadata

<bill> JeniT: example 2 (?) handles user-supplied metadata ^

<bill> gkellogg: whatever user metadata is provided, process is to coerce it into consistent metadata

<bill> …consistent == if title is in one it must be the same as in the other

<bill> ivan: several titles are put into an array

<bill> …the beauty and complication of import mechanism is that is defines a semantic merge

<bill> …we have to know which has higher precedence

<bill> …my personal view is that we take everything and step by step merge them using import algorithm

<bill> gkellogg: would like to differ that until we've addressed merging in general

<bill> ivan: two things: 1.) import property, necessary or not 2.) import as an algorithm for merging

<bill> JeniT: to move forward, propose we take the language from the syntax document about merging, and from the metadata document about where to find metadata files and how to merge them

<bill> …we have an example there for dealing with that

<bill> …regarding the import statement itself, more discussion is needed

<bill> …leave it in for now

<bill> ivan: can we formally propose it and discuss it over e-mail and see where it goes

<bill> gkellogg: don't want to belabor but once we have the merge order rules, the import directive will be more better understood

<bill> …proposed in my description a way in which it might work

<bill> …if we just think about title, and have several sources of title information

<bill> ivan: current merge algorithm handles this already

<bill> JeniT: name and title are handled very differently

<bill> gkellogg: i think name is set from title unless it is otherwise stated

<bill> ivan: titles can pile up in an array, but name must be the same

<bill> JeniT: name is a required property on column desc

<bill> ivan: i need to have a name, else I cannot convert to RDF

<bill> …according to the current algorithm, titles may be different, but if (at the first step) if i don't have the name, i don't have a column desc for the final metadata

<bill> …need to start somewhere

<bill> gkellogg: postpone validation until after merge has taken place

<bill> …what's the difference between an annotation and a property?

<bill> …title will always exist, name may not, so extract name from title

<bill> …allows space for other metadata to specify name

<bill> ivan: what i claim is that this does not work with the current merging algorithm

<bill> …if i have two metadata files with two arrays of column descriptions, if i can find common

<bill> …values, I can merge

<bill> …otherwise there is no way to find the merge

<bill> JeniT: reason is that current algorithm as described assumes every column will have a name

<bill> …since the creation of the metadata documents, we've introduced this issues

<bill> gkellogg: sounds like JeniT has a concept, which we'll need to work out

<bill> JeniT: I do have concept for how this works

<bill> …specified either by the implementation or the specification, it's only when you get to the metadata documents where that merge might happen

<bill> …in other words, there's work to do in terms of describing that in terms of metadata documents

<bill> gkellogg: it might just be useful to walk through an example to see what the process would be

<JeniT> http://w3c.github.io/csvw/syntax/#annotated-tabular-data-model

<bill> JeniT: will try to make examples for further discussion

<JeniT> ScribeNick: JeniT

<gkellogg> ivan: the current schema is complicated as two merged schemas may have a different structure.

<bill> :) bye

<scribe> ScribeNick: gkellogg

… If we make use of the order and number of column descriptions, then it’s easied by knowing they have the same number of columns.

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.140 (CVS log)
$Date: 2015/01/07 16:04:46 $