See also: IRC log
see also http://lists.w3.org/Archives/Public/public-csv-wg/2014Nov/0029.html (link added to agenda wiki)
<danbri> http://lists.w3.org/Archives/Public/public-csv-wg/2014Nov/0029.html
<danbri> <- issue list
danbri: trying to be more driven by GitHub issues.
… We’ll postpone message issues until JeniT comes on.
<danbri> i'll look up the others later
jtandy: ivan is on route to Australia.
… We can talk on teleconf when no progress made on GitHub.
<danbri> https://github.com/w3c/csvw/issues/62 Should the RDF/JSON
<danbri> transformation check the values?
<danbri> - CSV to JSON mapping, CSV to RDF mapping
… first: issue #62
… we thought the mapping issues could assume all parsing has already been done.
<danbri> (see github for detailed text, we are not reminuting it all)
… question is, if we can’t perform a transformation, do we proceed or die?
… My thought is that we can assume parsing has already thrown out bad data.
danbri: I think this touches on what is being defined: conformance criteria or behavior.
… If it’s a class of software, we should just do the easiest thing. Advanced implementations might do better without being non-conformant.
… A perfectly conformant processor could just pass everything through.
<danbri> eg. we might say "conformant processors are not required to …" so we only require pass through
<danbri> gregg: it adds complexity/branching to testing
jtandy: summarizing gkellogg, in order to give us a spec which is testable, we’ll go with the pass-through.
<danbri> dan proposing simple pass through
<danbri> gregg: advanced processing needs to be under control of a flag, so all processors can be guaranteed to give same output from same input
jtanday: I’ll take an action to update GitHub.
<danbri> I suggest a "Resolution (of those present): " since we have low turnout
resolution of those present: different processor flags to control parsed or raw output.
… pass through literal values in conformant mode, and allow advanced processors to do some contextual parsing/fixing
<danbri> conformant mode passes through literal values, advanced processors may offer additional contextual checking/fixing (via flags)
<danbri> gregg: additionally, processors may support an advanced processing flag, which will allow us to test that advanced processors produce consistent output (if that's not over-constraining them)
<danbri> jtandy to capture this into github issue
jtanday: I’ll pass that through, and when we come to actual implementations, we’ll check back.
what should the mapping of an empty cell be for RDF and JSON
danbri: 61 not flagged for discussion
<danbri> https://github.com/w3c/csvw/issues/59 How should ``language`` be
<danbri> used in RDF mapping?
<danbri> - CSV to RDF mapping
<danbri> Ivan wrote " • If the content of the cell is not datatyped, and is not a URI, and the language tag's value is not "en", then the generated object should be a language literal with the global language tag set."
jtandy: I’ll come back on that one.
… “How should the locale setting be used in the default mapping?”
… unless the data itself is saying the language of a cell is different, the language of the metadata should be applied to every literal.
<danbri> gregg: lets say default lang was english from default mapping, does the json now tag all its values with English, or is that assumed and untagged?
<danbri> (non-jsonld, normal colloquial non-rdfy json)
<danbri> jtandy: "plain old json"
<danbri> jtandy: suggesting … we transform verbatim and don't add locale info
… I proposed that Plain Old JSON (“POJ”) we don’t put in the default mapping into the output.
… So, the information in the metadata says it’s German, but that is not reflected in the output.
… we assume people can determine this from the complementary language mapping.
<jtandy> PROPOSAL: for RDF mapping, apply locale / language tag from metadata to all literal values in output
jenit: I think it’s all string values, because the number 2 is a literal string value without language.
<danbri> +1
<bill-ingram> +1
<JeniT> +1
<jtandy> +1
+1
<danbri> revisiting https://github.com/w3c/csvw/labels/Requires%20telcon%20discussion/decision
RESOLUTION: for RDF mapping, apply locale / language tag from metadata to all literal string values in output
<jtandy> PROPOSAL: for (plain old) JSON mapping, no locale information is added to the JSON output - we assume that people will look at the complimentary metadata for locale information
<danbri> gregg: in our discussion we talked about locale coming from mapping info, as opposied to locale info that might come from the data itself
<danbri> e.g. use of a particular col or diff lang. Are you proposing that we drop such from the JSON output also?
<danbri> jtandy: not sure how i'd write locale info in plain json
<danbri> gregg: you could use JSON-LD ...
<danbri> …but simple needs to be simple; people can use json-ld etc if they're more ambitious
<danbri> jtandy: agree
jtandy: if they care about localle, they should use RDF mapping with JSON-LD serialization.
<danbri> jenit: I think the metadata will say what the lang cols are in,
jenit: the could always trace it back from the original data. I think in the JSON mapping, the’ll use a property name to indicate that, or it would otherwise be implicit.
<jtandy> (if you want to say a particular locale for plain old json, you might say "property_en" or "property_fr")
<JeniT> +1
<jtandy> (e.g. the property has a human readable hint)
<jtandy> +1
+1
<danbri> +1
<bill-ingram> +1
<jtandy> RESOLVED: for (plain old) JSON mapping, no locale information is added to the JSON output - we assume that people will look at the complimentary metadata for locale information
subtopic: issue 39: What should be generated for a value with datatype in the case of JSON
<danbri> (where JSON is plain old JSON)
jenit: similar to language: how much structure to put in the output.
… I suggested recognizing boolean, numbers and null, and otherwise just map to a string.
<danbri> jenit: (in github), "Given that we're aiming for a simple JSON mapping for simple JSON users, I think the first option above is the right one: map to a simple string, number, boolean (or null) as appropriate for the datatype."
<danbri> proposal: re #39 for simple JSON we map to a simple string, number, boolean (or null) as appropriate for the datatype.
<danbri> +1
<jtandy> +1
<JeniT> +1
+1
<bill-ingram> +1
<danbri> resolved: re irc://irc.w3.org:6667/#39 for simple JSON we map to a simple string, number, boolean (or null) as appropriate for the datatype.
subtopic: issue 30: How to interpret fixed string type values ("Table", "Row",...)
jtanday: I assume they’ll figure this out based on context.
danbri: propose we just endorse the editorial decision.
jtanday: we’re trying to make JSON as “brutally simple” as possible.
<danbri> proposal: #30 aiming for json mapping to be super simple, we endorse the 2nd option as currently implemented by editors
+1
<bill-ingram> +1
<danbri> +1
jenit: I’d suggest the authors consider if anything can be considered tables columns. I’m not sure where the typed table mapping applies.
<JeniT> +1
<jtandy> +1 ... noting that further thought is required about whether things should be declared @type
<danbri> yup
<danbri> https://github.com/w3c/csvw/issues/20 Is row by row processing sufficient?
subtopic: issue #20: Is row by row processing sufficient?
<jtandy> resolved: #30 aiming for json mapping to be super simple, we endorse the 2nd option as currently implemented by editors
danbri: I propose that we know many CSV mappings have interdependent rows, for now we’re going to go with row-by-row mapping.
<danbri> ie. we push work onto preprocessors etc
<danbri> … and advanced mappings
jtandy: I think we discussed different alternatives, but agreed that the simple mapping is definitely row-by-row, but perhaps the templated mapping might want to consider holdover values from previous rows.
jenit: perhaps a flag in the metadata saying take it from the previous row seems relatively simple, but I’m happy to keep it super-simple for now.
jtandy: I’d suggest people just pre-process the CSV to populate those blanks.
<jtandy> proposal: keep things simple - row by row processing only
+1
<bill-ingram> +1
<jtandy> +1
<danbri> +1
<jtandy> resolved: keep things simple - row by row processing only
<JeniT> +1
<danbri> revisiting https://github.com/w3c/csvw/labels/Requires%20telcon%20discussion/decision
jenit: can we take issues in reverse order?
<danbri> https://github.com/w3c/csvw/issues/23 CSV Dialect Description
subtopic: issue 23:
jenit: acting on the F2F about trying to map over the flags we describe in the syntax doc into the dialect description within metadata.
… but, also trying to get consistency from the datapackage, such as header
<JeniT> http://w3c.github.io/csvw/metadata/#dialect-descriptions
<JeniT> PROPOSAL: we can close https://github.com/w3c/csvw/issues/23 as it’s sufficiently addressed by current draft
<danbri> +1
<bill-ingram> +1
+1
<jtandy> +1
<JeniT> +1
<JeniT> RESOLVED: we can close https://github.com/w3c/csvw/issues/23 as it’s sufficiently addressed by current draft
<danbri> Using JSON-LD for the metadata document
<danbri> oh sorry
<danbri> "Pattern string formats for parsing dates/numbers/durations", rather.
jenit: we discussed using pattern strings for parsing dates, numbers, durations, … in CSV files based on some kind of localle.
<JeniT> http://www.unicode.org/reports/tr35/
<danbri> see http://www.unicode.org/reports/tr35/
… The i18n guys pointed us at tr35, which describes the kind of format for pattern strings and relation to different localles.
… having looked at it, it’s quite complicated, and I think we could cleanly drop that as a 1.0 requirement, and layer it on as something extra that implementations can play around with during the 1.0 period to be considered for 2.0
… We had previously agreed to try to do this; there are strong requirements for parsing different dates and numbers, so I’m a bit uncomfortable dropping it
jtandy: I’ve not looked at the ISO datetime
standard, but I understand that it includes a number of structures for
how dates and times are recognized. Perhaps that would be a place to
start.
... I think the ISO standard allows things to be changed a bit compared
to XSD, but simpler than TR35.
<danbri> there was also http://www.w3.org/TR/NOTE-datetime pre-xml-schema
jenit: I don’t think it does, but it could. It may be that there’s some flexibility.
<jtandy> ACTION: jtandy to review ISO 8601 to determine if it supports 'locale' type strings for date-times [recorded in http://www.w3.org/2014/11/12-csvw-minutes.html#action01]
jenit: the other one is number format, such as using “,” instead of “.” as decimal point.
<trackbot> Created ACTION-57 - Review iso 8601 to determine if it supports 'locale' type strings for date-times [on Jeremy Tandy - due 2014-11-19].
danbri: is that covered by the dialect spec?
jenit: no, it’s not parsing the CSV into values, but parsing the values themselves.
<danbri> "1000,00"
jtanday: this is about picking up a string, which might have something like “nnn nnn,nn”, where it might be a decimal, vs a typo.
danbri: we ran into this with schema.org, and settled on the western method.
<danbri> see http://schema.org/price
jtandy: problem is, people don’t publish their data that way.
… we have a number of parsing directives, and having one to indicate decimal separator might be helpful.
<Zakim> danbri, you wanted to suggest "The CSVW Working Group considered requiring an implementation of http://www.unicode.org/reports/tr35/ pattern string formats. Given the complexity of
jenit: I was trying to think of a way forward which would enable us to make a more informed decision. I think Jtandy looking at ISO8601 would be very useful.
… I’ll look at what it takes to do the number parsing, and see if that’s something we want to go forward with.
<JeniT> ACTION: jenit to investigate what number parsing would look like if done right [recorded in http://www.w3.org/2014/11/12-csvw-minutes.html#action02]
<trackbot> Created ACTION-58 - Investigate what number parsing would look like if done right [on Jeni Tennison - due 2014-11-19].
<danbri> "Using JSON-LD for the metadata document "
jenit: this is just a copy-over of the issue as it was in the document. I think it’s completely resolvable as saying “yes, we are using JSON-LD”.
… The only question is if we should rename some of the JSON-LD keywords, subh as @id and @type, so they don’t stick out.
<danbri> gkellogg: some of discussions have been to serve doc as json, provide context via a header
<danbri> … aiming to look like JSON rather than JSON-LD
<danbri> … makes some sense, aliasing those keywords
<danbri> jtandy: to clarify, … it is possible to have a json-ld that replaces @id with something else and it will all work
jtandy: I want to clarify that it’s possible to have JSON-LD that replaces @id with something else and it will work.
<danbri> jenit: that's aliasing and it is fine
jenit: yes, that’s aliasing.
jtandy: we’ll always have a context?
jenit: we’ll publish one, and people point to a context.
<danbri> drafting a proposal: "close 48: we agree our metadata files are JSON-LD, and we are taking various measures to minimise associated syntax burdens"
<danbri> [I have a hard finish in 2 mins due to another meeting.]
jenit: I propose we split the issue into two: how json-ld processors can recognize this, and the other is aliasing of keywords.
<danbri> +1
+1
<bill-ingram> +1
<jtandy> +1
<danbri> Adjourned.
probably some messup with topic vs subtopic to be fixed in minutes.
<danbri> yeah, not sure how to edit those files but at least most of the right words are in the irc log
<danbri> thanks gregg for scribing!
<JeniT> ACTION: jenit to close https://github.com/w3c/csvw/issues/48 and to open new issues on (a) JSON-LD processors recognising metadata documents as JSON-LD and (b) aliasing of JSON-LD keywords [recorded in http://www.w3.org/2014/11/12-csvw-minutes.html#action03]
<trackbot> Created ACTION-59 - Close https://github.com/w3c/csvw/issues/48 and to open new issues on (a) json-ld processors recognising metadata documents as json-ld and (b) aliasing of json-ld keywords [on Jeni Tennison - due 2014-11-19].
<danbri> trackbot, meeting is closed
<trackbot> Sorry, danbri, I don't understand 'trackbot, meeting is closed'. Please refer to <http://www.w3.org/2005/06/tracker/irc> for help.
<danbri> trackbot, end meeting