W3C

- DRAFT -

CSV on the Web Working Group Teleconference

19 Nov 2014

See also: IRC log

Attendees

Present
+44.207.346.aaaa, Ivan, danbri, jtandy
Regrets
Chair
SV_MEETING_CHAIR
Scribe
danbri

Contents


<trackbot> Date: 19 November 2014

hi folks

ok per http://lists.w3.org/Archives/Public/public-csv-wg/2014Nov/0044.html my proposal is that we do not meet formally this week, but if anyone wants to follow Jeremy's suggestion to talk about dates/functions/operators, we can chat via Zakim.

regrets from Jeni

<ivan> Hi danbri, I come in a minute

likewise

too many tabs open!

<jtandy> #2789

<jtandy> sorry - fire alarm test

talking dates, informally

[agreed not a full WG meeting]

https://github.com/w3c/csvw/issues/54

jtandy: last week I said I'd look at ISO8601

it doesn't allow for pattern strings

gives a number of ways people can provide data

quite restricted

only uses gregorian calendar

wasn't quite what i expected of it

ivan: 8601 … some b/g ?
... ah, … the one always cited.

jtandy: it doesn't allow you to say that I'll give month first then years then days (per US convention), or whatever pattern

… so this doesn't work for us

ivan: at least for RDF this is what we're supposed to use at output

jtandy: for RDF and for XML and a number of other places, 8601 is what is used

… however CSV is not so restricted so wild CSV has other date formats

…therefore for mapping we need to consume a natural date format and then publish it as ISO8601, xsd:dateTime etc

… I think from F2F the I18N guys said take a look at TR35

ivan: which is quite terrifying :)

<jtandy> http://www.w3.org/TR/xpath-functions-30/#formatting-dates-and-times

jtandy: I've also looked at xpath, xquery functions

<- for gregg

… what happens there (url above) is that they provide a v good spec on how to turn ISO8601 internal date format into a natural date string

… but it is really very clear about how to do so

… even though it goes in the opposite direction to what we do

jtandy: specifically in http://www.w3.org/TR/xpath-functions-30/#formatting-dates-and-times section 9

<jtandy> http://www.w3.org/TR/xpath-functions-30/#dates-times

thanks

jtandy: if you go to 9.2.1 you see some examples

<jtandy> http://www.w3.org/TR/xpath-functions-30/#formatting-dates-and-times

<jtandy> section 9.8

9.8 Formatting dates and times

danbri: i assume function can't be reversed as ambiguous (e.g. US mon/day day/mon order)

jtandy: see Picture String section

9.8.4.1 The picture string

ivan: various of these are implemented across multiple languages

…however they don't define how to use such picture string

jtandy: various discussion on correct picture string representation

ivan: see 9.8.5 section

http://www.w3.org/TR/xpath-functions-30/#date-time-examples

jtandy: examples of output

danbri: reminiscent of URI Templates

format-time($t, "[h]:[m01]:[s01] o'clock [PN] [ZN,*-3]", "en", (), ())

gives

3: 58:45 o'clock PM PDT

jtandy: if we said "a CSV impl must support gregorian calendar, and it may support other calendars… conversion functionality is impl dependent"

<ivan> https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior

ivan: re issues for implementors

… this {url} is py version of the same thing

… much is similar, although they use a % sign instead of [ ]

… and then another one, ...

jtandy: yes, Moment JS lib

ivan: similar but not identical.

jtandy: and then java simple data format

… lots of places in same space

… if an impl was using e.g. py impl, they'd need to map from our picture string

… which is at least similar

… i.e. not having to do the entire job

… over time people might standardize on one picture string syntax

ivan: it's pretty much of a mess

jtandy: the general situation or this? :)

ivan: … both? :)

… if I look at the python one, and Moment likely is similar, they have diff chars for diff things, e.g. capital Y, … whereas if my u/standing correct, xpath they define microsyntaxes

… which is the worst to convert from this to the py one is a pita

<jtandy> http://momentjs.com/docs/

danbri: which dir do these tools go in?

ivan: both
... am tempted to ignore the xsd stuff

… and pick one of the existing formats and use that

… which is probably simpler, we'd make at least some implementors happier

<jtandy> Year, month, and day tokens

jtandy: for example, on Moment docs, in section Year Month Day tokens

… lists the types of tokens that you can use

…. we should look across these for consistency?

ivan: there is no consistency in sense that we know they use diff picture strings

<jtandy> consistency in picture string format?

ivan: considering js ruby java c, …. is there one format that dominates? that we can therefore declare a loser/winner

jtandy: is there one pic string format that is more standardized than others?

my concern is that using these language impl picture strings, … that none of them have been defined to the same rigour as xpath/xquery

jtandy: the w3c stuff more likely to address edge cases, i18n etc

ivan: we can say our default requirement is to use the iso spec etc

… however we can choose to allow picture strings

… as they are popular

… we can say they assume at least gregorian

… but we do not think of trying to address all use cases

[missing details]

jtandy: i'd advise not to use ordinal values / nominal names

ivan: we define the weekdays as numbers

<jtandy> (another dateformat implementation: https://code.google.com/p/jaxson/source/browse/trunk/jaxson/WebRoot/jsSrc/org/jaxson/util/SimpleDateFormat.js?r=41)

… months as numbers

… years as numbers

the seconds etc

… only numeric

… and UTC offset etc

… that certainly helps

… considering our Use Cases, how does this map against the practicalities?

jtandy: UCs don't talk about using other calendars; generally assume gregorian / CE

… do particular CSV files use american date formats?

ivan: if this is just a matter of order, … do they handle that? do they use names of the months?

CSV report: http://lists.w3.org/Archives/Public/public-csv-wg/2014Nov/0043.html

http://lists.w3.org/Archives/Public/public-csv-wg/2014Nov/att-0043/csv_profiler_report_copy.pdf

jtandy: weather date formats can be odd

http://en.wikipedia.org/wiki/Stardate

"The first two digits of the stardate are always "41." The 4 stands for 24th century, the 1 indicates first http://en.wikipedia.org/wiki/Television_program#Seasons.2Fseries."

jtandy: [missed something that sounded important]

ivan: e.g. say implementations are required to do that

<jtandy> i said: picture strings + gregorian calendar + Common Era ... named values (days, months etc.) are not permitted

<ivan> I said: picture strings + gregorian calendar + common era, using only numbers is required, addtional features (named values) are optional to the implementation

dan: how much could be done with regex?

<jtandy> agree with ivan

<jtandy> question becomes "what picture string 'standard' shall we use"

would it be lossy, e.g. '01' and '1' representing January turn into '1'?

<jtandy> (@danbri - the XPath spec deals with lossy transforms like you suggest)

ivan: if we decide to go with something in the js direction, is Moment somehow the main tool?

see also https://github.com/trending?l=javascript

ivan: the python one is the closest to standard in its own way

or java

danbri: don't forget .Net! they did a ton of work mapping across langs

ivan: Dart, Go, …

… we're doomed? :)

jtandy: action then is to review the common langs looking for similarities

… we'd need to specify simplifying restrictions if we wanted to include something

ivan: also problems with numbers

<jtandy> http://www.w3.org/TR/xpath-functions-30/#formatting-numbers

——

ivan: can we be v loose and say that … we do not require anything in the specification sense. What we could do is something like … first we use the ISO date spec, …

… we use that, we already have issue of putting picture/format string into metadata somewhere

… we let impl decide which one they understand

… the way the metadata is defined they can give an array of a picture string for a column

… and the impl can choose which

… hope that at least one is understood

danbri: should we come up with short name codes for each pic string language

ivan: yeah

jtandy: can you see any metadata publishers bothering with that?

ivan: decent people will use the iso format

er

i'll redial

<jtandy> where are you?

<jtandy> lost in space

<jtandy> danbri: proposes that "we strongly recommend for interoperability that you supplement your data with ISO 8601 datetime values"

<jtandy> ... when publishing CSV data on the web for general consumption

<jtandy> (so if it's not in ISO 8601 format, then we just treat it as a string for down stream resolution?)

<jtandy> ivan: if it's not in ISO 8601 format, we don't check - so it will be an error in the RDF

<jtandy> we discussed using a hash-table of common picture string formats for javascript, ruby, python etc.

<ivan> trackbot, end telcon

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.140 (CVS log)
$Date: 2014/11/19 15:59:55 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.140  of Date: 2014-11-06 18:16:30  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

No ScribeNick specified.  Guessing ScribeNick: danbri
Inferring Scribes: danbri

WARNING: No "Topic:" lines found.

Default Present: +44.207.346.aaaa, Ivan, danbri, jtandy
Present: +44.207.346.aaaa Ivan danbri jtandy

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Found Date: 19 Nov 2014
Guessing minutes URL: http://www.w3.org/2014/11/19-csvw-minutes.html
People with action items: 

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.


WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report


[End of scribe.perl diagnostic output]