W3C

- DRAFT -

RDB2RDF Working Group Teleconference

15 Nov 2011

Agenda

See also: IRC log

Attendees

Present
Ivan, +3539149aaaa, mhausenblas, Ashok_Malhotra, cygri, +1.781.273.aabb, MacTed, nunolopes, +1.603.897.aacc, Souri, +1.314.394.aadd, Michael, Richard, Ted, Ashok, Nuno, David, Marcelo, Eric, Seema
Regrets
Boris, Joerg
Chair
Michael
Scribe
mhausenblas

Contents


<trackbot> Date: 15 November 2011

<scribe> scribenick: mhausenblas

<scribe> Chair: Ashok

Admin

<Ashok> PROPOSAL: Accept the minutes of F2F meeting http://www.w3.org/2011/11/08-RDB2RDF-minutes.html

+1

RESOLUTION: Accept the minutes of F2F meeting http://www.w3.org/2011/11/08-RDB2RDF-minutes.html

LC Comments - DM

Ashok: http://www.w3.org/2001/sw/rdb2rdf/wiki/Last_Call#14_DM_and_R2RML_should_use_same_datatype_mapping and http://www.w3.org/2001/sw/rdb2rdf/wiki/Last_Call#15_DM:_PLUS_SIGN_character_in_value_of_a_pkey_column are open
... is this it?

<Zakim> cygri, you wanted to say it doesn't apply to R2RML

<cygri> http://www.w3.org/2001/sw/rdb2rdf/r2rml/#dfn-iri-safe

Richard: Re LC comment 15, I think this does not apply
... issue pointed out by Souri only is relevant for DM

Marcelo: Not looked at the LC comments yet

<Souri> handling in R2RML: "R2RML always performs percent-encoding when IRIs are generated from string templates. If IRIs need to be generated without percent-encoding, then rr:column should be used instead of rr:template, with an R2RML view that performs the string concatenation."

Ashok: Juan looked at the DM LC comments and reported back to the WG that all but 14/15 have been minor
... Eric, can you confirm?

Eric: Looking at it now

<ivan> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0037.html

Ivan: this is the mail Juan sent earlier today

Eric: I'm looking at Juan's message now

<cygri> +1!

Ivan: Comment re Appendix A
... some small mistakes, confirmed to be fixed

Eric: Will send official response, yes

<cygri> +1 to Souri

Souri: R2RML also has the encoding issue (re LC comment 15)

Ashok: Eric and Marcelo - could you speak this issue from DM POV?

ericP: when creating the identifiers, someone parsing them needs to understand the escape characters, at least for boundaries
... most aggressive escaping would result in all of the Unicode char in two or more UTFs
... this is an argument against using the form-enc from HTML
... we take a more relaxed approach in the DM

<cygri> http://www.w3.org/2001/sw/rdb2rdf/r2rml/#dfn-iri-safe

Richard: this is what R2RML does ATM
... so, anything that is not allowed in IRI or boundary char is PER encoded
... fairly conservative encoding
... we don't encode Unicode char in general
... one difference between DM and R2RML is in the handling of spaces
... there are two questions to it: 1) which chars to encode and 2) how to handle spaces

<Souri> space => '+' in DM vs. space => '%20' in R2RML

Richard: I argued for %20 as it doesn't clash with +
... and also not intended as a general IRI escape

Eric: Counter use cases would be a 'cheesy' DM browser by leveraging an HTML form

Richard: Hmm ... how would you generate the URIs in this setup?

(Eric explains details via HTML form)

Richard: how exactly would this look like?

Eric: Guess there is no way getting around the '?', true

Richard: So, URI generation would be need

Michael: see also http://xkr.us/articles/javascript/encode-compare/

<MacTed> +1 for percent encoding in all cases, losing the "+" space substitution special case

Ashok: Can we resolve this then?
... Eric, can you address LC comment 15 with this?

Michael: What's the resolution then?

Eric: Boundaries still unclear, spaces resolved, yes

Richard: IIRC we had an issue for this

<cygri> http://www.w3.org/2001/sw/rdb2rdf/track/issues/67

ISSUE-67?

<trackbot> ISSUE-67 -- Separation characters for reference IRIs and row IRIs -- closed

<trackbot> http://www.w3.org/2001/sw/rdb2rdf/track/issues/67

Richard: where is this addressed in the DM?

Eric: Will try to figure out now - we moved a bit around when addressing LC comments

<Ashok> Juan: What is the latest version of the DM document?

Richard: It was closed before LC, and there is an info box in the DM - marked as open question
... might make sense to re-open ISSUE-67, then?

<juansequeda> Ashok, it's http://www.w3.org/2001/sw/rdb2rdf/directMapping/. Looks like ericP updated it recently

<ericP> is that Overview.html or Overview.xml ?

<juansequeda> Overview.xml

<ericP> tx

<cygri> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0035.html

<ericP> so we're going to have to mix that with the LC 'cause Overview.xml is out of date with the LC

<ericP> (i think)

Richard: Re LC comment 14
... I'm quite happy with the R2RML setting
... DM should normatively reference it

<cygri> http://www.w3.org/2001/sw/rdb2rdf/r2rml/#datatype-conversions

Richard: re-worded it, still same behaviour
... makes it easier to reference it from DM
... and added a note that points to XML Schema 1.1
... addressed also limited precision issue
... I proposed a change in the DM in http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0035.html
... DM would essentially sec. 10.2

Ashok: Would the DM Editors be ok with Richard's proposal?

<ericP> http://localhost/2001/sw/rdb2rdf/directMapping/#defn-literal_map

<ericP> http://www.w3.org/2001/sw/rdb2rdf/directMapping/#defn-literal_map

Eric: I was thinking about something smaller, see URL above
... I agree that both should favour the canonical XML Schema/SQL form
... if we use cast-to-str approach it might introduce backward compatibility issues
... I'd prefer a simple, short table - would be a sub-set of Richard's section

<Souri> Aside: Please confirm: what would be the generated IRI for EMAIL="John.Smith" in both R2RML (with rr:template "http://SCOTT.EMP#EMAIL-{EMAIL}") and in DM: http://SCOTT.EMP#EMAIL-John%2ESmith ?

Ashok: So, RDF uses the XML Schema form

Eric: yes, max. graph merges

Richard: re user defined types, the spec can't define behaviour beyond very generic things
... the 'try-to-cast-to-str' is more helpful than not saying anything
... Giving some default handling for this helps interop

Eric: All user defined types things should be notes/implementer hints then?

Ashok: yes

Richard: R2RML does use note re user defined types already, yes

Ashok: seems we're in agreement re user defined types
... what about canonical forms

Richard: Dunno difference between SQL and XSD canonical form
... not convinced we need canonical forms at all

Ashok: Good to use canonical form for interop reasons

Richard: If your processor is aware of datatypes, then no problem
... if not aware, then you'd get the identical string representation, but only for equality
... not a huge advantage IMO

(Souri had to leave the telecon)

Richard: Still not clear what the XSD canonical forms are

Ashok: Wondering how we should advance then, here ...

Richard: Yes, it's not a huge advantage and it is expensive, sorta

Eric: Would help, for example, in SPARQL queries

Richard: IMO would be better if the underling RDF processor handles this rather than in our realm

Eric: I would like to see input normalisation as part of the spec/conformance

Richard: It's important to have a deterministic output, yes
... we need to pick some kind of normalization
... but we would also get it by applying SQL canonical form
... RDF also defines it, underlying software is expected to implement it

Eric: I think that a large number SPARQL engines do only RDF entailment

Richard: True, but that's a general problem - doesn't improve if we address it in RDB2RDF
... makes our job harder where it really should be resolved on a larger scale

Eric: In SPARQL there is a distinction between FILTER and graph match
... we should ask what we expect what people do - should people write it in a canonical form or not?

Richard: RDF doesn't say how to canonicalise, but during query time

Ted: Level of canonicalisation varies a lot
... age, for example, does this mean years, etc?
... Too many special cases, we can't cater for

Ashok: we will carry on via email
... this issue and translate tables are the two big ones left

Michael: I'm on travel next week

Ashok: Will send out the agenda then

meeting adjourned

trackbot, end telecon

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2011/11/15 18:04:55 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.136  of Date: 2011/05/12 12:01:43  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Found ScribeNick: mhausenblas
Inferring Scribes: mhausenblas
Default Present: Ivan, +3539149aaaa, mhausenblas, Ashok_Malhotra, cygri, +1.781.273.aabb, MacTed, nunolopes, +1.603.897.aacc, Souri, +1.314.394.aadd
Present: Ivan +3539149aaaa mhausenblas Ashok_Malhotra cygri +1.781.273.aabb MacTed nunolopes +1.603.897.aacc Souri +1.314.394.aadd Michael Richard Ted Ashok Nuno David Marcelo Eric Seema
Regrets: Boris Joerg
Agenda: http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Nov/0040.html
Found Date: 15 Nov 2011
Guessing minutes URL: http://www.w3.org/2011/11/15-RDB2RDF-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]