W3C

- DRAFT -

RDB2RDF Working Group Teleconference

01 May 2012

Agenda

See also: IRC log

Attendees

Present
Ashok_Malhotra, mhausenblas, cygri, MacTed, dmcneil, Ivan, nuno, nunolopes, Souri, juansequeda, seema, EricP, Michael, Ashok, Ted, Richard, David, Nuno, Juan, Seema, Eric
Regrets
Joerg, Marcelo, Boris
Chair
Michael
Scribe
Ashok

Contents


<trackbot> Date: 01 May 2012

Sure!

<mhausenblas> Thanks!

<mhausenblas> scribenick: Ashok

Minutes of Last Meeting

PROPOSAL: Accept the minutes of last meeting http://www.w3.org/2012/04/24-RDB2RDF-minutes.html

<cygri> +1

Minutes acceptes without objection

Fixing an omission in R2RML: syntax of blank node labels

<mhausenblas> PROPOSAL: Change Section 11.2 of R2RML http://www.w3.org/TR/2012/CR-r2rml-20120223/#generated-rdf-term) from:

<mhausenblas> [[

<mhausenblas> If the term type is rr:BlankNode: Return a blank node whose blank node identifier is the natural RDF lexical form corresponding to value.

<mhausenblas> ]]

<mhausenblas> to:

<mhausenblas> [[

<mhausenblas> If the term type is rr:BlankNode: Return a blank node that is unique to the natural RDF lexical form corresponding to value.

<mhausenblas> NOTE: RDF syntaxes and RDF APIs generally represent blank nodes with blank node identifiers. But the characters allowed in blank node identifiers differ between syntaxes, and not all characters occurring in value may be allowed, so a bijective mapping function from values to valid blank node identifiers may be required. The details of this mapping function are implementation-dependent, and an R2RML processors may have to use different functions for

<mhausenblas> different output syntaxes or access interfaces. Strings matching the regular expression [a-zA-Z_][a-zA-Z_0-9-]* are valid blank node identifiers in all W3C-recommended RDF syntaxes.

<mhausenblas> ]]

<dmcneil> +q

<MacTed> +1

RESOLUTION: Proposal accepted without change

<ivan> 1

Using non-existing column in mapping

Using non-existing column in mapping http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0018.html

<mhausenblas> Section 6 of the R2RML spec (http://www.w3.org/TR/2012/CR-r2rml-20120223/#triples-map) states:

<mhausenblas> [[

<mhausenblas> The referenced columns of all term maps of a triples map (subject map, predicate maps, object maps, graph maps) MUST be column names that exist in the term map's logical table.

<mhausenblas> ]]

<mhausenblas> PROPOSAL: Any referenced columns (that is, the columns mentioned in rr:column, rr:template, etc.) that don't exist in the rr:sqlQuery or table are treated simply as being NULL, rather than being considered an error. Perhaps the case could still be treated as a warning:

<mhausenblas> [[

<mhausenblas> Processors MAY warn mapping authors if a referenced column does not exist in the logical table.

<mhausenblas> ]]

<MacTed> I'd prefer SHOULD to MAY

<juansequeda> +1 to MacTed's comment

<Souri> -1 (prefer error over warning -- returning wrong value from a query could be disastrous)

<MacTed> SHOULD warn, MAY error?

<mhausenblas> PROPOSAL: Any referenced columns (that is, the columns mentioned in rr:column, rr:template, etc.) that don't exist in the rr:sqlQuery or table are treated simply as being NULL, rather than being considered a warning. Processors SHOULD warn mapping authors if a referenced column does not exist in the logical table.

<MacTed> PROPOSAL: Any referenced columns (that is, the columns mentioned in rr:column, rr:template, etc.) that don't exist in the rr:sqlQuery or table are treated simply as being NULL, rather than being considered a warning. Processors SHOULD error, MAY warn mapping authors if a referenced column does not exist in the logical table.

<dmcneil> +q

<MacTed> PROPOSAL: Processors SHOULD error if any referenced columns (that is, the columns mentioned in rr:column, rr:template, etc.) don't exist in the rr:sqlQuery or table. Processors MAY treat such columns as NULL but if so, SHOULD return a warning to mapping authors.

<Souri> +1 to David's comment

<cygri> Ashok++

<Souri> What exactly was the earlier statement about it?

<juansequeda> Can't we just choose one?

<juansequeda> PROPOSAL: Processors MUST error if any referenced columns (that is, the columns mentioned in rr:column, rr:template, etc.) don't exist in the rr:sqlQuery or table.

<Souri> +0 from me (as long as earlier statement said "error" is the norm)

WG decided not to change the spec in this regard

RESOLUTION: WG decided not to change the spec in this regard

Unnamed columns in rr:sqlQuery

<mhausenblas> Section 5.2 of the R2RML spec (http://www.w3.org/TR/2012/CR-r2rml-20120223/#r2rml-views) states:

<mhausenblas> [[

<mhausenblas> Any columns in the SELECT list derived by projecting an expression MUST be named.

<mhausenblas> ]]

<mhausenblas> PROPOSAL: Above to be replaced with an informative note:

<mhausenblas> [[

<mhausenblas> Note: Any columns in the SELECT list derived by projecting an expression should be explicitly named because otherwise they cannot be referenced in the rest of the mapping.

<mhausenblas> ]]

Unnamed columns in rr:sqlQuery http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0017.html

<Souri> +1

<juansequeda> +1

<Souri> do we need to add "unique" ?

<Souri> Any columns in the SELECT list derived by projecting an expression MUST be explicitly named for those column names to be referenced in the rest of the mapping.

<MacTed> PROPOSAL: Any columns in the SELECT list derived by projecting an expression SHOULD be named.

<ericP> what if you define the behavior in terms of a referenceable view?

<ericP> (SQL:referenceable view)

<ericP> that way the SQL spec defines it for you

<cygri> +1

RESOLUTION: Last proposal accepted by WG

XSD mapping for binary columns (xsd:hexBinary vs. xsd:base64Binary)

<mhausenblas> PROPOSAL: Binary datatypes like BLOB and VARBINARY should be mapped to xsd:hexBinary instead of xsd:base64Binary.

<MacTed> +1

XSD mapping for binary columns (xsd:hexBinary vs. xsd:base64Binary) http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0020.html

<ericP> +1

<ivan> ±0

Implementability for tables w/o primary key

RESOLUTION: Proposal accepted

<mhausenblas> PROPOSAL: In the DM spec, replace the following text:

<mhausenblas> [[

<mhausenblas> If the table has no primary key, the row node is a fresh blank node that is unique to this row.

<mhausenblas> ]]

<mhausenblas> with this:

<mhausenblas> [[

<mhausenblas> If the table has no primary key, the row node is a blank node. Distinct blank nodes MUST be generated for rows with distinct column values. For duplicate rows with identical values, implementations SHOULD generate a fresh blank for each duplicate row (resulting in a non-lean RDF graph [RDF Semantics]). However, if the underlying database system does not provide any means to reliably differentiate among the rows, then implementations MAY re-use the sa

<mhausenblas> blank node for multiple duplicate rows (resulting in a lean RDF graph). Implementations SHOULD document and advertise their chosen behavior.

<mhausenblas> ]]

RESOLUTION: Binary datatypes like BLOB and VARBINARY should be mapped to xsd:hexBinary instead of xsd:base64Binary.

<ericP> i think very few people are aware of lean graphs

<ericP> especially SPARQL users

<ivan> w+1 to Eric

<ivan> +1 to Eric

<ivan> I think it would just frighten people

<MacTed> +1 as written

Remove: (resulting in a lean RDF graph)

<mhausenblas> PROPOSAL: If the table has no primary key, the row node is a blank node. Distinct blank nodes MUST be generated for rows with distinct column values. For duplicate rows with identical values, implementations SHOULD generate a fresh blank for each duplicate row. However, if the underlying database system does not provide any means to reliably differentiate among the rows, then implementations MAY re-use the same blank node for multiple duplicate rows.

<mhausenblas> Implementations SHOULD document and advertise their chosen behavior.

<cygri> ericP: http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0036.html

<MacTed> possible informative note -- "The result of reusing the same blank node for multiple rows is a lean RDF graph [RDF Semantics]; generating fresh blank nodes for each row results in a non-lean RDF graph. One significant implication of a lean RDF graph is loss of cardinality."

<juansequeda> PROPOSAL: If the table has no primary key, the row node is a blank node. Distinct blank nodes MUST be generated for rows with distinct column values. For duplicate rows with identical values, implementations SHOULD generate a fresh blank for each duplicate row. However, if the underlying database system does not provide any means to reliably differentiate among the rows, then implementations MAY re-use the same blank node for multiple duplicate rows w

<juansequeda> implies loss of cardinality

<juansequeda> My prediction: most probably, implementors will not support SPARQL to SQL on the direct mapping on tables with no primary keys because 1) it's complicated to implement 2) corner case

<cygri> juansequeda++

<ericP> D012-2tables2duplicates0nulls

Ivan: Asks about our plans going forward

Michael: There are 2 open issues. We need to resolve these.

Ivan: So it looks like we are in second last call
... Because of the moratorium we need to publish before next Thursday.
... so we have some time pressure

<ivan> https://lists.w3.org/Archives/Member/chairs/2012AprJun/0043.html

Richard: The last 2 issues make the DM and R2RML mapping incompatible

<mhausenblas> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0070.html

DM cannot be implemented as an R2RML mapping (period encoding) http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0021.html

RESOLUTION: re DM cannot be implemented as an R2RML mapping (period encoding) the WG agrees to go with http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012Apr/0070.html

<cygri> ericP, how about tomorrow after the rdf-wg call?

Request Eric and Richard discuss over email or on the phone.

We will resolve this issue on the My 8 call, one way or another.

<ivan> bye guys

<ivan> have fun:-)

ADJOURNED

<ericP> SELECT * { ?I <IOUs#fname> ?fname ; <IOUs#amount> ?amount . ?L <Lives#fname> ?fname ; <Lives#city> ?city }

<ericP> │ _:a │ _:d │ 3.0E1 │ "London" │ "Bob" │

<ericP> │ _:a │ _:f │ 3.0E1 │ "London" │ "Bob" │

<ericP> │ _:c │ _:d │ 3.0E1 │ "London" │ "Bob" │

<ericP> │ _:c │ _:f │ 3.0E1 │ "London" │ "Bob" │

<ericP> │ _:b │ _:e │ 2.0E1 │ "Madrid" │ "Sue" │

<ericP> SELECT * FROM IOUs INNER JOIN Lives ON IOUs.fname=Lives.fname;

<ericP> | Bob | Smith | 30 | Bob | Smith | London |

<ericP> | Bob | Smith | 30 | Bob | Smith | London |

<ericP> | Sue | Jones | 20 | Sue | Jones | Madrid |

<ericP> | Bob | Smith | 30 | Bob | Smith | London |

<ericP> | Bob | Smith | 30 | Bob | Smith | London |

<mhausenblas> trackbot, end telecon

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.136 (CVS log)
$Date: 2012/05/01 17:35:22 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.136  of Date: 2011/05/12 12:01:43  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/ROPOSAL/PROPOSAL/
Succeeded: s/such/such columns/
Succeeded: s/ MUS / MUST /
Succeeded: s/PROPOSAL/RESOLVED/
Found ScribeNick: Ashok
Inferring Scribes: Ashok
Default Present: Ashok_Malhotra, mhausenblas, cygri, MacTed, dmcneil, Ivan, nuno, nunolopes, Souri, juansequeda, seema, EricP
Present: Ashok_Malhotra mhausenblas cygri MacTed dmcneil Ivan nuno nunolopes Souri juansequeda seema EricP Michael Ashok Ted Richard David Nuno Juan Seema Eric
Regrets: Joerg Marcelo Boris
Agenda: http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2012May/0001.html
Found Date: 01 May 2012
Guessing minutes URL: http://www.w3.org/2012/05/01-RDB2RDF-minutes.html
People with action items: 

[End of scribe.perl diagnostic output]