RDB2RDF Working Group Teleconference

21 Sep 2010


See also: IRC log


Souri, Richard, Juan, EricP, Michael._Ashok, Harry, Boris, Ivan, Ted, Nuno, Soeren
Li_Ma, Wolfgang_Halb


<trackbot> Date: 21 September 2010

<Ashok> meeting: RDB2RDF

<scribe> scribenick: mhausenblas

Agenda is at http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Sep/0048.html


PROPOSAL: Accept the minutes of last meeting, see http://www.w3.org/2010/09/14-rdb2rdf-minutes.html

<hhalpin> +1

RESOLUTION: WG has accepted the minutes from last time

Change name of language to SQRL?

Michael: let's focus on FPWD for now

<hhalpin> my main comment is to leave that to the editors

Ashok: second that

Any comments on FPWD

<Souri> I have not had much time to work on it due to OOW

Ashok: Souri, Seema and Richard are working on it - any comments?

ericP: still some issues re CVS, we're working on it

hhalpin: do all editor have CVS working?

cygri: I believe it should be working now, yes - sent credentials to ericP

Michael: I can assist, yes

hhalpin: Souri and Seema?

<ericP> cygri, all three of you will get email when the sysfolks have dealt with the request

<hhalpin> straight XHTML

cygri: editors will sort out, yes

Souri: got hhalpin mail but not yet tested it

<hhalpin> next week is fine for me.

Ashok: so question is - are we on track for a FPWD by end of month?

cygri: how much content would you expect?

Ashok: a skeleton would be sufficient

<ericP> it's a balance between getting early feedback and taking advantage of an opportunity to make a splash

cygri: will have this and a lot of details not yet nailed down I

Ashok: completely fine if FPWD has questions/comments in there

hhalpin: would be good to have a something out there by end of week (heart beat requirement)
... gather early feedback

<hhalpin> heartbeat check/schedule by Sept 30th.

R2RML Semantics and direct mapping

Ashok: I'd prefer to have the semantics right after FPWD
... so this week we have ericP on the call

<juansequeda> Eric has been talking with Marcelo (... right?)

Ashok: did you and juansequeda agree on semantics, yet?

ericP: trying to balance against the Datalog one
... couple of issues (PK, FK to tuple, etc.)

Ashok: what doc are you referring to?

ericP: the maths for creating URI is plain English (?)

<hhalpin> I would like to see EricP list out these issues in IRC.

ericP: still believe the set notation would be more sound

<juansequeda> yes

<juansequeda> having problems with my phoen

<juansequeda> can you here me?

<juansequeda> im going to redail

hhalpin: would be good if we go with the approach the majority of the WG prefers
... can ericP list the issues?

<juansequeda> I talked to several people at Microsoft

<juansequeda> they said either Datalog or even relational algebra

juansequeda: ericP was working with Marcello

ericP: we exchanged a bunch of documents, yes

juansequeda: talked to MS people, for them Datalog is fine or relational algebra
... want to approach IBM people

soeren: think relational algebra would be easier to understand
... I already sent a draft

juansequeda: response I got was half half (between datalog and relational algebra)

hhalpin: more or less same in the DB group in Edinburgh

<hhalpin> the point is they can generally be made equivalent

juansequeda: seems to be two camps
... we can put them side by side

soeren: I like both but relational algebra would be easier to read

<hhalpin> These are the issues that need to be iterated.

ericP: I see quite some issues that might require a lot of hand waving (re bNodes, etc.)

<Souri> +1 to having both and then comparing side by side (taking into account EricP's comment about no-primary-key case handling)

juansequeda: need to think about the limitations mentioned by ericP
... would be good if ericP sends out a list
... of cases where he sees issues

MacTed: not sure if it is worth it determining if it is readable

<ericP> no pk: http://www.w3.org/2001/sw/rdb2rdf/directGraph/#no-pk

<scribe> ACTION: Eric to list issues re Datalog approach re PK, FK, etc [recorded in http://www.w3.org/2010/09/21-rdb2rdf-minutes.html#action01]

<trackbot> Created ACTION-72 - List issues re Datalog approach re PK, FK, etc [on Eric Prud'hommeaux - due 2010-09-28].

<hhalpin> What would be the identifier then?

juansequeda: talked with Marcello; he was against using a bNode for this (see also RDF next steps workshop)

<MacTed> default primary key = ROWID, which is RDB implementation dependent

<MacTed> when there's no such, concatenation of all fields is sometimes (often) used as a fallback

ericP: I can give you examples where this is needed

<hhalpin> good point MacTed

<MacTed> if there's no unique on that concat, then problems arise ... but there are problems already

<Souri> we would need to generate unique bNode label for each row (uniqueness is limited to the desitination graph)

ericP: just to give a scope on the problem - assume the RDB as dataware house, in this multi set case you run into issues

Ashok: re MacTed's comment

<MacTed> "implementation dependent" :-)

Ashok: ROWID can change, hence not possible

Souri: basic problem is that we don't have PK we need to identify each row
... can be limited to destination graph
... just thinking aloud now - if we say the ROWID is the bNode ID
... so could might have moved
... seems like it would work out for SPARQL query
... but not totally sure

juansequeda: if there is no PK, how do I get all the data?

Souri: can't get it from the RDB
... the bNode could cluster all the relevant data
... in subject position

<Zakim> ericP, you wanted to suggest how a blank node is interpreted in SQL

ericP: in SQL-land I can't identify a non-PK row
... same in RDF-land
... but in SPARQL/SQL update I can do it

juansequeda: a bNode identifies something, but in SQL we don't have the same

Souri: I disagree; the row boundary is there
... but we need a unique subject

juansequeda: I agree, yes - will check back with Marcello, as he has some reservation re this

<hhalpin> http://www.w3.org/2009/12/rdf-ws/papers/ws23

hhalpin: the semantics are a bit different though (PatH agrees with Marcello re this)
... so, whatever is the simplest solution is fine with me (sort of ignoring the original semantics)

<Souri> From practitioners point of view, the main difference between bNode and URI is one has a local (graph) scope and the other has global scope

ericP: so, hhalpin, which semantics are we talking about, then?

hhalpin: we should use it in a practical way

<ericP> Debtors:

<ericP> Bob Smith $30

<ericP> Bob Smith $30

<ericP> Debtors: { [ :fn "Bob" ; :ln "Smith" ; :amnt ] [ :fn "Bob" ; :ln "Smith" ; :amnt ] }

juansequeda: agree. need to identify and discuss issues

<ericP> SELECT SUM(amnt) FROM Debtors WHERE fn = "Bob" AND ln = "Smith"

<ericP> SELECT (SUM(amnt) AS ?a) { [ :fn "Bob" ; :ln "Smith"] }

<hhalpin> i.e. the existential variable interpretation is a bit silly, but the blank nodes are generally used for "grouping" data that doesn't otherwise have an identifier.

ericP: on a per-use cases basis decide on issues/coverage

<Souri> if we generate bNode, we need to ensure the uniqueness is maintained within each destination graph, if we use URIs uniqueness has to be global

<hhalpin> that's a good point Souri - i.e. why we need to consider the blank node semantically as basically a unique identifier...

<hhalpin> however, again, that bring's up the issue with MacTed's concat common practice

Souri: from practitioners POV, URI is a bit more complex - all we need to ensure the 'label' we produce is unique in the destination graph

(scribe missed the details of Souri's explanation)

ericP: I guess the most complex scenario is with FK
... FK have to reference candidate key
... could be a bNode
... so unclear how to deal with the materialisation

Souri: tough case indeed, need to think more about it

ericP: it's doable, just an extra step

Souri: I was thinking of having a joint condition - if there is no corresponding on the PK, then don't know how to generate it

juansequeda: need to distinguish default mapping and customisation

ericP: right, and we might write some UC out of the default mapping

Souri: we need to explicitly say what we can handle (the editors)

hhalpin: edge cases need to be addressed (test cases)
... need to highlight these cases (not only formally)
... agree with MacTed's point there

juansequeda: agree as well

<Souri> I agree to Harry about the edge cases

<hhalpin> not hide edge-cases in formal semantics

<hhalpin> even if they are there, but we need to warn implenters about them

Souri: need to be explicit about the edge cases, yes

ericP: the scenario I just gave above is actually a combination of two more basic ones

<ericP> http://www.w3.org/2001/sw/rdb2rdf/directGraph/#no-pk

juansequeda: will send out paper about this and sync with ericP

(Souri explains another joint-condition example)

<ericP> "The columns in the referencing table must be the primary key or other candidate key in the referenced table." — http://en.wikipedia.org/wiki/Foreign_key

<ericP> ('cause I can't paste from Date)

Ashok: ok, thanks for all the input - we have our action items, will not be here next week

Michael: I'm around


trackbot, end telecon

Summary of Action Items

[NEW] ACTION: Eric to list issues re Datalog approach re PK, FK, etc [recorded in http://www.w3.org/2010/09/21-rdb2rdf-minutes.html#action01]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2010/09/21 17:02:22 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.135  of Date: 2009/03/02 03:52:20  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/raw/row/
Succeeded: s/PatH vs. Marcello/PatH agrees with Marcello re this/
Found ScribeNick: mhausenblas
Inferring Scribes: mhausenblas

WARNING: Replacing list of attendees.
Old list: +1.512.232.aaaa
New list: mhausenblas +49.133.6.aaaa +1.512.232.aabb Ashok_Malhotra MacTed juansequeda EricP nunolopes boris hhalpin Souri

Default Present: mhausenblas, +49.133.6.aaaa, +1.512.232.aabb, Ashok_Malhotra, MacTed, juansequeda, EricP, nunolopes, boris, hhalpin, Souri
Present: Souri Richard Juan EricP Michael._Ashok Harry Boris Ivan Ted Nuno Soeren
Regrets: Li_Ma Wolfgang_Halb
Agenda: http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Sep/0048.html
Found Date: 21 Sep 2010
Guessing minutes URL: http://www.w3.org/2010/09/21-rdb2rdf-minutes.html
People with action items: eric

[End of scribe.perl diagnostic output]