RDB2RDF Working Group Teleconference -- 30 Nov 2010

<trackbot> Date: 30 November 2010

<boris> I have problem with the passcoe

<boris> code

<boris> hi BTW

<boris> yes

<boris> I'll try it again

<mhausenblas> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Nov/0155.html

<MacTed> major deadline is hitting me here ... I may not make the call, and will be minimally attentive if I do.

<MacTed> my review of the Direct Mapping Editor's Draft is *not* complete, in no small part because I find it very hard to work with onscreen, and it fails to print usefully (many elements print off-the-page)

<boris> still with the passcode not valid

<boris> weird

<mhausenblas> scribenick: cygri

Admin

<boris> I'm calling again ... sorry

PROPOSAL: Accept the minutes of last meeting, see http://www.w3.org/2010/11/16-rdb2rdf-minutes.html

<Ashok> + PROPOSAL: Accept the minutes of last meeting, see http://www.w3.org/2010/11/23-rdb2rdf-minutes.html

<juansequeda> +1

<mhausenblas> +1

<ivan> 0 :-)

RESOLUTION: Accept the minutes of last meeting, see http://www.w3.org/2010/11/23-rdb2rdf-minutes.html

<mhausenblas> ACTION-79?

<trackbot> ACTION-79 -- Juan Sequeda to compile a list of speaking/panel opportunities for diseminating rdb2rdf work -- due 2010-11-30 -- OPEN

<trackbot> http://www.w3.org/2001/sw/rdb2rdf/track/actions/79

Review of R2RML Test Cases

<Ashok> 2. Review of R2RML Test Cases (Boris) ACTION-81 http://www.w3.org/2001/sw/rdb2rdf/wiki/R2RML_Test_Cases

ashok: we agreed to postpone first test case?

boris: yes

<boris> http://www.w3.org/2001/sw/rdb2rdf/wiki/R2RML_Test_Cases#R2RMLTC0002

boris: i'll start with test case R2RMLTC0003
... this is without primary key so URI is generated by concatenating values

ashok: are you speaking about default mapping?

boris: no
... syntax of the R2RML mapping is at end of test case

ivan: i would consider the expected default mapping result as separate entry

ericP: the default mapping says, for any given schema+rows, here's the graph it entails

<Zakim> ericP, you wanted to argue with ivan

ericP: how to test that your impl works is up to you
... you can do that test given an rdf graph

ivan: it's a question of organizing the tests

<juansequeda> +1 to Ivan

ivan: for each test we have a small db, we have a graph it produces, and optionally we have an r2rml mapping and the graph it produces

<boris> I got disconnected

<boris> calling again, sorry

ashok: does the syntax look right?

<ericP> +1

<ericP> (to the notion that some tests will apply to either mapping)

<hhalpin> +1, yes otherwise the test-cases for direct graph are likely to become rather redudant

<hhalpin> +1 cygri's comment

<ericP> database [ directDB? ( r2rml r2rDB ) * ]

<boris> sorry, havin problems again

cygri: there might be many test cases where there are multiple customized mappings for the same db

<Souri> In TC3, due to absence of primary key, the subject will be a bNode.

cygri: and in that case the direct graph would be same
... so it might be better to have one kind of test cases for direct, and another kind for r2rml

ericP: we would get lots of redundancy either way
... what would the reader find easiest
... databases are what we change the least
... propsed structure: have a db, then may or may not have direct mapping, then n pairs of r2rml and custom graph

<Souri> My above comment was in relation to direct mapping (not R2RML mapping).

<ivan> +1 to Ashok

ashok: why would we ever not have a direct mapping?

<Zakim> ericP, you wanted to note that many tests use the same input db

ericP: direct mapping might not be interesting for some dbs

<ivan> +1 to to Eric, modulo the '?' mark:-)

+1 to Eric

<hhalpin> hmm

<hhalpin> mike not working

<hhalpin> but basically, my point is that a direct graph mapping

<hhalpin> should be the first part of the document

<hhalpin> for each database, like eric said

<hhalpin> and the the second part of the doc should cover

<hhalpin> the features of r2rml.

<hhalpin> Obviously, otherwise implementers

<hhalpin> may be confused

+1 to that

<juansequeda> +1

<hhalpin> it seems kinda silly to have multiple test-cases

<hhalpin> that have the same input-output

<hhalpin> that's redundant, and makes the document harder

ericP: do we put all the db-direct pairs in the first half of the document?

<hhalpin> It's just an organizational issues - but yes, we could do it by databases as well.

<ivan> +1 to Ashok and Eric

+1 to hhalpin

<hhalpin> However, the key is not to have any redundant test-cases, which the current approach would be absolutely full of.

<hhalpin> So I'm happy with Eric's proposal of doing it one database at time, as long as there is no redundant tests.

Implementations (Michael)

<mhausenblas> ACTION-82?

<trackbot> ACTION-82 -- Michael Hausenblas to create Wiki page with implementation status (and plans thereof) -- due 2010-11-30 -- OPEN

<trackbot> http://www.w3.org/2001/sw/rdb2rdf/track/actions/82

<mhausenblas> close ACTION-82

<trackbot> ACTION-82 Create Wiki page with implementation status (and plans thereof) closed

<ericP> I note that we still have a dissenting opinion from cygri

<ericP> do we want to resolve that?

<juansequeda> I'm adding myself there

mhausenblas: made wiki page

<ericP> (as in give him time to persuade us)

mhausenblas: http://www.w3.org/2001/sw/rdb2rdf/wiki/Implementations
... now lists D2RQ, FeDeRate, ODEMapster, Revelytix

ashok: Souri, will you add an oracle line?

Souri: we can't officially promise, so we'll implement it first and add then

<juansequeda> now we have 5 lines :)

mhausenblas: it would be extremely beneficial if we had *something* from oracle

<ivan> +1 to Michael

mhausenblas: "we intend to ..."

Souri: will se what we can do

R2RML status

Souri: we are waiting for review comments
... interesting comment from Percy

<ivan> editors' draft of r2rml

Souri: about blank nodes

<Seema> comment from Percy on the r2rml spec

Souri: no one can conflict with a blank node in the same graph

<privera> I'm a Master Student at Pontifical Catholic University of Rio de Janeiro (PUC-Rio)

Souri: if we have stuff in different graphs, then we can't connect the triples because blank nodes in different graph are always distinct
... we need to point that out in the doc
... my plan for solution: cautionary note

<hhalpin> +1 strong warning on blank nodes, or RDF supports Marcelo's suggested semantics for blank-nodes in RDF Next to avoid pathological blank node problems.

cygri: +1 to that, put a strong warning but if ppl want to use bnodes anyways why stop them

<privera> +1

ashok: you start by talking about how r2rml can be used

<boris> thanks to ericP I'm on the phone again

ashok: there are two different uses. you could materialize the graph. or you make it virtual and query with sparql
... there are words, but they are not focused
... i'll propose wording

<hhalpin> +1 just outlining options

ashok: the document is very dense
... you want more words
... always add one or two sentences: this bit of sql here does the following

ivan: we need some text somewhere that explains the relationship between r2rml and direct mappign
... i made a presentation, it went well, but there's a need for explanation here
... titles are very different from another etc. we have to make a link between them

ashok: have a small paragraph in both of them

ivan: have a separate W3C note that explains it. can be a one-pager
... we did that in RIF and OWL (where we had six or seven docs)
... that might be one way to do it

ashok: my instinct is to add one paragraph in each doc

<hhalpin> +1 one paragraph i each document rather than yet another note

<hhalpin> minimize the number of documents

+1 to ivan and hhalpin

Souri: want to talk about adding schema triples

<juansequeda> +q

Souri: should we always generate these triples? or make it an option that you can skip?

cygri: typically the schema triples would be defined in existing vocabulary documents
... and those might disagree with the r2rml-generated schema triples
... would like to understand the use case better

juansequeda: we should generate schema triples for direct mapping but not for r2rml

<Souri> Suppose a query says {?x rdf:type rdfs:Class} and it goes against RDF "generated" from an empty EMP table ... what should be the response: should it be {?x = ".../EMP"} ?

hhalpin: -1 to having optional stuff or gray areas

<hhalpin> ditto with metadata about triples in general

<juansequeda> Marcelo and I have already defined the "direct schema": http://www.w3.org/2001/sw/rdb2rdf/wiki/Default_Mapping_to_RDFS/OWL

<hhalpin> Have an option that tells you to generate it, and then we flip a coin to decide on the default.

<hhalpin> :)

<ivan> +1 to cygri

<hhalpin> so maybe the default should be not to have the schema triples by default?

<ericP> +1 to cygri

<juansequeda> +1 to cygri

<boris> +1 to cygri

<juansequeda> hhalpin, I agree. schema triples should be optional

<privera> +1 to member:cygri

Souri: if we have empty table, what triples do we generate?
... do we generate nothing?

<hhalpin> there's the "table triple" issue - there was a diff between Eric and J&M's draft there...

Souri: just making sure that we are ok with generating no triples from an empty table

hhalpin: might be useful to have explicit option for generating schema triples
... there was something about table/metadata triples

ericP: cygri wanted to be able to identify which nodes comes from which table

<juansequeda> table triple: <emp#1> <ex:table> <emp>

ericP: there's several ways of doing this
... triple comes from a table? node comes from a table?

<privera> but in that case we are start talking about provenance

ericP: i'm in favour of saying "this predicate comes from this table"

<juansequeda> +1 to cygri

cygri: i want to be able to ask "what resources come from this table?" in a very simple way

<Souri> Just to confirm: So I am assumig that we are okay with a query {?x rdf:type rdfs:Class}, against RDF triples generated from an empty table (with no rr:class use in R2RML mapping) would return no solutions.

ericP: i want to make provenance ppl happy
... and then the sparql query would have two triple patterns instead of one
... this would be a leaner graph
... instead of 10000 arcs saying "protein x comes from table y", you have 10 arcs saying "column y.a comes from table y"

Souri, i'm ok with that

<ericP> <tableX> us:producedPredicate <predicateY> .

<Souri> "what resources come from this table?" Richard, how (i.e., using what SPARQL graph pattern) would you like to express it?

Direct mapping

marcelo: nothing new this week

<ericP> getting nodes from a table: SELECT ?s WHERE { <table> us:producedPredicate ?p . ?s ?p ?o }

ashok: thanks all

juansequeda: how far into december do we keep meeting?

<ivan> I am on vacations on both dates

<ericP> i intend to be lost in the desert on the 21st

ashok: telecons on 21st and 28th?

<mhausenblas> Michael: +1 for 21, but -1 for 28

<Seema> +1 for 21, no for 28

ashok: let's cancel both then

<privera> +1 ->28

<mhausenblas> [adjourned]

RESOLUTION: telecons for 21st and 28th is cancelled

<mhausenblas> trackbot, end telecon

- DRAFT -

RDB2RDF Working Group Teleconference

30 Nov 2010

Attendees

Contents

Admin

Review of R2RML Test Cases

Implementations (Michael)

R2RML status

Direct mapping

Summary of Action Items

Scribe.perl diagnostic output