RDB2RDF Working Group Teleconference -- 31 Aug 2010

<trackbot> Date: 31 August 2010

<mhausenblas> scribenick: boris

<mhausenblas> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2010Aug/0060.html

Admin

<mhausenblas> PROPOSAL: Accept the minutes of last meeting, see

<mhausenblas> http://www.w3.org/2010/08/24-rdb2rdf-minutes.html

<juansequeda> +1

RESOLUTION: last minutes approved

Revised proposal from Souri

souri: dont have comments rgarding lookup table
... richad is not here ... so ...
... To find actual use case for this lookup table
... let's discuss this next week when richard will be here

Michael: ok

Test Cases

michael: we need to create test cases ....
... boris volunteer to take care of this subject

<mhausenblas> http://www.w3.org/2001/sw/rdb2rdf/wiki/R2RML_Test_Cases

michael: brief introduction for the test cases

<mhausenblas> http://www.w3.org/2001/sw/rdb2rdf/test-cases/

michael: currently the docs are as pdf
... we have came for the test cases
... input sql, and the mapping file and the expect output
... the sparql query will check if the resultant rdf is oik
... questions?

someone: .... all the triples in the sparql query?
... no additonal triples

soeren: it is not testing for conformance

<juansequeda> +q to ask what are we testing. Is it correctness and/or performance ?

souren: test the triples not the superset

michael: no test case for performance
....: two fold
...: check the semantics we describe in the R2RML doc is what we mean
....: during the phase of the development of the R2RML DOC

<hhalpin> correctness testing
...: next phase will be to test implementarions

juan: we will give manually the input and the output

<Souri> SPARQL working group has good and tested scheme for test specifications

michael: yes...
... it is useful have sparql query rather than the rdf triples, ... but we are open to discuss

<juansequeda> souri, can you send out the link to the sparql group's test cases
...: we can have additonal triples, provenance for exmaple ....
....: that's why is good to have the sparql queries

<Zakim> juansequeda, you wanted to ask what are we testing. Is it correctness and/or performance ?

<Zakim> ericP, you wanted to second soundness

<hhalpin> soundness would be a subset of correctness testing

eric: we have bunch of implementations with more liberal equality testing ...
....: we have at least it is true, ... we couldn't tell there are some erronous values
... we can have some "extened" mode
...: for the implementaros
....: completeness ericP

michael: we are open to discuss more on this ...
...: if you think this is more sense is ok,

soeren: it is very important, it is really importanto to check no additional thing is included
... maybe add another sparql queries included the numbrers of triples

michael: it will take some time to set up completely ...

michael: we are going to work on test cases, and we once we will have sth there ....

juan: one example database?

michael: one example database per test case rathern than one big database

juan: we can have non-normalized database

michael: think in very simple test cases ... later on we can have more complex test cases
... in the beginning only simple test cases

juan: test cases for RDF, or SPARQL?

michael: yes, they are similiar, ....
... questions ...? ...

What are we going to do about the Default Mapping?

eric: status?
... Richard and Eric, write some use cases, originally for simple scenarios
... single tables with primary key

<mhausenblas> ACTION: Michael to send sample TC to the group [recorded in http://www.w3.org/2010/08/31-rdb2rdf-minutes.html#action01]

<trackbot> Created ACTION-71 - Send sample TC to the group [on Michael Hausenblas - due 2010-09-07].

eric: rewrote the notion of you have relations with or not primary key, not always composed, and FK not composed

<hhalpin> note that I'm happy to help set up test-cases for you boris.

eric: there is an example of that on the document

<ericP> http://www.w3.org/2001/sw/rdb2rdf/directGraph/

<hhalpin> i.e. in W3C CVS etc.

<hhalpin> I'm happy to take that as an action.

<ericP> http://www.w3.org/2001/sw/rdb2rdf/directGraph/#multi-key

eric: multi-key example

juan: multiatrtibute key?

eric: is describing the table ...
... hardest cases in Relational model ... bunch of set defintions on the section 5 ....
....: relation definitions 4.1 to 4.2 ...

juan: discuss with marcelo about, work together on this .... from the DB perspective ... it's a kind of BNF rule ...
... this is not the way you define a relational model
... suggest to put this in datalog?
... no use this BNF style
....: don't see anything to coming out, this is just bunch of rules

cygri: doesn't agree to use the definition from the DB ?

<juansequeda> http://www.w3.org/2001/sw/rdb2rdf/wiki/Semantics_of_R2RML

juan: the semantics of datalog ....
... ... every predicate is a relation ... right there
... well defined semantics
....: eric just have BNF rules and type of injectors

cygri: understandability more important to be formal

<juansequeda> http://www.w3.org/2001/sw/rdb2rdf/wiki/Default_Mapping

<soeren> +q

cygri: insits to check how the "text books" introduce relational model

juan: link to the wiki, we have on datalog is much understable that the thing from eric

cygri: not talking as eric proposal
...: just check text books

dan: check the books and chapters of datalogs

soeren: relational algebra introduces the semantics of relational db
...: datalog will introduce things more complicated
....: let's focus only the relational algebra

dan: let's look at the dabases ... datalog = realtiona algebra
... a lot of people will find more confortable ... encourage people to check relations books
... what is commonly done

juan: sparql is datalog and datalog = relational algebra

<Zakim> ericP, you wanted to say that soeren's proposal would work for the doc which translates queries

eric: there is no notion of join (forexmaple)
....: relational algegra .... (joins, etc) ... not use in ... RDF

juan: only selection and projection ....

eric: datalog is horn- functions ... two issues

<juansequeda> DATALOG: course(id, name) = SQL: select id, name from course

eric: 1. I will need prolog to the some operations

dan: commonly accepted to add this in datalog

juan: the semantics of this mapping language is based in datalog ...

eric: is it prolog?

dan: yes, datalog is a subset of prolog

<Zakim> mhausenblas, you wanted to ask who the intended audience of this document is

michael: who is the intented audience will be? who is supposed to read this? ... and the we can follow some direction

eric: d2r folks

michael: people who write the engine

eric: write the engine implemetatnoins

juan: not only d2r, oracle, ibm

micahel: people who write the engine ... is the primary audience

eric: machine exectubale transforamtion written in prolog? ... another issue is the set semantics .... in some cases you use cardinality
... add aggregates it costs ..... how we presevered cardinalidty, set of sematnics

when sql standaraziton process ... there wont be agregates? .... cardinality was expense?

souri: practical to allow duplicates ...
... supose someone define a mapping ... do not have a primary key ... two rows the same URI ... unless we use distintict we will have duplicates
... try to supress duplicates ... not easy

<mhausenblas> s/soeren: supose/souri: suppose

eric: if there is no PK, expressing attributes in a tuple, ... blank node

souri: define a mapping, Uri corresponds a row ... every instance ....

eric: project the uniqueness ...
... not necessary ...
....: example of tweets .... not identifiers for the tweets¿ ... example on section 2.2
...: non-existing PK
....: preserves cardinality ...

souri: basically again, best practice from his point view ....
... implementation of bad cases ....

michael: we have one concrete proposal from eric ... and default mapping by Juan is an alternative proposal

juan: again everything in datalog

eric: is prolog

juan: if we present the tihngs to ibm, is not going to be easy

soeren: oracle doc. there are a lot of datalog?
... first relational algebra ... his impresion ...
... or I am wrong?

souri: more sql, more practical people
... there are some things in relational algebra
... practitioners use tables, cols, not theory

<mhausenblas> http://www.w3.org/2001/sw/rdb2rdf/directGraph/

<hhalpin> +1 souri, and let's see what subset of it we can grab in Datalog/RIF.

michael: it will be good if you take the eric's job and rewrite into datalog?

juan: Marcelo and I did it already

souri: is like a rule ... what we that expect

juan: is a rule ... default mapping should have semantics
... BNF and the semantics are in datalog ... that is his concern

michael: we can expect sth next week?

juan: ok

<Zakim> ericP, you wanted to ask if we can talk about prolog instead of datalog

ericP: prolog?

juan: datalog + built-in function, not prolog

<mhausenblas> how about prolog--

<mhausenblas> :)

ericP: is in between ... ok

juan: semantics of the language are eq to datalog

cygri: there are different ways to define the semantics of the language
...: of course the semantics have to be written

<hhalpin> cygri - basically *any* mathematical structure can serve as semantics

<hhalpin> and we do not *have* to write them in my opinion.

<hhalpin> But often it does help us think through what's going on in the spec.

cygri: db community is not datalog ...
... another arguments?

<hhalpin> Generally, I think we should do the semantics in a way that ideally would help a programmer understand what's going on.

dan: important who the audience is for the document?
... implementors of the engine

<hhalpin> Datalog, being so close to FOL, is some thing quite a few people understand.

<hhalpin> but we need to be clear the semantics are also subservient to the users.
...: different database implementors types

<hhalpin> i.e. ashoks' point was correct

<hhalpin> that we don't want to restrict users

juan: reason why to datalog is because SPARQL=SQL via datalog

<hhalpin> or make them learn a whole new language..

juan: sematnics of the language = datalog ... here are the semantics

<hhalpin> my suggestion would be to go forward with merging eric's work and writing down souri's using a syntax

<hhalpin> and *then* work on the semantics on the side.

juan: we should learn from the issues about the semantics

cygri: we can write the semantics in different languages

<hhalpin> and the semantics might help us catch a bug.

<hhalpin> they tend to...

dan: including english, NL ...example

michael: juan means formal semantics = semanticas

<hhalpin> however, we do not want semantics to be a rathole, we need to get a FPWD out asap.

<ericP> 1+

soeren: formal semantics are very important .... translating SPARQL algebra into relational algebra ... it will be clear ...

<soeren> http://www.informatik.uni-leipzig.de/~auer/rdb2rdf/semantics.pdf

juan: he has to look at that ... sparql to sql translate

<hhalpin> let's get FPWD with syntax based on a combo of Souri's proposal and ericP's proposal out there.

<juansequeda> Semantics preserving SPARQL-to-SQL translation -> http://www.cs.wayne.edu/~shiyong/papers/dke09_artem.pdf

ericP: continue on the direct mapping ....
... RDB here is the RDF graph ... SPARQL query here is the underilyng relational algebra
... first option

michael: neext week about the syntax
... and the semantics, and juan let us know if he is ready or not

<mhausenblas> thanks for joining in cygri

<mhausenblas> best wishes to all at I-Semantics, cygri

<ericP> ditto

<mhausenblas> [adjourned]

RDB2RDF Working Group Teleconference

31 Aug 2010

Attendees

Contents

Admin

Revised proposal from Souri

Test Cases

What are we going to do about the Default Mapping?

Summary of Action Items