RDB2RDF Working Group Teleconference -- 08 Dec 2009

<trackbot> Date: 08 December 2009

<mhausenblas> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2009Dec/

<mhausenblas> batla are you scribing?

<hhalpin> If no one else can do it, I can scribe.

<soeren> I can not dial in today, but will observe proceedings here via IRC

<mhausenblas> scribenick: batla

Admin

batla will scribe

<Souri> do you have the url?

<jsequeda> it's this one, right? http://www.w3.org/2009/Talks/1215-SWObjects-egp/#(1)

<mhausenblas> RESOLVED: http://www.w3.org/2009/12/01-RDB2RDF-minutes.html accepted

Presentation of SWObjects (Eric Prud'hommeaux)

<mhausenblas> http://www.w3.org/2009/Talks/1215-SWObjects-egp

<hhalpin> ericP: this is work on SWObjects with HCLS IG

<hhalpin> ... main problem is sociopathic names in RDF

<hhalpin> ... so we have two graphs

<hhalpin> ... a stem graph

<hhalpin> ... and an interface graph that is actually queried.

<hhalpin> ... 3 approaches

<hhalpin> ... 1) We use DDL to create a triple view...

<MacTed> (slide 4 typo -- s/preciate/predicate/)

<hhalpin> 2) RDF Schema

<hhalpin> 3) Normative RDF Representation

<hhalpin> So what we do is use a stem graph representation and then create using SPARQL constructs we create representation data.

<hhalpin> Table and columns to types of predicates

<hhalpin> SQL to create interface views vs. SPARQL to create interface views.

<hhalpin> Goal: What we want is clear algebra for the production of a stem graph, and a clear algebra for a SPARQL to SQL translation, and a clear algebra from the SQL results to RDF nodes.

<hhalpin> ericP: Slide 6: SPARQL to SQL results

<hhalpin> ... a set of algebra functions

<hhalpin> ... a SQL algebra subset

<hhalpin> ... here is the subet of SQL that we can use for SPARQL.

<hhalpin> ... there are some issues, i.e. in particular when we are going to different data sources, how to express that in SQL?

<hhalpin> ... ASK and CONSTRUCT are a bit tricky, is ASK exists? What is the name of it, what is its relationship?

<hhalpin> ... we could easily do it, is it useful?

<hhalpin> ... Slide 7: Here is an algebra for the stem mapping

<hhalpin> ... the stem mapping is called that due to the fact that you have to provide a stem URI

<hhalpin> ... we need to create a stem graph that meets the same use cases as relational data

<hhalpin> ... if you view t he log

<hhalpin> ... what queries could be easily transformed?

<hhalpin> DanMiranker: Would like to investigate these mappings in detail

<hhalpin> Orri: If you are part of SPARQL 1.0 you cant transform everything, but with 1.1 features like aggregate, it should be possible to map all of SQL.

<hhalpin> ericP: Lets try to do 1.0 first, and then we can do the later versions later.

<hhalpin> ericP: Slide 8 how do we take this relational data and map it coherently

<hhalpin> ... can we get the naming right

<hhalpin> ... re linked data principles

<hhalpin> ... the same answers received by SPARQL should match SQL

<hhalpin> ... this algebra is very typical

<hhalpin> ... for any query language, projections and graph pattern s

<hhalpin> ... union, disjunction, conjunctions, and optionals

<hhalpin> ... unlimted graph patterns

<hhalpin> ... majority of work is the same as queries over a triple view of SQL

<hhalpin> ... basic premise: table algebra is a set of constraints coupled with a subject, that identifies a table alias.

<hhalpin> ... .I have done this in perl a while ago

<hhalpin> ericP: Now in SQL

<hhalpin> .. slide 9

<hhalpin> ericP: the way you handle the graph patterns are pretty consistent

<hhalpin> ... what is in algebra?

<hhalpin> ... bgp mapping

<hhalpin> ... mapping between sparql and sql

<hhalpin> ... big difference is whether or not for triple stores we add a new table for new graph patterns

<hhalpin> ... null guards are completely different in sparql will get to that later

<hhalpin> ... in my implementation, I disallowed variable predicates

<hhalpin> ... I think its doable

<hhalpin> ... but if you disallow wildcard predicates you get something you can implement the SPARQL in about the same time as SQL re compile time and identical time for query.

<hhalpin> ... runtime

<hhalpin> Orri: Allowing variables in SPARQL, we allow them in Virtuoso

<hhalpin> ... but we extend SQL in order to avoid complicated unions

<hhalpin> ... however, what ericP said is true, that without that we can get more or less same overhead

<hhalpin> ... we can compile complex unions

<hhalpin> ... but if your system knows its looking at virtuoso

<hhalpin> ... then we can define the mapping in terms of algebra we can now we have done in correctly.

<hhalpin> ... conjunction is going to be the same

<hhalpin> ... but nulls are different

<hhalpin> ... a null in SPARQL is a missing triple

<hhalpin> ... in SQL, its a special value thats not equal to itself.

<hhalpin> ... curlies inside of query.

<hhalpin> ... I wrote one down in terms if graph constraints

<hhalpin> ... since I want to eventually hint at what we could do with query constraints.

<hhalpin> Orri: You know that maps to unbound in SQL

<hhalpin> ericP: When I do a query for a first name and last name

<hhalpin> ... got a value for that

<hhalpin> ... sql you would not get a result

<hhalpin> ... sparql you would get an answer

<hhalpin> orri: unbound is equal to anything, null is equal to nothing

<hhalpin> ericP: Slide 12

<hhalpin> ... we want to make sure with options that the entire optional graph is transparent

<hhalpin> ... the temptation is to a left outer join on just one nested table

<hhalpin> ... but really you want to get all the tables joined.

<hhalpin> ... otherwise you get cardinality problems (see my paper)

<hhalpin> ... the other problem is leading optionals in SPARQL

<hhalpin> ... so we have to do a bit of trickery

<hhalpin> ... i.e. a one column table

<hhalpin> ... just make sure upstream processing throws that column away

<hhalpin> Slide 13: Things that are unbound in SPARQL

<hhalpin> ericP: again, we have to do a little trick

<hhalpin> ... a SQL query executes the same as a SPARQL query, a potentially null attribute

<hhalpin> ... have to make it explicit that it is not null

<hhalpin> ... its just tedious but doable

<hhalpin> ... Slide 14

<hhalpin> ... all the algebraic operators in SPARQL

<hhalpin> ... like string URI all of these can be handled by schema if you have RDF schema

<hhalpin> ... data types we can just handle ISO ones

<hhalpin> ... when I asked for somebodys name, we dont use xsd:strings

<hhalpin> ... which way is world better lang tags or data types from XSD

<hhalpin> ... now that we can query this kinda sociopathic representation of a RDB store

<hhalpin> ... how can we allow access using commonly known URIs like foaf:name

<hhalpin> ... nice thing is interface graph can entirely change shape

<hhalpin> ... can do anything, extra joins etc.

<hhalpin> ... as long as we can map it to stem graph via SPARQL construct

<hhalpin> ... magic stems

<hhalpin> ... creates virtual tables

<hhalpin> ... that can mix with concrete tables and result in more virtual tables

<hhalpin> ... but we can eventually resolve it to queries over concrete tables

<hhalpin> ... same approach re concrete constructs

<hhalpin> ... push query over virtual interface table to a query that can then be calculated over sociopathic original representation

<hhalpin> Souri: We have query rewriting systems in Oracle, very important technique

<hhalpin> ericP: Similar to goal flattening in rules

<hhalpin> DanielMiranker: No use of magic sets

<hhalpin> DanielMiranker: in literature yet

<hhalpin> ... we use this in Ultrawrap

<hhalpin> Souri: We support this, but just differently

<Souri> That was not me

<hhalpin> Souri: What is the power of that approach as regards mapping SQL

<hhalpin> ... for example, with analytical functions

<hhalpin> ... how could that be done with SPARQL, could this part of SQL be expresssed?

<hhalpin> ericP: My goal was to stick to SPARQL 1.0 and eventually SPARQL 1.1

<hhalpin> Souri: Possible to transmit that as regards rewriting it over SQL, we can map that easy, the main issue is expressive power

<hhalpin> ... I am not interested in SQL tables as a RDF consumer

<hhalpin> ... so analytical funtions would just get some overall aggregates that we could look at as RDF

<hhalpin> ... and we would just query that view looking at SQL view definitions

<hhalpin> Souri: We can aggregate relational views that are results of analytical f unctions

<hhalpin> ... the main difference between our approach and your approach

<hhalpin> ... is that your body is the SPARQL query

<hhalpin> ... again, my main question is power

<hhalpin> ... we want to look at view definition is completely in SQL; and then on top we map

<hhalpin> ... just to map the columns into predicates

<hhalpin> ... and defining some constraints

<hhalpin> ... the main difference is the expressive power of SQL vs. SPARQL

<hhalpin> ... for creating the views

<hhalpin> Orri: Minus a few small differences, we can make them more or less equivalent

<hhalpin> Souri: Is it SPARQL we are talking about?

<hhalpin> ... or relational data?

<hhalpin> Ahmed: If you do the work in SQL that is easier, and if you do it outside you need to be careful

<hhalpin> ericP: But this approach should work over anything you can express in SQL

<hhalpin> ... we have to be careful with arbitrary number of extensions

<hhalpin> ... my approach hopes to be portable

<hhalpin> ... spreadsheets, nlp, etc.

<hhalpin> Souri: Read only approach, we can only create views on some kinds of tables,

<hhalpin> ... when we are talking about spreadsheets and etc.

<hhalpin> ... we are not as deep as part of relational data.

<hhalpin> ... the complexity of the domains are different, so I was thinking RDB2RDF should take into account complexity of relational data.

<hhalpin> ericP: performance

<hhalpin> ... LLR(1) with SPARQL and SQL

<hhalpin> ... slide 17

<hhalpin> integrtiy constraints

<hhalpin> ... we can push them into sql into two ways

<hhalpin> ... every column that was mentioned in antecedent graph

<hhalpin> ... slide 18

<Ahmed> I hav ea conference call @10:00am sharp, I will split @9:59 - thanks.

<hhalpin> ... we may want to put an antecedent contruct on everything in our attributes

<hhalpin> ... (goes through example)

<hhalpin> slide 19

<hhalpin> ericP: Integrity constraint options

<hhalpin> ... anything with a constraint

<hhalpin> ... shortest path through all those variables, the magic set view for graph connections

<hhalpin> ... all the paths for everything that is connected

<hhalpin> ... I have used syntactic hints

<hhalpin> ... had no integrity constraints

<hhalpin> ... it didnt matter what it was coming from

<hhalpin> ... so we did virtual views of constructs and role flattening

<hhalpin> ... slide 21

<hhalpin> ... hcls requirements

<hhalpin> ... sparql 1.0 term rewriting, everything is done in interface graph...

<hhalpin> ... rdf construct

<hhalpin> ... turn uris of one form into another re regular expressions of named patterns

<hhalpin> ... come through membership, see wiki for more details

<hhalpin> ericP: thats it for me.

<hhalpin> DanielMiranker: We are interested in this algebra, since we wanted to compile DDL as well

<hhalpin> ... we did that all that of DataLog

<hhalpin> ... looking at Datalog to relational algebra work.

<hhalpin> ericP: Would that handle everything like leading optionals, i.e. in Datalog

<hhalpin> DanielMiranker: Yes, I think it would

<hhalpin> Meeting Adjourned

- DRAFT -

RDB2RDF Working Group Teleconference

08 Dec 2009

Attendees

Contents

Admin

Presentation of SWObjects (Eric Prud'hommeaux)

Summary of Action Items

Scribe.perl diagnostic output