W3C

Rdf Access to Relational Databases

25-26 Oct 2007

Attendees

Present
Chair
EricP
Scribe
ericP, LeeF, JimM_
raw IRC log

Contents


Day1

<ericP> LeeF, shellac, i need to put some slides together for a recently vacated 16:00 slot

<ericP> i want to compare/constrast different mapping languages

<ericP> so, good buddy...

<ericP> do you have expertise to offer?

<shellac> I think andy may have a slide on squirrels, which is super-simple

<shellac> squirrel's

<shellac> erm, you need d2rq example, and (I think most complex) viruoso

<shellac> virtuoso

<shellac> virtuoso's is like rdb views

<LeeF> ericP- why not a panel on mapping?

<LeeF> everyone's here

<LeeF> could pick a bunch of categories

<LeeF> and discuss w/ the panel

<ericP> panel, pretty much the info i wanted to get across

<LeeF> hmm?

<ericP> i wanted to say "given this relational schema, here's how you can set up your mapper to query it with that query"

<LeeF> i worry that that level of detail might not come across well in a verbal setting?

<ericP> so basically, each tech can present, but we have parallels between the presentations

<ericP> let's see if i can convince you or you me during teh coffee break

<LeeF> OK

<ericP> note, you are already at a disadvantage as we have no chairs

<LeeF> I AM A CHAIR

<shellac> ...step away from the capslock...

<ericP> COTS?

<shellac> COTS Commercial Off-The-Shelf?

<ericP> sounds good

<shellac> http://www.openacademy.mindef.gov.sg/openacademy/ITseminar/cots.htm

<LeeF> ericP, http://www.w3.org/2007/03/RdfRDB/talks/BoeingW3CPresentation.ppt is 403 Forbidden

<ericP> should be fixed now

<ericP> tx

<LeeF> thx

<cygri> ericP: you're planning a comparison of mapping approaches over lunch?

<ericP> cygri, i'd like to start on it

<ericP> i'd like to do more on it during a break-out tomorrow at 9.30

<cygri> ok ... just let me know if you need any input from me

<ericP> cygri, yes please. let's scratch on napkins during lunch

<ericP> (i, for instance, have a Dell D820 napkin running Debian Sid)

<ericP> could someone sitting next to chrisB or cygri tell them that i have no slides for them?

<UldisB> ericP: done. told chrisB.

<ericP> tx kindly

<ericP> he's up first thing after lunch

<ericP> cygri, do you have your slides (you're first presenter after lunch)

<cygri> ericP: chrisB does the presentation

<cygri> think he has sent you the slides per email a few minutes ago

<ericP> ahh i think he is known as "[Moderator Action]"

<ericP> (just that the slides got blocked from going to the list)

<ericP> putting them in talks/ space now...

<ericP> welcome Katsu

<Katsu> Hi, ericP. Thank you for chairing. I've been on this channel except during lunch.

<ericP> oh good. glad the chairing is working

<ericP> cygri, do you have a MySQL5 server?

<ericP> i can give you AlzGene in one of their DB formats

<cygri> yes, works for me

<ericP> or i can do a mysqldump

<ericP> cool. binary first...

<ericP> cygri, can you ping 10.130.168.139 ?

<ericP> (so you can pick it up off the web)

<ericP> ((my web))

<cygri> yes

<ericP> ok, hacking apache conf

<ericP> hmm, better if i zip it so you don't have to wget a zillion files...

<ericP> cygri, does http://10.130.168.139/AlzGene.tzf work for you?

<cygri> trying ...

<cygri> ericP, mysql doesn't like the files, seems to be an issue with capitalization of the table names

<ericP> fussy bastard

<ericP> would a mysqldump help it along?

<ericP> i guess that means there aren't many names to hit in the dump

<ericP> cygri, try http://10.130.168.139/AlzGene.sql

<ericP> ivan, timbl, anyowe else, could you please add topics for the discussion at 16.00 ?

<deanallemang> Paper by Suzette Stoutenberg and her team at Mitre where she uses an Ontology Architecture to have one ontology per database, but then a mediating ontology to merge them together

<deanallemang> http://www.mitre.org/work/tech_papers/tech_papers_04/04_1174/

<AshokMalhotra> c/stores/stored/

<deanallemang> Ashok: I don't think so - I think everyone can agree that circumstances will prevent us from storing data in some particular way, so we need to be flexible. Of course, once I say "everyone", I am sure there will be someone who doesn't

<AshokMalhotra> Right! I just wanted this on the table.

<AndyS> How much of OWL are people using in this problem space?

<deanallemang> Andy: Have a look at Suzette's paper - she uses intersections/unions, subclass/equivalentclass/subproperty, inverse, and I think hasValue

<AndyS> Thx: is that for the mapping between ontologies or modeling the DBs?

<deanallemang> Mapping - modeling of the ontologies is done pretty much the way the last speaker said - domain/range and owl:Class

<AndyS_> So there is scope to execute efficiently. Cool.

<ericP> cygri, any luck with http://10.130.168.139/AlzGene.sql ?

<eneumann> has the topic of locating reverse references been addressed yet?

<ericP> i don't believe so. could you explain/motivate it some?

<cygri> ericP: yes, i'm setting up a D2R Server for the DB ... can you send me the schema diagram too so i can see all the joins?

<eneumann> It goes something like this: "I would like to know everything linked to or said about this RDB entity, especially if it is outside its own table"?

<ericP> cygri, http://ashby.csail.mit.edu/web/Outlook.jpg

<ericP> boucning you a mail now...

<ericP> email addr?

<kidehen> ericP: s3.amazonaws.com/kingsley/Browser_based_Visual_SQL_to_RDF_Mapping_Tool.ppt re. RDF Visual Mapping presentation

<kidehen> ericP: as in http://s3.amazonaws.com/kingsley/Browser_based_Visual_SQL_to_RDF_Mapping_Tool.ppt

<ericP> kidehen, this replaces the current link to http://www.w3.org/2007/03/RdfRDB/papers/erling.html ?

<ericP> (link from http://www.w3.org/2007/03/RdfRDB/program )

<ericP> eneumann, i'm trying to understand when i'd want reverse references, and how i'd implement it

<ericP> that is, beyond what [[SELECT ?referrer WHERE { ?referrer references <somePage> }]]

<ericP> ... answers

<AndyS_> even that's hard when <somepage> can be in any DB

<eneumann> many orgs want to know all annotations and links made about things that sit inside RDBs, but these links (triples) are highly distributed. There are a couple of ways to implement this, but most borrow strategies from Google's 'barrels'

<ericP> yeah, hence the "how i'd implement it"

<AndyS_> and references is DB spanning

<cygri> ericP: work in progress here: http://10.130.168.109/

<ericP> cygri, cooool

<ericP> cygri, i bounced you a sample query. i might have abbreviated it correctly here:

<ericP> [[

<ericP> PREFIX ag: <http://localhost:8890/test/schemas/AlzGene#>

<ericP> SELECT ?paName ?locus ?ethn ?pubmed ?resl FROM <.example> WHERE {

<ericP> ?gene ag:geneAlias "CTSD" ; ag:locus ?locus .

<ericP> ?prot ag:geneRecord ?gene ; ag:proteinAlias ?paName .

<ericP> ?ethd ag:eth2geneRecord ?gene ; ag:studyRecord ?stud ; ag:ethnicity ?ethn ; ag:result ?resl .

<ericP> ?stud ag:pubmed ?pubmed ; ag:studyName ?stname }

<ericP> ]]

<ericP> http://10.130.168.109/sparql?query=PREFIX%20ag%3A%20%3Chttp%3A%2F%2Flocalhost%3A8890%2Ftest%2Fschemas%2FAlzGene%23%3E%0ASELECT%20%3FpaName%20%3Flocus%20%3Fethn%20%3Fpubmed%20%3Fresl%20WHERE%20%7B%0A%20%20%3Fgene%20ag%3AgeneAlias%20%22CTSD%22%20%3B%20ag%3Alocus%20%3Flocus%20.%0A%20%20%3Fprot%20ag%3AgeneRecord%20%3Fgene%20%3B%20ag%3AproteinAlias%20%3FpaName%20.%0A%20%20%3Fethd%20ag%3Aeth2geneRecord%20%3Fgene%20%3B%20ag%3AstudyRecord%20%3Fstud%20%3B%20ag%3Aethnicity%20%3Fethn%20%3B%20ag%3Aresult%20%3Fresl%20.%0A%20%20%3Fstud%20ag%3Apubmed%20%3Fpubmed%20%3B%20ag%3AstudyName%20%3Fstname%20%7D

<AndyS> So that use case was very reasonable. No large vendors are targetting it - maybe it is because it is not a single problem space when you push into the detail.

<deanallemang> Topic Proposal: Propagating updates from RDF views back to underlying datasources

<AndyS> If you missed the URL: http://www.mitre.org/work/tech_papers/tech_papers_04/04_1174/

<AndyS> Expections of RDF Schema / OWL for constraints

<ericP> tx AndyS

<AndyS> EricNeumann: Close world : exploit to produce efficient solutions

<AndyS> Orri: Whole semantic web - very large - federated query requires source description and discovery - meta infor for query plans

<AndyS> Kingsley: what if the big players expose linked data? Suggest that the time is coming when that will happen.

<AndyS> Ashok: will it change to RDF?

<AndyS> Michael: specialised store arise to meet these needs

<AndyS> ... Existing players suggest using current databases. Engines become (have become) very complex.

<AndyS> Ivan: == Benchmarks

<AndyS> ... community work in some form?

<AndyS> ... XG tomorrow.

<AndyS> THALIA: http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/THALIATestbed

Day2

<kidehen> ericP: no, please!!

<kidehen> ericP: remember I am presenting on behalf of Huajun Chen etc..

<kidehen> ericP: the slides are for tomorrows 11.25-11.55 session :-)

<LeeF> ericP, slides?

<kidehen> .w network

<ivan> List of converter tools

<ericP> [why should we go through XML?]

<X> Scribe: ericP

timbl: important message: XML is not a stepping stone to the semantic web

<kidehen> ericP: u processed my message re. ppt?

<X> Scribe: LeeF

brand: take away message - Semantic Web community needs to take government datasets and show what can be done with them

<X> Scribe: 999

<kidehen> http://colab.cim3.net/cgi-bin/wiki.pl?WikiHomePage has been updated with Linked Data Project URIs re. Govt. Data

<cygri> scribing the mapping breakout session

<cygri> "prefix mappings": generate class/propery names from the schema, like foo:table.column

<cygri> "predicate mappings": assign existing properties to certain columns, e.g. foaf:name instead of foo:table.column

<cygri> some things should be modelled as existentials in the RDF output, e.g. "this object has a resource that is a company and has a name IBM"

<cygri> issue: is the mapping between ontology and db schema directional or bidirectional?

<cygri> bidirectionality seems to be desirable, but not necessary for simple cases

<cygri> SPARQL features that are hard to translate, from D2RQ experience: LIMIT/OFFSET/ORDER BY, UNBOUND

<cygri> another mapping approach: rewriting a query into prefix mapped form using a CONSTRUCT-style pattern

<cygri> ericP, you got the notes?

<UldisB> would be good if breakout topic leaders 'd put a session summary on a wiki

are Vipul's slides online?

<timbl> http://www.rdfabout.com/rdf/usgov/geo/us/nj

<ericP> cygri, i just stuck them into the wiki

<ericP> i'll move them to a new page ('cause that page is for yesterday)

<timbl> http://dig.csail.mit.edu/2007/wiki/Projects.rdf#OpenLinkedDataProject

<AndyS> SPARQL algebra? :-)

<timbl> I have just demonstrated to Brand Nieman a Tabulator fly-though of US Census data linked with Geonames data, showing ow just putting govt data as linked data out htere and connecting it gives extra value and re-use.

<timbl> I hope he feels that the pilot study he wanted has already been done buy he Open Linked data project, especially James Tauberer

<cygri> ericP: AlzGene mapping for D2RQ: http://richard.cyganiak.de/2007/10/alzgene/

http://s3.amazonaws.com/kingsley/Browser_based_Visual_SQL_to_RDF_Mapping_Tool.ppt

(AFAIK)

<shellac> why is kingsley out to get me?

<ericP> cygri, tx. could link that from the minutes

<ericP> (i still need to move the break-out notes to a better [-named] page)

<ericP> agenda ''

<ericP> how do i annotate my disfunctional properties?

<shellac> hey, this talk is IBM confidential

<shellac> oh no, just that slide

<ericP> strike it from your mind

<AndyS> I found: http://www.sigmod.org/sigmod/record/issues/0409/11.JimMelton.pdf

<JimM> Yeah, but that's only a presentation about the standard :)

<AndyS> Better than nothing!

<JimM> Try http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2fIEC+FCD+9075-14%3a2006

<JimM> US$30 (huge bargain by comparison)

<AndyS> I was hoping for an online draft :-)

<JimM> Unfortunately, some organizations charge (virtually) nothing to participate, but they have to support their infrastructure some how, and that means selling their product.

<JimM> For AndyS: Unfortunately, some organizations charge (virtually) nothing to participate, but they have to support their infrastructure some how, and that means selling their product.

<JimM> Next steps

<JimM> Standardizing relationships to certain objects (e.g., zip codes)

<timbl> _________________________________________________________________________

<AndyS> I understand - I was being hopeful about the existence of "early" drafts like some standards

<Andre1> The SQL to XML mapping was described in "SQL/XML is Making Good Progress, Andrew Eisenberg and Jim Melton, ACM SIGMOD Record, Vol. 31, No. 2, June 2002."

<Andre1> http://www.acm.org/sigmod/record/issues/0206/standard.pdf

<JimM> Starting an XG on benchmarking (Chris)

<JimM> Is it appropriate to have an XG on the mapping issues?

<X> Scribe: JimM_

<AndyS> XG : about a year : W3C intrastructure : no team support

<AndyS> Ivan: survey work for mapping XG

XG might start a survey of existing approaches, identify work that might justify a WG established

Vipul: Doesn't sense level of consensus on approach to create group
... GRDDL group maybe? Ivan: No, going out of business
... Scope of mapping language? Relational only? Web services, too? Other data sources?

Ivan: Don't try to solve everything at once; start off with well-defined data source (ex: relational), then later do a second one, etc.

Vipul: Define underlying framework that can be used across data sources

Eric: Best way to champion the effort is to propose a Framework in a Technical Note

Orre: Agree a standard is needed for mapping relational to RDF; don't bite off too much by mixing all data sources. Understand it's desirable to integrate them all, but the DBMS should do that sort of thing.
... DBMS might be right place for that. Keep momentum going, but don't expect to much too soon.

Ivan: Feels that charter of such an XG is at end of 1-year period, good picture of what is around, how it's used, pros and cons; there is a possibility for standardization and here's the direction, OR no, there's no opportunity, so abandon the effort for now.

Vipul: If XG says "Yes, there is an opportunity", it might be able to produce that framework.

TimBL: If some group focused on problem of mapping SPARQL to SQL...

Ivan: There's mapping relational to RDF, and also mapping SPARQL syntax to SQL syntax.

TimBL: The XG should define the content of a mapping file.

Jim: Yes, but the problem of mapping SPARQL syntax to SQL syntax is very different from the problem of mapping from relational data (model) to RDF (model)

Brand: Most relational DBs are not semantically ready for this sort of mapping. Need to have tools to make them semantically ready (and to build them ready from the start)

Jim: Let's not solve the problem for the XG, but let them explore the space and report on their conclusions. My thought is to define a language in which the mappings can be specified and stored in the database's catalog itself.

Brand: Some gov't DBs have columns named "lead", others have columns named "Pb". Are those the same, nearly the same for some purpose, or totally different?
... They use wikis in which people associated with various databases can spell out what things mean, then they run a program to match info up.

Krishna: What is benefit from standardizing the mapping, and who benefits?

Eric: (missed most of this). Further out, communities who have multiple data sources and want to explore relationships between them, discovering new knowledge.

Vipul: Consistency, quality of mapping, etc. Need standardized language and tools.

Brand: 2 things in parallel. Need something like Visicalc so people can develop data table sin web environment that helps them produce semantics and produce RDF. Also need to have people who know how to do semantic harmonization of relational databases to get them ready for doing high-value mapping.

Eric: Banff demo helpful

Brand: Needs example

Ivan: Who would benefit? Good question. But maybe the big database vendors will decide that the quality of the mapping is so valuable to them in competition, that they might not want to standardize. That's why an XG is better than a WG (which presumes that a standard is going to be developed).

Krishna: Why do you need to know how I do the mapping? I'll just give it to you.

???: Most people prefer to use this approach instead of a hardwired mapping.

Merge disparate DBs through an interface that (medical) clinician, tech, etc. can use...

<AndyS> (??? = Eric Neumann)

Ivan2: Likely to be impossible to cut-and-paste a mapping from one DB to a second one. Too different. As soon as a move is ipossible, then a language to describe the mapping will be necessary. SQL good not because it's compatible among databases, but because training is portable.

Andrew: If a standard mapping language exists, one benefit is that tools vendors can use that to write tools, transfer of knowledge of people between projects and projects between people.

Vipul: Agrees with Andrew. Also opens possible for re-use, copy-and-paste, then edit them.

TimBL: P Particularly interested in tools that give good views into DB, makes metadata editable, clean up databases while doing the semantic mapping.

About 10 people interested in participating in such an XG

Jim: Suggests Ashok as leader

Send email to Eric if you're interested

Eric: OK, here's a third subject
... People interested in federation, at least in conjunction with mapping.

Phil: Benefit of mapping is etting relational data quickly into RDF world. Mapping without federation doesn't get the job done adequately.

Vipul: See 1:1 mapping between federation and mapping. At a minimum, they go together.

Orre: Even if we started working on the separately, they will have much overlap and should come together in the end.

Brand: Mapping done with semantic experts, while federating has been done by DB vendors/consultants.

<X> Scribe: ericP

JimM: fed and mapping are at different levels of maturity
... fed involves many more probs: e.g. security
... don't want mapping held up for federation
... start separately and join after mature

<X> Scribe: JimM_

Ivan: Agreed on two XGs. Should there be a third XG for federation?

Orre: Finite number of people, don't spread too thin. We believe that it's vital to make SPARQL "BI-capable". These other goals are nice, but need language support. (aggregation, subqueries, other SQL-like capabilities)

Andy: That comes back to question of mapping SPARQL to SQL.

???: Why worry about SPARQL, if translating to SQL which already has it?

Jim: But SPARQL has to have the expressibility in order to have something to map to SQL

Brand: Important to have a mapping from the language in which people write queries to the language in which the queries execute. E.g., transform from a natural language, using semantic ontologies, to ask user "is this what you mean by your question?", then refine into the target system's query language as appropriate.

Ivan: Orre and Ivan, how does this affect the proposed mapping XG? Should we do nothing at that level before extending SPARQL?

Orre: No, we will work on that in any case. That's not good enough jfor industry at large. W3C should do this, in a timely fashion. The mapping question is separate and should proceed in any case.

Ivan: SPARQL extension is totally separate (operationally) from mapping XG work.

TimBL: DAWG wants to finish and have a rest, even though some of us have already been using SPARQL Update.

Ivan: Let's be sure that we start gathering requirements/suggestions for SPARQL 1.x
... Need to involve the "small, insignificant" :) database community. How can we bring the two communities (DB and SW) together?

Melli: What do you mean by "connecting them"?

Ivan: Not necessarily into same group. But what % of DB community has even heard of semantic web? The value proposition is not entirely clear to the DB community yet. Need to make that widely known.

Does Oracle have Semantic Web interest group? What about Open World?

Melli: Finding something that will make DBAs say "Oh, here's something really hard/impossible in SQL, but I can do it w/Semantic Web technologies.

Vipul: Good demo gets people excited, even before value proposition is clear.

Eric: Would Bannf demo be doable in SQL?

Yes, but it would be much harder and take much longer in SQL.

Eric: If we had an IG, or a TF within another IG (e.g., SWIG), who would be lead...

Send Eric email if interested in the federation subject

<kidehen> JimM: Ae you game for a Linked Data demo?

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.128 (CVS log)
$Date: 2007/11/18 03:43:03 $