Cover page images (keys)

Questions (and Answers) on the Semantic Web

XML-Days, Berlin, Germany, 2006-09-27

Ivan Herman, W3C

We all know that, right?

WRONG!!!!

Goal of this presentation…

 

Is the Semantic Web AI on the Web?

No!

Picture of a hype article with a text on it saying 'Beware of the Hype'

So what is the Semantic Web?

Example: Automatic Airline Reservation

Example: data(base) integration

Example: data integration in life sciences

Left side: data silos, each its own representation on a screen, with scientist interpreting; right side: same silos, converted to rdf and co, scientist doing data right away.

And the problem is real

screen dump of three different Life Science databases with mutually different interfaces

So what is the Semantic Web?

The Semantic Web is… the Web of Data

And what is the relationship to AI?

 

All right, but what is RDF then?

RDF

RDF (cont.)

 

But isn’t RDF simply an (ugly) XML application?

RDF is a graph!

A Simple RDF Example

A Simple RDF Graph with full URI-s
<rdf:Description rdf:about="http://www.ivan-herman.net">
    <foaf:name>Ivan</foaf:name>
    <abc:myCalendar rdf:resource="http://…/myCalendar"/>
    <foaf:surname>Herman</foaf:surname>
</rdf:Description>

Yes, RDF/XML has its Problems

Use, e.g., Turtle if you prefer…

<http://www.ivan-herman.net>
  foaf:firstName "Ivan";
  abc:myCalendar <http://.../myCalendar>;
  foaf:surname "Herman".

 

But what has RDF to do with data integration?

Consider this (simplified) bookstore data set

ID Author Title Publisher Year
ISBN 0-00-651409-X id_xyz The Glass Palace id_qpr 2000

 

ID Name Home page
id_xyz Amitav Ghosh http://www.amitavghosh.com/

 

ID Publisher Name City
id_qpr Harper Collins London

Export your data as a set of relations…

The previous table in an RDF format

Add the data from another publisher…

The French and English data side by side

Start merging…

The merged data with nodes with identical URI-s pointed out

Simple integration…

The merged data with one of the nodes merged with common URI

Note the role of URI-s!

 

So what is then the role of ontologies and/or rules?

A possible short answer

This is where we are…

The merged data with one of the nodes merged with common URI

Our merge is not complete yet…

Better merge: richer queries are possible!

The merged data with extra nodes identified as a result of identifying same as properties

What we did: we used ontologies…

And then the merge may go on…

The merged data with a reference to a Wikipedia entry on the author

…and on…

The merged data with a reference to a Wikipedia entry on the author plus other books he wrote

…and on…

The merged data with a reference to a Wikipedia entry on the author plus other books he wrote plus a reference to Calcutta refereeing to the google map entry

Is that surprising?

Important issue: “schema independence”

You remember this statement?

And this?

Tradeoffs

concentric arcs with RDF, RDFS, OWL Lite, DL, and Full

 

“One has to learn formal logic, knowledge representation techniques, description logic, etc”

Not really…

 

Where do the data and ontologies come from?

(Should we really expect the author to type in all this data?)

Pure RDF data: not always a solution…

Data may be around already…

Data may be extracted (a.k.a. “scraped”)

Formalizing the scraper approach: GRDDL

<html xmlns="http://www.w3.org/1999/">
  <head profile="http://www.w3.org/2003/g/data-view">
    <title>Some Document</title>
    <link rel="transformation" href="http:…/dc-extract.xsl"/>
    <meta name="DC.Subject" content="Some subject"/>      
    ...
  </head>
  ...
  <span class="date">2006-01-02</span>
  ...
</html>
<rdf:Description rdf:about="…">
  <dc:subject>Some subject</dc:subject>
  <dc:date>2006-01-02</dc:date>
</rdf:Description>

GRDDL (cont)

Another Future Solution: RDFa

RDFa example

<div about="http://uri.to.newsitem">
  <span property="dc:date">March 23, 2004</span>
  <span property="dc:title">Rollers hit casino for £1.3m</span>
  By <span property="dc:creator">Steve Bird</span>. See
  <a href="http://www.a.b.c/d.avi" rel="dcmtype:MovingImage">
  also video footage</a>…
</div>
<http://uri.to.newsitem>
  dc:date             "March 23, 2004";
  dc:title            "Rollers hit casino for £1.3m;
  dc:creator          "Steve Bird";
  dcmtype:MovingImage <http://www.a.b.c/d.avi>.

Common in RDFa and GRDDL

Linking to SQL

And for Ontologies?

There are already ontologies around…

“Core” vocabularies

A mix of ontologies (a life science example)…

diagram showing a large number of HC related ontologies bound via a RFD-like graph

 

How do I extract triplets from and RDF Graph? Ie: how do I query an RDF Graph?

Querying RDF graphs

Simple SPARQL Example

SELECT ?cat ?val # note: not ?x!
WHERE { ?x rdf:value ?val. ?x category ?cat }
a simple graph with two tree like subgraphs
a simple graph with two tree like subgraphs left subgraph highlighted a simple graph with two tree like subgraphs with selected nodes in the left subgraph highlighted a simple graph with two tree like subgraphs with selected nodes in the right subgraph highlighted a simple graph with two tree like subgraphs with selected nodes in the right subgraph highlighted

Other SPARQL features

SPARQL as a federating tool

diagram showing a sparql that can be connected to an rdf datafile, a document via grddl, and to a database via an sparql/sql bridge

 

Isn't This Research Only?

(or: does this have any industrial relevance whatsoever?)

Not any more…

Not any more… (cont)

Some RDF deployment areas (cont)

The “corporate” landscape is moving

Applications are not always very complex…

The Active Semantic Doc picture: a doctor's file with annotations

Data integration

Example: antibodies demo

Antibodies' demo screen dump

There has been lots of R&D

MuseoSuomi Application dump Traditional Chinese medicine example dump

Portals

Vodafone screen dump

Improved Search via Ontology: GoPubMed

GoPubMed Application dump

Summary

 

Thank you for your attention!

(Slides are available from: http://www.w3.org/2006/Talks/0927-Berlin-IH/)