graphic with four colored squares
Cover page images (keys)

Deploying Web-scale Mash-ups by Linking Microformats and the Semantic Web

Dan Connolly, W3C/MIT
Harry Halpin, <hhalpin -(at)- ibiblio.org>
16th International World Wide Web Conference
Banff, Alberta, Canada
April 2007
slanted W3C logo

Overview, Part II

Spreadsheets as RDF authoring tools

spreadsheet screenshot

I'm not the worlds greatest fan of applying very general, very abstract models to day-to-day problems. ...

Then it hit me.

Spreadsheets:-) Think about it. Row 1 cells - subject. Col 1 cells predicate. Other cells: objects...

Sean McGrath, May 2005

Apple plists: iTunes

"Show me songs ..."

Apple plists: Omnigraffle

TAG diagram

See gleaning RDF from omnigraffle diagrams Connolly to public-cwm-talk 30 Sep 2004

Java Beans: Violet UML

<java version="1.5.0_04" class="java.beans.XMLDecoder"
   xmlns:grddl="http://www.w3.org/2003/g/data-view#"
   grddl:transformation="grokVioletUML.xsl">
 <object class="com.horstmann.violet.ClassDiagramGraph"> 
  <void method="addNode"> 
   ...
UML diagram with OWL formalization

GRDDL WG agenda

Eating our own cooking...

GRDDL Test suite

More GRDDL transformations and dialects

CustomRdfDialects is something of a directory of dialects, vocabularies, and GRDDL transformations:

Dialect

Domain

Schema

Mapping

Status

hCard

Contact Information

vcard ontology

hcard2rdf.xsl, hcard profile

maintained by Walsh, Halpin, and Suda since Nov 2006; used in xtech schedule

hCalendar

Calendars and Events

RDF Calendar

glean-hcal.xsl, hcal profile

maintained by DanConnolly since Dec 2002

hReview

Opinions, Ratings and Reviews

RDF Review

hreview2rdfxml.xsl

in progress

relLicense

Licenses

Creative Commons

grokCC.xsl

e.g. Joe Lambda's homepage

Semantic Web Roadmap

Started in 1998; evolving slowly as we learn more...

semantic web layers

Semantic Web Deployment

advancing semantic web wave Berners-Lee, Jan 2003

Semantic Web Roadmap with GRDDL

semantic web layers + GRDDL

GRDDL: Not Just for HTML

Health Care Example

Kayode wants to query clinical data

patient files Data in RDF Schemas Reports

Kayode wants to write software components which can extract RDF descriptions from XML HL7 CDA documents transmitted from various devices in a healthcare system using a clinical ontology so that he can merge together clinical reports and use inferences to detect possible problems.

Getting RDF out of XML: Before


<ClinicalDocument xmlns="urn:hl7-org:v3" xmlns:voc="urn:hl7-org:v3/voc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" templateId="2.16.840.1.113883.3.27.1776">
...
   <author>
 	<time value="20000407"/>
 	<assignedAuthor>
 	<id extension="KP00017" root="2.16.840.1.113883.3.933"/>
 	<assignedPerson>
 		<name>
 	    	<given>Robert</given>
 		<family>Dolin</family>
 		<suffix>MD</suffix>
 		</name>
...

Details in: hl7-sample.xml

RDF Data from HL7 XML

Link transformation via an attribute on the root element:


<ClinicalDocument xmlns="urn:hl7-org:v3" xmlns:voc="urn:hl7-org:v3/voc" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" templateId="2.16.840.1.113883.3.27.1776" xmlns:grddl="http://www.w3.org/2003/g/data-view#" 
grddl:transformation="glean-HL7-CDA.xslt">
...
 	<Observation>

 	<id root="10.23.4573.15877"/>
 	<code code="282290005" codeSystem="2.16.840.1.113883.6.96" codeSystemName="SNOMED CT" displayName="Imaging interpretation"/>
...

Details in: hl7-sample-grddl.xml

HL7 Data


[ a cpr:patient-record;
         dc:date "2000-04-07";
         edns:about [ a galen:Patient;
                      foaf:family_name "Levin";
                      foaf:firstName "Henry"];
         foaf:maker [ a foaf:Person;
                      foaf:family_name "Dolin";
                      foaf:firstName "Robert"]]


[ a cpr:clinical-description;
                 cpr:description-of [ a cpr:screening-act;
                                      edns:realizes [ a cpr:medical-sign;
                                      cpr:interpretant-of [
                                         a foaf:Image;
                                         skos:prefLabel "Chest-X-ray"];
                                      skos:prefLabel "Chest hyperinflated"];
                                      skos:prefLabel "Imaging interpretation"]]


RDFS and OWL

RDF Schema (RDFS) and the Web Ontology Language (OWL) correspond to UML notions such as subclass, domain, range, cardinality, ...

travel concepts schema

OWL identity reasoning

Premise
# one-to-many
foaf:mbox a owl:InverseFunctionalProperty.
:dan foaf:mbox <mailto:connolly@w3.org>.
:dan foaf:name "Dan Connolly".

:daniel foaf:mbox <mailto:connolly@w3.org>.
:daniel foaf:name "Daniel W. Connolly".
Conclusion:
:daniel owl:sameAs :dan.
:daniel foaf:name "Dan Connolly".
:daniel foaf:name "Daniel W. Connolly".

Finding Medical Problems with OWL


   @prefix : <http://www.w3.org/2002/07/owl#> .
   @prefix g: <http://www.example.org/grddl-primer#> .
   @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
   @prefix foaf:  <http://xmlns.com/foaf/0.1/> .
   @prefix cpr: <http://purl.org/cpr/0.5#> .
    
   g:DiagnosingImage     a :Class;
     :intersectionOf  (
       foaf:Image
       [
         a :Restriction;
         :onProperty g:indicates;
         :someValuesFrom cpr:medical-problem> ) .

Discovering the Medical Problem!

image sign
_:x234 "Chest hyperinflated"

Some Semantic Web Software Tools

Platform: Jena
Java
Pellet
Java
redland
C
rdflib
python
SWAP
python
4suite
python
tabulator
javascript
HP research Parsia Beckett Kretch TimBL, Connolly
MIT DIG
Fourthought TimBL
MIT DIG
proof cwm/N3
rules Jena rules cwm/N3 Fuxi
OWL DL
Pellet
LinkedData query semantic web client Y
GRDDL Y Y * Y GRDDL.py outboard
query ARQ roquet cwm some sparql
RDF parse, merge Y Jena raptor Y cwm rdf.js

State of RDFa


more information at:

http://rdfa.info

start with the RDFa Primer!

Services: zero-install tools

RDF Validator XSLT servlet triplr XMLArmyKnife SPARQLer
host/platform W3C
Jigsaw+Jena
Java
W3C
Saxon
Java
Beckett
redland
C
Leigh Dodds HP research
Jena/ARQ
Java
XSLT Y Y
SPARQL Y Y
GRDDL Y Y Y
convert to RDF/XML, turtle, json Y
output triples Y Y
graphs in SVG, PNG, ... Y
validate/diagnose Y Y Y
rdf parse Y Y Y

OpenLink Virtuoso seems to support all of the above...

Ajax Query-Builder

openlink query builder
OpenLink Ajax SPARQL query builder

RDF browsing with GRDDL

openlink RDF browser

OpenLink RDF Browser

Eating our own cooking

Dan's homepage

Dan's travel schedule

Dan's travel schedule

Travel schedule with hCalendar/hCard:

RDF browsing with GRDDL

openlink RDF browser

RDF browser acts as GRDDL-aware agent and finds mash-up-ready data.

RDF Calendar on a timeline

timeline screenshot

RDF Calendar on a timeline

timeline close-up

Calendar mash-up:

Travel schedule on a map

map screenshot

Travel schedule on a map:

One Event in Particular

Banff screenshot

You are here ;-)

Linked Data Principles

  1. Everything should have a URI: All entities of interest should be identified by URIs.
  2. Follow Your Nose Principle: URIs should be dereference-able, meaning that an application can look up a URI over the HTTP protocol and retrieve RDF data.
  3. Use standard formats: Data should be provided using the RDF/XML or Turtle syntax. If data is embedded using a format like Microformats , then these documents should include links to automatically extract RDF data from them, ala GRDDL.
  4. Link Your Data: Resource descriptions should contain links to related information in the form of dereference-able URIs within RDF statements and rdfs:seeAlso links.

See Linked Data sess Friday at 10:30am.

The Web as One Big Mashup

Follow your nose and query the whole Web

For each triple pattern, the library executes the following algorithm:

  1. look up URIs that appear in the triple pattern. Add retrieved graphs to the local graph set.
  2. look up any URI y where the graph set includes the triple { x rdfs:seeAlso y } and x is a URI from the triple pattern. Add retrieved graphs to the local graph set.
  3. match the triple pattern against all graphs in the local graph set.
  4. for each triple that matches the triple pattern
    1. look up all new URIs that appear in the triple. Add retrieved graphs to the local graph set.
    2. look up any new URI y where the graph set includes the triple { x rdfs:seeAlso y } and x is a URI from a matching triple. Add retrieved graphs to the local graph set.
  5. match the triple pattern against all newly retrieved graphs.
  6. repeat step 4 and 5 until the maximum number of retrieval steps or the timeout is reached.

The Future of GRDDL

References

GRDDL yourself!

Free time for adding microformats and GRDDL to your own vocabularies