covers: semantic html, microformats, turtle, RDF, and SPARQL
When can Jane, David, and Robin meet?
GRDDL and SPARQL to the rescue!
Start | End | Place | Summary |
---|---|---|---|
"2007-01-08" | "2007-01-11" | Edinburgh, UK | Web Design Conference |
Jane's friends Robin and David are both in town with her in Edinburgh on January 8th through 10th for the Web Design Conference.
Events in Robin's schedule...
... are marked up like this:
<li class="vevent"> <strong class="summary">Fashion Expo</strong> in <span class="location">Paris, France</span>: <abbr class="dtstart" title="2006-10-20">Oct 20</abbr> to <abbr class="dtend" title="2006-10-23">22</abbr> </li>
hCalendar = iCalendar in XHTML
iCalendar (RFC2445):
BEGIN:VEVENT UID:20020630T230445Z-3895-69-1-7@jammer DTSTART;VALUE=DATE:20020703 DTEND;VALUE=DATE:20020706 SUMMARY:XYZ Conference LOCATION:San Francisco END:VEVENT
hCalendar and other microformats have shared tools, knowledge, process...
Microformats are centralized data formats for different types of data, often (nearly) isomorphic to already widely adopted non-Web standards:
The lower-case semantic web
She doesn't trust the wisdom of crowds. She trusts:
A hotel with a ranking of 5 reviewed by a trusted friend:
rating | name | region | homepage | hotelname |
---|---|---|---|---|
5 | PeterS | Edinburgh | http://peter.example.org | Witch's Caldron Hotel, Edinburgh |
"How did you do that?" I'm glad you asked...
Too many services replicate the same sort of data....what if you have a Friendster, a Myspace, and a Twtter account?
photo by Jon Hicks
I want my data back.
Jon Bosak circa 1997
I've long believed that customers of any application own the data they enter into it.
Is this what Web 2.0 is all about? If so, maybe it's not such a bad thing.
... is an open world and universal space for machine-readable data.
To a computer, then, the web is a flat, boring world devoid of meaning...This is a pity, as in fact documents on the web describe real objects and imaginary concepts, and give particular relationships between them...Adding semantics to the web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values. TimBL, WWW1994
<#p> foaf:name "PeterS"; <#p> foaf:homepage <http://peter.example.org>.
Note the relationship to HTML links, especially with the re-discovery of the rel attribute.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix foaf: <http://xmlns.com/foaf/0.1/>. @prefix rev: <http://www.purl.org/stuff/rev#>. @prefix vcard: <http://www.w3.org/2006/vcard/ns#>. @prefix xfn: <http://gmpg.org/xfn/11#>. _:hotel vcard:adr [ vcard:locality "Edinburgh" ]; rev:hasReview [ rev:rating 5; rev:reviewer _:who; rdfs:label "Witch's Caldron Hotel, Edinburgh" ]. <jane> xfn:friend _:who. _:who foaf:name "PeterS"; foaf:homepage <http://peter.example.org>.
The Semantic Web is to spreadsheets and databases what the Web of hypertext documents is to word processor files.
Web | Semantic Web | |
---|---|---|
Traditional Design | hypertext | database, spreadsheet, logic |
+ | URIs | |
- | link consistency | global consistency? |
= | viral growth |
XML (Xtensible Markup Language) is a generalization of HTML that lets anyone name the elements and attributes
Think ASCII for the 21st Century!
Also a tree model (DOM - Document Object Model), which is a handy data structure.
RDF statements* are independent. RDF semantics are monotonic.
RDF | XML | |
---|---|---|
Premise |
<Book rdf:ID="book1"> <dc:title>The Grapes of Wrath</title> <dc:creator>Steinbeck</author> </Book> |
<xsd:simpleType name="myInteger"> <xsd:restriction base="xsd:integer"> <xsd:minInclusive value="10000"/> <xsd:maxInclusive value="99999"/> </xsd:restriction> </xsd:simpleType> |
Conclusion |
<Book rdf:ID="book1"> <dc:title>The Grapes of Wrath</title> </Book> |
<!-- no, this does not follow --> <xsd:simpleType name="myInteger"> <xsd:restriction base="xsd:integer"> <xsd:maxInclusive value="99999"/> </xsd:restriction> </xsd:simpleType> |
*RDF/XML does have a rdf:parseType="Collection" syntax, which expands to a lisp style binary tree in the abstract syntax. This erasure property works not on XML elements, but on RDF statements.
At least the issues in the 1998 spec have all been resolved, complete with test cases. There are plenty of interoperable parsers. And it works great with Relax-NG and nxml-mode :)
GRDDL (Gleaning Resource Descriptions from Dialects of Languages) is a way to boostrap RDF out of XML and in particular XHTML data by explicitly linking transformations from RDF to XML.
GRDDL terminology:
Recall Jane needs to her list of trusted sources in some machine readable format.
As long as they can be mapped to RDF, they can be mapped to each other.
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head profile="http://gmpg.org/xfn/11"> <title>Jane's XFN List</title> </head> <body> <h1>Jane's <abbr title="XHTML Friends Network">XFN</abbr> List</h1> <ul class="xoxo"> <li class="vcard"><a href="http://peter.example.org" class="url fn" rel="met collegue friend">Peter Smith</a></li> <li class="vcard"><a href="http://john.example.org" class="url fn" rel="met">John Doe</a></li> <li class="vcard"><a href="http://paul.example.org" class="url fn" rel="met">Paul Revere</a></li> </ul> </body> </html>
* actually, the XFN profile isn't quite GRDDL-happy yet; but the eRDF profile is.
magic* happens here...
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix h: <http://www.w3.org/1999/xhtml> . @prefix xfn: <http://gmpg.org/xfn/11#> . [] foaf:homepage <http://www.w3.org/2001/sw/grddl-wg/doc29/janefriends.html>; xfn:friend [ foaf:homepage <http://peter.example.org> ]; xfn:met [ foaf:homepage <http://peter.example.org> ], [ foaf:homepage <http://john.example.org> ], [ foaf:homepage <http://paul.example.org> ] .
*we'll explain the trick later.
The hReview microformat doesn't have an established profile yet, so the Hotel Review data uses GRDDL directly:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head profile="http://www.w3.org/2003/g/data-view"> <title>Hotel Reviews from Example.com</title> <link rel="transformation" href="http://www.w3.org/2001/sw/grddl-wg/doc29/hreview2rdfxml.xsl"/> </head> <div class="hreview" id="_665"> <div class="item vcard"> <b class="fn org">Witch's Caldron Hotel, Edinburgh</b> <span><span class="rating">5</span> out of 5 stars</span>
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX c: <http://www.w3.org/2002/12/cal/icaltzd#> SELECT ?name, ?summary, ?when FROM <myFriendsBlogsData> WHERE { ?somebody foaf:name ?name; foaf:mbox ?mbox. ?event c:summary ?summary; c:dtstart ?ymd; c:attendee [ c:calAddress ?mbox ] }.
?name | ?summary | ?when |
---|---|---|
Tantek Çelik | Web 2.0 | 2005-10-05 |
Norm Walsh | XML 2005 | 2005-11-13 |
Dan Connolly | W3C tech plenary | 2006-02-27 |
See SPARQL Query Language for RDF W3C Working Draft.
"Find reviews better than 2 stars and tell me the name of the hotel and the reviewer."
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX rev: <http://www.purl.org/stuff/rev#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT DISTINCT ?name ?rating ?hotelname
FROM <http://www.w3.org/2001/sw/grddl-wg/doc29/review.rdf>
WHERE {
?x rev:hasReview ?review.
?review rev:rating ?rating;
rdfs:label ?hotelname;
rev:reviewer [ foaf:name ?name ].
FILTER (?rating > 2).
}
Details: hotelquery1.rq
The query worked, but it's not precise enough:
name | rating | hotelname |
"PeterS" | 5 | "Enlightenment Amsterdam Hotel" |
"RexR" | 5 | "Pilgrim Hostel" |
"PeterS" | 4 | "Fano Hotel" |
"MaryV" | 5 | "Franklin Hotel, Philadelphia" |
"Simon" | 5 | "Forest Cafe Youth Hostel, Edinburgh" |
"JennyR" | 3 | "Merton Atlanta" |
"JohnD" | 4 | "Walter Scot Hotel, Edinburgh" |
"PeterS" | 5 | "Royal Moon Hotel, Boston" |
"JohnD" | 5 | "Elena Plaza Hotel" |
"PeterS" | 5 | "Witch's Caldron Hotel, Edinburgh" |
"RexR" | 3 | "Bond Plaza Hotel" |
"RexR" | 5 | "McRae Palace, Edinburgh" |
"RexR" | 5 | "Ritchie Centre, Edinburgh" |
"PeterS" | 5 | "Maximus New York Hotel & Towers" |
"Find reviews of hotels in Edinburgh better than 2 stars and tell me the name of the hotel and the reviewer."
SELECT DISTINCT ?name ?rating ?homepage
FROM <hotel-data.rdf>
FROM <janefriends.rdf>
WHERE {
?x rev:hasReview ?review;
vcard:adr [ vcard:locality "Edinburgh" ].
?review rev:rating ?rating;
rdfs:label ?hotelname;
rev:reviewer [ foaf:name ?name ].
FILTER (?rating > 2).
This shows hotels with a rating of 2 stars or higher that are located in Edinburgh, but there might be review spam:
rating | name | hotelname | region |
5 | "RexR" | "Ritchie Centre, Edinburgh" | "Edinburgh" |
5 | "PeterS" | "Witch's Caldron Hotel, Edinburgh" | "Edinburgh" |
5 | "Simon" | "Forest Cafe Youth Hostel, Edinburgh" | "Edinburgh" |
5 | "RexR" | "McRae Palace, Edinburgh" | "Edinburgh" |
4 | "JohnD" | "Walter Scott Hotel, Edinburgh" | "Edinburgh" |
"Find reviews by my friends of hotels in Edinburgh better than 2 stars and tell me the name of the hotel and the reviewer."
PREFIX xfn: <http://gmpg.org/xfn/11#>
SELECT DISTINCT ?rating ?name ?homepage ?hotelname
FROM <review.rdf>
FROM <xfn.rdf>
WHERE {
?place rev:hasReview ?review;
vcard:adr [ vcard:Locality "Edinburgh"].
?review
rdfs:label ?hotelname;
rev:rating ?rating;
rev:reviewer ?reviewer.
FILTER (?rating > 2).
?reviewer foaf:name ?name;
foaf:homepage ?homepage.
[ foaf:homepage <janefriends.html> ]
xfn:friend [ foaf:homepage ?homepage ].
}
Details: hotelquery3.rq
"Find reviews by my friends of hotels in Edinburgh better than 2 stars and tell me the name of the hotel and the reviewer."
Just right:
rating | name | region | homepage | hotelname |
---|---|---|---|---|
5 | PeterS | Edinburgh | http://peter.example.org | Witch's Caldron Hotel, Edinburgh |
When can Jane, David, and Robin meet?
David has chosen to mark up his schedule using Embedded RDF (an alternative to RDFa), a way to use GRDDL to get out RDF from documents.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head profile="http://purl.org/NET/erdf/profile">
<title>Where Am I</title>
<link rel="schema.cal" href="http://www.w3.org/2002/12/cal#" />
</head>
<body>
<p class="-cal-Vevent" id="tiddlywinks">
From <span class="cal-dtstart" title="2006-10-07">7 October, 2006</span>
to <span class="cal-dtend" title="2006-10-13">12 October, 2006</span>
I will be attending the <span class="cal-summary">National Tiddlywinks
Championship</span> in
<span class="cal-location">Bognor Regis, UK</span>.
</p>
<p class="-cal-Vevent" id="holiday">
Then I'm <span class="cal-summary">on holiday</span> in the
<span class="cal-location">Cayman Islands</span> between
<span class="cal-dtstart" title="2006-11-14">14 November, 2006</span>
and <span class="cal-dtend" title="2007-01-02">1 January, 2007</span>.
</p>
<p class="-cal-Vevent" id="award">
I then visit Scotland on <span class="cal-dtstart" title="2007-01-08">the 8th
January</span> to <span class="cal-summary">pick up a lifetime
achievement award from the world gamers association</span>. This time
the ceremony is in <span class="cal-location">Edinburgh, UK</span>. I'll be
taking the train home on the <span class="cal-dtend" title="2007-01-11">10th</span>.
</p>
</body>
</html>
GRDDL has gone meta!
This allows the HTML profile document to be GRDDL-enabled to link the standard library transformation of <link rel="transformation" href="http://www.w3.org/2003/g/glean-profile" /> and so extract the http://www.w3.org/2003/g/data-view#profileTransformation whose object is the transformation itself.
Embedded RDF has a link to a GRDDL transformation in its profile document.
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head profile="http://www.w3.org/2003/g/data-view"> <title>Embedded RDF HTML Profile</title> <link rel="transformation" href="http://www.w3.org/2003/g/glean-profile" /> </head> <body> <p> <a rel="profileTransformation" href="http://purl.org/NET/erdf/extract-rdf">GRDDL transform</a> </p> </body> </html>
No Transformation Links - just go to the namespace document!
In RDF, OWL, RDF Schema:
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dataview="http://www.w3.org/2003/g/data-view#">
<rdf:Description rdf:about="http://www.w3.org/2004/01/rdxh/p3q-ns-example">
<dataview:namespaceTransformation
rdf:resource="http://www.w3.org/2004/01/rdxh/grokP3Q.xsl"/>
</rdf:Description>
</rdf:RDF>
In XML Schema:
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns="http:.../Order-1.0"
targetNamespace="http:.../Order-1.0"
version="1.0"
...
xmlns:data-view="http://www.w3.org/2003/g/data-view#"
data-view:transformation="http://www.w3.org/2003/g/embeddedRDF.xsl" >
<xsd:element name="Order" type="OrderType">
<xsd:annotation
<xsd:documentation>This element is the root element.</xsd:documentation>
</xsd:annotation>
...
<xsd:annotation>
<xsd:appinfo>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://www.w3.org/2003/g/po-ex">
<data-view:namespaceTransformation
rdf:resource="grokPO.xsl" />
</rdf:Description>
</rdf:RDF>
</xsd:appinfo>
</xsd:annotation>
...
Is it too much work to ask people to add the transformation and profile to their individual instance data?
Creators or maintainers of vocabularies can also give users of their data the option of having their data transformed into RDF without having to even add any new markup to individual documents
Since once the tranformation has been linked to the profile or namespace document, all the users of the dialect get the added value of RDF for free
In either the namespace document or profile URI there has to be the following RDF property: http://www.w3.org/2003/g/data-view#profileTransformation whose subject is the namespace doc or profile document and whose object is the transformation itself.
While GRDDL has primarily in the wild been used to convert widely deployed microformats to RDF, it can actually be used with the W3C RDFa work item that allows one to "microformat-style" embed arbitary RDF statements in HTML
RDFa is useful because microformats exist as a number of centralized vocabularies, and what if you want to mark-up meta-data in a web-page about a subject there isn't a microformat about?
Since RDFa is still a moving target, we personally recommend people use Embedded RDF for the time being unless they are willing to track the changes in RDFa, but RDFa is more expressive than Embedded RDF (allowing XML Schema datatypes, etc.
This document is licensed under a <a href="http://cc.org/licenses/by/3.0/"> CC License </a> and was written by TimBL.
This document is licensed under a <a href="http://cc.org/licenses/by/3.0/" xmlns:cc="http://cc.org/ns#" rel="cc:license"> CC License </a> and was written by TimBL.
This document ... <div rel="dc:creator" class="foaf:Person" xmlns:dc="http://..." xmlns:foaf="http://..."> and was written by <span property="foaf:nickname"> TimBL </span>. </div>
yields
<> dc:creator [a foaf:Person ; foaf:nickname "TimBL"] .
RDFa for Jane's schedule online
<html xmlns:cal="http://www.w3.org/2002/12/cal/icaltzd#" xmlns:xs="http://www.w3.org/2001/XMLSchema#">
<head profile="http://www.w3.org/2003/g/data-view"> <title>Jane's Blog</title> <link rel="transformation"
href="http://www.w3.org/2001/sw/grddl-wg/td/RDFa2RDFXML.xsl"/> </head> <body> <p about="#event1" class ="cal:Vevent"> <b property="cal:summary">Weekend off in Iona</b>:
<span property="cal:dtstart" content="2006-10-21" datatype="xs:date">Oct 21st</span>
to <span property="cal:dtend" content="2006-10-21" datatype="xs:date">Oct 23rd</span>.
See <a rel="cal:url" href="http://freetime.example.org/">FreeTime.Example.org</a> for
info on <span property="cal:location">Iona, UK</span>.
</p>
<p about="#event2" class ="cal:Vevent">
<b property="cal:summary">Holiday in Ireland</b>:
<span property="cal:dtstart" content="2006-12-23" datatype="xs:date">Dec 23rd</span>
to <span property="cal:dtend" content="2006-12-27" datatype="xs:date">Dec 27th</span>.
See <a rel="cal:url" href="http://vacation.example.org/">Vacation.Example.org</a> for
info on <span property="cal:location">Belfast, Ireland</span>.
</p>
<p><b>New Years!</b> Now it's 2007...</p>
<p about="#event3" class ="cal:Vevent">
<b property="cal:summary">Web Conference</b>:
<span property="cal:dtstart" content="2007-01-08" datatype="xs:date">Jan 8th</span>
to <span property="cal:dtend" content="2007-01-11" datatype="xs:date">Jan 11th</span>.
See <a rel="cal:url" href="http://webconf.example.org/">webconf.example.org</a> for
info on <span property="cal:location">Edinburgh, UK</span>.
</p>