INRIA logo
Cover page images (INRIA Logo)

Bootstrapping the Semantic Web with GRDDL, Microformats, and RDFa

acacia Fabien GANDON,

edinburgh Harry Halpin,

Introduction: Data and Documents

GRDDL enables you to...

Examples of declarations

Biblio example.

The Web 2.0 and the Semantic Web

Microformats can be thought of as a competing alternative to the Semantic Web

The Limits of Microformats

The main problem with microformats is that they put your data into HTML, but you have no standard way to get the data out.

Except really hairy XSLT style-sheets such as Suda's x2v...

A further problem is that they cannot be validated easily. You can mix hCard and hCal and there's no way to guarantee you will interpret it correctly.

Domain-specific: You can not make a microformat for just anything!

From Microformats to the Semantic Web

With GRDDL, microformat data can be viewed as Semantic Web data.

So the first step of Semantic Web deployment is already happening...

Shockingly, a large number of web sites are using microformats:

  2. LinkedIn
  3. Yedda
  4. Yahoo! Local
  5. Yahoo! Tech Reviews

There's even a Dreamweaver plug-in to help authors!

GRDDL Working Group

The small, light and agile Working Group has produced:

Test Cases and Implementations

A draft GRDDL Test Cases are available, upon which an in progress implementation report is based.

W3C provides pair of online services on an experimental, best-effort basis:

The GrddlImplementations topic in the ESW Wiki is a community-maintained lists of GRDDL implementations in C (Redland), Java (Jena), Python (, and more.

Next Steps: Micromodels

We need a community-driven, easy-to-use site for people to create Semantic Web vocabularies, instead of always going through W3C Process or just posting them on their site.

Like an OASIS for the Semantic Web


Jane is trying to coordinate a meeting.

Jane is trying to coordinate a meeting with friends. She uses GRDDL to extract data from each of their calendar pages and combine it in a single model. She then writes a query to filter the events down to those dates when all of them are in the same city.

calendars Data in RDF SPARQL

Health Care

Kayode wants to query clinical data.

patient files Data in RDF Schemas Reports
Kayode uses a single-purpose XML vocabulary as the main representation format for a computer-based patient record. He uses GRDDL to be able to query these records both in their XML vocabulary and as RDF, without managing a dual representation.

Aggregating data

Stephan wants a synthetic review before buying a guitar.

reviews Data in RDF query
Stephan wishes to buy a guitar and visits a site offering a review service. He uses GRDDL to aggregate reviews and profiles of the reviewers in order to select the reviews he can trust.

Querying sites and digital libraries

DC4Plus Corp. wants to automate the publication of its electronic documents.

documents Data in RDF reports and indexes
Adeline designs a system to allow here company to streamline the publication of Technical Reports. The system relies on shared templates for publishing documents and a GRDDL transformation to build an up-to-date RDF index used to create an authoritative repository..

Wikis and e-learning

The Technical University of Marcilly decided to use wikis to foster knowledge exchanges between lecturers and students.

The Technical University of Marcilly decides to use a wiki with metadata embedded in its pages to tag, structure, navigate and query the resources of the wiki. GRDDL is used to extract these metadata as RDF to feed the different tools of the system.

wiki pages Data in RDF schemas Sparql

Web syndication

Extracting form descriptions to push entries to Voltaire's blog.

documents Data in RDF reports and indexes reports and indexes
Voltaire has setup a weblog engine that utilizes XForms for editing entries. He also provides a GRDDL transformation that extracts an RDF description of the XForms that other client applications can use to update existing entries using the identified service URIs, and perform other such services.

Validated Documents

the OAI would like to be able to specify document licenses in the schema they share.

The Open Archives Initiative (OAI) publishes an XML schema that universities can use to publish their archived documents. This schema also identifies a GRDDL transform to apply to all its instance documents in order to extract their Creative Commons license.

wiki pages Data in RDF schemas Sparql Sparql

Pulling Data from the Web

Steffen wants to build a directory of the people he works with.

Whenever he gets in touch with someone, Steffen starts a simple script that aims at gathering as much metadata about this person as possible. Because most of these web pages are not even valid HTML, the script calls an HTML-tidying tool and if the tidying is complex some of the metadata is likely to be no longer coherent.

documents Data in RDF reports and indexes reports and indexes

Pushing a transformation

Oceanic Consortium wants to provide transformations for their files without altering them or their schema.

Oceanic wishes to also publish RDF descriptions of their parts reusing the AirPartML documents produced for an arrangement with a consortium of airlines. The AirPartML schemas are strict and therefore Oceanic cannot alter their XML documents to specify a transformation. Yet using the HTTP Headers, Oceanic can specify link and profiles for transformation when serving their AirPartML documents.

documents Header tells its GRDDL source Get transforms Data in RDF

Direct Reference of GRDDL Transformations

Indirect Reference of GRDDL Transformations

XML namespace document (or XHTML profile document)

Provide a faithful rendition

Example with Microformats

Example referencing GRDDL transformations directly in the head of the HTML.

Example with embedded RDF

Example referencing GRDDL transformations in a profile document referenced in the head of the HTML.

Example XML

Example referencing GRDDL transformations in an XML document.

Complete example in RDFa