World Wide Web Consortium Releases First Version of GRDDL Specification

GRDDL Links the Semantic Web and Microformats

Contact Americas, Australia --
Janet Daly, <janet@w3.org>, +1.617.253.5884 or +1.617.253.2613
Contact Europe, Africa and the Middle East --
Marie-Claire Forgue, <mcf@w3.org>, +33.492.38.75.94
Contact Asia --
Yasuyuki Hirakawa <chibao@w3.org>, +81.466.49.1170

(also available in French and Japanese; see also translations in other languages)

http://www.w3.org/ -- 24 October 2006 -- Today, the World Wide Web Consortium forged an important link between Semantic Web and microformats communities. With "Gleaning Resource Descriptions from Dialects of Languages", or GRDDL (pronounced "griddle"), software can automatically extract information from structured Web pages to make it part of the Semantic Web. Those accustomed to expressing structured data with microformats in XHTML can thus increase the value of their existing data by porting it to the Semantic Web, at very low cost.

W3C invites community review of this First Public Working Draft, published by the GRDDL Working Group.

Different Needs, Different Ways to Express Data

One aspect of recent developments some people call "Web 2.0" involves applications based on combining — in "mash-ups" — various types of data that are spread all around on the Web. A number of active communities innovating on the Web share the goal of sharing data such as calendar information, contact information, and geopositioning information. These communities have developed diverse social practices and technologies that satisfy their particular needs. For instance, search engines have had great success using statistical methods while people who share photos have found it useful to tag their photos manually with short text labels. Much of this work can be captured via "microformats". Microformats refer to sets of simple, open data formats built upon existing and widely adopted standards, including HTML, CSS and XML.

This wave of activity has direct connections to the essence of the Semantic Web. The Semantic Web-based communities have pursued ways to improve the quality and availability of data on the Web, making it possible for more intensive data-integration and more diverse applications that can scale to the size of the Web and allow even more powerful mash-ups. The Web-based set of standards that supports this work is known as the Semantic Web stack. The foundations of the Semantic Web stack meet the requirements for formality of some applications such as managing bank statements, or combining volumes of medical data.

Each approach to "getting your data out there" has its place. But why limit yourself to just one approach if you can benefit, at low cost, from more than one? As microformats users consider more uses that require data modelling, or validation, how can they take advantage of their existing data in more formal applications?

A Bridge from Flexible Web Applications to the Semantic Web

GRDDL is the bridge for turning data expressed in an XML format (such as XHTML) into Semantic Web data. With GRDDL, authors transform the data they wish to share into a format that can be used and transformed again for more rigorous applications.

The recently published GRDDL Use Cases provides insight into why this is useful through a number of scenarios, including scheduling a meeting, comparing information from various retailers before making a purchase, and extracting information from wikis to facilitate e-learning. Once data is part of the Semantic Web, it can be merged with other data (for example, from a relational database, similarly exposed to the Semantic Web) for queries, inferences, and conversion to other formats.

The GRDDL Primer shows several practical examples of "how to GRDDL" an ordinary XHTML document that uses microformats. The practical impact on current authoring practices of adopting GRDDL is minor; only small changes are required to existing documents. GRDDL is thus ready to deploy, at very low cost.

About the World Wide Web Consortium [W3C]

The World Wide Web Consortium (W3C) is an international consortium where Member organizations, a full-time staff, and the public work together to develop Web standards. W3C primarily pursues its mission through the creation of Web standards and guidelines designed to ensure long-term growth for the Web. Over 400 organizations are Members of the Consortium. W3C is jointly run by the MIT Computer Science and Artificial Intelligence Laboratory (MIT CSAIL) in the USA, the European Research Consortium for Informatics and Mathematics (ERCIM) headquartered in France and Keio University in Japan,and has additional Offices worldwide. For more information see http://www.w3.org/