Copyright © W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
Current web pages, written in HTML, are chock-full of structured data. When publishers can express this data, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published to enable structured search and sharing.
RDFa is a syntax for expressing this structured data in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.
This document is an introduction to RDFa. For more detailed syntax specification, please consult the RDFa Syntax Document.
This is an internal draft produced by the Semantic Web Deployment Working Group [SWBPD-WG], in cooperation with the HTML Working Group [HTML-WG].
This document is for internal review only and is subject to change without notice. This document has no formal standing within the W3C.
1 Purpose of RDFa and Preliminaries
1.1 Audience
2 A First Scenario: Publishing Events and Contacts
2.1 The Basic HTML
2.2 Publishing An Event
2.3 Publishing Contact Information
2.4 The Complete HTML with RDFa
2.5 RDFa with Limited HTML control
2.6 The RDF Triples
3 A Second Scenario: Publishing Photos
3.1 The Shutr Photo Management System
3.2 Literal Properties
3.3 URI Properties
4 Beyond the Current Document
4.1 Qualifying Other Documents
4.2 Inheriting about
4.3 Qualifying Chunks of Documents
5 More Complex Structured Data: Social Networking with FOAF
5.1 Two Layers of Structured Data
5.2 Additional Layers
5.3 The RDF Triples
5.4 Naming the Nodes
6 Bibliography
7 Acknowledgments
Current web pages, written in HTML, are chock-full of structured data. When publishers can express this data, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published to enable structured search and sharing.
RDFa is a syntax that expresses this structured data using a set of elements and attributes that embed RDF in HTML. An important goal of RDFa is to achieve this RDF embedding without repeating existing HTML content when that content is the structured data. Though RDFa was initially designed for XHTML2, one should be able to use RDFa with other XML dialects, e.g. XHTML1, SVG, given proper schema additions. In addition, RDFa is defined so as to be compatible with non-XML HTML.
An HTML document marked up with RDFa constructs is a valid HTML Document. RDFa is about using HTML compatible constructs and extensions to specify RDF 'content'. It is not about embedding RDF/XML syntax into HTML documents.
We note that RDFa makes use of XML namespaces. In this document,
we assume, for simplicity's sake, that the following namespaces
are defined: dc
for Dublin Core, foaf
for
FOAF, cc
for Creative Commons, and xsd
for
XML Schema Definitions:
dc
: http://purl.org/dc/elements/1.1/foaf
: http://xmlns.com/foaf/0.1/cc
: http://web.resource.org/cc/The audience for this document should have a working knowledge of XHTML. Some familiarity with RDF is useful, though the basics can be picked up from reading this Primer. Similarly, the basic XML concepts used in this work—in particular namespaces— can be picked up from reading this Primer.
Editorial note: Ben | 2006-10-21 |
We may want to reference other RDF documents here, like N3 or Turtle notation (as per LeeF's comments). |
Jo blogs about her work, which involves web development.
When Jo has an upcoming conference talk, she simply blogs it at http://jo-blog.example.org/
. Her blog also includes her contact information (Jo has a fantastic spam filter, so she is unafraid of publishing her email address):
<html> <head><title>Jo's Blog</title></head> <body> ... <p> I'm giving a talk at the XTech Conference about web widgets, on May 8th at 10am. </p> ... <p class="contactinfo"> My name is Jo Smith. I'm a distinguished web engineer at <a href="http://example.org"> Example.org </a>. You can contact me <a href="mailto:jo@example.org"> via email </a>. </p> ... </body> </html>
This short piece of mark-up is already full of structured data.
The markup describes an event: a talk that Jo is giving. This event starts at 10am on May 8th. A summary of the event is "a talk at XTech 2006 on web widgets." We also have contact information for Jo: she works for the organization Example.org, with job title of "Distinguished Web Engineer." She can be contacted at the email address "jo@example.org."
At the moment, it is very difficult for software — like web browsers and search engines — to make use of this implicit data. We need a standard mechanism to explicitly express it. This is precisely where RDFa comes in.
Jo would like to add some structure to this blog entry so that readers of her blog might be able to add her talk directly to their calendar. RDFa allows her to do just that, using extra attributes (and the occasional element). Since this is a calendar event, Jo will specifically use the iCal vocabulary [ICAL-RDF] to denote the data's structure.
The first step is to reference the iCal vocabulary within the HTML page, so that a parser may know where to look up the vocabulary terms:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> ...
then, Jo declares a new event and gives it a name:
<p class="cal:Vevent" id="xtech_conference_talk"> ... </p>
Note how the class
attribute is used here to define the type of the data being expressed, exactly as this attribute was initially intended in HTML. Now, Jo wants to make sure that all information within this p
describe the event itself. She adds an additional attribute:
<p class="cal:Vevent" id="xtech_conference_talk" about="#xtech_conference_talk"> ... </p>
then, inside this event declaration, Jo can set up the event fields, reusing the existing HTML. For example, the event summary can be declared as:
I'm giving <span property="cal:summary">a talk at the XTech Conference about web widgets</span>,
The property
attribute on the span
element declares a data field that pertains to the closest declared about
, in this case the Vevent
declared in the p
. Note how the existing rendered content, "a talk at the XTech Conference about web widgets", is the value of this field. Sometimes, this isn't the right thing. Specifically, the start time of the event should be rendered nicely — "May 8th" —, but should likely be represented in an easy, machine-parsable way, the standard iCal format: 20060508T1000-0500
. In this case, the markup needs only a slight modification:
<span property="cal:dtstart" content="20060508T1000-0500">May 8th at 10am</span>
In this case, the actual content of the span
element, "May 8th at 10am", is ignored for structured data purposes: it has been replaced by the explicit content
attribute. The full markup is then:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> <head><title>Jo's Blog</title></head> <body> ... <p class="cal:Vevent" about="#xtech_conference_talk"> I'm giving <span property="cal:summary"> a talk at the XTech Conference about web widgets </span>, on <span property="cal:dtstart" content="20060508T1000-0500"> May 8th at 10am </span>. </p> ... </body> </html>
The above markup can be interpreted now as a set of RDF triples, the details of which we we explain in Section 2.6 The RDF Triples.
Note that Jo could have used any other HTML element, not just span
, to carry the structure of her data. In other words, when the structure of the data is already laid out in the HTML using elements such as h1
, em
, div
, etc..., Jo can simply add the property
attribute, and optionally the content
attribute, to indicate the specific structure.
Now that Jo has published an event using structured data, she realizes there is much data on her blog that she can mark up in the same way. Her contact information, in particular, is an easy target for structured markup with RDFa:
... <p class="contactinfo"> My name is Jo Smith. I'm a distinguished web engineer at <a href="http://example.org"> Example.org </a>. You can contact me <a href="mailto:jo@example.org"> via email </a>. </p> ...
Jo discovers the vCard RDF vocabulary [VCARD-RDF], which she adds to her existing page. Since Jo thinks of vCards as a way to publish her contact information, she uses the prefix contact
to designate this vocabulary. Note that, although Jo already imported the iCal vocabulary, adding the vCard vocabulary is just as easy and does not interfere:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ...
Jo then sets up her vCard using RDFa, by deciding that the
appropriate p
will be her vcard. She notes,
however, that the vCard schema does not require declaring
a vCard type. Instead, it is recommended that a vCard
refer to a web page that identifies the individual. Jo
thus uses RDFa's special attribute about
for
just for this purpose, indicating that all contained HTML
pertain to Jo's designated URL. Note how
the about
attribute is inherited down the DOM
tree hierarchy: the nearest ancestor with a
declared about
applies to declared structured
data.
... <p class="contactinfo" about="http://example.org/staff/jo"> ...everything here pertains to http://example.org/staff/jo... </p> ...
Simple enough!, Jo realizes. She adds her first vCard fields: name, title, organization and email.
... <p class="contactinfo" about="http://example.org/staff/jo"> My name is <span property="contact:fn"> Jo Smith </span>. I'm a <span property="contact:title"> distinguished web engineer </span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Notice how Jo was able to use the rel
attribute directly within the anchor tag for designating her organization and email address. In this case, the rel
indicates a relationship between the current URL, designated by about
, and the target URL, designated by href
. The exact meaning of this relationship is defined by the rel
. In this case, contact:org
indicates the relationship of "vCard organization", while contact:email
indicates the relationship of "vCard email".
The rel
attribute is naturally paired with the href
attribute, much like the property
attribute is paired with the content
attribute. The astute reader will notice that we have defined what happens when a property
attribute is present without a content
attribute, but not what happens when a rel
attribute is present without its corresponding href
. We explore this feature in Section 5 More Complex Structured Data: Social Networking with FOAF.
(The above example slightly simplifies the vCard vocabulary where email
is concerned, since vCard technically requires indicating the type of the email. This simplification is for clarity's sake.)
Jo's complete HTML with RDFa is thus:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ... <p class="cal:Vevent" about="#xtech_conference_talk"> I'm giving <span property="cal:summary"> a talk at the XTech Conference about web widgets </span>, on <span property="cal:dtstart" content="20060508T1000-0500"> May 8th at 10am </span>. </p> ... <p class="contactinfo" about="http://example.org/staff/jo"> My name is <span property="contact:fn"> Jo Smith </span>. I'm a <span property="contact:title"> distinguished web engineer </span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Note how, if Jo changes her email address link, her organization, or the title of her talk, the RDFa approach will automatically pick up these changes in the marked up, structured data. The only places where this doesn't happen is when the content
attribute must override the rendered content, which is inevitable when the human-rendered data and the machine-readable data must differ.
The RDF triples generated by the above markup are detailed in Section 2.6 The RDF Triples.
What if Jo does not have complete control over the HTML of her blog? For example, she may be using a templating system which makes it particularly difficult to add the vocabularies in the html
element at the top of her page without adding it to every page on her site. Or, she may be using a web blogging provider that doesn't allow her to change the header of the page to begin with.
Fortunately, RDFa uses standard XML namespaces, which means that the vocabularies can be imported "locally" to an HTML element. Jo's HTML blog page could express the exact same structured data with the following markup:
<html> ... <p class="cal:Vevent" about="#xtech_conference_talk" xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> I'm giving <span property="cal:summary"> a talk at the XTech Conference about web widgets </span>, on <span property="cal:dtstart" content="20060508T1000-0500"> May 8th at 10am </span>. </p> ... <p class="contactinfo" about="http://example.org/staff/jo" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> My name is <span property="contact:fn"> Jo Smith </span>. I'm a <span property="contact:title"> distinguished web engineer </span> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Of course, just like in the case of the vocabularies defined on the top-level html
tag, more than one vocabulary can be imported into any element. In this case, each p
only needs one vocabulary: the first uses iCal, the second uses vCard. This approach helps with the desired ability to copy-and-paste HTML from one page to another: the closer the namespace declarations to their relevant statements, the easier it is to copy and paste the content.
In Section 2.2 Publishing An Event, Jo published an event. The RDFa is parsed to generate RDF triples, specifically:
<#xtech_conference_talk> rdf:type cal:Vevent; cal:summary "a talk at the XTech Conference about web widgets"^^XMLLiteral; cal:dtstart "20060508T1000-0500" .
In Section 2.3 Publishing Contact Information, Jo published contact information. The RDFa is parsed to generate the following RDF triples:
<http://example.org/staff/jo> contact:fn "Jo Smith"^^XMLLiteral; contact:title "distinguished web engineer"^^XMLLiteral; contact:org <http://example.org>; contact:email <mailto:jo@example.org>.
Consider a (fictional) photo management web site
called Shutr, whose web site
is http://www.shutr.net
. Users of Shutr can upload
their photos at will, annotate them, organize them into
albums, and share them with the world. They can choose to
keep these photos private, or make them available for public
consumption under licensing terms of their choosing.
The primary interface to Shutr is its web site and the HTML it delivers. Since photos are contributed by users with significant amount of built-in structured data (camera type, exposure, etc...) and additional, explicitly provided data (photo caption, license, photographer's name), Shutr may benefit from using RDF to express this structure.
We explore how Shutr might use RDFa to express RDF right in the
HTML it already publishes. We assume an additional XML
namespace, shutr
, which corresponds to
URI http://www.shutr.net/rdf/shutr#
.
The simplest structured data Shutr might want to expose is basic information about a photo album: the creator of the album, the date of creation, and its license. We consider literal properties first, and URI properties second. (We ignore photo-specific data for now, as that involves RDF statements about an image, which is not an HTML document. We will, of course, get back to this soon.)
A literal property is a string of text, e.g. "Ben
Adida", a number, e.g. "28", or any other typed,
self-contained datum that one might want to express as a
property. In RDFa, literal properties are expressed using the property
attribute and an optional content
attribute.
Consider Mark Birbeck, a user of the Shutr system with
username markb
, and his latest photo album
"Vacation in the South of France." This photo album
resides
at http://www.shutr.net/user/markb/album/12345
. The
HTML document presented upon request of that URI includes
the following HTML snippet:
<h1>Photo Album #12345: Vacation in the South of France</h1> <h2>created by Mark Birbeck</h2>
Notice how the rendered HTML contains elements of the photo album's structured data. Using RDFa, Shutr can mark up this HTML to indicate these structured data properties without repeating the raw data:
<h1>Photo Album #12345: <span property="dc:title">Vacation in the South of France</span></h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
An RDFa-aware browser would thus extract the following RDF triples:
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral .
(The ^^XMLLiteral
notation, which denotes a datatype, will be explained shortly.)
The span
element is actually not required. One can easily add the RDFa attributes to existing HTML elements, without adding a span:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
which yields the same RDF triples, of course. The use of an extra span
is helpful when existing HTML markup isn't enough to isolate the rendered content that is relevant to the RDF triple.
A reader who knows about XML datatypes might, at this point in the presentation, wonder what datatype these values will have. Given the above RDFa, "Vacation in the South of France" is an XML Literal. In some cases, this may not be appropriate. Consider an expanded HTML snippet which includes the photo album's creation date:
<h1>Vacation in the South of France</h1> <h2>created by Mark Birbeck on 2006-01-02</h2>
A precise way to augment this HTML with RDFa is:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" type="xsd:date">2006-01-02</span></h2>
which would yield the following triples (note how the
default datatype is XMLLiteral
, which
explains the first example above.):
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral . <> dc:date "2006-01-02"^^xsd:date .
Going further, Shutr realizes
that 2006-01-02
, while a correct xsd:date
representation, is not exactly user-friendly. In this
case, having the rendered data be the same as the
structured data might not be the right answer. Shutr may
instead opt for the following RDFa:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" type="xsd:date" content="2006-01-02"> January 2nd, 2006 </span> </h2>
The above HTML will render the date as "January 2nd,
2006" but will yield the exact same triples as above. The
use of the content
attribute should be limited
to cases where the rendered text is not well-enough
structured to represent the data.
A URI property is one that is merely a reference
to a web-accessible resource, e.g. an image, a PDF
document, or another HTML document, all reachable via the
web. In RDFa, URI properties are expressed using the rel
and href
attributes. The href
attribute is the well-understood target of a link, while rel
indicates a relationship.
Shutr may want to give its users the ability to license their photos to the world under certain specific conditions. For this purpose, there are numerous existing licenses, including those published by Creative Commons. Thus, if Mark Birbeck chooses to license his vacation album for others to reuse, Shutr might use the following HTML snippet:
This document is licensed under a <a href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
This clickable link has an intended meaning: it is the document's license. Using RDFa can cement that meaning within the HTML itself:
This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
Note the use of the rel
attribute to indicate a
URI property rather than a textual one. The use of this
attribute goes hand in hand with an href
attribute within the same element. This href
attribute indicates the URI object of the RDF
triple. Thus, the above RDFa yields the following triple:
<> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .
Compared with other existing RDF mechanisms to indicate Creative Commons licensing—e.g. a parallel RDF/XML file or inline RDF/XML within HTML comments—, the RDFa approach provides Creative Commons and Shutr with a significant integrity advantage: the clickable link is the semantic link, and any change to the target will change both the human and machine views. Also, a simple copy-and-paste of the HTML will carry through both the rendered and semantic data.
In both cases, the target URI may provide an HTML document which includes further RDFa statements. The Creative Commons license page, for example, may include RDFa statements about its legal details.
Editorial note: Ben | 2006-10-21 |
this section is a bit repetitive, since we have chunks of documents in the first section already. We may want to revise it, shorten it, and make it more relevant to the use cases. |
The above examples casually swept under the rug the issue of
the RDF subject: most of the triples expressed were about the
current document representing a photo album. However, not
all RDF triples in a given HTML document will be about
that document itself. In RDFa, the default subject is the
current document, but it can easily be overridden using
the about
attribute, which we briefly introduced in the very first example.
Shutr may choose to present many photos in a given HTML
page. In particular, at the
URI http://www.shutr.net/user/markb/album/12345
,
all of the album's photos will appear inline. Structured data
about each photo can be included simply by specifying
an about
attribute:
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> </li> <li> <img src="/user/markb/photo/34567" />, <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> </li> </ul>
The above RDFa yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .
This same approach applies to statements with URI objects. For example, each photo in the album has a creator and may have its own usage license.
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> taken by photographer <a about="/user/markb/photo/23456" property="dc:creator" href="/user/markb"> Mark Birbeck </a>, licensed under a <a about="/user/markb/photo/23456" rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li> <img src="/user/markb/photo/34567" /> <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> taken by photographer <a about="/user/markb/photo/34567" property="dc:creator" href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a about="/user/markb/photo/34567" rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
This yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/23456> dc:creator "Mark Birbeck"^^XMLLiteral . </user/markb/photo/23456> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral . </user/markb/photo/34567> dc:creator "Steven Pemberton"^^XMLLiteral . </user/markb/photo/34567> cc:license <http://creativecommons.org/licenses/by/2.5/> .
about
At this point, Shutr might begin to worry about the
fast-growing size of its HTML document, given that the
photo's URI must be repeated in the about
attribute for every RDF property expressed. To address
this issue, RDFa allows the value of this attribute to be
inherited from a parent or ancestor element. In other words, if an
element carries a rel
or property
attribute, but no about
attribute, an RDFa
browser will determine the subject of the RDF statement by
navigating up the parent hierarchy of that element until
it finds an about
, or until it gets to the root
element, at which point the default
is about=""
.
Thus, the markup for the above example can be simplified to:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> <span property="dc:title"> Sunset in Nice </span>, taken by photographer <a property="dc:creator" href="/user/markb/"> Mark Birbeck </a>, licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li about="/user/markb/photo/34567"> <img src="/user/markb/photo/34567" /> <span property="dc:title"> W3C Meeting in Mandelieu </span>, taken by photographer <a property="dc:creator" href="/user/stevenp"> Steven Pemberton </a> licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
which yields the same triples as the previous example, though, in this case, one can easily see the parallel to the corresponding N3 shorthand:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral ; dc:creator "Mark Birbeck"^^XMLLiteral ; cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral ; dc:creator "Steven Pemberton"^^XMLLiteral ; cc:license <http://creativecommons.org/licenses/by/2.5/> .
While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belong to a single user is the more likely scenario. For this purpose, RDFa provides ways to make structured data statements about chunks of documents using natural HTML constructs.
Consider the
page http://www.shutr.net/user/markb/cameras
,
which, as its URI implies, lists Mark Birbeck's
cameras. Its HTML includes:
<ul> <li id="nikon_d200"> Nikon D200, purchased on 2004-06-01. </li> <li id="canon_sd550"> Canon Powershot SD550, purchased on 2005-08-01. </li> </ul>
and the photo page will then include information about which camera was used to take each photo:
<ul> <li> <img src="/user/markb/photo/23456" /> ... using the <a href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
The RDFa syntax for formally specifying the relationship is exactly the same as before, as expected:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> ... using the <a rel="shutr:takenWith" href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
which corresponds to:
</user/markb/photo/23456> shutr:takenWith </user/markb/cameras#nikon_d200>
Then, the HTML snippet at http://www.shutr.net/user/markb/cameras
is:
<ul> <li id="nikon_d200" about="#nikon_d200"> <span property="dc:title" type="xsd:string"> Nikon D200 </span> purchased on <span property="dc:date" type="xsd:date"> 2004-06-01 </span> </li> <li id="canon_sd550" about="#canon_sd550"> <span property="dc:title" type="xsd:string"> Canon Powershot SD550 </span> purchased on <span property="dc:date" type="xsd:date"> 2005-08-01 </span> </li> </ul>
which then yields the following triples:
<#nikon_d200> dc:title "Nikon D200"^^xsd:string ; dc:date "2004-06-01"^^xsd:date . <#canon_sd550> dc:title "Canon SD550"^^xsd:string ; dc:date "2005-08-01"^^xsd:date .
One immediately wonders whether the redundancy between
the about
and id
attributes can be
simplified. Partly for this purpose, RDFa includes
elements link
and meta
. These elements are similar in function to a
and
span
except for one special behavior: they only apply to their immediate parent
element, even if an ancestor element bears an
alternate about
attribute.
<ul> <li id="nikon_d200"> <meta property="dc:title" type="xsd:string"> Nikon D200 </meta> purchased on <meta property="dc:date" type="xsd:date"> 2004-06-01 </meta> </li> <li id="canon_sd550"> <meta property="dc:title" type="xsd:string"> Canon Powershot SD550 </meta> purchased on <meta property="dc:date" type="xsd:date"> 2005-08-01 </meta> </li> </ul>
One might now wonder how meta
and link
behave when their parent element doesn't have
an id
or about
attribute. The result
of such syntax is an RDF bnode, an advanced topic which we
skip in this Primer.
One popular RDF vocabulary is FOAF [FOAF], which provides structure for social networking and personal information. FOAF is particularly interesting to consider because it provides deeper structure than the examples provided so far: a FOAF person has an office, which has an address, which has a street, city, zip code, and country. So far, we have only explained how to define structure "one-level deep."
Consider, specifically, that Tim Berners-Lee is encouraging folks to publish a FOAF file. Let's express (a portion of) Tim's FOAF file using RDFa. We start with a portion of Tim's homepage:
<dl> <dt>Email</dt> <dd>timbl@w3.org</dd> <dt>Address</dt> <dd> 77 Massachusetts Ave.<br /> MIT Room 32-G524<br /> Cambridge MA 02139<br /> USA </dd> <dt>Phone</dt> <dd>+1 (617) 253 5702</dd> <dt>Fax:</dt> <dd>+1 (617) 258 5999</dd> </dl>
We can easily mark up the "one-layer deep" structure, specifically
the email, phone, and fax fields with
properties foaf:mailbox
, foaf:phone
,
and foaf:fax
:
<dl class="foaf:Person" about="#card"> <dt>Email</dt> <dd property="foaf:mbox">timbl@w3.org</dd> ... <dt>Phone</dt> <dd property="foaf:phone">+1 (617) 253 5702</dd> <dt>Fax:</dt> <dd property="foaf:fax">+1 (617) 258 5999</dd> </dl>
Now, we need to express the address information in relation to
Tim, as well as the address's properties, e.g. street address,
city, state, etc. Recall that, when referencing another named
resource, we've used the rel
attribute. For example, to
describe the licensing of a document, we've used the markup:
This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons License </a>
What we need to do here is describe a relationship
of foaf:address
between Tim and some unnamed
node of data, off which we want to hang additional
properties. Thus, we use the rel
attribute again, this
time without a corresponding href
.
... <dt>Address</dt> <dd rel="foaf:address"> 77 Massachusetts Ave.<br /> MIT Room 32-G524<br /> Cambridge MA 02139<br /> USA </dd> ... </dl>
The HTML element on which this rel
is expressed, in
this case dd
, then represents a blank node (in
RDF terminology) that is the object of
the foaf:address
relationship. In addition, the
subject of all contained RDFa statements is transparently set to
be this blank node, as if there were an
implicit about
. This then allows the following markup
to say exactly what we mean:
<dl class="foaf:Person" about="#card"> ... <dt>Address</dt> <dd rel="foaf:address"> <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br /> <span property="foaf:address_line_2">MIT Room 32-G524</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </dd> ... </dl>
This layering of structured data easily extends to multiple
layers. Consider what Tim might do if he were to list both his
home and office addresses. The properties foaf:office
and foaf:home
each relate a foaf:Person
to a location, and each location has an address. Thus, the markup
becomes, quite naturally:
<dl class="foaf:Person" about="#card"> ... <dt>Office Address</dt> <dd rel="foaf:office"> <div rel="foaf:address"> <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br /> <span property="foaf:address_line_2">MIT Room 32-G524</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </div> </dd> <dt>Home Address</dt> <dd rel="foaf:home"> <div rel="foaf:address"> <span property="foaf:address_line_1">1 Web Way</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </div> </dd> ... </dl>
Using this technique, it is relatively easy and natural to express fairly extensive and complex structured data.
When interpreted as RDF, the use of the rel
attribute
without a corresponding href
creates a new RDF blank
node. Specifically, in the first example where Tim publishes a
single address, the triples are:
<#card> rdf:type foaf:Person . <#card> foaf:address _:dd0 . _:dd0 foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral . _:dd0 foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral . _:dd0 foaf:city "Cambridge"^^XMLLiteral . _:dd0 foaf:country "USA"^^XMLLiteral .
In the case of the multiple addresses, the triples become:
<#card> rdf:type foaf:Person . <#card> foaf:office _:dd0 . <#card> foaf:home _:dd1 . _:dd0 foaf:address _:div0 . _:div0 foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral . _:div0 foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral . _:div0 foaf:city "Cambridge"^^XMLLiteral . _:div0 foaf:country "USA"^^XMLLiteral . _:dd1 foaf:address _:div1 . _:div1 foaf:address_line_1 "1 Web Way"^^XMLLiteral . _:div1 foaf:city "Cambridge"^^XMLLiteral . _:div1 foaf:country "USA"^^XMLLiteral .
In most cases, it is neither useful nor desired to make the internal components of the structured data accessible by the outside world. For example, there is likely no good reason for Tim to give his office address a URI which other RDF statements might reference. Simply handing out his FOAF URI is enough.
However, in some cases, it may in fact be useful to name all
components yet to continue to use the rel
without
an href
to designate the structured
relationship. RDFa allows this as naturally as possible: if the
element on which the rel
is added has
an id
or an about
, then the value of that
attribute becomes the name of the node, and all triples are
appropriately updated. about
takes precedence
over id
, since it is an explicit RDFa statement.
For example, if Tim chose the following markup:
<dl class="foaf:Person" about="#card"> ... <dt>Address</dt> <dd rel="foaf:address" id="address"> <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br /> <span property="foaf:address_line_2">MIT Room 32-G524</span><br /> <span property="foaf:city">Cambridge</span> MA 02139<br /> <span property="foaf:country">USA</span> </dd> ... </dl>
the triples would then become:
<#card> rdf:type foaf:Person . <#card> foaf:address <#address> . <#address> foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral . <#address> foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral . <#address> foaf:city "Cambridge"^^XMLLiteral . <#address> foaf:country "USA"^^XMLLiteral .
Note that this doesn't change anything significant about the structured data itself, only that the address is now addressable by other structured data statements.
This document is the work of the RDF-in-HTML Task Force, including (in alphabetical order) Ben Adida, Mark Birbeck, Jeremy Carroll, Steven Pemberton, and Ralph Swick. This work would not have been possible without the help of the Semantic Web Deployment and Best Practices Working Group, in particular chairs Guus Schreiber and David Wood. Earlier versions of this document were officially reviewed by Gary Ng and David Booth, both of whom provided insightful comments that significantly improved the work.