W3C

RDFa Primer 1.0

Embedding RDF in XHTML

W3C Working Draft 12 March 2007

This version:
http://www.w3.org/TR/2007/WD-xhtml-rdfa-primer-20070312/
Latest version:
http://www.w3.org/TR/xhtml-rdfa-primer
Previous version:
http://www.w3.org/TR/2006/WD-xhtml-rdfa-primer-20060516/
Editors:
Ben Adida, Creative Commons <ben@adida.net>
Mark Birbeck, x-port.net Ltd. <mark.birbeck@x-port.net>

Abstract

Current web pages, written in HTML, contain significant inherent structured data. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing.

RDFa is a syntax for expressing this structured data in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.

This document is an introduction to RDFa. A more detailed syntax specification is being produced.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is joint work of the W3C Semantic Web Deployment Working Group [SWD-WG] and the former W3C HTML Working Group, now called the W3C XHTML2 Working Group [XHTML2-WG]. This work is part of both the W3C Semantic Web Activity and the HTML Activity. The two Working Groups expect to advance this work to Recommendation Status.

Comments on this Working Draft are welcome and may be sent to public-rdf-in-xhtml-tf@w3.org; please include the text "comment" in the subject line. All messages received at this address are viewable in a public archive.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Changes

Changes since the previous version include:

Table of Contents

1 Purpose of RDFa and Preliminaries
    1.1 Audience
2 A First Scenario: Publishing Events and Contacts
    2.1 The Basic HTML
    2.2 Publishing An Event
    2.3 Publishing Contact Information
    2.4 The Complete HTML with RDFa
    2.5 RDFa with Limited HTML control
    2.6 The RDF Triples
3 A Second Scenario: Publishing Photos
    3.1 The Shutr Photo Management System
    3.2 Literal Properties
    3.3 URI Properties
4 Beyond the Current Document
    4.1 Qualifying Other Documents
    4.2 Inheriting about
    4.3 Qualifying Chunks of Documents
5 More Complex Structured Data: Social Networking with FOAF
    5.1 Two Layers of Structured Data
    5.2 Additional Layers
    5.3 The RDF Triples
    5.4 Naming the Nodes
6 Bibliography
7 Acknowledgments


1 Purpose of RDFa and Preliminaries

Current web pages, written in HTML, contain significant inherent structured data. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing.

RDFa is a syntax that expresses this structured data using a set of elements and attributes that embed RDF in HTML. An important goal of RDFa is to achieve this RDF embedding without repeating existing HTML content when that content is the structured data. RDFa is designed to work with different XML dialects, e.g. XHTML1, SVG, etc., given proper schema additions. In addition, RDFa is defined so as to be compatible with non-XML HTML.

An XHTML document marked up with RDFa constructs should validate, and a non-XML HTML document marked up with RDFa remains compliant. RDFa uses existing HTML constructs and HTML-compatible extensions to specify RDF 'content'. It is not about embedding RDF/XML syntax into HTML documents.

We note that RDFa makes use of XML namespaces. In this document, we assume, for simplicity's sake, that the following namespaces are defined: dc for Dublin Core, foaf for FOAF, cc for Creative Commons, and xsd for XML Schema Definitions:

1.1 Audience

The audience for this document should have a working knowledge of XHTML. Some familiarity with RDF is useful, though the basics can be picked up from reading this Primer. Similarly, the basic XML concepts used in this work — in particular namespaces — can be picked up from reading this Primer.

2 A First Scenario: Publishing Events and Contacts

Jo blogs about her work, which involves web development.

2.1 The Basic HTML

Jo has an upcoming talk at the XTech Conference, on May 8th at 10am, where she will be discussing "web widgets". She blogs an announcement of her talk at http://jo-blog.example.org/. Her blog also includes her contact information (Jo has a fantastic spam filter, so she is unafraid of publishing her email address):

<html>
    <head><title>Jo's Blog</title></head>
    <body>
...
    <p>
        I'm giving a talk at the XTech Conference about web widgets, on May 8th at 10am.
    </p>
...
    <p class="contactinfo">
        My name is Jo Smith. I'm a distinguished web engineer
        at
        <a href="http://example.org">
            Example.org
        </a>.
        You can contact me
        <a href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...
    </body>
</html>

This short piece of mark-up is already full of structured data.

The markup describes an event: a talk that Jo is giving. This event starts at 10am on May 8th. A summary of the event is "a talk at XTech 2007 on web widgets." We also have contact information for Jo: she works for the organization Example.org, with job title of "Distinguished Web Engineer." She can be contacted at the email address "jo@example.org."

At the moment, it is very difficult for software like web browsers and search engines to make use of this implicit data. We need a standard mechanism to explicitly express it. This is precisely where RDFa comes in.

2.2 Publishing An Event

Jo would like to add some structure to this blog entry so that readers of her blog might be able to add her talk directly to their calendar. RDFa allows her to do just that, using extra attributes. Since this is a calendar event, Jo will specifically use the iCal vocabulary [ICAL-RDF] to denote the data's structure.

The first step is to reference the iCal vocabulary within the HTML page, so that a parser may know where to look up the vocabulary terms:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#">
...

then, Jo declares a new event and gives it a name:

   <p class="cal:Vevent" id="xtech_conference_talk">
        ...
   </p> 

Note how the class attribute is used here to define the type of the data being expressed, exactly as this attribute was initially intended in HTML. (If Jo wanted to declare multiple types, she could include more than one value in the class attribute with space separation.) Now, Jo wants to make sure that all information within this p describe the event itself. She adds an additional attribute:

   <p class="cal:Vevent" id="xtech_conference_talk" about="#xtech_conference_talk">
        ...
   </p> 

then, inside this event declaration, Jo can set up the event fields, reusing the existing HTML. For example, the event summary can be declared as:

       I'm giving <span property="cal:summary">a talk at the XTech Conference about web widgets</span>,

The property attribute on the span element declares a data field that pertains to the closest declared about, in this case the Vevent declared in the p. Note how the existing rendered content, "a talk at the XTech Conference about web widgets", is the value of this field. Sometimes, this isn't the right thing. Specifically, the start time of the event should be rendered nicely — "May 8th" — but should likely be represented in an easy, machine-parsable way, the standard iCal format: 20070508T1000+0200. In this case, the markup needs only a slight modification:

       <span property="cal:dtstart" content="20070508T1000+0200">May 8th at 10am</span>

In this case, the actual content of the span element, "May 8th at 10am", is ignored for structured data purposes: it has been replaced by the explicit content attribute. The full markup is then:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#">
    <head><title>Jo's Blog</title></head>
    <body>
...
    <p class="cal:Vevent" about="#xtech_conference_talk">
        I'm giving
        <span property="cal:summary">
            a talk at the XTech Conference about web widgets
        </span>,
        on 
        <span property="cal:dtstart" content="20070508T1000+0200">
            May 8th at 10am
        </span>.
    </p>
...
    </body>
</html>

The above markup can be interpreted now as a set of RDF triples, the details of which we we explain in Section 2.6 The RDF Triples.

Note that Jo could have used any other HTML element, not just span, to carry the structure of her data. In other words, when the structure of the data is already laid out in the HTML using elements such as h1, em, div, etc..., Jo can simply add the property attribute, and optionally the content attribute, to indicate the specific structure.

2.3 Publishing Contact Information

Now that Jo has published an event using structured data, she realizes there is much data on her blog that she can mark up in the same way. Her contact information, in particular, is an easy target for structured markup with RDFa:

...
    <p class="contactinfo">
        My name is Jo Smith. I'm a distinguished web engineer
        at
        <a href="http://example.org">
            Example.org
        </a>.
        You can contact me 
        <a href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Jo discovers the vCard RDF vocabulary [VCARD-RDF], which she adds to her existing page. Since Jo thinks of vCards as a way to publish her contact information, she uses the prefix contact to designate this vocabulary. Note that, although Jo already imported the iCal vocabulary, adding the vCard vocabulary is just as easy and does not interfere:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
      xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
...

Jo then sets up her vCard using RDFa, by deciding that the appropriate p will be her vcard. She notes, however, that the vCard schema does not require declaring a vCard type. Instead, it is recommended that a vCard refer to a web page that identifies the individual. Jo thus uses RDFa's special attribute about for just for this purpose, indicating that all contained HTML pertain to Jo's designated URL. Note how the about attribute is inherited from parent elements in the HTML: the about attribute on the nearest ancestor applies to declared structured data.

...
    <p class="contactinfo" about="http://example.org/staff/jo">
        ...everything here pertains to http://example.org/staff/jo...
    </p>
...

"Simple enough!" Jo realizes. She adds her first vCard fields: name, title, organization and email.

...
    <p class="contactinfo" about="http://example.org/staff/jo">
        My name is
        <span property="contact:fn">
            Jo Smith
        </span>.
        I'm a
        <span property="contact:title">
            distinguished web engineer
        </span>
        at
        <a rel="contact:org" href="http://example.org">
            Example.org
        </a>.
        You can contact me
        <a rel="contact:email" href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Notice how Jo was able to use the rel attribute directly within the anchor tag for designating her organization and email address. In this case, the rel indicates a relationship between the current URL, designated by about, and the target URL, designated by href. The exact meaning of this relationship is defined by the rel. In this case, contact:org indicates the relationship of "vCard organization", while contact:email indicates the relationship of "vCard email".

The rel attribute is naturally paired with the href attribute, much like the property attribute is paired with the content attribute. The astute reader will notice that we have defined what happens when a property attribute is present without a content attribute, but not what happens when a rel attribute is present without its corresponding href. We explore this feature in Section 5 More Complex Structured Data: Social Networking with FOAF.

(The above example slightly simplifies the vCard vocabulary where email is concerned, since vCard technically requires indicating the type of the email. This simplification is for clarity's sake.)

2.4 The Complete HTML with RDFa

Jo's complete HTML with RDFa is thus:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
      xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
...
    <p class="cal:Vevent" about="#xtech_conference_talk">
        I'm giving
        <span property="cal:summary">
            a talk at the XTech Conference about web widgets
        </span>,
        on
        <span property="cal:dtstart" content="20070508T1000+0200">
            May 8th at 10am
        </span>.
    </p>
...
    <p class="contactinfo" about="http://example.org/staff/jo">
        My name is
        <span property="contact:fn">
            Jo Smith
        </span>.
        I'm a
        <span property="contact:title">
            distinguished web engineer
        </span>
        at
        <a rel="contact:org" href="http://example.org">
            Example.org
        </a>.
        You can contact me 
        <a rel="contact:email" href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Note how, if Jo changes her email address link, her organization, or the title of her talk, the RDFa approach will automatically pick up these changes in the marked up, structured data. The only places where this doesn't happen is when the content attribute must override the rendered content, which is inevitable when the human-rendered data and the machine-readable data must differ.

The RDF triples generated by the above markup are detailed in Section 2.6 The RDF Triples.

2.5 RDFa with Limited HTML control

What if Jo does not have complete control over the HTML of her blog? For example, she may be using a templating system which makes it particularly difficult to add the vocabularies in the html element at the top of her page without adding it to every page on her site. Or, she may be using a web blogging provider that doesn't allow her to change the header of the page to begin with.

Fortunately, RDFa uses standard XML namespaces, which means that the vocabularies can be imported "locally" to an HTML element. Jo's HTML blog page could express the exact same structured data with the following markup:

<html>
...
    <p class="cal:Vevent" about="#xtech_conference_talk"
       xmlns:cal="http://www.w3.org/2002/12/cal/ical#">
        I'm giving
        <span property="cal:summary">
            a talk at the XTech Conference about web widgets
        </span>,
        on
        <span property="cal:dtstart" content="20070508T1000+0200">
            May 8th at 10am
        </span>.
    </p>
...
    <p class="contactinfo" about="http://example.org/staff/jo"
       xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
        My name is
        <span property="contact:fn">
            Jo Smith
        </span>.
        I'm a
        <span property="contact:title">
            distinguished web engineer
        </span>
        at
        <a rel="contact:org" href="http://example.org">
            Example.org
        </a>.
        You can contact me 
        <a rel="contact:email" href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Of course, just like in the case of the vocabularies defined on the top-level html tag, more than one vocabulary can be imported into any element. In this case, each p only needs one vocabulary: the first uses iCal, the second uses vCard. This approach helps with the desired ability to copy-and-paste HTML from one page to another: the closer the namespace declarations to their relevant statements, the easier it is to copy and paste the content.

2.6 The RDF Triples

RDFa is parsed to generate RDF triples, which we denote using N3 notation [N3]. URIs are written using angle brackets, e.g. <http://example.org/foo/bar>, literals are written in quotation marks, e.g. "a talk", and QNames are written directly, e.g. cal:summary.

In Section 2.2 Publishing An Event, Jo published an event. The RDF triples extracted from her markup are:

<http://jo-blog.example.org/blog/?p=123#xtech_conference_talk>
                         rdf:type cal:Vevent; 
                         cal:summary "a talk at the XTech Conference about web widgets"^^XMLLiteral;
                         cal:dtstart "20070508T1000+0200" .

In Section 2.3 Publishing Contact Information, Jo published contact information. The RDFa is parsed to generate the following RDF triples:

<http://example.org/staff/jo>
               contact:fn "Jo Smith"^^XMLLiteral;
               contact:title "distinguished web engineer"^^XMLLiteral;
               contact:org <http://example.org>;
               contact:email <mailto:jo@example.org>.
        

(The ^^XMLLiteral notation, which denotes a datatype, will be explained shortly.)

3 A Second Scenario: Publishing Photos

3.1 The Shutr Photo Management System

Consider a (fictional) photo management web site called Shutr, whose web site is http://www.shutr.net. Users of Shutr can upload their photos at will, annotate them, organize them into albums, and share them with the world. They can choose to keep these photos private, or make them available for public consumption under licensing terms of their choosing.

The primary interface to Shutr is its web site and the HTML it delivers. Since photos are contributed by users with significant amount of built-in structured data (camera type, exposure, etc...) and additional, explicitly provided data (photo caption, license, photographer's name), Shutr may benefit from using RDF to express this structure.

We explore how Shutr might use RDFa to express RDF right in the HTML it already publishes. We assume an additional XML namespace, shutr, which corresponds to URI http://www.shutr.net/rdf/shutr#.

The simplest structured data Shutr might want to expose is basic information about a photo album: the creator of the album, the date of creation, and its license. We consider literal properties first, and URI properties second. (We ignore photo-specific data for now, as that involves RDF statements about an image, which is not an HTML document. We will, of course, get back to this soon.)

3.2 Literal Properties

A literal property is a string of text, e.g. "Ben Adida", a number, e.g. "28", or any other typed, self-contained datum that one might want to express as a property. In RDFa, literal properties are expressed using the property attribute and an optional content attribute.

Consider Mark Birbeck, a user of the Shutr system with username markb, and his latest photo album "Vacation in the South of France." This photo album resides at http://www.shutr.net/user/markb/album/12345. The HTML document presented upon request of that URI includes the following HTML snippet:

<h1>Photo Album #12345: Vacation in the South of France</h1>
<h2>created by Mark Birbeck</h2>

Notice how the rendered HTML contains elements of the photo album's structured data. Using RDFa, Shutr can mark up this HTML to indicate these structured data properties without repeating the raw data:

<h1>Photo Album #12345: <span property="dc:title">Vacation in the South of France</span></h1>
<h2>created by <span property="dc:creator">Mark Birbeck</span></h2>

An RDFa-aware browser would thus extract the following RDF triples:

<> dc:title "Vacation in the South of France"^^XMLLiteral .
<> dc:creator "Mark Birbeck"^^XMLLiteral .

When the existing HTML elements already delineate the exact structure, adding a new span element is not required. One can easily add the RDFa attributes to existing HTML elements:

<h1 property="dc:title">Vacation in the South of France</h1>
<h2>created by <span property="dc:creator">Mark Birbeck</span></h2>

which yields the same RDF triples, of course. The use of an extra span is helpful when existing HTML markup isn't enough to isolate the rendered content that is relevant to the RDF triple.

A reader who knows about XML datatypes might, at this point in the presentation, wonder what datatype these values will have. Given the above RDFa, "Vacation in the South of France" is an XML Literal. In some cases, this may not be appropriate. Consider an expanded HTML snippet which includes the photo album's creation date:

<h1>Vacation in the South of France</h1>
<h2>created by Mark Birbeck on 2007-01-02</h2>

A precise way to augment this HTML with RDFa is:

<h1 property="dc:title">Vacation in the South of France</h1>
<h2>created by <span property="dc:creator">Mark Birbeck</span>
    on <span property="dc:date" datatype="xsd:date">2007-01-02</span></h2>

which would yield the following triples (note how the default datatype is XMLLiteral, which explains the first example above.):

<> dc:title "Vacation in the South of France"^^XMLLiteral .
<> dc:creator "Mark Birbeck"^^XMLLiteral .
<> dc:date "2007-01-02"^^xsd:date .

Going further, Shutr realizes that 2007-01-02, while a correct xsd:date representation, is not exactly user-friendly. In this case, having the rendered data be the same as the structured data might not be the right answer. Shutr may instead opt for the following RDFa:

<h1 property="dc:title">Vacation in the South of France</h1>
<h2>created 
  by <span property="dc:creator">Mark Birbeck</span>
  on <span property="dc:date" datatype="xsd:date"
           content="2007-01-02">
    January 2nd, 2007
     </span>
</h2>

The above HTML will render the date as "January 2nd, 2007" but will yield the exact same triples as above. The use of the content attribute should be limited to cases where the rendered text is not well-enough structured to represent the data.

If Shutr wants to indicate that a specific object, e.g. "Mark Birbeck", plays two different roles, e.g. creator and publisher, the markup can easily be updated to reflect this duality without repeating the data:

<h1 property="dc:title">Vacation in the South of France</h1>
created by <span property="dc:creator dc:publisher">Mark Birbeck</span>

In all of the above markup and triples, as well as in the rest of the document, we use the dc:creator predicate with both literals, e.g. strings, and URIs, e.g. names and "persons": both are allowed by the Dublin Core specification. We show in 4 Beyond the Current Document how to refer to fragments of a document, and in 5 More Complex Structured Data: Social Networking with FOAF how to create deeper structures to address this type of issue.

3.3 URI Properties

A URI property is one that is merely a reference to a web-accessible resource, e.g. an image, a PDF document, or another HTML document, all reachable via the web. In RDFa, URI properties are expressed using the rel and href attributes. The href attribute is the well-understood target of a link, while rel indicates a relationship.

Shutr may want to give its users the ability to license their photos to the world under certain specific conditions. For this purpose, there are numerous existing licenses, including those published by Creative Commons. Thus, if Mark Birbeck chooses to license his vacation album for others to reuse, Shutr might use the following HTML snippet:

This document is licensed under a
<a href="http://creativecommons.org/licenses/by-nc/2.5/">
  Creative Commons Non-Commercial License
</a>.

This clickable link has an intended meaning: it is the document's license. Using RDFa can cement that meaning within the HTML itself:

This document is licensed under a
<a rel="cc:license"
   href="http://creativecommons.org/licenses/by-nc/2.5/">
  Creative Commons Non-Commercial License
</a>.

Note the use of the rel attribute to indicate a URI property rather than a textual one. The use of this attribute goes hand in hand with an href attribute within the same element. This href attribute indicates the URI object of the RDF triple. Thus, the above RDFa yields the following triple:

<> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .

It is worth noting that the rel attribute, like property and class, supports multiple values, separated by spaces. The triples generated are the same as if each value were declared independently.

Compared with other existing RDF mechanisms to indicate Creative Commons licensing; e.g. a parallel RDF/XML file or inline RDF/XML within HTML comments, the RDFa approach provides Creative Commons and Shutr with a significant integrity advantage: the clickable link is the semantic link, and any change to the target will change both the human and machine views. Also, a simple copy-and-paste of the HTML will carry through both the rendered and semantic data.

In both cases, the target URI may provide an HTML document which includes further RDFa statements. The Creative Commons license page, for example, may include RDFa statements about its legal details.

4 Beyond the Current Document

The above examples casually swept under the rug the issue of the RDF subject: most of the triples expressed were about the current document representing a photo album. However, not all RDF triples in a given HTML document will be about that document itself. In RDFa, the default subject is the current document, but it can easily be overridden using the about attribute, which we briefly introduced in the very first example.

4.1 Qualifying Other Documents

Shutr may choose to present many photos in a given HTML page. In particular, at the URI http://www.shutr.net/user/markb/album/12345, all of the album's photos will appear inline. Structured data about each photo can be included simply by specifying an about attribute:

<ul>
  <li> <img src="/user/markb/photo/23456" />,
    <span about="/user/markb/photo/23456" property="dc:title">
      Sunset in Nice
    </span>
  </li>

  <li> <img src="/user/markb/photo/34567" />,
    <span about="/user/markb/photo/34567" property="dc:title">
      W3C Meeting in Mandelieu
    </span>
  </li>
</ul>

The above RDFa yields the following triples:

</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral .

</user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .

This same approach applies to statements with URI objects. For example, each photo in the album has a creator and may have its own usage license.

<ul>
  <li> <img src="/user/markb/photo/23456" />,
    <span about="/user/markb/photo/23456" property="dc:title">
      Sunset in Nice
    </span>
    taken by photographer
    <a about="/user/markb/photo/23456" 
       property="dc:creator"
       href="/user/markb">
      Mark Birbeck
    </a>,
    licensed under a
    <a about="/user/markb/photo/23456" rel="cc:license"
       href="http://creativecommons.org/licenses/by-nc/2.5/">
      Creative Commons Non-Commercial License
    </a>.
  </li>

  <li> <img src="/user/markb/photo/34567" /> 
    <span about="/user/markb/photo/34567" property="dc:title">
      W3C Meeting in Mandelieu
    </span>
    taken by photographer
    <a about="/user/markb/photo/34567"
      property="dc:creator"
      href="/user/stevenp">
      Steven Pemberton
    </a>,
    licensed under a
    <a about="/user/markb/photo/34567" rel="cc:license"
       href="http://creativecommons.org/licenses/by/2.5/">
      Creative Commons Commercial License
    </a>.
  </li>
</ul>

This yields the following triples:

</user/markb/photo/23456>
        dc:title "Sunset in Nice"^^XMLLiteral .
</user/markb/photo/23456>
    dc:creator "Mark Birbeck"^^XMLLiteral .
</user/markb/photo/23456>
        cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .

</user/markb/photo/34567>
        dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .
</user/markb/photo/34567>
        dc:creator "Steven Pemberton"^^XMLLiteral .
</user/markb/photo/34567>
        cc:license <http://creativecommons.org/licenses/by/2.5/> .

4.2 Inheriting about

At this point, Shutr might begin to worry about the fast-growing size of its HTML document, given that the photo's URI must be repeated in the about attribute for every RDF property expressed. To address this issue, RDFa allows the value of this attribute to be inherited from a parent or ancestor element. In other words, if an element carries a rel or property attribute, but no about attribute, an RDFa browser will determine the subject of the RDF statement by navigating up the parent hierarchy of that element until it finds an about, or until it gets to the root element, at which point the default is about="".

Thus, the markup for the above example can be simplified to:

<ul>
  <li about="/user/markb/photo/23456">
    <img src="/user/markb/photo/23456" />
    <span property="dc:title">
      Sunset in Nice
    </span>,
    taken by photographer 
    <a property="dc:creator" href="/user/markb/">
      Mark Birbeck
    </a>,
    licensed under a
    <a rel="cc:license"
       href="http://creativecommons.org/licenses/by-nc/2.5/">
      Creative Commons Non-Commercial License
    </a>.
  </li>

  <li about="/user/markb/photo/34567">
    <img src="/user/markb/photo/34567" />
    <span property="dc:title">
      W3C Meeting in Mandelieu
    </span>,
    taken by photographer 
    <a property="dc:creator" href="/user/stevenp">
      Steven Pemberton
    </a>
    licensed under a
    <a rel="cc:license"
       href="http://creativecommons.org/licenses/by/2.5/">
      Creative Commons Commercial License
    </a>.
  </li>
</ul>

which yields the same triples as the previous example, though, in this case, one can easily see the parallel to the corresponding N3 shorthand:

</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral ;
                          dc:creator "Mark Birbeck"^^XMLLiteral ;
                          cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .

</user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral ;
                          dc:creator "Steven Pemberton"^^XMLLiteral ;
                          cc:license <http://creativecommons.org/licenses/by/2.5/> .

4.3 Qualifying Chunks of Documents

While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belong to a single user is the more likely scenario. For this purpose, RDFa provides ways to make structured data statements about chunks of documents using natural HTML constructs.

Consider the page http://www.shutr.net/user/markb/cameras, which, as its URI implies, lists Mark Birbeck's cameras. Its HTML includes:

<ul>
  <li id="nikon_d200"> Nikon D200, purchased on 2004-06-01.
  </li>

  <li id="canon_sd550"> Canon Powershot SD550, purchased on 2005-08-01.
  </li>
</ul>

and the photo page will then include information about which camera was used to take each photo:

<ul>
  <li> <img src="/user/markb/photo/23456" />
    ...
    using the <a href="/user/markb/cameras#nikon_d200">Nikon D200</a>,
    ...
  </li>
...
</ul>

The RDFa syntax for formally specifying the relationship is exactly the same as before, as expected:

<ul>
  <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" />
    ...
    using the <a rel="shutr:takenWith" 
         href="/user/markb/cameras#nikon_d200">Nikon D200</a>,
    ...
  </li>
...
</ul>

which corresponds to:

</user/markb/photo/23456> shutr:takenWith </user/markb/cameras#nikon_d200>

Then, the HTML snippet at http://www.shutr.net/user/markb/cameras is:

<ul>
  <li id="nikon_d200" about="#nikon_d200">
    <span property="dc:title" datatype="xsd:string">
      Nikon D200
    </span>
    purchased on
    <span property="dc:date" datatype="xsd:date">
      2004-06-01
    </span>
  </li>

  <li id="canon_sd550" about="#canon_sd550">
    <span property="dc:title" datatype="xsd:string">
      Canon Powershot SD550
    </span>
    purchased on
    <span property="dc:date" datatype="xsd:date">
      2005-08-01
    </span>
  </li>
</ul>

which then yields the following triples:

<#nikon_d200> dc:title "Nikon D200"^^xsd:string ;
              dc:date "2004-06-01"^^xsd:date .

<#canon_sd550> dc:title "Canon SD550"^^xsd:string ;
               dc:date "2005-08-01"^^xsd:date .

5 More Complex Structured Data: Social Networking with FOAF

If the reader wishes only to embed simple, name-value pairs into an HTML document, this section is not required reading. However, many structured datasets quickly require some additional level of depth. In this section, we consider these more complex structures. One popular RDF vocabulary is FOAF [FOAF], which provides structure for social networking and personal information. FOAF is particularly interesting to consider because it provides deeper structure than the examples provided so far: a FOAF person has an office, which has an address, which has a street, city, zip code, and country. So far, we have only explained how to define structure "one-level deep."

5.1 Two Layers of Structured Data

Consider, specifically, that Tim Berners-Lee is encouraging folks to publish a FOAF file. Let's express (a portion of) Tim's FOAF file using RDFa. We start with a portion of Tim's homepage:

<dl>
 <dt>Email</dt>
 <dd>timbl@w3.org</dd>

 <dt>Address</dt>
 <dd>
  77 Massachusetts Ave.<br />
    MIT Room 32-G524<br />
  Cambridge MA 02139<br />
  USA
 </dd>

 <dt>Phone</dt>
 <dd>+1 (617) 253 5702</dd>

 <dt>Fax:</dt>
 <dd>+1 (617) 258 5999</dd>
</dl>

We can easily mark up the "one-layer deep" structure, specifically the email, phone, and fax fields with properties foaf:mailbox, foaf:phone, and foaf:fax:

<dl class="foaf:Person" about="#card" id="card">
 <dt>Email</dt>
 <dd property="foaf:mbox">timbl@w3.org</dd>

...

 <dt>Phone</dt>
 <dd property="foaf:phone">+1 (617) 253 5702</dd>

 <dt>Fax:</dt>
 <dd property="foaf:fax">+1 (617) 258 5999</dd>
</dl>

Now, we need to express the address information in relation to Tim, as well as the address's properties, e.g. street address, city, state, etc. Recall that, when referencing another named resource, we've used the rel attribute. For example, to describe the licensing of a document, we've used the markup:

This document is licensed under a
<a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/">
  Creative Commons License
</a>

What we need to do here is describe a relationship of foaf:address between Tim and some unnamed node of data, off which we want to hang additional properties. Thus, we use the rel attribute again, this time without a corresponding href.

...
 <dt>Address</dt>
 <dd rel="foaf:address">
  77 Massachusetts Ave.<br />
  MIT Room 32-G524<br />
  Cambridge MA 02139<br />
  USA
 </dd>

...
</dl>

The HTML element on which this rel is expressed, in this case dd, then represents a blank node (in RDF terminology) that is the object of the foaf:address relationship. In addition, the subject of all contained RDFa statements is transparently set to be this blank node, as if there were an implicit about. This then allows the following markup to say exactly what we mean:

<dl class="foaf:Person" about="#card" id="card">
...
 <dt>Address</dt>
 <dd rel="foaf:address">
  <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br />
    <span property="foaf:address_line_2">MIT Room 32-G524</span><br />
  <span property="foaf:city">Cambridge</span> MA 02139<br />
  <span property="foaf:country">USA</span>
 </dd>
...
</dl>

5.2 Additional Layers

This layering of structured data easily extends to multiple layers. Consider what Tim might do if he were to list both his home and office addresses. The properties foaf:office and foaf:home each relate a foaf:Person to a location, and each location has an address. Thus, the markup becomes, quite naturally:

<dl class="foaf:Person" about="#card" id="card"> ...
 <dt>Office Address</dt>
 <dd rel="foaf:office">
   <div rel="foaf:address">
     <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br />
     <span property="foaf:address_line_2">MIT Room 32-G524</span><br />
     <span property="foaf:city">Cambridge</span> MA 02139<br />
     <span property="foaf:country">USA</span>
   </div>
 </dd>

 <dt>Home Address</dt>
 <dd rel="foaf:home">
   <div rel="foaf:address">
     <span property="foaf:address_line_1">1 Web Way</span><br />
     <span property="foaf:city">Cambridge</span> MA 02139<br />
     <span property="foaf:country">USA</span>
   </div>
 </dd>
...
</dl>

Using this technique, it is relatively easy and natural to express fairly extensive and complex structured data.

5.3 The RDF Triples

When interpreted as RDF, the use of the rel attribute without a corresponding href creates a new RDF blank node. Specifically, in the first example where Tim publishes a single address, the triples are:

<#card> rdf:type foaf:Person .
<#card> foaf:address _:dd0 .

_:dd0 foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral .
_:dd0 foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral .
_:dd0 foaf:city "Cambridge"^^XMLLiteral .
_:dd0 foaf:country "USA"^^XMLLiteral .

In the case of the multiple addresses, the triples become:

<#card> rdf:type foaf:Person .
<#card> foaf:office _:dd0 .
<#card> foaf:home _:dd1 .

_:dd0 foaf:address _:div0 .

_:div0 foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral .
_:div0 foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral .
_:div0 foaf:city "Cambridge"^^XMLLiteral .
_:div0 foaf:country "USA"^^XMLLiteral .

_:dd1 foaf:address _:div1 .

_:div1 foaf:address_line_1 "1 Web Way"^^XMLLiteral .
_:div1 foaf:city "Cambridge"^^XMLLiteral .
_:div1 foaf:country "USA"^^XMLLiteral .

5.4 Naming the Nodes

In most cases, it is neither useful nor desired to make the internal components of the structured data accessible by the outside world. For example, there is likely no good reason for Tim to give his office address a URI which other RDF statements might reference. Simply handing out his FOAF URI is enough.

However, in some cases, it may in fact be useful to name all components yet to continue to use the rel without an href to designate the structured relationship. RDFa allows this as naturally as possible: if the element on which the rel is added has an id or an about, then the value of that attribute becomes the name of the node, and all triples are appropriately updated. about takes precedence over id, since it is an explicit RDFa statement.

For example, if Tim chose the following markup:

<dl class="foaf:Person" about="#card" id="card">
...
 <dt>Address</dt>
 <dd rel="foaf:address" id="address">
  <span property="foaf:address_line_1">77 Massachusetts Ave.</span><br />
    <span property="foaf:address_line_2">MIT Room 32-G524</span><br />
  <span property="foaf:city">Cambridge</span> MA 02139<br />
  <span property="foaf:country">USA</span>
 </dd>
...
</dl>

the triples would then become:

<#card> rdf:type foaf:Person .
<#card> foaf:address <#address> .

<#address> foaf:address_line_1 "77 Massachusetts Avenue"^^XMLLiteral .
<#address> foaf:address_line_2 "MIT Room 32G-524"^^XMLLiteral .
<#address> foaf:city "Cambridge"^^XMLLiteral .
<#address> foaf:country "USA"^^XMLLiteral .

Note that this doesn't change anything significant about the structured data itself, only that the address is now addressable by other structured data statements.

6 Bibliography

FOAF
The Friend of a Friend (FOAF) Project (See http://www.foaf-project.org/.)
RDFHTML
RDF-in-HTML Task Force (See http://www.w3.org/2001/sw/BestPractices/HTML/.)
SWD-WG
Semantic Web Best Deployment Working Group (See http://www.w3.org/2006/07/SWD/.)
SWBPD-WG
Semantic Web Best Practices and Deployment Working Group (See http://www.w3.org/2001/sw/BestPractices/.)
XHTML2-WG
XHTML2 Working Group, previously called HTML Working Group (See http://www.w3.org/MarkUp/.)
ICAL-RDF
RDF Calendar Interest Group Note (See http://www.w3.org/TR/rdfcal/.)
VCARD-RDF
Representing vCard Objects in RDF/XML (See http://www.w3.org/TR/vcard-rdf.)

7 Acknowledgments

This document is the work of the RDF-in-HTML Task Force, including (in alphabetical order) Ben Adida, Mark Birbeck, Jeremy Carroll, Michael Hausenblas, Steven Pemberton, Ralph Swick, Elias Torres, and Wing Yung. This work would not have been possible without the help of the Semantic Web Deployment Working Group, in particular chairs Guus Schreiber and Tom Baker. Earlier versions of this document were produced with the help of members of the Semantic Web Best Practices and Deployment Working Group, chaired by Guus Schreiber and David Wood. Gary Ng and David Booth, provided insightful comments on previous versions.