W3C

RDFa Primer 1.0

Embedding RDF in XHTML

W3C Working Draft 16 May 2006

This version:
http://www.w3.org/TR/2006/WD-xhtml-rdfa-primer-20060516/
Latest version:
http://www.w3.org/TR/xhtml-rdfa-primer/
Previous version:
http://www.w3.org/TR/2006/WD-xhtml-rdfa-primer-20060310/
Editors:
Ben Adida, Creative Commons <ben@creativecommons.org>
Mark Birbeck, x-port.net Ltd. <mark.birbeck@x-port.net>

Abstract

Current web pages, written in HTML, are chock-full of structured data. When publishers can express the document's metadata, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be automatically detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published to enable structured search and sharing.

RDFa is a syntax for expressing such metadata in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract metadata representation is RDF, which lets publishers build their own metadata vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The metadata is closely tied to the data it describes, so that rendered data can be copied and pasted along with its relevant structure.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was created by the RDF in XHTML Task Force (HTML) of the W3C Semantic Web Best Practices and Deployment Working Group (SWBPD) and the W3C HTML Working Group [member-only link]. This work is part of both the W3C Semantic Web Activity and the HTML Activity.

This document is a W3C Working Draft. It is a companion document to the XHTML 2.0 specification and explains the expected use of the XHTML Metainformation Module and XHTML Metainformation Attributes Module. This Working Draft is non-normative; the XHTML2 specification is the normative description of the features described in this primer. At the time of publication of this Working Draft the published XHTML2 Working Draft lags in some details. This is expected to be corrected with the next XHTML2 Working Draft. Comments are welcome and may be sent to public-rdf-in-xhtml-tf@w3.org; please include the text "comment" in the subject line. All messages received at this address are viewable in a public archive.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy. W3C also maintains a public list of any patent disclosures made in connection with the deliverables of the SWBPD Working Group.

Changes from the previous version include:

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

Table of Contents

1 Purpose of RDFa and Preliminaries
2 A First Scenario: Publishing Events and Contacts
    2.1 The Basic HTML
    2.2 Publishing An Event
    2.3 Publishing Contact Information
    2.4 The Complete HTML with RDFa
    2.5 RDFa with Limited HTML control
    2.6 What Does The Metadata Look Like?
3 A Second Scenario: Publishing Photos
    3.1 The Shutr Photo Management System
    3.2 Literal Properties
    3.3 URI Properties
4 Beyond the Current Document
    4.1 Qualifying Other Documents
    4.2 Inheriting about
    4.3 Qualifying Chunks of Documents
    4.4 Compact URIs (CURIEs)
        4.4.1 Mixing CURIEs and URIs
        4.4.2 Which Attributes are Which?
        4.4.3 Back to Shutr
5 Social Networking with FOAF
6 Bibliography
7 Acknowledgments


1 Purpose of RDFa and Preliminaries

Current web pages, written in HTML, are chock-full of structured data. When publishers can express the document's metadata, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be automatically detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published to enable structured search and sharing.

RDFa is a syntax that accomplishes this metadata expression using a set of elements and attributes that embed RDF in XHTML. An important goal of RDFa is to achieve this RDF embedding without repeating existing XHTML content when that content is the metadata. Though RDFa was initially designed for XHTML2, one should be able to use RDFa with other XML dialects, e.g. XHTML1, SVG, given proper schema additions.

An XHTML document marked up with RDFa constructs is a valid XHTML Document. RDFa is about using XHTML compatible constructs and extensions to specify RDF 'content'. It is not about embedding RDF/XML syntax into XHTML documents.

We note that RDFa makes use of XML namespaces. In this document, we assume, for simplicity's sake, that the following namespaces are defined: dc for Dublin Core, foaf for FOAF, cc for Creative Commons, and xsd for XML Schema Definitions.

2 A First Scenario: Publishing Events and Contacts

Jo blogs about her work, which involves web development.

2.1 The Basic HTML

When Jo has an upcoming conference talk, she simply blogs it at http://jo-blog.example.org/. Her Blog also includes her contact information (Jo has a fantastic spam filter, so she is unafraid of publishing her email address):

<html>
    <head><title>Jo's Blog</title></head>
    <body>
...
    <p>
        I'm giving a talk at the XTech Conference about web widgets, on May 8th at 10am.
    </p>
...
    <p class="contactinfo">
        My name is Jo Smith. I'm a distinguished web engineer
        at
        <a href="http://example.org">
            Example.org
        </a>.
        You can contact me
        <a href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...
    </body>
</html>

This short piece of mark-up is already full of metadata.

The markup describes an event: a talk that Jo is giving. This event starts at 10am on May 8th. A summary of the event is "a talk at XTech 2006 on web widgets." We also have contact information for Jo: she works for the organization Example.org, with job title of 'Distinguished Web Engineer'. She can be contacted at the email address 'jo@example.org'.

At the moment, it is very difficult for software — like web browsers and search engines — to make use of this implicit metadata. We need a standard way to explicitly express this metadata. In the next section, we will see how easy it is for Jo to use RDFa to mark up her data for just this purpose.

2.2 Publishing An Event

Jo would like to add some "structure" to this blog entry so that readers of her blog might be able to add her talk directly to their calendar. RDFa allows her to do just that, using extra elements and attributes. Since this is a calendar event, Jo will specifically use the iCal vocabulary [ICAL-RDF] to denote the metadata.

The first step is to reference the iCal vocabulary into the HTML page:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#">
...

then, Jo declares a new event:

   <p role="cal:Vevent">
        ...
   </p> 

then, inside this event declaration, Jo can set up the event fields, reusing the existing HTML. For example, the event summary can be declared as:

       I'm giving <meta property="cal:summary">a talk at the XTech Conference about web widgets</meta>,

The meta tag effectively adds fields to the parent tag, in this case the Vevent declared in the p. Note how the existing rendered content, "a talk at the XTech Conference about web widgets", is the value of this metadata field. Sometimes, this isn't the right thing. Specifically, the start time of the event should be rendered nicely — "May 8th" —, but should likely be represented in an easy, machine-parsable way, the standard iCal format: 20060508T1000-0500. In this case, the markup needs only a slight modification:

       <meta property="cal:dtstart" content="20060508T1000-0500">May 8th at 10am</meta>

The full markup is then:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#">
    <head><title>Jo's Blog</title></head>
    <body>
...
    <p role="cal:Vevent">
        I'm giving
        <meta property="cal:summary">
            a talk at the XTech Conference about web widgets
        </meta>,
        on 
        <meta property="cal:dtstart" content="20060508T1000-0500">
            May 8th at 10am
        </meta>.
    </p>
...
    </body>
</html>

The above markup can be interpreted now as a set of RDF triples, the details of which we we explain in Section ??.

2.3 Publishing Contact Information

Now that Jo has published an event using structured metadata, she realizes there is much data on her blog that she can mark up in the same way. Her contact information, in particular, is an easy target for structured markup with RDFa:

...
    <p class="contactinfo">
        My name is Jo Smith. I'm a distinguished web engineer
        at
        <a href="http://example.org">
            Example.org
        </a>.
        You can contact me 
        <a href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Jo discovers the vCard RDF vocabulary [VCARD-RDF], which she adds to her existing page. Since Jo thinks of vCards as a way to publish her contact information, she uses the prefix contact to designate this vocabulary. Note that, although Jo already imported the iCal vocabulary, adding the vCard vocabulary is just as easy and does not interfere:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
      xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
...

Jo then sets up her vCard using RDFa, by deciding that the appropriate p will be her vcard. She notes, however, that the vCard schema does not require declaring a vCard type — i.e. there is no need to declare an rdfs:type. Instead, it is recommended that a vCard refer to a web page that identifies the individual. Jo discovers that RDFa supports a special attribute just for this purpose: about, which indicates that all contained HTML pertain to this designated URL:

...
    <p class="contactinfo" about="http://example.org/staff/jo">
        ...everything here pertains to http://example.org/staff/jo...
    </p>
...

Simple enough!, Jo realizes. She adds her first vCard fields: name, title, organization and email.

...
    <p class="contactinfo" about="http://example.org/staff/jo">
        My name is
        <meta property="contact:fn">
            Jo Smith
        </meta>.
        I'm a
        <meta property="contact:title">
            distinguished web engineer
        </meta>
        at
        <a rel="contact:org" href="http://example.org">
            Example.org
        </a>.
        You can contact me
        <a rel="contact:email" href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Notice how Jo was able to use the rel attribute directly within the anchor tag for designating her organization and email address. In this case, the rel indicates a relationship between the current URL, designated by about, and the target URL, designated by href. The exact meaning of this relationship is defined by the rel. In this case, contact:org indicates the relationship of "vCard organization", while contact:email indicates the relationship of "vCard email".

(Note that the above example slightly simplifies the vCard vocabulary where email is concerned, since vCard technically requires indicating the type of the email. This simplification is for clarity's sake, and a more complete example is provided in Section ??. )

2.4 The Complete HTML with RDFa

Jo's complete HTML with RDFa now looks as follows:

<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"
      xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
...
    <p role="cal:Vevent">
        I'm giving
        <meta property="cal:summary">
            a talk at the XTech Conference about web widgets
        </meta>,
        on
        <meta property="cal:dtstart" content="20060508T1000-0500">
            May 8th at 10am
        </meta>.
    </p>
...
    <p class="contactinfo" about="http://example.org/staff/jo">
        My name is
        <meta property="contact:fn">
            Jo Smith
        </meta>.
        I'm a
        <meta property="contact:title">
            distinguished web engineer
        </meta>
        at
        <a rel="contact:org" href="http://example.org">
            Example.org
        </a>.
        You can contact me 
        <a rel="contact:email" href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Note how, if Jo changes her email address link, her organization, or the title of her talk, the RDFa approach will automatically pick up these changes in the "marked up", structured data. The only places where this doesn't happen is when the content attribute must override the rendered content, which is inevitable when the human-rendered data and the machine-readable data must differ.

The RDF triples generated by the above markup is detailed in Section ??.

2.5 RDFa with Limited HTML control

What if Jo does not have complete control over the HTML of her blog? For example, she may be using a templating system which makes it particularly difficult to add the vocabularies in the html tag at the top of her page without adding it to every page on her site. Or, she may be using a web blogging provider that doesn't allow her to change this tag to begin with.

Fortunately, RDFa uses standard XML namespaces, which means that the vocabularies can be imported "locally" to an HTML element. Jo's HTML blog page could express the exact same metadata with the following markup:

<html>
...
    <p role="cal:Vevent"
       xmlns:cal="http://www.w3.org/2002/12/cal/ical#">
        I'm giving
        <meta property="cal:summary">
            a talk at the XTech Conference about web widgets
        </meta>,
        on
        <meta property="cal:dtstart" content="20060508T1000-0500">
            May 8th at 10am
        </meta>.
    </p>
...
    <p class="contactinfo" about="http://example.org/staff/jo"
       xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#">
        My name is
        <meta property="contact:fn">
            Jo Smith
        </meta>.
        I'm a
        <meta property="contact:title">
            distinguished web engineer
        </meta>
        at
        <a rel="contact:org" href="http://example.org">
            Example.org
        </a>.
        You can contact me 
        <a rel="contact:email" href="mailto:jo@example.org">
            via email
        </a>.
    </p>
...

Of course, just like in the case of the vocabularies defined on the top-level html tag, more than one vocabulary can be imported into any element. In this case, each p only needs one vocabulary: the first uses iCal, the second uses vCard.

2.6 What Does The Metadata Look Like?

In Section ??, Jo published an event. The RDFa generates RDF triples, specifically:

_:p0 rdfs:type cal:Vevent;
               cal:summary "a talk at the XTech Conference about web widgets"^^XMLLiteral;
               cal:dtstart "20060508T1000-0500";
        

In Section ??, Jo published contact information. The RDFa generates the following RDF triples:

<http://example.org/staff/jo>
               contact:fn "Jo Smith"^^XMLLiteral;
               contact:title "distinguished web engineer"^^XMLLiteral;
               contact:org <http://example.org>;
               contact:email <mailto:jo@example.org>.
        

3 A Second Scenario: Publishing Photos

3.1 The Shutr Photo Management System

Consider a (fictional) photo management web site called Shutr, whose web site is http://www.shutr.net. Users of Shutr can upload their photos at will, annotate them, organize them into albums, and share them with the world. They can choose to keep these photos private, or make them available for public consumption under licensing terms of their choosing.

The primary interface to Shutr is its web site and the XHTML it delivers. Since photos are contributed by users with significant amount of built-in metadata (camera type, exposure, etc...) and additional, explicitly provided metadata (photo caption, license, photographer's name), Shutr may benefit from using RDF to express this rich metadata.

We explore how Shutr might use RDFa to express this RDF metadata right in the XHTML it already publishes. We assume an additional XML namespace, shutr, which corresponds to URI http://www.shutr.net/rdf/shutr#.

The simplest structured metadata Shutr might want to expose is basic information about a photo album: the creator of the album, the date of creation, and its license. We consider literal properties first, and URI properties second. (We ignore photo-specific metadata for now, as that involves RDF statements about an image, which is not an XHTML document. We will, of course, get back to this soon.)

3.2 Literal Properties

A literal property is a string of text, e.g. "Ben Adida", a number, e.g. "28", or any other typed, self-contained datum that one might want to express as a metadata property.

Consider Mark Birbeck, a user of the Shutr system with username markb, and his latest photo album "Vacation in the South of France." This photo album resides at http://www.shutr.net/user/markb/album/12345. The XHTML document presented upon request of that URI includes the following XHTML snippet:

<h1>Photo Album #12345: Vacation in the South of France</h1>
<h2>created by Mark Birbeck</h2>

Notice how the rendered XHTML contains elements of the photo album's structured metadata. Using RDFa, Shutr can mark up this XHTML to indicate these structured metadata properties without repeating the raw data:

<h1>Photo Album #12345: <span property="dc:title">Vacation in the South of France</span></h1>
<h2>created by <span property="dc:creator">Mark Birbeck</span></h2>

An RDFa-aware browser would thus extract the following RDF triples:

<> dc:title "Vacation in the South of France"^^XMLLiteral .
<> dc:creator "Mark Birbeck"^^XMLLiteral .

(The ^^XMLLiteral notation, which denotes a datatype, will be explained shortly.)

The span element is actually not required to attach RDF properties to rendered content. One can easily add the RDFa attributes to existing HTML elements, without adding a span:

<h1 property="dc:title">Vacation in the South of France</h1>
<h2>created by <span property="dc:creator">Mark Birbeck</span></h2>

which yields the same RDF triples, of course. The use of an extra span is helpful when existing HTML markup isn't enough to isolate the rendered content that is relevant to the RDF triple.

A reader who knows about XML datatypes might, at this point in the presentation, wonder what datatype these values will have. Given the above RDFa, "Vacation in the South of France" is an XML Literal. In some cases, this may not be appropriate. Consider an expanded HTML snippet which includes the photo album's creation date:

<h1>Vacation in the South of France</h1>
<h2>created by Mark Birbeck on 2006-01-02</h2>

A precise way to augment this HTML with RDFa is:

<h1 property="dc:title">Vacation in the South of France</h1>
<h2>created by <span property="dc:creator">Mark Birbeck</span>
    on <span property="dc:date" type="xsd:date">2006-01-02</span></h2>

which would yield the following triples (note how the default datatype is XMLLiteral, which explains the first example above.):

<> dc:title "Vacation in the South of France"^^XMLLiteral .
<> dc:creator "Mark Birbeck"^^XMLLiteral .
<> dc:date "2006-01-02"^^xsd:date .

Going further, Shutr realizes that 2006-01-02, while a correct xsd:date representation, is not exactly user-friendly. In this case, having the rendered data be the same as the structured data might not be the right answer. Shutr may instead opt for the following RDFa:

<h1 property="dc:title">Vacation in the South of France</h1>
<h2>created 
  by <span property="dc:creator">Mark Birbeck</span>
  on <span property="dc:date" type="xsd:date"
           content="2006-01-02">
    January 2nd, 2006
     </span>
</h2>

The above XHTML will render the date as "January 2nd, 2006" but will yield the exact same triples as above. The use of the content attribute should be limited to cases where the rendered text is not well-enough structured to represent the metadata.

3.3 URI Properties

A URI property is one that is merely a reference to a web-accessible resource, e.g. an image, a PDF document, or another XHTML document, all reachable via the web.

Shutr may want to give its users the ability to license their photos to the world under certain specific conditions. For this purpose, there are numerous existing licenses, including those published by Creative Commons. Thus, if Mark Birbeck chooses to license his vacation album for others to reuse, Shutr might use the following XHTML snippet (currently -- April 2006 -- recommended by Creative Commons):

This document is licensed under a
<a href="http://creativecommons.org/licenses/by-nc/2.5/">
  Creative Commons Non-Commercial License
</a>.

This clickable link has an intended meaning: it is the document's license. Using RDFa can cement that meaning within the XHTML itself:

This document is licensed under a
<a rel="cc:license"
   href="http://creativecommons.org/licenses/by-nc/2.5/">
  Creative Commons Non-Commercial License
</a>.

Note the use of the rel attribute to indicate a URI property rather than a textual one. The use of this attribute goes hand in hand with an href attribute within the same element. This href attribute indicates the URI object of the RDF triple. Thus, the above RDFa yields the following triple:

<> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .

Compared with other existing RDF mechanisms to indicate Creative Commons licensing -- e.g. a parallel RDF/XML file or inline RDF/XML within XHTML comments --, the RDFa approach provides Creative Commons and Shutr with a significant integrity advantage: the clickable link is the semantic link, and any change to the target will change both the human and machine views. Also, a simple copy-and-paste of the XHTML will carry through both the rendered and semantic data.

In both cases, the target URI may provide an XHTML document which includes further RDFa statements. The Creative Commons license page, for example, may include RDFa statements about its legal details.

4 Beyond the Current Document

The above examples casually swept under the rug the issue of the RDF subject: all the triples expressed were about the current document representing a photo album. However, not all RDF triples in a given XHTML2 document will be about that document itself. In RDFa, the default subject is the current document, but it can easily be overridden using the about attribute.

4.1 Qualifying Other Documents

Shutr may choose to present many photos in a given XHTML page. In particular, at the URI http://www.shutr.net/user/markb/album/12345, all of the album's photos will appear inline. Metadata about each photo can be included simply by specifying an about attribute:

<ul>
  <li> <img src="/user/markb/photo/23456" />,
    <span about="/user/markb/photo/23456" property="dc:title">
      Sunset in Nice
    </span>
  </li>

  <li> <img src="/user/markb/photo/34567" />,
    <span about="/user/markb/photo/34567" property="dc:title">
      W3C Meeting in Mandelieu
    </span>
  </li>
</ul>

The above RDFa yields the following triples:

</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral .

</user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .

This same approach applies to statements with URI objects. For example, each photo in the album has a creator and may have its own usage license.

<ul>
  <li> <img src="/user/markb/photo/23456" />,
    <span about="/user/markb/photo/23456" property="dc:title">
      Sunset in Nice
    </span>
    taken by photographer
    <a about="/user/markb/photo/23456" 
       property="dc:creator"
       href="/user/markb">
      Mark Birbeck
    </a>,
    licensed under a
    <a about="/user/markb/photo/23456" rel="cc:license"
       href="http://creativecommons.org/licenses/by-nc/2.5/">
      Creative Commons Non-Commercial License
    </a>.
  </li>

  <li> <img src="/user/markb/photo/34567" /> 
    <span about="/user/markb/photo/34567" property="dc:title">
      W3C Meeting in Mandelieu
    </span>
    taken by photographer
    <a about="/user/markb/photo/34567"
      property="dc:creator"
      href="/user/stevenp">
      Steven Pemberton
    </a>,
    licensed under a
    <a about="/user/markb/photo/34567" rel="cc:license"
       href="http://creativecommons.org/licenses/by/2.5/">
      Creative Commons Commercial License
    </a>.
  </li>
</ul>

This yields the following triples:

</user/markb/photo/23456>
        dc:title "Sunset in Nice"^^XMLLiteral .
</user/markb/photo/23456>
    dc:creator "Mark Birbeck"^^XMLLiteral .
</user/markb/photo/23456>
        cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .

</user/markb/photo/34567>
        dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .
</user/markb/photo/34567>
        dc:creator "Steven Pemberton"^^XMLLiteral .
</user/markb/photo/34567>
        cc:license <http://creativecommons.org/licenses/by/2.5/> .

4.2 Inheriting about

At this point, Shutr might begin to worry about the fast-growing size of its HTML document, given that the photo's URI must be repeated in the about attribute for every RDF property expressed. To address this issue, RDFa allows the value of this attribute to be inherited from a parent or ancestor element. In other words, if an element carries a rel or property attribute, but no about attribute, an RDFa browser will determine the subject of the RDF statement by navigating up the parent hierarchy of that element until it finds an about, or until it gets to the root element, at which point the default is about="".

Thus, the markup for the above example can be simplified to:

<ul>
  <li about="/user/markb/photo/23456">
    <img src="/user/markb/photo/23456" />
    <span property="dc:title">
      Sunset in Nice
    </span>,
    taken by photographer 
    <a property="dc:creator" href="/user/markb/">
      Mark Birbeck
    </a>,
    licensed under a
    <a rel="cc:license"
       href="http://creativecommons.org/licenses/by-nc/2.5/">
      Creative Commons Non-Commercial License
    </a>.
  </li>

  <li about="/user/markb/photo/34567">
    <img src="/user/markb/photo/34567" />
    <span property="dc:title">
      W3C Meeting in Mandelieu
    </span>,
    taken by photographer 
    <a property="dc:creator" href="/user/stevenp">
      Steven Pemberton
    </a>
    licensed under a
    <a rel="cc:license"
       href="http://creativecommons.org/licenses/by/2.5/">
      Creative Commons Commercial License
    </a>.
  </li>
</ul>

which yields the same triples as the previous example, though, in this case, one can easily see the parallel to the corresponding N3 shorthand:

</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral ;
                          dc:creator "Mark Birbeck"^^XMLLiteral ;
                          cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .

</user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral ;
                          dc:creator "Steven Pemberton"^^XMLLiteral ;
                          cc:license <http://creativecommons.org/licenses/by/2.5/> .

4.3 Qualifying Chunks of Documents

While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belong to a single user is the more likely scenario. For this purpose, RDFa provides ways to make metadata statements about chunks of documents using natural XHTML constructs.

Consider the page http://www.shutr.net/user/markb/cameras, which, as its URI implies, lists Mark Birbeck's cameras. Its HTML includes:

<ul>
  <li id="nikon_d200"> Nikon D200, purchased on 2004-06-01.
  </li>

  <li id="canon_sd550"> Canon Powershot SD550, purchased on 2005-08-01.
  </li>
</ul>

and the photo page will then include information about which camera was used to take each photo:

<ul>
  <li> <img src="/user/markb/photo/23456" />
    ...
    using the <a href="/user/markb/cameras#nikon_d200">Nikon D200</a>,
    ...
  </li>
...
</ul>

The RDFa syntax for formally specifying the relationship is exactly the same as before, as expected:

<ul>
  <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" />
    ...
    using the <a rel="shutr:takenWith" 
         href="/user/markb/cameras#nikon_d200">Nikon D200</a>,
    ...
  </li>
...
</ul>

which generates the triple:

</user/markb/photo/23456> shutr:takenWith </user/markb/cameras#nikon_d200>

Then, the XHTML snippet at http://www.shutr.net/user/markb/cameras is:

<ul>
  <li id="nikon_d200" about="#nikon_d200">
    <span property="dc:title" type="xsd:string">
      Nikon D200
    </span>
    purchased on
    <span property="dc:date" type="xsd:date">
      2004-06-01
    </span>
  </li>

  <li id="canon_sd550" about="#canon_sd550">
    <span property="dc:title" type="xsd:string">
      Canon Powershot SD550
    </span>
    purchased on
    <span property="dc:date" type="xsd:date">
      2005-08-01
    </span>
  </li>
</ul>

which then yields the following triples:

<#nikon_d200> dc:title "Nikon D200"^^xsd:string ;
              dc:date "2004-06-01"^^xsd:date .

<#canon_sd550> dc:title "Canon SD550"^^xsd:string ;
               dc:date "2005-08-01"^^xsd:date .

One immediately wonders whether the redundancy between the about and id attributes can be simplified. Partly for this purpose, RDFa includes elements link and meta. These elements are similar in function to A and span except for one special behavior: they only apply to their immediate parent element, even if an ancestor element bears an alternate about attribute.

<ul>
  <li id="nikon_d200">
    <meta property="dc:title" type="xsd:string">
      Nikon D200
    </meta>
    purchased on
    <meta property="dc:date" type="xsd:date">
      2004-06-01
    </meta>
  </li>

  <li id="canon_sd550">
    <meta property="dc:title" type="xsd:string">
      Canon Powershot SD550
    </meta>
    purchased on
    <meta property="dc:date" type="xsd:date">
      2005-08-01
    </meta>
  </li>
</ul>

One might now wonder how meta and link behave when their parent element doesn't have an id or about attribute. The result of such syntax is an RDF bnode, an advanced topic which we skip in this Primer.

4.4 Compact URIs (CURIEs)

For Shutr, as for many other web publishers, the introduction of RDFa attributes tends to increase the size of the XHTML noticeably, sometimes unnecessarily so: there is significant data duplication with full expression of URIs. We have already shown how judicious use of the about attribute can reduce the number of times an RDF subject is expressed. We have also shown how the use of link and meta elements can further reduce the use of the about attribute when attaching metadata to particular XHTML chunks. We now address URI duplication, RDFa's most significant data duplication issue, with Compact URIs, known as CURIEs.

A CURIE, e.g. dc:title is composed of a prefix, e.g. dc, followed by a colon, followed by a suffix, e.g. title. The compact URI is resolved by

  • resolving the prefix according to normal XML namespace resolution,
  • resolving the suffix as a relative URI against the base URI defined by the resolved prefix.

Note that QNames used for RDF properties are valid CURIEs, and resolve in exactly the same way. Thus dc:title and cc:license resolve as expected when dc and cc are correctly defined namespaces.

The differences to note between CURIEs and QNames are:

  • CURIEs allow any sequence of legal URI characters in the suffix, including, for example, digits only, dashes, slashes, etc...
  • CURIEs allow the empty string as a prefix, e.g. :next, in which case the base URI defaults to the default XML namespace, which is usually xhtml2 in our case. Note that this can also be expressed simply as next without the colon, which provides backwards compatibility for existing standard rel values.
  • CURIEs allow the underscore character _ as a prefix when referencing bnodes. More on this in the Advanced section.

4.4.1 Mixing CURIEs and URIs

One of the most important applications of CURIEs in RDFa is the use of a CURIE/URI attribute, where either a normal URI or a CURIE can be used interchangeably. In order to differentiate between the two types, square brackets [] are used around a CURIE, whereas a URI is written normally.

For example, if Shutr wants to reference the Creative Commons license http://creativecommons.org/licenses/by/2.5/ in an attribute that accepts both CURIEs and URIs, it can use either:

... attr="http://creativecommons.org/licenses/by/2.5/" ...

or, assuming the namespace cclicenses has been properly defined:

... attr="[cclicenses:by/2.5/]" ...

4.4.2 Which Attributes are Which?

In RDFa, the property attributes property,rel, and rev are all CURIE-only, which ensures backwards compatibility with past uses of rel, e.g. rel="next". The about and href attributes, on the other hand, accept mixed CURIE/URI datatypes. This ensures compatibility with browsers that expect clickability for the href, and consistency between subject and object.

4.4.3 Back to Shutr

Thus, getting back to Shutr's photo list:

<ul>
  <li> <img src="/user/markb/photo/23456" />,
    Sunset in Nice,
    taken by
    <a href="/user/markb">
      Mark Birbeck
    </a>,
    licensed under a 
    <a href="http://creativecommons.org/licenses/by/2.5/">
      Creative Commons License
    </a>.
  </li>

  <li> <img src="/user/markb/photo/34567" />,
    W3C Meeting in Mandelieu
    taken by
    <a href="/user/stevenp">
      Steven Pemberton
    </a>,
    licensed under a 
    <a href="http://creativecommons.org/licenses/by-nc/2.5/">
      Creative Commons Non-Commercial License
    </a>.
  </li>
</ul>

adding metadata to these photos with CURIEs can save significant space (over the non-CURIE use) as soon as there are a number of photos in the list:

<ul xmlns:cclic="http://creativecommons.org/licenses/" xmlns:photos="/user/markb/photo/">
  <li about="[photos:23456]"> <img src="/user/markb/photo/23456" />,
    <span property="dc:title">
      Sunset in Nice
    </span>,
    taken by
    <a property="dc:creator" href="/user/markb">
      Mark Birbeck
    </a>,
    licensed under a 
    <a rel="cc:license"
       href="[cclic:by/2.5/]">
      Creative Commons License
    </a>.
  </li>

  <li about="[photos:34567]"> <img src="/user/markb/photo/34567" />,
    <span property="dc:title">
      W3C Meeting in Mandelieu
    </span>
    taken by 
    <a property="dc:creator" href="/user/stevenp">
      Steven Pemberton
    </a>,
    licensed under a 
    <a rel="cc:license"
       href="[cclic:by-nc/2.5/]">
      Creative Commons Non-Commercial License
    </a>.
  </li>
</ul>

Of course, this assumes a browser that can parse CURIEs for clickable links. Initially, complete URIs may be preferable in the href attribute.

5 Social Networking with FOAF

... more to come here ...

6 Bibliography

RDFHTML
RDF-in-HTML Task Force (See http://www.w3.org/2001/sw/BestPractices/HTML/.)
SWBPD-WG
Semantic Web Best Practices and Deployment Working Group (See http://www.w3.org/2001/sw/BestPractices/.)
HTML-WG
HTML Working Group (See http://www.w3.org/MarkUp/Group/.)
ICAL-RDF
RDF Calendar Interest Group Note (See http://www.w3.org/TR/rdfcal/.)
VCARD-RDF
Representing vCard Objects in RDF/XML (See http://www.w3.org/TR/vcard-rdf.)

7 Acknowledgments

This document is the work of the RDF-in-HTML Task Force, including (in alphabetical order) Ben Adida, Mark Birbeck, Jeremy Carroll, Steven Pemberton, and Ralph Swick. This work would not have been possible without the help of the Semantic Web Deployment and Best Practices Working Group, in particular chairs Guus Schreiber and David Wood. Earlier versions of this document were officially reviewed by Gary Ng and David Booth, both of whom provided insightful comments that significantly improved the work.