Copyright © 2006 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
Current web pages, written in HTML, are chock-full of structured data. When publishers can express the document's metadata, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be automatically detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published to enable structured search and sharing.
RDFa is a syntax for expressing such metadata in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract metadata representation is RDF, which lets publishers build their own metadata vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The metadata is closely tied to the data it describes, so that rendered data can be copied and pasted along with its relevant structure.
This is an internal draft produced by the RDF-in-HTML task force [RDFHTML], a joint task force of the Semantic Web Best Practices and Deployment Working Group [SWBPD-WG] and HTML Working Group [HTML-WG].
This document is for internal review only and is subject to change without notice. This document has no formal standing within the W3C.
1 Purpose of RDFa and Preliminaries
2 A First Scenario: Publishing Events and Contacts
2.1 The Basic HTML
2.2 Publishing An Event
2.3 Publishing Contact Information
2.4 The Complete HTML with RDFa
2.5 RDFa with Limited HTML control
2.6 What Does The Metadata Look Like?
3 A Second Scenario: Publishing Photos
3.1 The Shutr Photo Management System
3.2 Literal Properties
3.3 URI Properties
4 Beyond the Current Document
4.1 Qualifying Other Documents
4.2 Inheriting about
4.3 Qualifying Chunks of Documents
4.4 Compact URIs (CURIEs)
4.4.1 Mixing CURIEs and URIs
4.4.2 Which Attributes are Which?
4.4.3 Back to Shutr
5 Social Networking with FOAF
6 Bibliography
7 Acknowledgments
Current web pages, written in HTML, are chock-full of structured data. When publishers can express the document's metadata, and when tools can read it, a new world of user functionality becomes available, letting users copy and paste structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar. A license on a document can be automatically detected so that the user is informed of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published to enable structured search and sharing.
RDFa is a syntax that accomplishes this metadata expression using a set of elements and attributes that embed RDF in XHTML. An important goal of RDFa is to achieve this RDF embedding without repeating existing XHTML content when that content is the metadata. Though RDFa was initially designed for XHTML2, one should be able to use RDFa with other XML dialects, e.g. XHTML1, SVG, given proper schema additions.
An XHTML document marked up with RDFa constructs is a valid XHTML Document. RDFa is about using XHTML compatible constructs and extensions to specify RDF 'content'. It is not about embedding RDF/XML syntax into XHTML documents.
We note that RDFa makes use of XML namespaces. In this
document, we assume, for simplicity's sake, that the
following namespaces are defined: dc
for Dublin
Core, foaf
for FOAF, cc
for
Creative Commons, and xsd
for XML Schema Definitions.
Jo blogs about her work, which involves web development.
When Jo has an upcoming conference talk, she simply blogs it at http://jo-blog.example.org/
. Her Blog also includes her contact information (Jo has a fantastic spam filter, so she is unafraid of publishing her email address):
<html> <head><title>Jo's Blog</title></head> <body> ... <p> I'm giving a talk at the XTech Conference about web widgets, on May 8th at 10am. </p> ... <p class="contactinfo"> My name is Jo Smith. I'm a distinguished web engineer at <a href="http://example.org"> Example.org </a>. You can contact me <a href="mailto:jo@example.org"> via email </a>. </p> ... </body> </html>
This short piece of mark-up is already full of metadata.
The markup describes an event: a talk that Jo is giving. This event starts at 10am on May 8th. A summary of the event is "a talk at XTech 2006 on web widgets." We also have contact information for Jo: she works for the organization Example.org, with job title of 'Distinguished Web Engineer'. She can be contacted at the email address 'jo@example.org'.
At the moment, it is very difficult for software — like web browsers and search engines — to make use of this implicit metadata. We need a standard way to explicitly express this metadata. In the next section, we will see how easy it is for Jo to use RDFa to mark up her data for just this purpose.
Jo would like to add some "structure" to this blog entry so that readers of her blog might be able to add her talk directly to their calendar. RDFa allows her to do just that, using extra elements and attributes. Since this is a calendar event, Jo will specifically use the iCal vocabulary [ICAL-RDF] to denote the metadata.
The first step is to reference the iCal vocabulary into the HTML page:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> ...
then, Jo declares a new event:
<p role="cal:Vevent"> ... </p>
then, inside this event declaration, Jo can set up the event fields, reusing the existing HTML. For example, the event summary can be declared as:
I'm giving <meta property="cal:summary">a talk at the XTech Conference about web widgets</meta>,
The meta
tag effectively adds fields to the parent tag, in this case the Vevent
declared in the p
. Note how the existing rendered content, "a talk at the XTech Conference about web widgets", is the value of this metadata field. Sometimes, this isn't the right thing. Specifically, the start time of the event should be rendered nicely — "May 8th" —, but should likely be represented in an easy, machine-parsable way, the standard iCal format: 20060508T1000-0500
. In this case, the markup needs only a slight modification:
<meta property="cal:dtstart" content="20060508T1000-0500">May 8th at 10am</meta>
The full markup is then:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> <head><title>Jo's Blog</title></head> <body> ... <p role="cal:Vevent"> I'm giving <meta property="cal:summary"> a talk at the XTech Conference about web widgets </meta>, on <meta property="cal:dtstart" content="20060508T1000-0500"> May 8th at 10am </meta>. </p> ... </body> </html>
The above markup can be interpreted now as a set of RDF triples, the details of which we we explain in Section ??.
Now that Jo has published an event using structured metadata, she realizes there is much data on her blog that she can mark up in the same way. Her contact information, in particular, is an easy target for structured markup with RDFa:
... <p class="contactinfo"> My name is Jo Smith. I'm a distinguished web engineer at <a href="http://example.org"> Example.org </a>. You can contact me <a href="mailto:jo@example.org"> via email </a>. </p> ...
Jo discovers the vCard RDF vocabulary [VCARD-RDF], which she adds to her existing page. Since Jo thinks of vCards as a way to publish her contact information, she uses the prefix contact
to designate this vocabulary. Note that, although Jo already imported the iCal vocabulary, adding the vCard vocabulary is just as easy and does not interfere:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ...
Jo then sets up her vCard using RDFa, by deciding that the appropriate p
will be her vcard. She notes, however, that the vCard schema does not require declaring a vCard type — i.e. there is no need to declare an rdfs:type
. Instead, it is recommended that a vCard refer to a web page that identifies the individual. Jo discovers that RDFa supports a special attribute just for this purpose: about
, which indicates that all contained HTML pertain to this designated URL:
... <p class="contactinfo" about="http://example.org/staff/jo"> ...everything here pertains to http://example.org/staff/jo... </p> ...
Simple enough!, Jo realizes. She adds her first vCard fields: name, title, organization and email.
... <p class="contactinfo" about="http://example.org/staff/jo"> My name is <meta property="contact:fn"> Jo Smith </meta>. I'm a <meta property="contact:title"> distinguished web engineer </meta> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Notice how Jo was able to use the rel
attribute directly within the anchor tag for designating her organization and email address. In this case, the rel
indicates a relationship between the current URL, designated by about
, and the target URL, designated by href
. The exact meaning of this relationship is defined by the rel
. In this case, contact:org
indicates the relationship of "vCard organization", while contact:email
indicates the relationship of "vCard email".
(Note that the above example slightly simplifies the vCard vocabulary where email
is concerned, since vCard technically requires indicating the type of the email. This simplification is for clarity's sake, and a more complete example is provided in Section ??. )
Jo's complete HTML with RDFa now looks as follows:
<html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ... <p role="cal:Vevent"> I'm giving <meta property="cal:summary"> a talk at the XTech Conference about web widgets </meta>, on <meta property="cal:dtstart" content="20060508T1000-0500"> May 8th at 10am </meta>. </p> ... <p class="contactinfo" about="http://example.org/staff/jo"> My name is <meta property="contact:fn"> Jo Smith </meta>. I'm a <meta property="contact:title"> distinguished web engineer </meta> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Note how, if Jo changes her email address link, her organization, or the title of her talk, the RDFa approach will automatically pick up these changes in the "marked up", structured data. The only places where this doesn't happen is when the content
attribute must override the rendered content, which is inevitable when the human-rendered data and the machine-readable data must differ.
The RDF triples generated by the above markup is detailed in Section ??.
What if Jo does not have complete control over the HTML of her blog? For example, she may be using a templating system which makes it particularly difficult to add the vocabularies in the html
tag at the top of her page without adding it to every page on her site. Or, she may be using a web blogging provider that doesn't allow her to change this tag to begin with.
Fortunately, RDFa uses standard XML namespaces, which means that the vocabularies can be imported "locally" to an HTML element. Jo's HTML blog page could express the exact same metadata with the following markup:
<html> ... <p role="cal:Vevent" xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> I'm giving <meta property="cal:summary"> a talk at the XTech Conference about web widgets </meta>, on <meta property="cal:dtstart" content="20060508T1000-0500"> May 8th at 10am </meta>. </p> ... <p class="contactinfo" about="http://example.org/staff/jo" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> My name is <meta property="contact:fn"> Jo Smith </meta>. I'm a <meta property="contact:title"> distinguished web engineer </meta> at <a rel="contact:org" href="http://example.org"> Example.org </a>. You can contact me <a rel="contact:email" href="mailto:jo@example.org"> via email </a>. </p> ...
Of course, just like in the case of the vocabularies defined on the top-level html
tag, more than one vocabulary can be imported into any element. In this case, each p
only needs one vocabulary: the first uses iCal, the second uses vCard.
In Section ??, Jo published an event. The RDFa generates RDF triples, specifically:
_:p0 rdfs:type cal:Vevent; cal:summary "a talk at the XTech Conference about web widgets"^^XMLLiteral; cal:dtstart "20060508T1000-0500";
In Section ??, Jo published contact information. The RDFa generates the following RDF triples:
<http://example.org/staff/jo> contact:fn "Jo Smith"^^XMLLiteral; contact:title "distinguished web engineer"^^XMLLiteral; contact:org <http://example.org>; contact:email <mailto:jo@example.org>.
Consider a (fictional) photo management web site
called Shutr, whose web site
is http://www.shutr.net
. Users of Shutr can upload
their photos at will, annotate them, organize them into
albums, and share them with the world. They can choose to
keep these photos private, or make them available for public
consumption under licensing terms of their choosing.
The primary interface to Shutr is its web site and the XHTML it delivers. Since photos are contributed by users with significant amount of built-in metadata (camera type, exposure, etc...) and additional, explicitly provided metadata (photo caption, license, photographer's name), Shutr may benefit from using RDF to express this rich metadata.
We explore how Shutr might use RDFa to express this RDF
metadata right in the XHTML it already publishes. We assume
an additional XML namespace, shutr
, which
corresponds to URI http://www.shutr.net/rdf/shutr#
.
The simplest structured metadata Shutr might want to expose is basic information about a photo album: the creator of the album, the date of creation, and its license. We consider literal properties first, and URI properties second. (We ignore photo-specific metadata for now, as that involves RDF statements about an image, which is not an XHTML document. We will, of course, get back to this soon.)
A literal property is a string of text, e.g. "Ben Adida", a number, e.g. "28", or any other typed, self-contained datum that one might want to express as a metadata property.
Consider Mark Birbeck, a user of the Shutr system with
username markb
, and his latest photo album
"Vacation in the South of France." This photo album
resides
at http://www.shutr.net/user/markb/album/12345
. The
XHTML document presented upon request of that URI includes
the following XHTML snippet:
<h1>Photo Album #12345: Vacation in the South of France</h1> <h2>created by Mark Birbeck</h2>
Notice how the rendered XHTML contains elements of the photo album's structured metadata. Using RDFa, Shutr can mark up this XHTML to indicate these structured metadata properties without repeating the raw data:
<h1>Photo Album #12345: <span property="dc:title">Vacation in the South of France</span></h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
An RDFa-aware browser would thus extract the following RDF triples:
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral .
(The ^^XMLLiteral
notation, which denotes a datatype, will be explained shortly.)
The span
element is actually not required to attach RDF
properties to rendered content. One can easily add the RDFa attributes to existing HTML elements, without adding a span:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span></h2>
which yields the same RDF triples, of course. The use of an extra span
is helpful when existing HTML markup isn't enough to isolate the rendered content that is relevant to the RDF triple.
A reader who knows about XML datatypes might, at this point in the presentation, wonder what datatype these values will have. Given the above RDFa, "Vacation in the South of France" is an XML Literal. In some cases, this may not be appropriate. Consider an expanded HTML snippet which includes the photo album's creation date:
<h1>Vacation in the South of France</h1> <h2>created by Mark Birbeck on 2006-01-02</h2>
A precise way to augment this HTML with RDFa is:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" type="xsd:date">2006-01-02</span></h2>
which would yield the following triples (note how the
default datatype is XMLLiteral
, which
explains the first example above.):
<> dc:title "Vacation in the South of France"^^XMLLiteral . <> dc:creator "Mark Birbeck"^^XMLLiteral . <> dc:date "2006-01-02"^^xsd:date .
Going further, Shutr realizes
that 2006-01-02
, while a correct xsd:date
representation, is not exactly user-friendly. In this
case, having the rendered data be the same as the
structured data might not be the right answer. Shutr may
instead opt for the following RDFa:
<h1 property="dc:title">Vacation in the South of France</h1> <h2>created by <span property="dc:creator">Mark Birbeck</span> on <span property="dc:date" type="xsd:date" content="2006-01-02"> January 2nd, 2006 </span> </h2>
The above XHTML will render the date as "January 2nd,
2006" but will yield the exact same triples as above. The
use of the content
attribute should be limited
to cases where the rendered text is not well-enough
structured to represent the metadata.
A URI property is one that is merely a reference to a web-accessible resource, e.g. an image, a PDF document, or another XHTML document, all reachable via the web.
Shutr may want to give its users the ability to license their photos to the world under certain specific conditions. For this purpose, there are numerous existing licenses, including those published by Creative Commons. Thus, if Mark Birbeck chooses to license his vacation album for others to reuse, Shutr might use the following XHTML snippet (currently -- April 2006 -- recommended by Creative Commons):
This document is licensed under a <a href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
This clickable link has an intended meaning: it is the document's license. Using RDFa can cement that meaning within the XHTML itself:
This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>.
Note the use of the rel
attribute to indicate a
URI property rather than a textual one. The use of this
attribute goes hand in hand with an href
attribute within the same element. This href
attribute indicates the URI object of the RDF
triple. Thus, the above RDFa yields the following triple:
<> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> .
Compared with other existing RDF mechanisms to indicate Creative Commons licensing -- e.g. a parallel RDF/XML file or inline RDF/XML within XHTML comments --, the RDFa approach provides Creative Commons and Shutr with a significant integrity advantage: the clickable link is the semantic link, and any change to the target will change both the human and machine views. Also, a simple copy-and-paste of the XHTML will carry through both the rendered and semantic data.
In both cases, the target URI may provide an XHTML document which includes further RDFa statements. The Creative Commons license page, for example, may include RDFa statements about its legal details.
The above examples casually swept under the rug the issue of
the RDF subject: all the triples expressed were about the
current document representing a photo album. However, not
all RDF triples in a given XHTML2 document will be about
that document itself. In RDFa, the default subject is the
current document, but it can easily be overridden using
the about
attribute.
Shutr may choose to present many photos in a given XHTML
page. In particular, at the
URI http://www.shutr.net/user/markb/album/12345
,
all of the album's photos will appear inline. Metadata
about each photo can be included simply by specifying
an about
attribute:
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> </li> <li> <img src="/user/markb/photo/34567" />, <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> </li> </ul>
The above RDFa yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral .
This same approach applies to statements with URI objects. For example, each photo in the album has a creator and may have its own usage license.
<ul> <li> <img src="/user/markb/photo/23456" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> taken by photographer <a about="/user/markb/photo/23456" property="dc:creator" href="/user/markb"> Mark Birbeck </a>, licensed under a <a about="/user/markb/photo/23456" rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li> <img src="/user/markb/photo/34567" /> <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> taken by photographer <a about="/user/markb/photo/34567" property="dc:creator" href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a about="/user/markb/photo/34567" rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
This yields the following triples:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral . </user/markb/photo/23456> dc:creator "Mark Birbeck"^^XMLLiteral . </user/markb/photo/23456> cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral . </user/markb/photo/34567> dc:creator "Steven Pemberton"^^XMLLiteral . </user/markb/photo/34567> cc:license <http://creativecommons.org/licenses/by/2.5/> .
about
At this point, Shutr might begin to worry about the
fast-growing size of its HTML document, given that the
photo's URI must be repeated in the about
attribute for every RDF property expressed. To address
this issue, RDFa allows the value of this attribute to be
inherited from a parent or ancestor element. In other words, if an
element carries a rel
or property
attribute, but no about
attribute, an RDFa
browser will determine the subject of the RDF statement by
navigating up the parent hierarchy of that element until
it finds an about
, or until it gets to the root
element, at which point the default
is about=""
.
Thus, the markup for the above example can be simplified to:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> <span property="dc:title"> Sunset in Nice </span>, taken by photographer <a property="dc:creator" href="/user/markb/"> Mark Birbeck </a>, licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> <li about="/user/markb/photo/34567"> <img src="/user/markb/photo/34567" /> <span property="dc:title"> W3C Meeting in Mandelieu </span>, taken by photographer <a property="dc:creator" href="/user/stevenp"> Steven Pemberton </a> licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons Commercial License </a>. </li> </ul>
which yields the same triples as the previous example, though, in this case, one can easily see the parallel to the corresponding N3 shorthand:
</user/markb/photo/23456> dc:title "Sunset in Nice"^^XMLLiteral ; dc:creator "Mark Birbeck"^^XMLLiteral ; cc:license <http://creativecommons.org/licenses/by-nc/2.5/> . </user/markb/photo/34567> dc:title "W3C Meeting in Mandelieu"^^XMLLiteral ; dc:creator "Steven Pemberton"^^XMLLiteral ; cc:license <http://creativecommons.org/licenses/by/2.5/> .
While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belong to a single user is the more likely scenario. For this purpose, RDFa provides ways to make metadata statements about chunks of documents using natural XHTML constructs.
Consider the
page http://www.shutr.net/user/markb/cameras
,
which, as its URI implies, lists Mark Birbeck's
cameras. Its HTML includes:
<ul> <li id="nikon_d200"> Nikon D200, purchased on 2004-06-01. </li> <li id="canon_sd550"> Canon Powershot SD550, purchased on 2005-08-01. </li> </ul>
and the photo page will then include information about which camera was used to take each photo:
<ul> <li> <img src="/user/markb/photo/23456" /> ... using the <a href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
The RDFa syntax for formally specifying the relationship is exactly the same as before, as expected:
<ul> <li about="/user/markb/photo/23456"> <img src="/user/markb/photo/23456" /> ... using the <a rel="shutr:takenWith" href="/user/markb/cameras#nikon_d200">Nikon D200</a>, ... </li> ... </ul>
which generates the triple:
</user/markb/photo/23456> shutr:takenWith </user/markb/cameras#nikon_d200>
Then, the XHTML snippet at http://www.shutr.net/user/markb/cameras
is:
<ul> <li id="nikon_d200" about="#nikon_d200"> <span property="dc:title" type="xsd:string"> Nikon D200 </span> purchased on <span property="dc:date" type="xsd:date"> 2004-06-01 </span> </li> <li id="canon_sd550" about="#canon_sd550"> <span property="dc:title" type="xsd:string"> Canon Powershot SD550 </span> purchased on <span property="dc:date" type="xsd:date"> 2005-08-01 </span> </li> </ul>
which then yields the following triples:
<#nikon_d200> dc:title "Nikon D200"^^xsd:string ; dc:date "2004-06-01"^^xsd:date . <#canon_sd550> dc:title "Canon SD550"^^xsd:string ; dc:date "2005-08-01"^^xsd:date .
One immediately wonders whether the redundancy between
the about
and id
attributes can be
simplified. Partly for this purpose, RDFa includes
elements link
and meta
. These elements are similar in function to A
and
span
except for one special behavior: they only apply to their immediate parent
element, even if an ancestor element bears an
alternate about
attribute.
<ul> <li id="nikon_d200"> <meta property="dc:title" type="xsd:string"> Nikon D200 </meta> purchased on <meta property="dc:date" type="xsd:date"> 2004-06-01 </meta> </li> <li id="canon_sd550"> <meta property="dc:title" type="xsd:string"> Canon Powershot SD550 </meta> purchased on <meta property="dc:date" type="xsd:date"> 2005-08-01 </meta> </li> </ul>
One might now wonder how meta
and link
behave when their parent element doesn't have
an id
or about
attribute. The result
of such syntax is an RDF bnode, an advanced topic which we
skip in this Primer.
For Shutr, as for many other web publishers, the
introduction of RDFa attributes tends to increase the size
of the XHTML noticeably, sometimes unnecessarily so: there
is significant data duplication with full expression of
URIs. We have already shown how judicious use of
the about
attribute can reduce the number of
times an RDF subject is expressed. We have also shown how
the use of link
and meta
elements can
further reduce the use of the about
attribute
when attaching metadata to particular XHTML chunks.
We now address URI duplication, RDFa's most significant
data duplication issue, with Compact URIs, known as
CURIEs.
A CURIE, e.g. dc:title
is
composed of a prefix, e.g. dc
, followed by a
colon, followed by a suffix, e.g. title
. The
compact URI is resolved by
Note that QNames used for RDF properties are valid CURIEs,
and resolve in exactly the same
way. Thus dc:title
and cc:license
resolve as expected when dc
and cc
are correctly defined namespaces.
The differences to note between CURIEs and QNames are:
:next
, in which case the base URI
defaults to the default XML namespace, which is usually
xhtml2
in our case. Note that this can also be expressed simply as next
without the colon, which provides backwards compatibility for existing standard rel
values.
_
as
a prefix when referencing bnodes. More on this in the Advanced section.
One of the most important applications of CURIEs in RDFa
is the use of a CURIE/URI attribute, where either a normal
URI or a CURIE can be used interchangeably. In order to
differentiate between the two types, square
brackets []
are used around a CURIE, whereas
a URI is written normally.
For example, if Shutr wants to reference the Creative
Commons
license http://creativecommons.org/licenses/by/2.5/
in an attribute that accepts both CURIEs and URIs, it can
use either:
... attr="http://creativecommons.org/licenses/by/2.5/" ...
or, assuming the namespace cclicenses
has been properly defined:
... attr="[cclicenses:by/2.5/]" ...
In RDFa, the property
attributes property
,rel
,
and rev
are all CURIE-only, which ensures
backwards compatibility with past uses of rel
,
e.g. rel="next"
. The about
and href
attributes, on the other hand, accept
mixed CURIE/URI datatypes. This ensures compatibility with
browsers that expect clickability for the href
,
and consistency between subject and object.
Thus, getting back to Shutr's photo list:
<ul> <li> <img src="/user/markb/photo/23456" />, Sunset in Nice, taken by <a href="/user/markb"> Mark Birbeck </a>, licensed under a <a href="http://creativecommons.org/licenses/by/2.5/"> Creative Commons License </a>. </li> <li> <img src="/user/markb/photo/34567" />, W3C Meeting in Mandelieu taken by <a href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a href="http://creativecommons.org/licenses/by-nc/2.5/"> Creative Commons Non-Commercial License </a>. </li> </ul>
adding metadata to these photos with CURIEs can save significant space (over the non-CURIE use) as soon as there are a number of photos in the list:
<ul xmlns:cclic="http://creativecommons.org/licenses/" xmlns:photos="/user/markb/photo/"> <li about="[photos:23456]"> <img src="/user/markb/photo/23456" />, <span property="dc:title"> Sunset in Nice </span>, taken by <a property="dc:creator" href="/user/markb"> Mark Birbeck </a>, licensed under a <a rel="cc:license" href="[cclic:by/2.5/]"> Creative Commons License </a>. </li> <li about="[photos:34567]"> <img src="/user/markb/photo/34567" />, <span property="dc:title"> W3C Meeting in Mandelieu </span> taken by <a property="dc:creator" href="/user/stevenp"> Steven Pemberton </a>, licensed under a <a rel="cc:license" href="[cclic:by-nc/2.5/]"> Creative Commons Non-Commercial License </a>. </li> </ul>
Of course, this assumes a browser that can parse CURIEs
for clickable links. Initially, complete URIs may be
preferable in the href
attribute.
This document is the work of the RDF-in-HTML Task Force, including (in alphabetical order) Ben Adida, Mark Birbeck, Jeremy Carroll, Steven Pemberton, and Ralph Swick. This work would not have been possible without the help of the Semantic Web Deployment and Best Practices Working Group, in particular chairs Guus Schreiber and David Wood. Earlier versions of this document were officially reviewed by Gary Ng and David Booth, both of whom provided insightful comments that significantly improved the work.