This document is also available in these non-normative formats: PostScript version, PDF version, ZIP archive, and Gzip'd TAR archive.
The English version of this specification is the only normative version. Non-normative translations may also be available.
Copyright © 2007 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
Current web pages, written in HTML, contain significant inherent structured data. When publishers can express this data more completely, and when tools can read it, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a user's desktop calendar; a license on a document can be detected so that users can be informed of their rights automatically; a photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself, enabling structured search and sharing.
RDFa is a syntax for expressing this structured data in XHTML. The rendered, hypertext data of XHTML is reused by the RDFa markup, so that publishers don't repeat themselves. The underlying abstract representation is RDF, which lets publishers build their own vocabulary, extend others, and evolve their vocabulary with maximal interoperability over time. The expressed structure is closely tied to the data, so that rendered data can be copied and pasted along with its relevant structure.
The rules for interpreting the data are generic, so that there do not have to be different rules for different structures; this allows authors and publishers of data to define their own formats without having to update software, or register formats via a central authority.
This document is a detailed syntax specification for RDFa. For a more gentle introduction, please consult the RDFa Primer.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is an internal draft produced by the Semantic Web Deployment Working Group [SWD-WG], in cooperation with the XHTML 2 Working Group [XHTML2-WG]. Initial work on RDFa began with the Semantic Web Best Practices and Deployment Working Group [SWBPD-WG].
At this time, the prose of the document is not complete and is not quite consistent with the current agreements the task force has with regard to processing model, etc. The section on Processing Model is up-to-date; if there is a discrepancy between the prose and that section, that section is most likely correct.
This document has no official standing within the W3C. It is also a work in progress, which means it may change at any time, without warning, and you shouldn't rely on anything in this document.
xml:base
property
attributerel
attributerev
attributerel
and rev
attributeRDF/XML [RDF-SYNTAX] provides sufficient flexibility to represent all of the abstract concepts in RDF [RDF-CONCEPTS]. However, it presents two challenges; first it is difficult or impossible to validate documents that contain RDF/XML using XML Schemas or DTD's, which makes it difficult to import RDF/XML into other markup languages. Whilst newer schema languages such as RELAX NG [RELAXNG] do provide a way to validate documents that contain arbitrary RDF/XML, it will be a while before they gain wide support.
Second, even if one could add RDF/XML directly into an XML dialect like XHTML, there would be significant data duplication between the rendered data and the RDF/XML structured data. It would be far better to add RDF to a document without repeating the document's existing data. For example, an XHTML document that explicitly renders its author's name "Mark Birbeck" should not need to repeat this name for the RDF expression of the same concept: it should be possible to supplement the existing markup in such a way that it can also be interpreted as RDF, with minimal repetition of data.
Third, as users often want to transfer structured data from one application to another, sometimes to or from a non-web-based application, it is highly beneficial to express the web data's structure "in context." The user experience could then be enhanced, for example by providing contextual information about specific rendered data, perhaps when the user "right-clicks" on an item of interest.
In the past, many attributes were 'hard-wired' directly into the markup language to represent specific concepts. For example, in XHTML 1.1 [XHTML11] and
HTML [HTML4] there is a cite
attribute; the attribute allows an author to add information to a document which is used to indicate the origin of a
quote.
However, these 'hard-wired' attributes make it difficult to define a generic process for extracting metadata from any document since a parser would need to know about each of the special attributes. One motivation for RDFa then, has been to devise a means by which documents can be augmented with metadata in a generic rather than hard-wired manner. This has been achieved by creating a fixed set of attributes and parsing rules, but allowing those attibutes to contain properties from any of a number of the growing range of available taxonomies. The values of those properties are in most cases the information that is already in an author's document.
RDFa takes the pressure off language authors to anticipate all the structural requirements users of their language might have, by outlining a new syntax for RDF that relies only on attributes. RDFa can be easily imported into other XML-based markup languages, as well as HTML, allowing any mark-up language to carry arbitrary RDF.
This specification deals specifically with the use of RDFa in HTML-based languages, such as HTML and XHTML.
In the following examples, for brevity assume that the following namespace prefixes are defined:
cc: | http://creativecommons.org/ns# |
dc: | http://purl.org/dc/elements/1.1/ |
ex: | http://example.org/ |
foaf: | http://xmlns.com/foaf/0.1/ |
rdf: | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: | http://www.w3.org/2000/01/rdf-schema# |
svg: | http://www.w3.org/2000/svg |
xh11: | http://www.w3.org/1999/xhtml |
xsd: | http://www.w3.org/2001/XMLSchema# |
biblio: | http://example.org/biblio/0.1 |
taxo: | http://purl.org/rss/1.0/modules/taxonomy/ |
The metadata information that RDFa provides access to is generally understood to be a collection of statements. A statement is a basic unit of information that has been constructed in a very specific way to make it easier to process. In turn, by breaking large sets of information down into a collection of statements, even very complex metadata can be made available for processing.
To illustrate, suppose we have the following set of facts:
Albert was born on March 14, 1879, in Germany. There is a picture of him at http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.This would be quite difficult for a machine to process, and it is certainly not in a format that could be passed from one system to another. However, if we convert the information to a set of statements it begins to be more manageable. The same information could therefore be represented as follows:
Albert was born on March 14, 1879. Albert was born in Germany. Albert has a picture at http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.
To make this information machine-processable RDF defines the structure of these statements very tightly. A statement is actually a triple, meaning that it is made up three components. The first is the subject of the statement, and is what we are making our statements about. In these examples the subject is always 'Albert'.
The second part of a triple is the property of the subject that we want to define. In the examples here, the properties would be 'was born on', 'was born in', and 'has a picture at'. These are more usually called predicates in RDF.
The final part of a triple is the value of the property, or the object. In the examples here the object values are 'March 14, 1879', 'Germany', and 'http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg'.
Breaking complex information into manageable units obviously helps, but there is still ambiguity here. For example, which 'Albert' are we talking about? If some system has further facts--triples--about 'Albert' how could we know whether they are about the same person, and so add them to the list of things we know about that person? Also, if we wanted to find people born in Germany, how could we know that the predicate 'was born in' has the same purpose as the predicate 'birthplace' that exists in some other system? RDF solves this problem by replacing our vague terms with URI references.
URIs are most commonly used to identify web pages, but RDF makes use of them as a way to provide unique identifiers for concepts. For example, we could identify the subject of all of our statements by using the DBPedia URI for Albert Einstein:
<http://dbpedia.org/resource/Albert_Einstein> has the name Albert Einstein. <http://dbpedia.org/resource/Albert_Einstein> was born on March 14, 1879. <http://dbpedia.org/resource/Albert_Einstein> was born in Germany. <http://dbpedia.org/resource/Albert_Einstein> has a picture at http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg.
URI references are also used to uniquely identify the objects in metadata statements (note that the picture of Einstein is already a URI):
<http://dbpedia.org/resource/Albert_Einstein> has the name Albert Einstein. <http://dbpedia.org/resource/Albert_Einstein> was born on March 14, 1879. <http://dbpedia.org/resource/Albert_Einstein> was born in <http://dbpedia.org/resource/Germany>. <http://dbpedia.org/resource/Albert_Einstein> has a picture at <http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg>.
And of course URI references are also used to ensure that predicates are unambiguous:
<http://dbpedia.org/resource/Albert_Einstein> <http://xmlns.com/foaf/0.1/name> Albert Einstein. <http://dbpedia.org/resource/Albert_Einstein> <http://dbpedia.org/property/dateOfBirth> March 14, 1879. <http://dbpedia.org/resource/Albert_Einstein> <http://dbpedia.org/property/birthPlace> <http://dbpedia.org/resource/Germany>. <http://dbpedia.org/resource/Albert_Einstein> <http://xmlns.com/foaf/0.1/depiction> <http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg>.
Although URI resources are always used for subjects and predicates, the object part of a triple can be either a URI or a literal. In the example triples, Einstein's name is represented by a plain literal, which means that it is a basic string with no type or language information:
<http://dbpedia.org/resource/Albert_Einstein> <http://xmlns.com/foaf/0.1/name> "Albert Einstein".
Some literals, such as dates and numbers, have very specific meanings, so RDF provides a mechanism for indicating the type of a literal. A typed literal is indicated by attaching a URI
to the end of a plain literal which indicates the literal's datatype. This URI is usually based on datatypes defined in the XML Schema Datatypes specification [XML SCHEMA DATATYPES REFERENCE /
http://www.w3.org/TR/xmlschema-2/]. The following syntax would be used to unambiguously express Einstein's date of birth as a literal of type xsd:date
:
<http://dbpedia.org/resource/Albert_Einstein> <http://dbpedia.org/property/dateOfBirth> "1879-03-14"^^<http://www.w3.org/2001/XMLSchema#date>.
RDF does not have one set way to express triples, since the key ideas of RDF are the triple and the use of URIs. However, a number of mechanisms are available, such as RDF/XML, N-Triples [N-TRIPLES], and of course RDFa. Most discussions of RDF make use of the N-Triple syntax to explain their ideas, since it's quite compact. The examples we have just seen are already using this syntax, and we'll continue to use it throughout this document, with a slight variation that long URIs can be abbreviated by using a URI mapping. This is indicated by removing the angle brackets from the URI, as follows:
<http://dbpedia.org/resource/Albert_Einstein> foaf:name "Albert Einstein" . <http://dbpedia.org/resource/Albert_Einstein> p:dateOfBirth "1879-03-14"^^xs:date . <http://dbpedia.org/resource/Albert_Einstein> p:birthPlace <http://dbpedia.org/resource/Germany>. <http://dbpedia.org/resource/Albert_Einstein> foaf:depiction <http://en.wikipedia.org/wiki/Image:Albert_Einstein_Head.jpg>.Note that this is merely a way to make examples more compact and the actual triples generated would use the full URIs.
When writing examples, you will often see the following URI:
<>This indicates the 'current document', i.e., the document being processed.
A collection of triples is called a graph.
For more information on the concepts described above, see [RDF-CONCEPTS].
The following is a description of RDFa that uses RDF terminology:
The aim of RDFa is to allow [RDF graph]s to be carried in XML documents of any type. An [RDF graph] comprises [node]s linked by relationships. The basic unit of a graph is a [triple], in which a subject [node] is linked to an object [node] via a [predicate]. The subject [node] is always either an [RDF URI reference] or a [blank node], the predicate is always an [RDF URI reference], and the object of a statement can be an [RDF URI reference], a [literal], or a [blank node].
In RDFa, a subject [RDF URI reference] is indicated using the attribute about
and predicates are represented using one of the attributes property
,
instanceof
, rel
, or rev
. Objects which are [RDF URI reference]s are represented using the attributes href
, resource
or src
,
whilst objects that are [literal]s are represented either with the attribute content
(with an optional [datatype] expressed using the datatype
attribute), or the content of
the element in question.
xml:base
All [RDF URI references] are subject to xml:base
[XMLBASE]. Note that this means that in the absence of an xml:base
attribute, the document
containing the RDF statements is itself the base.
An example follows to show how xml:base
affects the subject:
<span xml:base="http://internet-apps.blogspot.com/"> <link about="" rel="dc:creator" href="http://www.blogger.com/profile/1109404" /> <meta about="" property="dc:title" content="Internet Applications" /> </span>
The triples generated would be as follows:
<http://internet-apps.blogspot.com/> dc:creator <http://www.blogger.com/profile/1109404> . <http://internet-apps.blogspot.com/> dc:title "Internet Applications" .
In order to allow for the compact expression of RDF statements, RDFa uses a superset of QNames [QName] that allows the contraction of all URIs (QNames have a syntactic restriction on the sorts of URI that can be contracted).
These Compact URIs are called CURIEs here.
The rel
, rev
, and property
attributes accept CURIE-only datatypes, while href
and about
accept mixed CURIE data. In particular, the following notation is a valid RDFa statement:
This document is licensed under a <a xmlns:cclicenses="http://creativecommons.org/licenses/" rel="cc:license" href="http://creativecommons.org/licenses/by/nc-nd/3.0/"> Creative Commons License </a>.
which generates the following triple, as expected:
<> cc:license <http://creativecommons.org/licenses/by/nc-nd/3.0/> .
A basic CURIE is comprised of two components, a prefix and a reference. The prefix is separated from the reference by a colon (:
).
curie := [ prefix [ ':' ] ] reference
prefix := NCName
reference := irelative-ref (as defined in [IRI])
The prefix value MUST be defined using the 'xmlns:' syntax specified in [XMLNAMES].
If the prefix is omitted from a CURIE, the default value of http://www.w3.org/1999/xhtml MUST be used.
A CURIE is a representation of a full IRI. This IRI is obtained by concatenating the IRI associated with the prefix
with the reference
. The result MUST
be a syntactically valid IRI [IRI].
The CURIE prefix '_' is reserved. For this reason, prefix declarations using '_' SHOULD be avoided by authors.
Host languages MAY define additional constraints on these syntax rules when CURIES are used in the context of those host languages. Host languages MUST NOT relax the CURIE syntax constraints defined in this specification.
The main idea behind the syntax for RDFa is that existing data should be easy to update to convey RDF triples. Thus, the bulk of RDFa can be expressed using only attributes applied to existing elements within the XML document. These are
about
,rel
,rev
,property
,href
,resource
,src
,datatype
,content
instanceof
Several of these attributes alredy exist in HTML, and their general HTML use is preserved. This specification simply gives an RDF interpretation to their existing use.
For example, given an XHTML chunk as follows:
This photo was taken by <span class="author">Mark Birbeck</span>.
a simple attribute augmentation can yield an RDF triple:
This photo was taken by <span class="author" about="photo1.jpg" property="dc:creator">Mark Birbeck</span>.
which yields:
<photo1.jpg> dc:creator "Mark Birbeck" .
Similarly, links can be augmented to express RDF triples. Consider an XHTML chunk:
This photo was taken by <a href="http://www.blogger.com/profile/1109404">Mark Birbeck</a>.
When the RDF object is a URI, the RDF predicate is designated using rel
:
This photo was taken by <a about="photo1.jpg" rel="dc:creator" href="http://www.blogger.com/profile/1109404">Mark Birbeck</a>.
which yields:
<photo1.jpg> dc:creator <http://www.blogger.com/profile/1109404> .
It's important to note that the various RDFa attributes can be used on any existing element of the host language. Note also that one can express a reverse relationship using the rev
attribute. For example, if the photo in question is actually a depiction of Mark, one could write:
This photo was taken by <a about="photo1.jpg" rev="foaf:img" href="http://www.blogger.com/profile/1109404">Mark Birbeck</a>.
which would yield:
<http://www.blogger.com/profile/1109404> foaf:img <photo1.jpg> .
Both relations can be expressed simultaneously:
This photo was taken by <a about="photo1.jpg" rel="dc:creator" rev="foaf:img" href="http://www.blogger.com/profile/1109404">Mark Birbeck</a>.
which then yields two triples:
<photo1.jpg> dc:creator <http://www.blogger.com/profile/1109404> . <http://www.blogger.com/profile/1109404> foaf:img <photo1.jpg> .
It's possible to go further and add the attributes used for denoting statements in which the object is a [literal]:
This photo was taken by <a about="photo1.jpg" property="dc:title" rel="dc:creator" rev="foaf:img" href="http://www.blogger.com/profile/1109404">Mark Birbeck</a>.
which would then yield:
<photo1.jpg> dc:creator <http://www.blogger.com/profile/1109404> . <http://www.blogger.com/profile/1109404> foaf:img <photo1.jpg> . <photo1.jpg> dc:title "Mark Birbeck" .
Or going further:
This photo was taken by <a about="photo1.jpg" property="dc:title" content="Portrait of Mark" rel="dc:creator" rev="foaf:img" href="http://www.blogger.com/profile/1109404">Mark Birbeck</a>.
which would then yield:
<photo1.jpg> dc:creator <http://www.blogger.com/profile/1109404> . <http://www.blogger.com/profile/1109404> foaf:img <photo1.jpg> . <photo1.jpg> dc:title "Portrait of Mark" .
It's possible to do all of this without ambiguity, since the property
attribute always denotes a predicate in a statement using a [literal] as the object, whilst the rel
and rev
attributes always denote a predicate in a statement using a [URI reference] as the object.
Of course, the more natural way to express the three above triples is to strive to make all metadata literals and URIs meaningful within the host language. Specifically, in the case of XHTML, it makes sense to render as much of the useful metadata as possible and use RDFa to mark up this rendered data. The following XHTML thus generates the same triples as shown above.
This photo, entitled <span about="photo1.jpg" property="dc:title">Portrait of Mark</span> was taken by <a about="photo1.jpg" rel="dc:creator" rev="foaf:img" href="http://www.blogger.com/profile/1109404">Mark himself</a>.
The value of the about
attribute sets the subject for any nested triples which means that the same triples can be expressed using this, more compact, syntax:
<div about="photo1.jpg"> This photo, entitled <span property="dc:title">Portrait of Mark</span> was taken by <a rel="dc:creator" rev="foaf:img" href="http://www.blogger.com/profile/1109404">Mark himself</a>. </div>
A second feature of RDFa is that it is possible to use parts of the host document to provide the [subject] of a [triple]. This marks RDFa from other approaches to serialising RDF, in that the the same syntax can now be used to make statements about parts of a document, and external documents.
It is possible to make such statements using the syntax introduced in the examples above:
<html xmlns:dc="http://purl.org/dc/elements/1.1/"> <head> <title>On Crime and Punishment</title> </head> <body> <blockquote id="q1" about="#q1" rel="dc:source" resource="urn:isbn:0140449132" > <p> Rodion Romanovitch! My dear friend! If you go on in this way you will go mad, I am positive! Drink, pray, if only a few drops! </p> </blockquote> </body> </html>
Using qualifying statements, RDFa allows a single XML dialect document to include multiple RDF entities. Relations between the various entities of a given page can also be defined using RDFa notation.
Consider the following XHTML, which defines two RDF entities of type taxo:topic
, two RDF entities of type biblio:Publication
, metadata pertinent to each publication,
including dc:title
and dc:creator
, and relations of type taxo:topics
between the publications and tags:
<html xmlns:dc="http://purl.org/dc/elements/1.1/"> <head> <title>Mark's Publications</title> </head> <body> <h2>Tags</h2> <div id="tag_standards"> <link rel="rdf:type" href="[taxo:topic]" /> Standards </div> <div id="tag_xforms"> <link rel="rdf:type" href="[taxo:topic]" /> XForms </div> <h2>Publications</h2> <div id="publication_1"> <link rel="rdf:type" href="[biblio:Publication]" /> <link rel="dc:creator" href="http://www.blogger.com/profile/1109404" /> <meta property="dc:title">A Standards-Based Virtual Machine</meta> <link rel="taxo:topics" href="#tag_standards" /> </div> <div id="publication_2"> <link rel="rdf:type" href="[biblio:Publication]" /> <link rel="dc:creator" href="http://www.blogger.com/profile/1109404" /> <meta property="dc:title">XForms and Internet Applications</meta> <link rel="taxo:topics" href="#tag_standards" /> <link rel="taxo:topics" href="#tag_xforms" /> </div> </body> </html>
This yields the expected triples:
<#tag_standards> rdf:type taxo:topic . <#tag_xforms> rdf:type taxo:topic . <#publication_1> rdf:type biblio:Publication . <#publication_1> dc:creator <http://www.blogger.com/profile/1109404> <#publication_1> dc:title "A Standards-Based Virtual Machine"^^rdf:XMLLiteral . <#publication_1> taxo:topics <#tag_standards> . <#publication_2> rdf:type biblio:Publication . <#publication_2> dc:creator <http://www.blogger.com/profile/1109404> . <#publication_2> dc:title "XForms and Internet Applications"^^rdf:XMLLiteral . <#publication_2> taxo:topics <#tag_standards> . <#publication_2> taxo:topics <#tag_xforms> .
Beyond this theoretical example, this application of RDFa is particularly useful for formats like FOAF. (See examples.)
The previous series of examples may mislead one to think that RDFa statements are only contextual, only meant to qualify existing elements. However, as the first examples implied, a fixed
about
attribute can be used to specify a global subject. It is actually quite easy to make independent, global RDF statements. Statements like:
This document is licensed under a <a about="" rel="cc:license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/"> Creative Commons </a>.
will produce the same triple no matter where they're located in the document:
<> cc:license <http://creativecommons.org/licenses/by-nc-nd/3.0/> .
This section is normative.
This section looks at a generic set of processing rules for creating a set of triples that represent the metadata present in an XHTML+RDFa document. Processing need not follow the DOM traversal technique outlined here, although the effect of following some other manner of processing must be the same as if the processing outlined here were followed. The processing model is explained using the idea of DOM traversal which makes it easier to describe (particularly in relation to the 'evaluation context').
Parsing a document for RDFa triples is carried out by starting at the root element of the document, and visiting each of its child elements in turn, applying processing rules. Processing is recursive in that for each child element the processor also visits each of its child elements, and applies the same processing rules.
As processing continues, various rules are applied which will either generate triples, or change the [evaluation context] information which will be used in subsequent processing. Some of the rules will be determined by the host language--in this case XHTML--and some of the rules will be part of RDFa.
Note that we don't say anything about what should happen to the triples generated, or whether more triples might be generated during processing than are outlined here. However, to be conformant, an RDFa processor needs to act as if at least the rules in this section are applied.
During processing, each rule is applied within an 'evaluation context'. Rules may further modify this evaluation context, or create triples that can be established by making use of this evaluation context. The context itself consists of the following pieces of information:
base
element. The important
thing is that it establishes a URL against which relative paths can be evaluated.base
, but it will usually change during the course of processing. Processing would normally begin after the document to be parsed has been completely loaded. However, there is no requirement for this to be the case, and it is certainly possible to use a
SAX-style processing model to extract the RDFa information. Note that if some approach other than the DOM traversal approach defined here is used, it is important to ensure that any meta
or link
elements processed in the head
of the document honour any occurrences of base
which may appear after those elements. (In other words, HTML
processing rules must still be applied, even if document processing takes place in a non-HTML environment such as a search indexer.)
At the beginning of processing, the [current evaluation context] is initialised as follows:
base
element, if present;Processing then begins with the root element, and all nodes in the tree are processed according to the following rules, depth-first:
xmlns
atttribute. The value to be mapped is set by the XML namespace prefix, and the value to map is the value of the
attribute--a URI. Note that the URI is not processed in any way; in particular if it is a relative path it is not resolved against the [current base]. Authors are advised to follow best practice for
using namespaces, which includes not using relative paths. (See [xyz].)xml:lang
, or the HTML attribute lang
.@about
. Note that the final value of the [current resource] is an absolute URI, which means that if @about
contains a relative path the value must be normalised against [base] in the [current evaluation context], using the algorithm defined in RFC 3986.resource
. If there is no resource
attribute then the HTML src
attribute is used, and if that is not present, the HTML href
attribute is used. If
none of these are present then a unique identifier or [bnode] is created. Note that the final value of the [current object resource] is an absolute URI, which means that if any of these attributes
contain relative paths they must be normalised against [base] in the [current evaluation context], using the algorithm defined in RFC 3986.content
attribute is present, or the body of the [current element] contains only
text (i.e., there are no child elements), or the body of the [current element] does have child elements but the datatype
attribute has an empty value. Additionally, if
there is a value for [current language] then the value of the [plain literal] should include this language information, as described here:???. The actual literal is either the value of the
content
attribute (if present) or a string created by concatenating the inner content of each of the children in turn, of the [current element].datatype
attribute is present, and does not have an empty value. The actual literal is
either the value of the content
attribute (if present) or a string created by concatenating the inner content of each of the children in turn, of the [current element]. The
final string includes the datatype, as described here:???datatype
attribute is not present. The value of
the [XML literal] is a string created from the inner content of the [current element], i.e., not including the element itself.property
attribute. If present, the attribute must contain one or more [basic curies], each
of which is converted to an absolute URI using CURIE processing rules, and then used to generate a triple as follows:
instanceof
attribute. If present, the attribute must contain one or more [basic
curies], each of which is converted to an absolute URI using CURIE processing rules, and then used to generate a triple as follows:
true
.@rel
and @rev
attributes.rel
attribute must contain one or more [basic curies], each of which is converted to an absolute URI using CURIE processing rules, and then used to generate a triple as
follows:
rev
attribute must contain one or more [basic curies], each of which is converted to an absolute URI using CURIE processing rules, and then used to generate a triple as
follows:
true
.true
then the [current resource] is set to the value of the [current object resource], and the [chaining] flag is set to false
.NOTE: The recursive aspect of this needs to be explained a little better, in particular when we get stuff back off the stack.
This section provides an in-depth examination of the processing steps described in the previous section. It also includes examples which may help clarify some of the steps involved.
When triples are created they will always be in relation to the [current resource]. When parsing begins the [current resource] will be the URI of the document being parsed, or a value as set by
base
. However, as processing progresses, any about
attributes will change the [current resource]. The value of about
is a URI. If it is relative it is resolved
against the current [base] value.
Daniel knows <a about="mailto:daniel.brickley@bristol.ac.uk" rel="foaf:knows" href="mailto:libby.miller@bristol.ac.uk">Libby</a>. Libby knows <a about="mailto:libby.miller@bristol.ac.uk" rel="foaf:knows" href="mailto:ian.sealy@bristol.ac.uk">Daniel</a>.
<div about="photo1.jpg"> <span class="attribution-line">this photo was taken by <span property="dc:creator">Mark Birbeck</span> </span> </div>
will inherit the about
attribute from the enclosing div
and yield the expected triple:
<photo1.jpg> dc:creator "Mark Birbeck" .
There are two types of object, [URI resources] and [literals].
A [literal] object can be set using either the attribute content
, or any inline text in an element. A [URI resource] object can be set using one of the attributes href
,
resource
or src
. Which attribute is used to generate a triple depends on how the predicate is indicated, as discussed in the next section.
content
attributeThe content
attribute can be used to indicate a [plain literal] as follows:
<meta about="http://internet-apps.blogspot.com/" property="dc:creator" content="Mark Birbeck" />
or, alternatively, using the content of the element:
<span about="http://internet-apps.blogspot.com/" property="dc:creator">Mark Birbeck</span>
Both of these examples give the following triple:
<http://internet-apps.blogspot.com/> dc:creator "Mark Birbeck" .
The value of the content
attribute is given precedence over any element content, so the following would give exactly the same triple:
<span about="http://internet-apps.blogspot.com/" property="dc:creator" content="Mark Birbeck">John Doe</span>
RDF allows [plain literal]s to have a language tag, as illustrated by the following example from [RDFTESTS-RDFMS-XMLLANG-TEST006]:
<http://example.org/node> <http://example.org/property> "chat"@fr .
In RDFa the XML language attribute -- xml:lang
-- is used to add this information, whether the plain literal is designated by the content
attribute, or by a
datatype
value of plaintext
:
<meta about="http://example.org/node" property="ex:property" xml:lang="fr" content="chat" />
Note that the value can be inherited as defined in [XML-LANG], so the following syntax will give the same triple as above:
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fr"> <head> <title xml:lang="en">Example</title> <meta about="http://example.org/node" property="ex:property" content="chat" /> </head> ... </html>
XML documents cannot contain XML mark-up in their attributes, which means it is not possible to represent XML within the content
attribute. The following would cause an XML parser to
generate an error:
<head about=""> <meta property="dc:title" content="E = mc<sup>2</sup>: The Most Urgent Problem of Our Time" /> </head>
It does not help to escape the content, since the output would simply be a string of text containing numerous ampersands:
<> dc:title "E = mc&amp;lt;sup&amp;gt;2&amp;lt;/sup&amp;gt;: The Most Urgent Problem of Our Time" .
RDF does, however, provide a datatype for indicating [XML literal]s. RDFa therefore adds this datatype to any [literal] that has child elements. For example:
<h2 property="dc:title"> E = mc<sup>2</sup>: The Most Urgent Problem of Our Time </h2>
would generate the expected triple:
<> dc:title "E = mc<sup>2</sup>: The Most Urgent Problem of Our Time"^^rdf:XMLLiteral .
There will be situations where the extra mark-up is not actually part of the meaning of the literal, and can be ignored. In this situation an empty datatype
value can be used to
override the XML literal behaviour:
<p>You searched for <strong>Einstein</strong>:</p> <p about="http://dbpedia.org/resource/Albert_Einstein"> <span property="foaf:name" datatype="">Albert <strong>Einstein</strong></span> (March 14, 1879 – April 18, 1955) was a German-born theoretical physicist. </p>
Although the rendering of this page has highlighted the term the user searched for, setting datatype
to nothing will ignore this, giving the following triples:
<http://dbpedia.org/resource/Albert_Einstein> foaf:name "Albert Einstein" .
Note that the value of this [XML Literal] is the exclusive canonicalization of the RDFa element's value.
clarify canonicalization
as per Elias's email, we need to clarify what this canonicalization is. datatype
RDF allows [literal]s to be given a data type, as illustrated by the following example from [RDFTESTS-DATATYPES-TEST001]:
<http://example.org/foo> <http://example.org/bar> "10"^^<http://www.w3.org/2001/XMLSchema#integer> .
This can be represented in RDFa as follows:
<span about="http://example.org/foo" property="ex:bar" content="10" datatype="xsd:integer">ten</span>
EliasT comments
We need to explain which datatypes are allowed and emphasize "plaintext".href
attribute When a triple predicate has been expressed using the rel
attribute, the href
attribute on the [RDFa statement]'s element is used to indicate the object as a [URI
reference]. Its type, just like that of the about
attribute, is a URI:
<link about="mailto:daniel.brickley@bristol.ac.uk" rel="foaf:knows" href="mailto:libby.miller@bristol.ac.uk" />
rel
without href
When a triple predicate has been expressed using the rel
attribute, and no href
attribute exists on the same [RDFa element], then the CURIE represented by this element is used as the object. Specifically, this CURIE is affected by the about
and
id
attributes. When neither is present, the object is a bnode (bnodes are discussed further in Section 5 [REF]). In all cases, the subject resolution for child elements is affected:
where they do not override the subject, their subject is this same CURIE here resolved as the object.
Consider, for example, a simple fragment of HTML for describing the creator of a web page, with further information about the creator, including his name and email address:
<div rel="dc:creator"> <span property="foaf:name">Ben Adida</span> (<a property="foaf:mbox" href="mailto:ben@adida.net">ben@adida.net</a>)
The above yields the following triples:
<> dc:creator _:div0 . _:div0 foaf:name "Ben Adida" . _:div0 foaf:mbox <mailto:ben@adida.net> .
The predicate of a statement is specified using a property
, instanceof
, rel
or rev
attribute. These attributes can be placed on any element in
a document, and can co-exist on the same element. Each of these attributes accepts space-separated CURIEs, each of which will be used to generate exactly one triple. The attribute indicates the type
of object to use for the triple, literal or URI. In the descriptions, the case of a single CURIE for the attributes is described; when an attribute consists of more than one (space separated) CURIES,
the process is applied for each of them.
property
attribute A property
attribute designates a predicate whose object is a literal. The object of the triple will have been established using [literal] object resolution (See Section @@4.4). The
following example indicates the name of the author responsible for the text being quoted:
<blockquote about="#q1"> <p> Rodion Romanovitch! My dear friend! If you go on in this way you will go mad, I am positive! Drink, pray, if only a few drops! </p> <p> by <span property="dc:creator">Fyodor Dostoevsky</span> </p> </blockquote>
rel
attribute A rel
attribute designates a predicate whose object is a resource. The object of the triple is determined using [URI reference] object resolution (Section 4.4). The following example
indicates that one 'FOAF person' knows another:
Daniel <a about="mailto:daniel.brickley@bristol.ac.uk" rel="foaf:knows" href="mailto:libby.miller@bristol.ac.uk">knows</a> Libby.
The triple generated is:
<mailto:daniel.brickley@bristol.ac.uk> foaf:knows <mailto:libby.miller@bristol.ac.uk> .
rev
attributeA rev
attribute, like its cousin the rel
attribute, indicates a predicate whose object is a resource, though its subject and object resolutions are reversed. The subject
of the triple is determined using [URI reference] object resolution (Section 4.4). Note that resolution is effectively the same as if the rev
attribute had been a
rel
attribute with object and subject reversed. The following example indicates that one 'FOAF person' knows another:
Libby <a about="mailto:daniel.brickley@bristol.ac.uk" rev="foaf:knows" href="mailto:libby.miller@bristol.ac.uk">knows</a> Daniel.
and the [triple] generated is essentially a reversal of our previous example:
<mailto:libby.miller@bristol.ac.uk> foaf:knows <mailto:daniel.brickley@bristol.ac.uk> .
rel
and rev
attributeIt is perfectly acceptable to use both the rel
and rev
attributes within the same element, yielding two triples without repeating the subject and object. For example:
Libby and Daniel <a about="mailto:daniel.brickley@bristol.ac.uk" rel="foaf:knows" rev="foaf:knows" href="mailto:libby.miller@bristol.ac.uk" >know each other</a>.
expresses:
<mailto:libby.miller@bristol.ac.uk> foaf:knows <mailto:daniel.brickley@bristol.ac.uk> . <mailto:daniel.brickley@bristol.ac.uk> foaf:knows <mailto:libby.miller@bristol.ac.uk> .
The predicates need not be the same, of course.
The rel
, rev
, and property
attributes accept multiple space-separated CURIEs as a single attribute value. When there is more than one CURIE, then each
expresses the exact same triples it would if it were the single CURIE in the attribute value. For example:
This document was authored and published by <a about="" rel="dc:creator dc:publisher" href="http://example.org/~markb"> Mark Birbeck </a>.
is interpreted by performing the normal subject and object resolutions dictated by the rel
attribute on both the dc:creator
and dc:publisher
values. The
resulting triples are:
<> dc:creator <http://example.org/~markb> . <> dc:publisher <http://example.org/~markb> .
The same exact reasoning applies to the rev
and property
attributes.
NOTE: I intend to remove this section.
Having established the different parts of the syntax of RDFa, we will now look at the various aspects of the RDF Abstract Syntax, and see how they can be represented in RDFa.
A [blank node] is generated explicitly when an [RDFa statement] uses a bnode CURIE as its subject. A [blank node] can be generated more implicitly when an XML element without an
about
attribute has meta
or link
child elements, also without about
attributes of their own. In the latter case, the [unique anonymous ID] generated to
identify the [blank node] is associated with the [context statement] of the meta
and link
elements. This allows a number of statements to be made about the same [blank
node].
For example, to establish relationships between a [blank node] and literals or URIs, one can use the implicit [blank node] construction of our earlier example, repeated here:
NOTE: This is all out of date. It can probably come out, but I just want to look at the use-case a bit more to see if the example should be re-created with correct syntax.
<blockquote> <link rel="dc:source" href="urn:isbn:0140449132" /> <meta property="dc:creator" content="Fyodor Dostoevsky" /> <p> Rodion Romanovitch! My dear friend! If you go on in this way you will go mad, I am positive! Drink, pray, if only a few drops! </p> </blockquote>
This would generate the following [triple]s:
_:a dc:source <urn:isbn:0140449132> . _:a dc:creator "Fyodor Dostoevsky" .
One could also use the more explicit declaration:
<blockquote about="[_:a]"> <p> Rodion Romanovitch! My dear friend! If you go on in this way you will go mad, I am positive! Drink, pray, if only a few drops! </p> </blockquote> <link about="[_:a]" rel="dc:source" href="urn:isbn:0140449132" /> <meta about="[_:a]" property="dc:creator" content="Fyodor Dostoevsky" />
To establish relationships between [blank node]s, the [unique anonymous ID] must be set explicity using a CURIE bnode as subject or object. For example, if our desired output is the following [triple]s:
_:a foaf:mbox <mailto:daniel.brickley@bristol.ac.uk> . _:b foaf:mbox <mailto:libby.miller@bristol.ac.uk> . _:a foaf:knows _:b .
we could use the following XHTML:
<link about="[_:a]" rel="foaf:mbox" href="mailto:daniel.brickley@bristol.ac.uk" /> <link about="[_:b]" rel="foaf:mbox" href="mailto:libby.miller@bristol.ac.uk" /> <link about="[_:a]" rel="foaf:knows" href="[_:b]" />
or, alternatively, if we wish to partly render the information in XHTML:
<div about="[_:a]"> DanBri can be reached via <a rel="foaf:mbox" href="mailto:daniel.brickley@bristol.ac.uk"> email </a>. He knows Libby. <link rel="foaf:knows" href="[_:b]" /> </div> <div about="[_:b]"> Libby can be reached via <a rel="foaf:mbox" href="mailto:libby.miller@bristol.ac.uk"> email </a> </div>
One of the advantages of using the same syntax to make general statements as well as statements about a document is that in many cases a document can carry its own metadata. For example, if an XHTML document contains a navigable link to the Creative Commons license, this link can also be used to express metadata:
<div about=""> This document is licensed under a <a rel="cc:license" href="http://creativecommons.org/licenses/by-sa/2.0/"> Creative Commons License </a> which, among other things, requires that you provide attribution to the author, <a rel="dc:creator" href="http://ben.adida.net">Ben Adida</a>. </div>
This chunk of XHTML will generate the same triples, no matter what other XHTML contains it:
<> cc:license <http://creativecommons.org/licenses/by-sa/2.0/> . <> dc:creator <http://ben.adida.net> .
FOAF requires the definition of at least two RDF entities: the FOAF person, and the FOAF homepage, which cannot be the same. Thus, the following XHTML can be used to represent a FOAF record:
<html xmlns:geo="http://www.w3.org/2003/01/geo/" ...> <head> <title property="dc:title">Dan's home page</title> </head> <body> <section id="person"> <span about="[_:geolocation]"> Dan is located at latitude <meta property="geo:lat">51.47026</meta> and longitude <meta property="geo:long">-2.59466</meta> </span> <link rel="rdf:type" href="[foaf:Person]" /> <link rel="foaf:homepage" href="" /> <link rel="foaf:based_near" href="[_:geolocation]" /> <h1 property="foaf:name">Dan Brickley</h1> </section> </body> </html>
which yields the correct FOAF triples:
<> dc:title "Dan's home page"^^rdf:XMLLiteral . _:geolocation geo:lat "51.47026"^^rdf:XMLLiteral . _:geolocation geo:long "-2.59466"^^rdf:XMLLiteral . <#person> rdf:type foaf:Person . <#person> foaf:homepage <> . <#person> foaf:based_near _:geolocation . <#person> foaf:name "Dan Brickley"^^rdf:XMLLiteral .
If one wants to make the foaf:Person
a blank node, then the only change required is taking out the id="person"
from the span
element, which then yields the
following triples:
<> dc:title "Dan's home page" . _:geolocation geo:lat "51.47026" . _:geolocation geo:long "-2.59466" . _:span0 rdf:type foaf:Person . _:span0 foaf:homepage <> . _:span0 foaf:based_near _:geolocation . _:span0 foaf:name "Dan Brickley" .
This section is normative.
This section is informative.
2007-09-04: Migrated to XHTML 2 Working Group Publication System. Converted to a format that is consistent with REC-Track documents. Updated to reflect current processing model. Added normative definition of CURIEs. Started updating prose to be consistent with current task force agremeents. [ShaneMcCarron], [StevenPemberton], [MarkBirbeck]
2007-04-06: fixed some of the language to talk about "structure" rather than metadata. Added note regarding space-separated values in predicate-denoting attributes. [BenAdida]
2006-01-16: made the use of CURIE type for rel
,rev
,property
consistent across document (particularly section 2.4 was erroneous). [BenAdida]
This section is informative.
At the time of publication, the participants in the W3C XHTML 2 Working Group were: