W3C

RDFa API

An API for extracting structured data from Web documents

W3C Working Draft 23 September 2010

This version:
http://www.w3.org/TR/2010/WD-rdfa-api-20100923/
Latest published version:
http://www.w3.org/TR/rdfa-api/
Latest editor's draft:
http://www.w3.org/2010/02/rdfa/sources/rdfa-api/
Previous version:
http://www.w3.org/TR/2010/WD-rdfa-api-20100608/
Editors:
Manu Sporny, Digital Bazaar, Inc.
Benjamin Adrian, German Research Center for Artificial Intelligence GmbH
Mark Birbeck, Backplane Ltd.
Author:
Ivan Herman, W3C

This document is also available in this non-normative format: diff to previous version.


Abstract

RDFa [RDFA-CORE] enables authors to publish structured information that is both human- and machine-readable. Concepts that have traditionally been difficult for machines to detect, like people, places, events, music, movies, and recipes, are now easily marked up in Web documents. While publishing this data is vital to the growth of Linked Data, using the information to improve the collective utility of the Web for humankind is the true goal. To accomplish this goal, it must be simple for Web developers to extract and utilize structured information from a Web document. This document details such a mechanism; an RDFa Application Programming Interface (RDFa API) that allows simple extraction and usage of structured information from a Web document.

How to Read this Document

This section is non-normative.

This is a version with examples edited by TimBL to see what the tabulator RDF API would look in the same examples.

This document is a detailed specification for an RDFa API. The document is primarily intended for the following audiences:

For those looking for an introduction to the use of RDFa, or some real-world examples, please consult the RDFa Primer [RDFA-PRIMER].

If you are not familiar with RDF, you should read about the Resource Description Framework (RDF) [RDF-CONCEPTS] before reading this document. The [RDF-CONCEPTS] document outlines the core data model that is used by RDFa to express information.

If you are not familiar with RDFa, you should read and understand the [RDFA-CORE] specification. It describes how data is encoded in host languages using RDFa. A solid understanding of concepts in RDFa Core will inevitably help you understand how the RDFa API works in concert with how the data is expressed in a host language.

If you are a Web developer and are already familiar with RDF and RDFa, and you want to programatically extract RDFa content from documents, then you will find the Concept Diagram and Developing with the API sections of most interest. It contains a handful of ECMAScript examples on how to use the RDFa API.

Readers who are not familiar with the Terse RDF Triple Language [TURTLE] may want to read the specification in order to understand the short-hand RDF notation used in some of the examples.

This document uses the Web Interface Definition Language [WEBIDL] to specify all language bindings. If you intend to implement the RDFa API you should be familiar with the Web IDL language [WEBIDL].

Examples may contain references to existing vocabularies and use abbreviations in CURIEs and source code. The following is a list of all vocabularies and their abbreviations, as used in this document:

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

The following changes have been made since the First Public Working Draft:

This document was published by the RDFa Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-rdfa-wg@w3.org (subscribe, archives). All feedback is welcome.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

This section is non-normative.

RDFa provides a means to attach properties to elements in XML and HTML documents. Since the purpose of these additional properties is to provide information about real-world items, such as people, films, companies, events, and so on, properties are grouped into objects called PropertyGroups.

The RDFa API provides a set of interfaces that make it easy to manipulate DOM objects that contain information that is also part of a PropertyGroup. This specification defines these interfaces.

A document that contains RDFa effectively provides two data layers. The first layer is the information about the document itself, such as the relationship between the elements, the value of its attributes, the origin of the document, and so on, and this information is usually provided by the Document Object Model, or DOM [DOM-LEVEL-1].

The second data layer comprises information provided by embedded metadata, such as company names, film titles, ratings, and so on, and this is usually provided by RDFa [RDFA-CORE], Microformats [MICROFORMATS], DC-HTML, GRDDL, or Microdata.

Whilst this embedded information could be accessed via the usual DOM interfaces -- for example, by iterating through child elements and checking attribute values -- the potentially complex interrelationships between the data mean that it is more efficient for developers if they have access to the data after it has been interpreted.

For example, a document may contain the name of a person in one section and the phone number of the same person in another; whilst the basic DOM interfaces provide access to these two pieces of information through normal navigation, it is more convenient for authors to have these two pieces of information available in one property collection, reflecting the final PropertyGroup.

All of this is achieved through the RDFa API.

There are many scenarios in which the RDFa API can be used to extract information from a Web document. The following sections describe a few of these scenarios.

1.1 Importing Data

Amy has enriched her band's web-site to include Google Rich Snippets event information. Google Rich Snippets are used to mark up information for the search engine to use when displaying enhanced search results. Amy also uses some ECMAScript code that she found on the web that automatically extracts the event information from a page and adds an entry into a personal calendar.

Brian finds Amy's web-site through Google and opens the band's page. He decides that he wants to go to the next concert. Brian is able to add the details to his calendar by clicking on the link that is automatically generated by the ECMAScript tool. The ECMAScript extracts the RDFa from the web page and places the event into Brian's personal calendaring software - Google Calendar.

<div prefix="v: http://rdf.data-vocabulary.org/#" typeof="v:Event"> 
  <a rel="v:url" href="http://amyandtheredfoxies.example.com/events" 
     property="v:summary">Tour Info: Amy And The Red Foxies</a>
  
  <span rel="v:location">
  	<a typeof="v:Organization" rel="v:url" href="http://www.kammgarn.de/" property="v:name">Kammgarn</a>
  </span>
  <div rel="v:photo"><img src="foxies.jpg"/></div>
  <span property="v:summary">Hey K-Town, Amy And The Red Foxies will rock Kammgarn in October.</span>
  When: 
  <span property="v:startDate" content="2009-10-15T19:00">15. Oct., 7:00 pm</span>-
  <span property="v:endDate" content="2009-10-15T21:00">9:00 pm</span>
  </span>

  Category: <span property="v:eventType">concert</span>
</div>

1.2 Enhanced Browser Interfaces

Dave is writing a browser plugin that filters product offers in a web page and displays an icon to buy the product or save it to a public wishlist. The plugin searches for any mention of product names, thumbnails, and offered prices. The information is listed in the URL bar as an icon, and upon clicking the icon, displayed in a sidebar in the browser. He can then add each item to a list that is managed by the browser plugin and published on a wishlist website.

<div prefix="rdfs: http://www.w3.org/2000/01/rdf-schema#
             foaf: http://xmlns.com/foaf/0.1/
             gr: http://purl.org/goodrelations/v1# 
             xsd: http://www.w3.org/2001/XMLSchema#" xml:lang="en">
 
  <div about="#offering" typeof="gr:Offering">
    <div rel="foaf:page" resource="http://www.amazon.com/Harry-Potter-Deathly-Hallows-Book/dp/0545139708"></div>
    <div property="rdfs:label">Harry Potter and the Deathly Hallows</div>
    <div property="rdfs:comment">In this final, seventh installment of the Harry Potter series, J.K. Rowling 
    unveils in spectactular fashion the answers to the many questions that have been so eagerly 
    awaited. The spellbinding, richly woven narrative, which plunges, twists and turns at a 
    breathtaking pace, confirms the author as a mistress of storytelling, whose books will be read, 
    reread and read again.</div>
    <div rel="foaf:depiction">
       <img src="http://ecx.images-amazon.com/images/I/51ynI7I-qnL._SL500_AA300_.jpg" />
    </div>
    <div rel="gr:hasBusinessFunction" resource="http://purl.org/goodrelations/v1#Sell"></div>
    <div rel="gr:hasPriceSpecification">Buy for
      <span typeof="gr:UnitPriceSpecification">
        <span property="gr:hasCurrency" content="USD" datatype="xsd:string">$</span>
        <span property="gr:hasCurrencyValue" datatype="xsd:float">7.49</span>
      </span>
    </div> Pay via: 
      <span rel="gr:acceptedPaymentMethods" resource="http://purl.org/goodrelations/v1#PayPal">PayPal</span>
      <span rel="gr:acceptedPaymentMethods" resource="http://purl.org/goodrelations/v1#MasterCard">MasterCard</span>
    </div>
  </div>
   
</div>

1.3 Data-based Web Page Modification

Dale has a site that contains a number of images, showcasing his photography. He has already used RDFa to add licensing information about the images to his pages, following the instructions provided by Creative Commons. Dale would like to display the correct Creative Commons icons for each image so that people will be able to quickly determine which licenses apply to each image.

<div prefix="cc: http://creativecommons.org/ns#">
  <img src="http://dale.example.com/images/image1.png" 
       rel="cc:license" 
       resource="http://creativecommons.org/licenses/by/3.0/us/"/>
<a href="http://dale.example.com" property="cc:attributionName" 
   rel="cc:attributionURL">Dale</a>
</div>   

1.4 Automatic Summaries

Mary is responsible for keeping the projects section of her company's home page up-to-date. She wants to display info-boxes that summarize details about the members associated with each project. The information should appear when hovering the mouse over the link to each member's homepage. Since each member's homepage is annotated with RDFa, Mary writes a script that requests the page's content and extracts necessary information via the RDFa API.

<div prefix="dc: http://purl.org/dc/terms/ foaf: http://xmlns.com/foaf/0.1/">
<div about="#me" property="foaf:name" content="Bob">My<span> interests are:
  <ol about="#me" typeof="foaf:Person">
  <li rel="foaf:interests">
    <a href="facebook" rel="tag" property="dc:title">facebook</a>
  </li>
  <li rel="foaf:interests">
    <a href="opengraph" rel="tag" property="dc:title">opengraph</a>
  </li>
  <li rel="foaf:interests">
    <a href="semanticweb" rel="tag" property="dc:title">semanticweb</a>
  </li>
</ol>

<p>Please follow me on Twitter: 
  <span about="#me" rel="foaf:account">
    @<a typeof="foaf:OnlineAccount" property="foaf:accountName"
        href="http://twitter.com/bob">bob</a>.
  </span>
</p>
</div>

1.5 Data Visualization

Richard has created a site that lists his favourite restaurants and their locations. He doesn't want to generate code specific to the various mapping services on the Web. Instead of creating specific markup for Yahoo Maps, Google Maps, MapQuest, and Google Earth, he instead adds address information via RDFa to each restaurant entry. This enables him to build on top of the structured data in the page as well as letting visitors to the site use the same data to create innovative new applications based on the address information in the page.

<div prefix="vc: http://www.w3.org/2006/vcard/ns# foaf: http://xmlns.com/foaf/0.1/" typeof="vc:VCard">
  <span property="vc:fn">Wong Kei</span>
  <span property="vc:street-address">41-43 Wardour Street</span>
  <span property="vc:locality">London</span>, <span property="vc:country-name">United Kingdom</span>
  <span property="vc:tel">020 74373071</span>
</div>

1.6 Linked Data Mashups

Marie is a chemist, researching the effects of ethanol on the spatial orientation of animals. She writes about her research on her blog and often makes references to chemical compounds. She would like any reference to these compounds to automatically have a picture of the compound's structure shown as a tooltip, and a link to the compound's entry on the National Center for Biotechnology Information [NCBI] Web site. Similarly, she would like visitors to be able to visualize the chemical compound in the page using a new HTML5 canvas widget she has found on the web that combines data from different chemistry websites.

<div prefix="dbp: http://dbpedia.org/ontology/ fb: http://rdf.freebase.com/rdf/" >
   My latest study about the effects of 
   <span about="[fb:en.ethanol]" 
      typeof="[dbp:ChemicalCompound]" 
      property="[fb:chemistry.chemical_compound.pubchem_id]" 
      content="702">ethanol</span> on mice's spatial orientation show that ...
</div>

2. Design Considerations

This section is non-normative.

RDFa 1.0 [RDFA-SYNTAX] has seen substantial growth since it became an official W3C Recommendation in October 2008. It has seen wide adoption among search companies, e-commerce sites, governments, and content management systems. There are numerous interoperable implementations and growth is expected to continue to rise with the latest releases of RDFa 1.1 [RDFA-CORE], XHTML+RDFa 1.1 [XHTML-RDFA], and HTML+RDFa 1.1 [HTML-RDFA].

In an effort to ensure that Web applications are able to fully utilize RDFa, this specification outlines an API and a set of interfaces that extract RDF Triples from Web documents or other document formats that utilize RDFa. The RDFa API is designed with maximum code expressiveness and ease of use in mind. Furthermore, a deep understanding of RDF and RDFa is not necessary in order to extract and utilize the structured data embedded in RDFa documents.

Since there are many Web browsers and programming environments for the Web, the rapid adoption of RDFa requires an interoperable API that Web document designers can count on being available in all Web browsers. The RDFa API provides a uniform and developer-friendly interface for extracting RDFa from Web documents.

Since most browser-based applications and browser extensions that utilize Web documents are written in ECMAScript [ECMA-262], the implementation of the RDFa API is primarily concerned with ensuring that concepts covered in this document are easily utilized in ECMAScript.

While ECMAScript is of primary concern, the RDFa API specification is language independent and is designed such that DOM tool developers may implement it in many of the other common Web programming languages such as Python, Java, Perl, and Ruby. Objects that are defined by the RDFA API are designed to work as seamlessly as possible with language-native types, operators, and program flow constructs.

2.1 Goals

The design goals that drove the creation of the APIs that are described in this document are:

Ease of Use and Expressiveness
While this should be a design goal for all APIs, special care is taken to ensure that developers can accomplish common tasks using a minimal amount of code. While execution speed is always an important factor to consider, it is secondary to minimizing the amount of work a developer must perform to extract and use data contained in the document.
Modularity and Pluggability
Each part of the API is modular and pluggable, meaning that data storage, parsers and query modules may be replaced by other developer-defined mechanisms as long as the interfaces listed in this document are supported.
DOM Orthogonality
Interfaces which are defined on the RDF graph which is carried by the document, should be the same as the interface to any other RDF graph such as that loaded from other sources, (or representing HTTP transactions which have taken place, etc.)
Support for Non-RDFa Parsers
Other languages that store data in the DOM are considered as first-class languages when it comes to extraction and support via the RDFa API. Mechanisms like DC-HTML, eRDF, Microformats, and Microdata can be used to express structured data in DOM-based languages. It is a goal of this API to ensure that information expressed in these languages can be extracted, using a developer-defined parser, and stored in the Data Store.
Low-level Access and the Freedom to Tinker
Providing an abstract API, while simpler, may not produce the kind of innovation that the semantics community would like to see. It is important to give developers access to the entire RDFa API stack in order to ensure that each layer of the stack can be improved independently of a standards body.
A Modular RDF API
The RDFa Working Group understood that in order to provide an RDFa API that an RDF API would have to be created. This specification details a low-level RDF API as well as a higher-level RDFa API. While it is not certain how many developers would require a stand-alone RDF API, the hope is that those that require one will be able to re-use the RDF API defined in this specification. This specification is an attempt to unify RDF APIs much like POSIX unified the mechanisms used to access operating system services across all vendors.
Native Language Constructs
Data is exposed and processed in a way that is natural for ECMAScript and many other Web programming languages like Python, Ruby and even C++. For example, lists and Date objects may be passed directly to the API and will be interpreted as their RDF model equivlents. By ensuring that programming language constructs are considered in the design of the API, we ensure that the API won't fight the language and thus, the developer.
Macros and Templates
Some of the mechanisms that underpin RDF are difficult to use in everyday programming. For example, having to type out an entire URI is not only laborious for a programmer, but also error prone and overly-verbose. RDFa Core [RDFA-CORE] introduces the concept of a Compact URI Expression, or CURIE. This API builds on the CURIE concept and allows IRIs to be expressed as CURIEs. The API should also provide short-cuts that reduce the amount of code that has to be repeated. Property Group Templates are one example of reducing repetitive code writing as it can be stored in a single variable and re-used for objects.

2.2 Concept Diagram

The following diagram describes the relationship between all concepts discussed in this document.

The RDFa API Concept Stack

Diagram of RDFa API Concepts

The lowest layer of the API defines the basic structures that are used to store information; Symbol, PlainLiteral, TypedLiteral, BlankNode and finally the RDFStatement/code> and the RDFGraph.

The next layer of the API, the DataStore, supports the storage of information.

The DataParser and DataQuery interfaces directly interact with the DataStore. The DataParser is used to extract information from the Document and store the result in a DataStore. The DataQuery interface is used to extract different views of data from the DataStore. The PropertyGroup is an abstract, easily manipulable view of this information for developers. While PropertyGroup objects can address most use cases, a developer also has access to the information in the DataStore at a basic level. Access to the raw data allows developers to create new views and ways of directly manipulating the data in a DataStore.

The highest layer to the API is the Document object and is what most web developers will use to retrieve PropertyGroups created from data stored in the document.

3. Developing with the API

This section is non-normative.

3.1 Basic Concepts

This section is non-normative.

This API provides a number of interfaces to enable:

  • parsing of DOM objects that contain embedded metadata;
  • extraction of the embedded metadata into a data store;
  • querying a data store in order to retrieve PropertyGroups from the data store.

3.1.1 The Basic API

The RDFa API has a number of advanced methods that can be used to access the DataStore, DataParser and DataQuery mechanisms. Most web developers will not need to use the advanced methods - most will only require the following interfaces for most of their day-to-day development activities.

Retrieving PropertyGroups

document.getItemsByType(type)
Retrieves a list of PropertyGroups by their type, such as foaf:Person.
document.getItemBySubject(type)
Retrieves a single PropertyGroup by its subject, such as http://example.org/people#bob.
document.getItemsByProperty(property, optional value)
Retrieves a list of PropertyGroups by a particular property and optional value that the PropertyGroup contains.

Retrieving DOM Elements

document.getElementsByType(type)
Retrieves a list of DOM Nodes by the type of data that they express, such as foaf:Person.
document.getElementsBySubject(type)
Retrieves a list of DOM Nodes by the subject associated with the data that they express, such as http://example.org/people#bob.
document.getElementsByProperty(property, optional value)
Retrieves a list of DOM Nodes by a particular property and optional value that each expresses.

IRI Mapping

document.data.context.setPrefix(prefix, iri)
Gets and sets short-hand IRI mappings that are used by the API, such as expanding foaf:Person to http://xmlns.com/foaf/0.1/Person.

Advanced Processing

document.data.query.select(query, template)
Retrieves an array of PropertyGroups based on a set of selection criteria.
document.data.store.filter(pattern)
Filters a given DataStore by matching a given triple pattern.
document.data.parser.iterate(pattern)
Iterates through a DOM, using a low-memory, stream-based approach, matching on the given triple pattern.

3.1.2 Using the Basic API

The following section uses the markup shown below to demonstrate how to extract and use PropertyGroups using the RDFa API. The following markup is assumed to be served from a document located at http://example.org/people.

<div prefix="foaf: http://xmlns.com/foaf/0.1/" about="#albert" typeof="foaf:Person">
  <span property="foaf:name">Albert Einstein</span>
</div>
Working with Property Groups

Retrieving Terms by Type

You can retrievean set (associative array) that is described above by doing the following:

var people = document.data.findAllMembers("http://xmlns.com/foaf/0.1/Person");
or you can specify a short-cut to use when specifying the IRI:
var foaf = document.ns("http://xmlns.com/foaf/0.1/")
var people = document.data.each(undefined, rdf('type'), foaf('Person'));

Retrieving triples by Subject

You can also get a PropertyGroup by its subject:

var kb = document.data;
var aboutAlbert = kb.statementsMatching(kb.sym("http://example.org/people#albert"));

You can also specify a relative IRI and the document IRI will be automatically pre-pended:

var albert = document.data.sym("#albert");

Retrieving Items by Property

You can get a list of PropertyGroups by their properties:

var peopleNamedAlbertEinstein = document.data.each(undefined, foaf('name'), "Albert Einstein");

You can get item by its properties:

var personNamedAlbertEinstein = document.data.any(undefined, foaf('name'), "Albert Einstein");

The wildcard (undefined) can be used as any one position in subject, predicate, object. So each() and any() are very flexible functions. The function each() always returns a list, and any() always a single item.

Using PropertyGroups

noteProperty groups are not used -- it is not clear they are needed.

You can retrieve property values from PropertyGroups like so:

var albert = document.data.sym("#albert");
var name = document.data.any(albert, foaf("name"));

You can also specify values that you would like to map to PropertyGroup attributes:

 RDFA API:
var albert = document.getItemBySubject("#albert", {"foaf:name": "name"});
var name = albert.name;
 RDF API:
var albert = document.data.sym("#albert");
var fn = foaf('name');
var name = doument.data.any(albert, fn);
Managing Elements with Data

Retrieving Elements Containing Data by Type

You can retrieve the DOM Element that is described above by doing the following:

var elements = document.getElementsByType("http://xmlns.com/foaf/0.1/Person");
or you can specify a short-cut to use when specifying the IRI:
OLD RDFA API:  // Context considered harmful where can be avoided
document.data.context.setPrefix("foaf", "http://xmlns.com/foaf/0.1/");
var elements = document.getElementsByType("foaf:Person");
RDF API:
foaf = $rdf.Namespace("http://xmlns.com/foaf/0.1/");
var elements = document.getElementsByType(foaf("Person"));

Retrieving Elements Containing Data by Subject

You can also get a list of Elements by the subject of data:

var elements = document.getElementsBySubject("http://example.org/people#albert");

You can also specify a relative IRI and the document IRI will be automatically pre-pended:

var elements = document.getElementsBySubject("#albert");

Retrieving Elements by Property

You can get a list of Elements by the properties and values that they declare:

var elements = document.data.each(undefined, foaf("name"), "Albert Einstein");

Modifying DOM Elements

You can modify elements that are returned just like any other DOM Node, for example:

The mechanism to access the DOM Element associated with in-document data is being redesigned for the next draft version of the RDFa API.
var elements = document.getElementsByProperty("foaf:name", "Bob");
for(i = 0; i <= elements.length; i++)
{
   var e = elements[i];
   e.style.setProperty('color', '#00cc00', null);
}

The code above would change the color of all the areas of the page where the item's name is "Bob" to green.

3.2 Advanced Concepts

This section covers a number of concepts that go beyond basic everyday usage of the RDFa API. The interfaces to the API allow you to work with data at an abstract level, or query structured data and override key parts of the software stack in order to extend the functionality that the API provides.

3.2.1 Advanced Queries

The features available via a Query object will depend on the implementation. However, all conforming processors will provide the basic element selection mechanisms described here.

Querying by Type

Perhaps the most basic task is to select PropertyGroups of a particular type. The type of a PropertyGroup is set in RDFa via the special attribute @typeof. For example, the following markup expresses a PropertyGroup of type Person in the Friend-of-a-Friend vocabulary:

<div typeof="foaf:Person">
  <span property="foaf:name">Albert Einstein</span>
</div>

To locate all PropertyGroups that are people, we could use the document.getItemsByType() method:

document.getItemsByType("http://xmlns.com/foaf/0.1/Person");
or we could do the same using the DataQuery interface:

var query = document.data.createQuery("rdfa", store);
var people = query.select( { "rdf:type": "foaf:Person" } );

While the query interface is more verbose for simple queries, it becomes necessary for more complex queries as demonstrated later in this section. Note that the Query object has access to the mappings provided via the document.data.context object, so they can also be used in queries. It is also possible to write the same query in a way that is independent of any prefix-mappings:

var people = query.select( { "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://xmlns.com/foaf/0.1/Person" } );
Querying by Property Value

The previous query selected all PropertyGroups of a certain type, but it did so by indicating that the property rdf:type should have a specific value. Queries can also specify other properties. For example, given the following mark-up:

<div typeof="foaf:Person">
  <span property="foaf:name">Albert Einstein</span> -
  <span property="foaf:myersBriggs">INTP</span>
  <a rel="foaf:workInfoHomepage" href="http://en.wikipedia.org/wiki/Albert_Einstein">More...</span>
</div>
<div typeof="foaf:Person">
  <span property="foaf:name">Mother Teresa</span> -
  <span property="foaf:myersBriggs">ISFJ</span>
  <a rel="foaf:workInfoHomepage" href="http://en.wikipedia.org/wiki/Mother_Teresa">More...</span>
</div>
<div typeof="foaf:Person">
  <span property="foaf:name">Marie Curie</span> - 
  <span property="foaf:myersBriggs">INTP</span>
  <a rel="foaf:workInfoHomepage" href="http://en.wikipedia.org/wiki/Marie_Curie">More...</span>
</div>

The following query demonstrates how a developer would select and use all PropertyGroups of type Person that also have a Myers Brigg's personality type of "INTP" (aka: The Architect):

var architects = query.select( {
  "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://xmlns.com/foaf/0.1/Person",
  "http://xmlns.com/foaf/0.1/myersBriggs": "INTP"
} );

var name = architects[0].get("http://xmlns.com/foaf/0.1/name");

As before, prefix-mappings can also be used:

var architects = query.select( {"rdf:type": "foaf:Person", "foaf:myersBriggs": "INTP"} );

var name = architects[0].get("foaf:name");

Directives to generate the PropertyGroup object based on a template specified by the developer can also be used. In this case, all of the "INTP" personality types are gleaned from the page and presented as PropertyGroups containing each person's name and blog page:

var architects = query.select( {"rdf:type": "foaf:Person", "foaf:myersBriggs": "INTP"},
                               {"foaf:name": "name", "foaf:workInfoHomepage", "webpage"} );

var name = architects[0].name;
var infoWebpage = architects[0].webpage;

3.2.2 Direct Access to the Data Store

The RDFa API allows a developer to not only query the DataStore at via the DataQuery mechanism, it also allows a developer to get to the underlying data structures that represent the structured data at the "atomic level".

The filter interface is a part of the DataStore and enables a developer to filter a series of triples out of the DataStore. For example, to extract all triples about a particular subject, the developer could do the following:

var names = document.data.store.filter( {"subject": "http://example.org/people#benjamin"} );

Developers could also combine subject-property filters by doing the following:

var names = document.data.store.filter( {"subject:", "http://example.org/people#benjamin", "property": "foaf:nick"} );

The query above would extract all known nicknames for the subject as triples.

A developer may also retrieve all triples in the DataStore by specifying no filter parameters:

var allTriples = document.data.store.filter();

3.2.3 Direct Access to the Parser Stream

The .iterate interface will almost certainly see large changes in the next version of the RDFa API specification. Implementers are warned to not implement the interface and wait for the next revision of this specification.

The iterate interface can be used to process triples in the document as they are discovered. This interface is most useful for processing large amounts of data in low-memory environments.

var iter = document.data.parser.iterate( {"subject": "http://example.org/people#mark"} );
for(var triple = iter.next(); triple != null; triple = iter.next())
{
   // process each triple that is associated with http://example.org/people#mark
}

3.2.4 Overriding the Defaults

The sub-modules in the RDFa API are meant to be overridden by developers in order to extend basic functionality as well as innovate new interfaces for the RDFa API.

Overriding the DataStore

The API is designed such that a developer may override the default data store provided by the browser by providing their own. This is useful, for instance, if the developer wanted to create a permanent site-specific data store using Local Storage features in the browser, or allowing provenance information to be stored with each triple.

var mydatastore = new MyCustomDataStore();

document.data.store = mydatastore;
Overriding the Data Parser

Developers may create and specify different parsers for parsing structured data from the document that builds upon RDFa, or parses other languages not related to RDFa. For example, Microformats-specific parsers could be created to extract structured hCard data and store it in an object that is compatible with the DataStore interface.

var hcardParser = new MyHCardParser();

document.data.parser = hcardParser;
Overriding the Data Query

The query mechanism for the API can be overridden to provide different or more powerful query mechanisms. For example, by replacing the standard query mechanism, developers could provide a full SPARQL query mechanism:

var sparqlQuery = new MySparqlEngine();
document.data.query = sparqlQuery;

var books = document.data.query.select("SELECT ?book ?title WHERE { ?book <http://purl.org/dc/elements/1.1/title> ?title . }",
                                       {"?book": "subject", "?title": "title"} );

4. The Interfaces Specification

The following section contains all of the interfaces that developers are expected to implement as well as implementation guidance.

4.1 Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].

Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.

User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.

Implementations that use ECMAScript or Java to implement the APIs defined in this specification must implement them in a manner consistent with the respective ECMAScript or Java Bindings defined in the Web IDL specification, as this specification uses that specification's terminology. [WEBIDL]

Implementations that use any other language to implement the APIs defined in this specification that do not have bindings defined in the Web IDL specification should attempt to map the API as closely as possible to the implementation language's native mechanisms and datatypes. Developers are encouraged to work with other developers who are providing the RDFa API in the same langauge to ensure that RDFa Processors are modular and easily exchangable.

4.2 The RDF Interfaces

RDFa is a syntax for expressing the RDF Data Model [RDF-CONCEPTS] in Web documents. The RDFa API is designed to extract the RDF Data Model from Web documents. The following RDF Resources are utilized in this specification: PlainLiterals, TypedLiterals, IRI References (as defined in [IRI]) and BlankNodes. The interfaces for each of these RDF Resources are detailed in this section. The types as exposed by the RDFa API conform to the same data and comparison restrictions as specified in the RDF concepts specification [RDF-CONCEPTS] and the [IRI] specification.

Each RDF interface provides access to both the extracted RDFa value and the DOM Node from which the value was extracted. This allows developers to extract and use RDF triples from a host language and also manipulate the DOM based on the associated DOM Node. For example, an agent could highlight all strings that are marked up as foaf:name properties in a Web document.

The basic RDF Resource types used in the RDFa API are:

  • IRI Reference — A reference to a resource as defined in International Resource Identifier [IRI]. Example: http://www.w3.org/2001/XMLSchema#string.
  • PlainLiteral — A string value with optional information about the language of its content. Example: "Harry Potter and the Half-Blood Prince"@en is a plain literal expressed in the English language.
  • Typed Literal — A typed string value with information about the datatype of its content. Example: "7"^^xsd:integer is a typed literal with a value of type xsd:integer.
  • Blank Node — A blank node is a reference to a resource that does not have a corresponding IRI. Examples of BlankNodes include _:me, and _:42.
  • RDF Triple — Triples are the basic data structure utilized by RDF to express logical statements. An RDF triple is a 3-item ordered set consisting of a subject, a property, and an object. Example: <http://example.org/hp> rdfs:label "Harry Potter" .
A class diagram of all of the basic RDF classes, attributes, methods and linkages in the sections to follow

Diagram of RDF Classes, Attributes, Methods and linkages.

An RDFa API implementer must provide the basic types as described in this specification. An implementer may provide additional types and/or a deeper type or class hierarchy that includes these basic types.

4.2.1 RDFNode and RDFResource

An RDFNode is an anything that can be an object of an RDFTriple. It is also the base class of PlainLiteral, and TypedLiteral.

[NoInterfaceObject]
interface RDFNode {
    readonly attribute stringifier DOMString value;
};
Attributes
value of type stringifier DOMString, readonly
The value of an RDFNode is either a literal's lexical value or a lexical identifier of an IRI or BlankNode. Note that the value attribute is marked with a stringifier decorator. In [WEBIDL], this means that the value is used as the return value if toString() is called on an object that inherits from this type. The stringifier can be overridden by the re-specification of the toString() method on a subclass of this type.
No exceptions.

An RDFResource is an anything that can be a subject of an RDFTriple.

[NoInterfaceObject]
interface RDFResource : RDFNode {
    readonly attribute stringifier DOMString value;
};
Attributes
value of type stringifier DOMString, readonly
The value of an RDFResource is either a string that represents an IRI or an identifier of a BlankNode.
No exceptions.

4.2.2 IRI References

An IRI Reference in the RDFa API points to a resource and is further defined in [IRI].

[NoInterfaceObject]
interface IRI : RDFResource {
};

4.2.3 RDF Literals

An RDF Literal is an RDF Resource that represents lexical values in RDFa data. The two RDF Literals provided via the RDFa API are PlainLiterals and TypedLiterals. For a given RDF Literal, either language or type information can be provided. If the type is set, the RDF Literal is a Typed Literal. If a type is not set, it is a PlainLiteral.

PlainLiteral
RDF Literals that may contain language information about the given text. The language is specified as a text string as specified in [BCP47] (e.g., 'en', 'fr', 'de').
TypedLiteral
RDF Literals that contain type information about the given text. The type is always specified in the form of an IRI Reference (e.g., http://www.w3.org/2001/XMLSchema#DateTime).
PlainLiterals

PlainLiterals have a string value and may specify a language.

[NoInterfaceObject]
interface PlainLiteral : RDFNode {
    readonly attribute stringifier DOMString value;
    readonly attribute DOMString             language;
};
Attributes
language of type DOMString, readonly
A language string as defined in [BCP47], normalized to lowercase.
No exceptions.
value of type stringifier DOMString, readonly
The lexical value of the literal encoded in the character encoding of the source document. The value is extracted from an RDFa document using the algorithm defined in the RDFa Core Specification [RDFA-CORE], Section 7.5: Sequence, Step 11.
No exceptions.
Example

This section is non-normative.

The following example demonstrates a few common use cases of the PlainLiteral type.

// Create a new PlainLiteral using the DataStore's createPlainLiteral interface
var literal = document.data.store.createPlainLiteral("Harry Potter and the Half-Blood Prince", "en");

// The API supports conversion of PlainLiterals to native language string types
var str = literal.toString();
// At this point, str is equivalent to "Harry Potter and the Half-Blood Prince"

// The API supports attribute-based access of PlainLiteral values, the native type
// of a PlainLiteral value is a DOMString.
var val = literal.value
// At this point, val is equivalent to "Harry Potter and the Half-Blood Prince"

// The API supports attribute-based access of PlainLiteral language values, the
// native type of a PlainLiteral language value is a DOMString
var lang = literal.language;
// At this point, lang is equivalent to "en"
Typed Literals

A TypedLiteral has a string value and a datatype specified as an IRI Reference. TypedLiterals can be converted into native language datatypes of the implementing programming language by registering a Typed Literal Converter as defined later in the specification.

The datatype's IRI reference specifies the datatype of the text value, e.g., xsd:DataTime or xsd:boolean.

The RDFa API provides a method to explicitly convert TypedLiteral values to native datatypes supported by the host programming language. Developers may write their own Typed Literal Converters in order to convert an RDFLiteral into a native language type. The converters are registered by using the registerTypeConversion() method. Default TypedLiteral converters must be supported by the RDFa API implementation for the following XML Schema datatypes:

  • xsd:string
  • xsd:boolean
  • xsd:float
  • xsd:double
  • xsd:boolean
  • xsd:integer
  • xsd:long
  • xsd:date
  • xsd:time
  • xsd:dateTime
[NoInterfaceObject]
interface TypedLiteral : RDFNode {
    readonly attribute stringifier DOMString value;
    readonly attribute IRI                   type;
    any valueOf ();
};
Attributes
type of type IRI, readonly
A datatype identified by an IRI reference
No exceptions.
value of type stringifier DOMString, readonly
The lexical value of the literal encoded in the character encoding of the source document. The value is extracted from an RDFa document using the algorithm defined in the RDFa Core Specification [RDFA-CORE], Section 7.5: Sequence, Step 11.
No exceptions.
Methods
valueOf
Returns a native language representation of this literal. The type conversion should be performed by translating the value of the literal using the IRI reference of the datatype to the closest native datatype in the programming language.
No parameters.
No exceptions.
Return type: any
Example

This section is non-normative.

The following example demonstrates how a TypedLiteral representing a date is automatically converted to ECMAScript's native DateTime object.

// Create a new TypedLiteral using the DataStore's createTypedLiteral interface
var literal = document.data.store.createTypedLiteral("2010-12-24", "xsd:date");

// The value attribute stores the data in the raw format as specified via the
// createTypedLiteral interface or in the document text
var value = literal.value;
// At this point, value is equivalent to "2010-12-24"

// The toString() method will convert the TypedLiteral into a language-native 
// string value of "2010-12-24". This value may be different from the raw format
// stored in the literal.value attribute
var str = literal.toString();
// At this point, str is equivalent to "2010-12-24"

// The valueOf() method will convert the TypedLiteral into a language-native
// datatype. In ECMAScript, the return value of valueOf() will be a 
// Date object.
var date = literal.valueOf();
// At this point, date will be a Date object with a value of 
// Fri Dec 24 2010 00:00:00 GMT+0000

4.2.4 Blank Nodes

A BlankNode is an RDF resource that does not have a corresponding IRI reference, as defined in [RDF-CONCEPTS]. The value of a BlankNode is not required to be the same for identical documents that are parsed at different times. The purpose of a BlankNode is to ensure that RDF Resources in the same document can be compared for equivalence by ID.

The reasoning behind how we stringify BlankNodes should be explained in more detail.

BlankNodes are stringified by concatenating "_:" to BlankNode.value

[NoInterfaceObject]
interface BlankNode : RDFResource {
    readonly attribute stringifier DOMString value;
};
Attributes
value of type stringifier DOMString, readonly
The temporary identifier of the BlankNode. The value must not be relied upon in any way between two separate RDFa processing runs of the same document.
No exceptions.

Developers and authors must not assume that the value of a BlankNode will remain the same between two processing runs. BlankNode values are only valid for the most recent processing run on the document. BlankNodes values will often be generated differently by different RDFa Processors.

Example

This section is non-normative.

The following example demonstrates the use of a BlankNode in a ECMAScript implementation that uses incrementing numbers for the identifier.

// Create two new BlankNodes - A and B
var bna = document.data.store.createBlankNode();
var bnb = document.data.store.createBlankNode();

// BlankNode stringification changes from implementation to implementation
// and between structured data extraction runs. Developers must not depend on 
// the stringified names of BlankNodes to be the same between implementations 
// or two different processing runs of a structured data processor.
var stra = bna.toString();
// The stringified representation of the BlankNode at this point could be
// "_:1", "_:a", "_:blank_node_alpha", or any other unique identifier starting
// with "_:"

// Extract the unique values associated with BlankNode A and BlankNode B
var bnavalue = bna.value;
var bnbvalue = bnb.value;
// The two values above, bnavalue and bnbvalue, must not ever be allowed to
// be the same. The values can be strings, integers or any other value that
// easily establishes a unique identifier for the BlankNode.

4.2.5 RDF Triples

The RDFTriple interface represents an RDF triple as specified in [RDF-CONCEPTS]. RDFTriple can be used by referring to properties, such as subject, property, and object. RDFTriple can also be used by referring to pre-defined indexes. The stringification of RDFTriple results in an N-Triples-based representation as defined in [RDF-TESTCASES].

[NoInterfaceObject, Null=Null]
interface RDFTriple {
    const unsigned long size = 3;
    readonly attribute RDFResource subject;
    readonly attribute IRI         property;
    readonly attribute RDFNode     object;
    getter RDFNode        get (in unsigned long index);
    stringifier DOMString toString ();
};
Attributes
object of type RDFNode, readonly
The object associated with the RDFTriple.
No exceptions.
property of type IRI, readonly
The property associated with the RDFTriple.
No exceptions.
subject of type RDFResource, readonly
The subject associated with the RDFTriple.
No exceptions.
Methods
get
An index method to access subject, predicate, and object with indices from zero to two, respectively.
ParameterTypeNullableOptionalDescription
indexunsigned long
No exceptions.
Return type: getter RDFNode
toString
Converts this triple into a string in N-Triples notation.
No parameters.
No exceptions.
Constants
size of type unsigned long
A constant used to define an RDFTriple as array of length three.
Example

This section is non-normative.

The following examples demonstrate the basic usage of an RDFTriple object. The creation of the triple uses the foaf:name CURIE, which is transformed into an IRI. The example assumes that a mapping has already been created for foaf. For more information on creating RDFa DOM API mappings, see the section on IRI Mappings.

// Create a new RDFTriple with a PlainLiteral as the object
var triple = document.data.store.createTriple("http://www.example.com#manu", "foaf:name", \
   document.data.store.createPlainLiteral("Manu Sporny"));

// Adding the RDFTriple's subject to a string will automatically call the
// toString() method for the RDFResource, which will serialize the IRI value
// to a language-native string
var str = "Triple subject: " + triple.subject;
// At this point, str will be "Triple subject: http://www.example.com#manu"

// You can also convert the entire triple to N-Triples format by making the
// language implementation call the underlying toString() interface for RDFTriple.
var tstr = "N-Triples: " + triple;
// At this point, tstr will be 'N-Triples: <http://www.example.com#manu> <http://xmlns.com/foaf/0.1/name> "Manu Sporny" .'

4.3 The Linked Data Interfaces

A number of convenience objects and methods are provided by the RDFa DOM API to help developers manipulate RDF Resources more easily when writing Web applications.

The basic RDF interface types described earlier in this document are utilized by the following Structured Data Interfaces:

  • DataContext — A mechanism that greatly reduces the amount of code a developer must write to express IRIs and convert data.
  • DataStore — An store containing a set of RDFTriple objects.
  • DataParser — An parser that is capable of parsing DOM Nodes and placing extracted data into a DataStore.
  • DataIterator — An iterator capable of incrementally filtering triples discovered via a DataParser.
  • PropertyGroup — An associative array of all statements in a document about a single subject.
  • DataQuery — Provides the capability of using a particular language to query and extract values from a DataStore.
A class diagram of all of the basic Linked Data classes, attributes, methods and linkages in the sections to follow

Diagram of Linked Data Classes, Attributes, Methods and linkages.

4.3.1 Data Context

Processing RDF data involves the frequent use of unwieldy IRI references and frequent type conversion. The DataContext interface is provided in order to simplify contextual operations such as shortening IRIs and converting RDF data into native language datatypes.

It is assumed that this interface is created and available before a document is parsed for RDFa data. For example, while operating within a Browser Context, it is assumed that the following lines of code are executed before a developer has access to the RDFa API methods on the document object:

document.data.context = new DataContext();
document.data.context.setPrefix("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#");
document.data.context.setPrefix("xsd", "http://www.w3.org/2001/XMLSchema-datatypes#");

In general, when a CURIE is resolved by the RDFa API or a TypedLiteral is converted to a native language type, the current DataContext stored in the DocumentData object must be used to perform the action. This is to ensure that there is only one active DataContext in use by the RDFa API at any given time. The default DataContext stored in the DocumentData object may be changed at runtime.

All of the code that sets up the default type converters for Browser Contexts that use ECMAScript should probably be in the code snippet above.

The following interface allows IRI mappings to be easily created and used by Web Developers at run-time. It also allows for conversion of RDF data into native language datatypes.

interface DataContext {
    void setPrefix (in DOMString prefix, in DOMString iri);
    void registerTypeConversion (in DOMString iri, in TypedLiteralConverter converter);
    IRI  resolveCurie (in DOMString curie);
    any  convertType (in DOMString value, in optional DOMString inputType, in optional DOMString modifier);
};
Methods
convertType
Returns the native language value of the passed literal value as a language-native type. If the given value cannot be converted, the given value must be returned.
The return value upon conversion failure is being actively discussed in the RDFa WG. There are proposals to raise exceptions upon conversion failure, proposals to return tuples containing conversion success/failure and the converted value, as well as other mechanisms that would allow the signalling of a conversion failure from the method to calling code.
ParameterTypeNullableOptionalDescription
valueDOMStringThe value to convert that is associated with the TypedLiteral.
inputTypeDOMStringThe input type for the TypedLiteral passed as a string. For example, xsd:string or xsd:integer.
modifierDOMStringThe developer-supplied modifier for the conversion. The string is a free-form, string that is used by the TypedLiteralConverter that is associated with the inputType.
The RDFa Working Group is still discussing whether or not having a modifier is a good idea as 90% of use cases will never use it and the remaining use cases could provide the functionality with an external switch to the conversion function.
No exceptions.
Return type: any
registerTypeConversion
Registers a type converter from given IRI datatype to a native language dataype in the current programming language.
ParameterTypeNullableOptionalDescription
iriDOMStringA string specifying the IRI datatype. The string may be a CURIE. For example: http://www.w3.org/2001/XMLSchema-datatypes#integer or xsd:integer.
converterTypedLiteralConverterConverts the TypedLiteral's value into a native language datatype in the current programming language.
No exceptions.
Return type: void
resolveCurie
Resolves a given CURIE. If successful, returns an IRI or Null if the CURIE could not be resolved.
ParameterTypeNullableOptionalDescription
curieDOMStringThe CURIE that is to be resolved into an IRI.
No exceptions.
Return type: IRI
setPrefix
Registers a mapping from a prefix to an IRI . The given IRI must be a full IRI. For example, if a developer wants to specify the foaf IRI mapping, they would call setPrefix("foaf", "http://xmlns.com/foaf/0.1/"). Calling the setPrefix() method with a prefix value that does not exist results in the creation of a new mapping. Calling the method with a null IRI value will remove the mapping.
ParameterTypeNullableOptionalDescription
prefixDOMStringThe prefix to put into the mapping as the key. (e.g., foaf)
iriDOMStringThe IRI reference to place into the mapping as the mapped value of the given prefix. (e.g., "http://xmlns.com/foaf/0.1/")
No exceptions.
Return type: void
Easy IRI Mapping

All methods that accept CURIEs as arguments in the RDFa API must use the algorithm specified in RDFa Core, Section 7.4: CURIE and URI Processing [RDFA-CORE] for TERMorCURIEorURI. The prefix and term mappings are provided by the current document.data.context instance.

The following examples demonstrate how mappings are created and used via the RDFa API.

// Create a new mapping for the Friend-of-a-Friend vocabulary
document.data.context.setPrefix("foaf", "http://xmlns.com/foaf/0.1/");

// The new mapping is automatically used when CURIEs are expanded to IRIs,
// the following statement will return all the people in the document
var people = document.getItemsByType("foaf:Person");

// The following statement will result in the exact same list of PropertyGroups
// as returned by the previous getItemsByType() call. Note that the only
// difference is that this call uses a full IRI, while the call above uses a
// CURIE
var people2 = document.getItemsByType("http://xmlns.com/foaf/0.1/Person");

In the example above, the CURIE "foaf:Person" is expanded to an IRI with the value "http://xmlns.com/foaf/0.1/Person" in the getItemsByType method.

IRI mappings for all terms in the following vocabularies must be included: rdf and xsd.

Automatic Type Conversion

TypedLiteralConverter is a callable interface that transforms the value of a TypedLiteral into a native language type in the current programming language. The type IRI of the TypedLiteral is used to determine the best mapping to the native language type.

Specifing Typed Literal Converters

A TypedLiteralConverter may be implemented as a class, or as a language-native callback in languages like ECMAScript.

[NoInterfaceObject Callback]
interface TypedLiteralConverter {
    any convert (in DOMString value, in optional IRI inputType, in optional DOMString modifier);
};
Methods
convert
Returns the native language value of the passed literal value as a language-native type. If the given value cannot be converted, the given value must be returned.
The return value upon conversion failure is being actively discussed in the RDFa WG. There are proposals to raise exceptions upon conversion failure, proposals to return tuples containing conversion success/failure and the converted value, as well as other mechanisms that would allow the signalling of a conversion failure from the method to calling code.
ParameterTypeNullableOptionalDescription
valueDOMStringThe value to convert that is associated with the TypedLiteral.
inputTypeIRIThe input type for the TypedLiteral passed as an IRI. For example, http://www.w3.org/2001/XMLSchema#string or http://www.w3.org/2001/XMLSchema#integer.
modifierDOMStringA developer-specified modifier used during the conversion. The string is a free-form string that is used by the developer-specified convert() method.
No exceptions.
Return type: any

Examples

The following example demonstrates how a developer could register and use a TypedLiteralConverter.

// Register a new type converter for the "xsd:boolean" type
document.data.context.registerTypeConversion("xsd:boolean", function(value) {return new Boolean(value);});

// Create a new TypedLiteral of type "xsd:boolean" with a string value of "1"
var literal = document.data.store.createTypedLiteral("1", "xsd:boolean");

// Convert the literal to a string
var lstr = literal.toString();
var lvalue = literal.value;
// At this point, lstr and lvalue will both be "1"

// Get the language-native value of the literal
var lnvalue = literal.valueOf();
// At this point, the language-native value of the lnvalue variable will 
// be a Boolean type whose value is 'true'.

The following example demonstrates how to create and pass a TypedLiteralConverter function in ECMAScript:

var converter = function (value) { return new String(value) };
document.data.context.registerTypeConverter("xsd:string", converter);

A TypedLiteralConverter can also specify a target type for the converter so that one converter can be used for multiple types:

var converter = function(value, inputType, modifier) 
{
   if(inputType == "http://www.w3.org/2001/XMLSchema#integer")
   {
      // if the input type is xsd:integer, convert to an ECMAScript integer
      return parseInt(value);
   }
   else if(modifier == "caps")
   {
      // if the modifier is "caps" convert to a string and uppercase the 
      // return value
      return new String(value).toUpperCase();
   }
   else
   {
      // in all other cases, convert to a string
      return new String(value);
   }
};

// register the converter
document.data.context.registerTypeConverter("xsd:string", converter);
document.data.context.registerTypeConverter("xsd:integer", converter);

// Use the developer-defined TypedLiteralConverter to resolve the value to
// a language-native type. In this example, airport codes are commonly
// upper-cased values, so specify that the conversion should be capitalized.
var airportCode = document.data.context.convertType("wac", "xsd:string", "caps");
// At this point, the value of airportCode should be "WAC"

4.3.2 Data Store

The DataStore is a set of RDFTriple objects. It provides a basic getter as well as an indexed getter for retrieving individual items from the store. The DataStore can be used to create primitive types as well as store collections of them in the form of RDFTriples.

The forEach method is not properly defined in WebIDL - need to get input from the WebApps Working Group on how best to author this interface.

[NoInterfaceObject]
interface DataStore {
    readonly attribute unsigned long size;
    getter RDFTriple get (in unsigned long index);
    boolean          add (in RDFTriple triple);
    IRI              createIRI (in DOMString iri, in optional Node node);
    PlainLiteral     createPlainLiteral (in DOMString value, in optional DOMString? language);
    TypedLiteral     createTypedLiteral (in DOMString value, in DOMString type);
    BlankNode        createBlankNode (in optional DOMString name);
    RDFTriple        createTriple (in RDFResource subject, in IRI property, in RDFNode object);
    [Null=Null]
    DataStore        filter (in optional Object? pattern, in optional Element? element, in optional RDFTripleFilter filter);
    void             clear ();
    void             forEach (in DataStoreIterator iterator);
    boolean          merge (in DataStore store);
};
Attributes
size of type unsigned long, readonly
A non-negative integer that specifies the size, in RDFTriples, of the store.
No exceptions.
Methods
add
Adds an RDFTriple to the DataStore. Returns True if the RDFTriple was added to the store successfully. Adding an RDFTriple that already exists in the DataStore must return True and must not store two duplicate triples in the DataStore.
ParameterTypeNullableOptionalDescription
tripleRDFTripleThe triple to add to the DataStore.
No exceptions.
Return type: boolean
clear
Clears all data from the store.
No parameters.
No exceptions.
Return type: void
createBlankNode
Creates a BlankNode given an optional name value.
ParameterTypeNullableOptionalDescription
nameDOMStringThe name of the BlankNode, which will be used when Stringifying the BlankNode.
No exceptions.
Return type: BlankNode
createIRI
Creates an IRI given a value and an optional Node.
ParameterTypeNullableOptionalDescription
iriDOMStringThe IRI reference's lexical value.
nodeNodeAn optional DOM Node to associate with the IRI.
No exceptions.
Return type: IRI
createPlainLiteral
Creates a PlainLiteral given a value and an optional language.
ParameterTypeNullableOptionalDescription
valueDOMStringThe value of the PlainLiteral, which is usually a human-readable string.
languageDOMStringThe language that is associated with the PlainLiteral encoded according to the rules outlined in [BCP47].
No exceptions.
Return type: PlainLiteral
createTriple
Creates an RDFTriple given a subject, property and object. If any incoming value does not match the requirements listed below, a Null value must be returned by this method.
ParameterTypeNullableOptionalDescription
subjectRDFResourceThe subject value of the RDFTriple. The value must be either an IRI or a BlankNode.
propertyIRIThe property value of the RDFTriple.
objectRDFNodeThe object value of the RDFTriple. The value must be an IRI, PlainLiteral, TypedLiteral, or BlankNode.
No exceptions.
Return type: RDFTriple
createTypedLiteral
Creates a TypedLiteral given a value and a type.
ParameterTypeNullableOptionalDescription
valueDOMStringThe value of the TypedLiteral.
typeDOMStringThe IRI type of the TypedLiteral. The argument can either be a full IRI or a CURIE.
No exceptions.
Return type: TypedLiteral
filter
Returns an DataStore, which consists of zero or more RDFTriple objects.
The RDFa Working Group is debating whether the return value of the filter method should be a DataStore or a Sequence of RDFTriples. DataStores allow much higher-level functions to be carried out versus a simple Sequence of RDFTriples. However, DataStores may be very memory intensive to construct and manage.
ParameterTypeNullableOptionalDescription
patternObject

A filter pattern that determines which triples to select from the DataStore. The filter pattern is provided to make it easy to perform complex filtering on subject, property and object by providing a very flexible data container to convey selection information. The implementation of the pattern parameter is designed to be easily implemented by data dictionaries or associative arrays, but should be done in a way that is natural to the implementation language. To ensure a similar programming environment, the variables to convey subject, property and object parameters should use keys or variables named "subject", "property" and "object", respectively.

The subject parameter is used to filter RDFTriple objects. Values can be of type DOMString, IRI, or BlankNode. If a DOMString is supplied, the API must convert the value into an IRI or BlankNode before comparison is performed. If the subject is specified, the RDFTriple must not be placed in the final output Array unless the given subject matches the RDFTriple's subject. If a subject parameter is not specified, the function must not reject any triple based on the subject unless specified by the RDFTripleFilter.

The property parameter is used to filter RDFTriple objects. Values can be of type DOMString or IRI. If a DOMString is supplied, the API must convert the value into an IRI before comparison is performed. If the property is specified, the RDFTriple must not be placed in the final output Array unless the given property matches the RDFTriple's property. If a property parameter is not specified, the function must not reject any triple based on the property unless specified by the RDFTripleFilter.

The object parameter is used to filter RDFTriple objects. Values can be of type DOMString, BlankNode, PlainLiteral, TypedLiteral or IRI. If a DOMString is supplied, the API must convert the value into a PlainLiteral before comparison is performed. If the object is specified, the RDFTriple must not be placed in the final output Array unless the given object matches the RDFTriple's object. If an object parameter is not specified, the function must not reject any triple based on the object unless specified by the RDFTripleFilter.

elementElementThe parent DOM Element where filtering should start. The implementation must only consider RDF triples on the current DOM Element and its children.
filterRDFTripleFilterA user defined function, returning a true or false value, that determines whether or not an RDFTriple should be added to the final Array.
No exceptions.
Return type: DataStore
forEach
Calls the given callback for each item in the DataStore.
While the forEach() method is intended to provide a functional mechanism for iterating through a DataStore, it has been questioned whether the interface would be useful for developers since there is already a procedural array-index-based iteration mechanism built into a DataStore.
ParameterTypeNullableOptionalDescription
iteratorDataStoreIteratorA function that takes the following arguments: index, subject, property, object. The function is called for each item in the DataStore.
No exceptions.
Return type: void
get
Returns the RDFTriple object at the given index in the list.
ParameterTypeNullableOptionalDescription
indexunsigned longThe index of the RDFTriple in the list to retrieve. The value must be a positive integer value greater than or equal to zero and less than DataStore::length.
No exceptions.
Return type: getter RDFTriple
merge
Merges all triples in an external DataStore into this DataStore. Duplicate triples must not be inserted into the same data store. Returns True if all triples were merged into the store successfully.
ParameterTypeNullableOptionalDescription
storeDataStoreThe external DataStore to merge into this DataStore.
No exceptions.
Return type: boolean

The DataStoreIterator interface is used by the forEach() method on the DataStore when processing all of the triples in a DataStore.

[NoInterfaceObject, Callback, Null=Null]
interface DataStoreIterator {
    void process (in int index, in RDFResource subject, in IRI property, in RDFNode object);
};
Methods
process
A callable function that takes an array index, subject, property, and object as arguments and processes each based on a developer-supplied algorithm.
ParameterTypeNullableOptionalDescription
indexintThe offset into the DataStore that contains the current RDFTriple being processed.
subjectRDFResourceThe subject of the RDFTriple being processed.
propertyIRIThe property associated with the RDFTriple being processed.
objectRDFNodeThe object associated with the RDFTriple being processed.
No exceptions.
Return type: void
Examples

This section is non-normative.

The following examples demonstrates two mechanisms that are available for navigating through a DataStore; index getter-based iteration and array index-based iteration.

// Get all triples in the document.data.store object
var store = document.data.store.filter(); 

// Loop through the DataStore
for(var i = 0; i < store.size; i++) 
{
   // a developer may use the get() interface to retrieve a triple. This 
   // approach is called index getter-based iteration.
   var t1 = store.get(i);
   
   // alternatively, a developer may use the indexed-method of retrieving a
   // triple from the DataStore. This approach is called array index-based
   // iteration.
   var t2 = store[i];
}

The following example demonstrates a more functional mechanism that can be used to process each triple in a DataStore:

// Specify a callback function as defined by the DataStoreIterator
// interface
function alertTriple(index, subject, property, object)
{
   alert("DataStore subject: " + subject + ", property: " + property + ", object: " + object);
}

// Iterate over the DataStore, executing the alertObject callback for each 
// triple in the DataStore.
document.data.store.forEach(alertTriple);

4.3.3 Data Parser

The DataParser is capable of processing a DOM Element and placing the parsing results into a DataStore. While this section specifies how one would parse RDFa data and place it into a DataStore, the interface is also intended to support the parsing and storage of various Microformats, eRDF, GRDDL, DC-HTML, and Microdata. Web developers that would like to write customer parsers should extend this interface.

[NoInterfaceObject]
interface DataParser {
    attribute DataStore store;
    [Null=Null]
    DataIterator iterate (in optional Object? pattern, in optional Element? element, in optional RDFTripleFilter filter);
    boolean      parse (in Element domElement);
};
Attributes
store of type DataStore
The DataStore that is associated with the DataParser. The results of each parsing run will be placed into the store.
No exceptions.
Methods
iterate
Returns an DataIterator, which is capable of iterating through a set of RDF triples, one RDFTriple at a time. The DataIterator is most useful in small memory footprint environments, or in documents that contain a very large number of triples.
It has been noted that having an stream-based mechanism of processing triples via the iterate() method, and a process-and-store-based mechanism for storing triples via the parse() method on the same interface is confusing. Each mechanism provides an alternate way of processing triples in a document. In the future, iterate() and parse() may be separated out into two distinct interfaces.
ParameterTypeNullableOptionalDescription
patternObject

A filter pattern that determines which triples to select from the DataStore. The filter pattern is provided to make it easy to perform complex filtering on subject, property and object by providing a very flexible data container to convey selection information. The implementation of the pattern parameter is designed to be easily implemented by data dictionaries or associative arrays, but should be done in a way that is natural to the implementation language. To ensure a similar programming environment, the variables to convey subject, property and object parameters should use keys or variables named "subject", "property" and "object", respectively.

The subject parameter is used to filter RDFTriple objects. Values can be of type DOMString, IRI, or BlankNode. If a DOMString is supplied, the API must convert the value into an IRI or BlankNode before comparison is performed. If the subject is specified, the RDFTriple must not be placed in the final output Array unless the given subject matches the RDFTriple's subject. If a subject parameter is not specified, the function must not reject any triple based on the subject unless specified by the RDFTripleFilter.

The property parameter is used to filter RDFTriple objects. Values can be of type DOMString or IRI. If a DOMString is supplied, the API must convert the value into an IRI before comparison is performed. If the property is specified, the RDFTriple must not be placed in the final output Array unless the given property matches the RDFTriple's property. If a property parameter is not specified, the function must not reject any triple based on the property unless specified by the RDFTripleFilter.

The object parameter is used to filter RDFTriple objects. Values can be of type DOMString, BlankNode, PlainLiteral, TypedLiteral or IRI. If a DOMString is supplied, the API must convert the value into a PlainLiteral before comparison is performed. If the object is specified, the RDFTriple must not be placed in the final output Array unless the given object matches the RDFTriple's object. If an object parameter is not specified, the function must not reject any triple based on the object unless specified by the RDFTripleFilter.

elementElementThe parent DOM Element where filtering should start. The implementation must only consider RDF triples on the current DOM Element and its children.
filterRDFTripleFilterA user defined function, returning a true or false value, that determines whether or not an RDFTriple should be added to the final Array.
No exceptions.
Return type: DataIterator
parse
Parses starting at the given DOM Element and populates the store with the information that is discovered. If a starting element isn't specified, or the value of the starting element is Null, then the document object must be used as the starting element.
Even though a specific DOM Element can be specified to start the process of placing RDFTriples into the DataStore, the entire document must be processed by an RDFa Processor due to context that may affect the generation of a set of triples. Specifying the DOM Element is useful when a subset of the document data is to be stored in the DataStore.
There are two ways to approach this mechanism. The first is to only parse the sub-tree, ignoring the context of the greater document. The second is to parse the entire document, but only store triples that are a part of the sub-tree.
ParameterTypeNullableOptionalDescription
domElementElementThe DOM Element that should trigger triple generation.
No exceptions.
Return type: boolean

4.3.4 Data Iterator

The DataIterator interface may undergo changes in the next version of the RDFa API specification. Implementers are warned to not implement the interface and wait for the next revision of this specification.

The DataIterator iterates through a DOM subtree and returns RDFTriples that match a filter function or triple pattern. A DOM Element can be specified so that only triples contained in the Element and its children will be a part of the iteration. The DataIterator is provided in order to allow implementers to provide a less memory intensive implementation for processing triples in very large documents.

A DataIterator is created by calling the document.data.parser.iterate() method.

[NoInterfaceObject]
interface DataIterator {
             attribute DataStore       store;
    readonly attribute Element         root;
    readonly attribute RDFTripleFilter filter;
    readonly attribute RDFTriple       triplePattern;
    RDFTriple next ();
};
Attributes
filter of type RDFTripleFilter, readonly
The RDFTripleFilter is a function that is provided by developers to filter RDFTriples in a subtree.
No exceptions.
root of type Element, readonly
The DOM Element that was used as the starting point for extracting RDFTriples.
No exceptions.
store of type DataStore
The DataStore that is associated with the DataIterator.
No exceptions.
triplePattern of type RDFTriple, readonly
An RDF triple pattern is a set of filter parameters that can be passed to an RDFTripleFilter to match particular triple patterns.
No exceptions.
Methods
next
Returns the next RDFTriple object that is found in the DOM subtree or NULL if no more RDFTriples match the filtering criteria.
No parameters.
No exceptions.
Return type: RDFTriple
Example

This section is non-normative.

The following examples describe the how various filter patterns can be applied to the DOM via document.data.parser.iterate().

// Get a DataIterator via the DataParser interface. A DataIterator allows 
// stream-based access to document data and does not store the matched
// triples into a DataStore. DataIterators are useful in environments where 
// memory is not plentiful and stream-based processing would reduce memory 
// usage.
var iter1 = document.data.parser.iterate();

// Iterate through each triple matched by the iterator
for(var triple = iter1.next(); triple != null; triple = iter1.next()) 
{
   // do something with the RDFTriple
}

// A developer may provide a pattern filter to use when filtering on subject.
// The following iterator would only iterate on triples with a subject
// of "http://www.example.com#foo"
var iter2 = document.data.parser.iterate( {"subject": "http://www.example.com#foo"} );

// A developer may also provide a pattern filter to use when filtering on property.
// The following iterator would only iterate on triples with a property
// of "http://xmlns.com/foaf/0.1/name"
var iter3 = document.data.parser.iterate( {"property": "foaf:name"} );

4.3.5 Property Group

The PropertyGroup interface provides a view on a particular subject contained in the DataStore. The PropertyGroup aggregates the RDFTriples as a single language-native object in order to provide a more natural programming primitive for developers.

PropertyGroup attributes can be accessed in the following ways in ECMAScript:

// Retrieves a PropertyGroup for the given subject - since the IRI is
// fragment-relative, the document's IRI is used as the base IRI.
var person = document.getItemBySubject("#bob");

// Access the PropertyGroup's name attribute via an IRI
var name1 = person.get("http://xmlns.com/foaf/0.1/name");

// Access the PropertyGroup's name attribute via a CURIE
var name2 = person.get("foaf:name");
// At this point name1 and name2 are equivalent
[NoInterfaceObject]
interface PropertyGroup {
    attribute Object info;
    attribute IRI[]  properties;
    Sequence<RDFNode> get (in DOMString property);
};
Attributes
info of type Object
A developer-specific info object to associate with the PropertyGroup. The info object must provide a property-based accessor mechanism for languages that support it, such as ECMAScript. For DOM-based environments, the Element that originated the PropertyGroup must be specified in a property called source in the info attribute.
No exceptions.
properties of type array of IRI
A list of all attributes that are available via this PropertyGroup.
No exceptions.
Methods
get
Returns a sequence of IRIs, BlankNodes, PlainLiterals, and/or TypedLiterals in the projection that have a property IRI that is equivalent to the given value.
ParameterTypeNullableOptionalDescription
propertyDOMStringA stringified IRI representing a property whose values are to be retrieved from the PropertyGroup. For example, using a property of http://xmlns.com/foaf/0.1/name will return a sequence of values that represent FOAF names in the PropertyGroup. The given property may also be a CURIE.
No exceptions.
Return type: Sequence<RDFNode>
Example

This section is non-normative.

The following examples demonstrate how to use the PropertyGroup interface.

// Get a PropertyGroup by subject. In this case, the PropertyGroup will 
// represent Ivan Herman, the current Semantic Web Activity Lead at the
// World Wide Web Consortium
var ivan = document.getItemBySubject("http://www.ivan-herman.net/foaf#me");

// Set the context mapping for the foaf prefix so that we can retrieve 
// properties using CURIEs
document.data.context.setPrefix("foaf", "http://xmlns.com/foaf/0.1/");

// Get the names associated with Ivan Herman
var names = ivan.get("foaf:name");
// At this point, names will be a list containing two items, 
// ["Herman Iván", "Ivan Herman"], Ivan's Westernized name and Ivan's 
// native name.

// Get the titles associated with Ivan Herman
var titles = ivan.get("foaf:title");
// At this point, the titles list will contain only a single item - ["Dr."]

// Get all of the work homepages associated with Ivan
var pages = ivan.get("foaf:workInfoHomepage");
// At this point, the pages list will contain three entries:
// "http://www.iswsa.org/", "http://www.iw3c2.org", and
// "http://www.w3.org/2001/sw/#activity"
The Source Pointer

The RDFa Working Group is currently discussing the best mechanism to enable access to the DOMNode that contains a particular subject, predicate or object. While there have been several mechanisms that have been proposed, none of them are easy or straighforward to use. This mechanism will be modified heavily in the next version of the document in order to allow easier access to the DOMNode associated with a particular piece of structured data:

var people = document.getItemsByType("foaf:Person");
 
for (var i = 0; i < people.length; i++) {
  people[i].info("foaf:name", "source")[0].style.border = "1px solid blue";
}
Property Group Templates

A query can be used to retrieve not only basic PropertyGroups, but can also specify how PropertyGroups are built by utilizing PropertyGroup Templates.

There has been a complaint that this section comes from out of nowhere. The purpose of this section is to describe that PropertyGroups can be mapped to native language objects to ease development. We may need to elaborate more on this at this point in the document to help integrate this section with the flow of the document.

For example, assume our source document contains the following event, marked up using the Google Rich Snippet Event format (example taken from the Rich Snippet tutorial, and slightly modified):

<div prefix="v: http://rdf.data-vocabulary.org/#" typeof="v:Event"> 
  <a rel="v:url" href="http://amyandtheredfoxies.example.com/events" 
     property="v:summary">Tour Info: Amy And The Red Foxies</a>
  
  <span rel="v:location">
  	<a typeof="v:Organization" rel="v:url" href="http://www.kammgarn.de/" property="v:name">Kammgarn</a>
  </span>
  <div rel="v:photo"><img src="foxies.jpg"/></div>
  <span property="v:summary">Hey K-Town, Amy And The Red Foxies will rock Kammgarn in October.</span>
  When: 
  <span property="v:startDate" content="20091015T19:00">15. Oct., 7:00 pm</span>-
  <span property="v:endDate" content="20091015T21:00">9:00 pm</span>
  </span>

  Category: <span property="v:eventType">concert</span>
</div>

To query for all Event PropertyGroups we know that we can do this:

var ar = query.select({ "rdf:type": "http://rdf.data-vocabulary.org/#Event" });

However, to build a special PropertyGroup that contains the summary, start date and end date, we need only do this:

var events = query.select({ "rdf:type": "http://rdf.data-vocabulary.org/#Event" },
                          {"rdf:type" : "type", "v:summary": "summary", 
                           "v:startDate": "start", "v:endDate": "end"} );

The second parameter is a Property Group Template. Each key-value pair specifies an IRI to map to an attribute in the resulting PropertyGroup object.

Exposing the embedded data in each PropertyGroup makes it easy to create an HTML anchor that will allow users to add the event to their Google Calendar, as follows:

var anchor, button, i, pg;
 
for (i = 0; i < events.length; i++) {
  // Get the PropertyGroup
  var event = events[i];
 
  // Create the anchor
  anchor = document.createElement("a");
 
  // Point to Google Calendar
  anchor.href = "http://www.google.com/calendar/event?action=TEMPLATE"
    + "&text=" + event.summary + "&dates=" + event.start + "/" + event.end;
 
  // Add the button
  button = document.createElement("img");
  button.src = "http://www.google.com/calendar/images/ext/gc_button6.gif";
  anchor.appendChild(button);
 
  // Add the link and button to the DOM object
  // NOTE: The next call will likely change in the next version of the RDF
  //       API specification as it is too difficult to use for most developers.
  event.info("rdf:type", "source")[0].appendChild(anchor);
}

The result will be that the event has an HTML a element at the end (and any Event on the page will follow this pattern):

<div vocab="http://rdf.data-vocabulary.org/#" typeof="Event">
  .
  .
  .
  <a href="http://www.google.com/calendar/event?action=TEMPLATE& →
text=Hey+K-Town,+Amy+And+The+Red+Foxies+will+rock+Kammgarn+in+October.&dates=20091015T1900Z/20091015T2100Z">
    <img src="http://www.google.com/calendar/images/ext/gc_button6.gif" />
  </a>
</div>

For more detailed information about queries see the DataQuery interface.

4.3.6 Data Query

The DataQuery interface provides a means to query a DataStore. While this interface provides a simple mechanism for querying a DataStore for RDFa, it is expected that developers will implement other query interfaces that conform to this DataQuery interface for languages like SPARQL or other Domain Specific Language.

[NoInterfaceObject]
interface DataQuery {
    attribute DataStore store;
    Sequence<PropertyGroup> select (in Object? query, in optional Object template);
};
Attributes
store of type DataStore
The DataStore that is associated with the DataQuery.
No exceptions.
Methods
select
Generates a sequence of PropertyGroups that matches the given selection criteria.
ParameterTypeNullableOptionalDescription
queryObjectAn associative array containing properties as keys and objects to match as values. If the query is null, every item in the DataStore that the query is associated with must returned.
templateObjectA template describing the attributes to create in each PropertyGroup that is returned. The template is an associative array containing properties as keys and attribute names that should be created in the returned PropertyGroup as values.
No exceptions.
Return type: Sequence<PropertyGroup>

4.4 The Document Interface

The RDFa API is designed to provide a small, powerful set of interfaces that a developer may use to retrieve RDF triples from a Web document. The core interfaces were described in the previous two sections. This section focuses on the final RDFa API that most developers will utilize to generate the objects that are described in the RDF Interfaces and the Structured Data Interfaces sections. The following API is provided by this specification:

  • Document Interface Extensions — A set of extensions to the Document interface to help developers manage structured data in Web documents.
  • Document Data — The abstract container object for managing structured data in a Document.

4.4.1 Document Extensions

The following section describes all of the extensions that are necessary to enable manipulation of structured data within a Web Document.

[Supplemental, NoInterfaceObject]
interface RDFaDocument {
    readonly attribute DocumentData data;
    DocumentData            createDocumentData ();
    Sequence<PropertyGroup> getItemsByType (in DOMString type);
    PropertyGroup           getItemBySubject (in DOMString subject);
    Sequence<PropertyGroup> getItemsByProperty (in DOMString property, in DOMString value);
    NodeList                getElementsByType (in DOMString type);
    NodeList                getElementsBySubject (in DOMString subject);
    NodeList                getElementsByProperty (in DOMString property, in DOMString value);
};
Attributes
data of type DocumentData, readonly
The DocumentData interface is useful for extracting and storing data that is associated with the Document.
No exceptions.
Methods
createDocumentData
Creates a DocumentData object and returns it. The object that is returned must have the store, context, parser and query attributes initialized to sensible defaults that would allow the immediate extraction of RDFa data from the current document by calling DocumentData.parser.parse(document).
No parameters.
No exceptions.
Return type: DocumentData
getElementsByProperty
Retrieves a list of Nodes objects based on the value of a given property.
ParameterTypeNullableOptionalDescription
propertyDOMStringA DOMString representing an IRI-based property. The string can either be a full IRI or a CURIE.
valueDOMStringA DOMString representing the value to match against.
No exceptions.
Return type: NodeList
getElementsBySubject
Retrieves a NodeList consisting of Nodes that have explicitly specified the given subject.
ParameterTypeNullableOptionalDescription
subjectDOMStringA DOMString representing an IRI-based subject. The string can either be a full IRI or a CURIE.
No exceptions.
Return type: NodeList
getElementsByType
Retrieves a list of Nodes based on the object type of the data that they specify.
ParameterTypeNullableOptionalDescription
typeDOMStringA DOMString representing an rdf:type to select against.
No exceptions.
Return type: NodeList
getItemBySubject
Retrieves a PropertyGroup object based on its subject.
ParameterTypeNullableOptionalDescription
subjectDOMStringA DOMString representing an IRI-based subject. The string can either be a full IRI or a CURIE.
No exceptions.
Return type: PropertyGroup
getItemsByProperty
Retrieves a list of PropertyGroup objects based on the values of a property.
ParameterTypeNullableOptionalDescription
propertyDOMStringA DOMString representing an IRI-based property. The string can either be a full IRI or a CURIE.
valueDOMStringA DOMString representing the value to match against.
No exceptions.
Return type: Sequence<PropertyGroup>
getItemsByType
Retrieves a list of PropertyGroup objects based on their rdf:type property.
ParameterTypeNullableOptionalDescription
typeDOMStringA DOMString representing an rdf:type to select against.
No exceptions.
Return type: Sequence<PropertyGroup>
Document implements RDFaDocument;
All instances of the DOM Document interface must implement RDFaDocument.

4.4.2 DOMImplementation Extensions

If the RDFa API is implemented in a DOM environment and a DOMImplementation interface is provided, the following additional requirements for the hasFeature() method must be met:

interface DOMImplementation {
    boolean hasFeature (in DOMString feature, in DOMString version);
};
Methods
hasFeature
Checks to see whether or not the DOM implementation has exposed all of the mandatory RDFa API features specified in this specification. An implementation that supports all of the mandatory features in this specification must return true for a feature string of "RDFaAPI" and a version string of "1.1".
ParameterTypeNullableOptionalDescription
featureDOMStringThe feature string to use when checking to see if the DOM environment exposes all of the RDFa API attributes and methods.
versionDOMStringThe version string to use when checking to see if the DOM environment exposes all of the RDFa API attributes and methods.
No exceptions.
Return type: boolean

4.4.3 Document Data

The DocumentData interface is used to create structured-data related context, storage, parsing and query objects.

interface DocumentData {
    attribute DataStore   store;
    attribute DataContext context;
    attribute DataParser  parser;
    attribute DataQuery   query;
    DataContext createContext ();
    DataStore   createStore (in optional DOMString type);
    DataParser  createParser (in DOMString type, in DataStore store);
    DataQuery   createQuery (in DOMString type, in DataStore store);
};
Attributes
context of type DataContext
The default DataContext for the document.
No exceptions.
parser of type DataParser
The default DataParser for the document.
No exceptions.
query of type DataQuery
The default DataQuery for the document.
No exceptions.
store of type DataStore
The default DataStore for the document.
No exceptions.
Methods
createContext
Creates a DataContext and returns it.
No parameters.
No exceptions.
Return type: DataContext
createParser
Creates a DataParser of the given type and returns it.
ParameterTypeNullableOptionalDescription
typeDOMStringThe type of DataParser to create. A value of "rdfa" must be accepted for all conforming implementations of this specification.
storeDataStoreThe DataStore to associate with the DataParser.
No exceptions.
Return type: DataParser
createQuery
Creates a DataQuery for the given store.
ParameterTypeNullableOptionalDescription
typeDOMStringThe type of query to create for the given store. A value of "rdfa" must be accepted for all conforming implementations of this specification. Implementations may provide alternative query interfaces, such as SPARQL, SQL, HQL, GQL, or other query languages to enable innovative new ways of querying the underlying storage mechanism.
storeDataStoreThe DataStore to associate with the DataQuery.
No exceptions.
Return type: DataQuery
createStore
Creates a DataStore and returns it. If the type is not specified, a DataStore must be created and returned. Alternatively, developers may provide other DataStore implementations such as persisted triple stores, quad stores, distributed graph stores and other more advanced storage mechanisms. The type determines the underlying DataStore that will be created.
ParameterTypeNullableOptionalDescription
typeDOMStringThe type of DataStore to create. A value of "triple" must be accepted for all conforming implementations of this specification. If the type is omitted, a value of "triple" must be assumed.
No exceptions.
Return type: DataStore

4.4.4 Pattern Filters

An important goal of the RDFa API is to help Web developers filter the set of RDF triples in a document down to only the ones that interest them. This section covers pattern-based filters. Pattern filters trigger off of one or more of the subject, property, or object properties in RDF triples. This section also introduces the interfaces for the other filter types.

Function Filters

Filter criteria may also be defined by the developer as a filter function. The RDFTripleFilter is a callable function that determines whether an RDFTriple should be included in the set of output triples.

[NoInterfaceObject, Callback, Null=Null]
interface RDFTripleFilter {
    boolean match (in RDFTriple triple);
};
Methods
match
A callable function that returns true if the input RDFTriple should be included in the output set, or false if the input RDFTriple should be rejected from the output set.
ParameterTypeNullableOptionalDescription
tripleRDFTripleThe triple to test against the filter.
No exceptions.
Return type: boolean
Example

This section is non-normative.

The examples below use the following HTML code:

<div id="start" about="http://dbpedia.org/resource/Albert_Einstein">

  <span property="foaf:name">Albert Einstein</span>

  <span property="dbp:dateOfBirth" datatype="xsd:date">1879-03-14</span>
  <div rel="dbp:birthPlace" resource="http://dbpedia.org/resource/Germany">
    <span property="dbp:conventionalLongName">Federal Republic of Germany</span>
  </div>
</div>

The following examples demonstrate the use of document.data.store.filter() and document.data.parser.iterate() in ECMAScript.

// create a filter function that filters on triples with properties in the 
// foaf namespace.
function myFilter(element, subject, property, object)
{
   if(subject.value.search(/http:\/\/xmlns.com\/foaf\/0\.1/) >= 0)
   {
      return true;
   }
}

// start filtering at the element with the id attribute whose value is "start"
// using the DataStore's filter() method
var store = document.data.store.filter({}, document.getElementById("start"), myFilter);
store.forEach(function (index, subject, property, object) { alert(object); });
// The code above will display one alert dialog box containing the following 
// text: "Albert Einstein".

// start filtering at the element with the id attribute whose value is "start"
// using the DataParser's iterate() method
var iter = document.data.parser.iterate({}, document.getElementById("start"), myFilter);
for(var triple=iter.next(); triple != null; triple = iter.next())
{
   alert(triple.object);
}
// The code above will display one alert dialog box containing the following 
// text: "Albert Einstein".

5. The Initialization Process

The RDFa API must be initialized before the Web developer has access to any of the methods that are defined in this specification. To initialize the API environment in a Browser-based environment, an implementor must do the following:

  1. create a default Store object, which will hold information obtained from parsing;
  2. create a defaul Parser object, passing it a pointer to a store;
  3. initiate parsing, to extract information from some object -- usually a DOM object -- and place it into the store;
  4. create a default Query object which can be used to interrogate the information placed in the store;

Some platforms may merge one or more of these steps as a convenience to developers. For example, a browser that supports this API may carry out the first four steps when a document loads, and then expose a Query interface to allow developers to access the PropertyGroups. Some approaches to this will be discussed in the next section, but before we look at those, we'll give a brief overview of how each of these phases would normally be accomplished.

5.1 Creating the Data Store

To create a store the createStore method is called:

document.data.store = document.data.createStore();

The store object created supports the Store interfaces providing methods to add metadata to the store. These methods are used during parsing to populate the store but they can also be used directly to add additional information. Examples of this are shown later.

5.2 Creating the Data Parser

Once a store has been created, the implementor should create a default parser:

document.data.parser = document.data.createParser("rdfa", store);

Note that an implementation may support many types of parser, so the specific parser required needs to be specified. For example, an implementation may also support a Microformats hCard parser:

var parser = document.data.createParser("hCard", store);

Implementations may also support different versions of a parser, for example:

var parser1 = document.data.createParser("rdfa1.0", store);
var parser2 = document.data.createParser("rdfa1.1", store);

Probably should have a URI to identify parsers rather than a string, since not only are there many different Microformats, but also, people may end up wanting to add parsers for RDF/XML, different varieties of JSON, and so on. However, if we treat the parameter here as a CURIE, then we can avoid having long strings. If we do that, then the version number would need to be elided with the language type: "rdfa1.0", "rdfa1.1", and so on.

5.3 Parsing the DOM

Once we have a parser, we can use it to extract information from sources that contain embedded data. In the following example we extract data from the Document object:

parser.parse( document );

Since the parser is connected to a store, the PropertyGroups obtained from processing the document are now available in the variable document.data.store.

A store can be used more than once for parsing. For example, if we wanted to apply an hCard Microformat parser to the same document, and put the extracted data into the same store, we could do this:

var store = document.data.createStore();
 
document.data.createParser("rdfa", store).parse();
document.data.createParser("hCard", store).parse();

The store will now contain PropertyGroups from the RDFa parsing, as well as PropertyGroups from the hCard parsing.

If the developer wishes to reuse the store but clear it first, then the clear() method on the DataStore interface can be used.

Diagram: Show the connection between a PropertyGroup and the DOM.

5.4 Creating the Data Query

Query objects are used to interrogate stores and obtain a list of DOM objects that are linked to PropertyGroups. Since there are a number of languages and techniques that can be used to express queries, we need to specify the type of query object that we'd like:

var query = document.data.createQuery("rdfa", store);

6. Future Discussion

This section is non-normative.

The current version of the RDFa API focuses on filtering RDF triples. It also provides methods for filtering DOM Nodes that contain certain types of RDF triples.

The RDFa Working Group is currently discussing whether or not to include the following advanced functionality:

A. Acknowledgements

At the time of publication, the members of the RDFa Working Group were:

B. References

B.1 Normative references

[BCP47]
A. Phillips, M. Davis. Tags for Identifying Languages September 2009. IETF Best Current Practice. URL: http://tools.ietf.org/rfc/bcp/bcp47.txt
[HTML-RDFA]
Manu Sporny; et al. HTML+RDFa 04 March 2010. W3C Working Draft. URL: http://www.w3.org/TR/rdfa-in-html/
[IRI]
M. Duerst, M. Suignard. Internationalized Resource Identifiers (IRI). January 2005. Internet RFC 3987. URL: http://www.ietf.org/rfc/rfc3987.txt
[RDF-CONCEPTS]
Graham Klyne; Jeremy J. Carroll. Resource Description Framework (RDF): Concepts and Abstract Syntax. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210
[RDF-TESTCASES]
Jan Grant; Dave Beckett. RDF Test Cases. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-testcases-20040210
[RDFA-CORE]
Shane McCarron; et al. RDFa Core 1.1: Syntax and processing rules for embedding RDF through attributes.3 August 2010. W3C Working Draft. URL: http://www.w3.org/TR/2010/WD-rdfa-core-20100803
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Internet RFC 2119. URL: http://www.ietf.org/rfc/rfc2119.txt
[WEBIDL]
Cameron McCormack, Sam Weinig. Web IDL 11 March 2010. W3C Editor's Draft. URL: http://dev.w3.org/2006/webapi/WebIDL/
[XHTML-RDFA]
Shane McCarron; et. al. XHTML+RDFa 1.1. 3 August 2010. W3C Working Draft. URL: http://www.w3.org/TR/WD-xhtml-rdfa-20100803

B.2 Informative references

[DOM-LEVEL-1]
Vidur Apparao; et al. Document Object Model (DOM) Level 1. 1 October 1998. W3C Recommendation. URL: http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/
[ECMA-262]
ECMAScript Language Specification, Third Edition. December 1999. URL: http://www.ecma-international.org/publications/standards/Ecma-262.htm
[MICROFORMATS]
Microformats. URL: http://microformats.org
[RDFA-PRIMER]
Mark Birbeck; Ben Adida. RDFa Primer. 14 October 2008. W3C Note. URL: http://www.w3.org/TR/2008/NOTE-xhtml-rdfa-primer-20081014
[RDFA-SYNTAX]
Ben Adida, et al. RDFa in XHTML: Syntax and Processing. 14 October 2008. W3C Recommendation. URL: http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014
[SVGTINY12]
Scott Hayman; et al. Scalable Vector Graphics (SVG) Tiny 1.2 Specification. 22 December 2008. W3C Recommendation. URL: http://www.w3.org/TR/2008/REC-SVGTiny12-20081222
[TURTLE]
David Beckett, Tim Berners-Lee. Turtle: Terse RDF Triple Language January 2008. W3C Team Submission. URL: http://www.w3.org/TeamSubmission/turtle/