W3C

The RDFa DOM API

W3C Editor's Draft 31 March 2010

This Version:
http://dev.w3.org/rdfa/rdfa-dom-api.html
Latest Published Version:
http://www.w3.org/TR/rdfa-dom-api/
Latest Editor's Draft:
http://dev.w3.org/rdfa/rdfa-dom-api.html
Previous version:
none
Editors:
Benjamin Adrian, German Research Center for Artificial Intelligence GmbH
Manu Sporny, Digital Bazaar, Inc.

Abstract

This specification defines an Application Programming Interface (API) for accessing RDFa[RDFA-SYNTAX] data contained in the Document Object Model (DOM) of a structured document, such as SVG, XHTML or HTML. The so called RDFa DOM API can be used to extract specific RDF triples from DOM nodes that contain RDFa. Vice versa, the RDFa DOM API provides methods to retrieve specific DOM nodes that contain specific RDF triple patterns inside existing RDFa content.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was published by the RDFa Working Group as an Editor's Draft. If you wish to make comments regarding this document, please send them to public-rdfa-wg@w3.org@w3.org (subscribe, archives). All feedback is welcome.

Publication as a Editor's Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

RDFa [RDFA-SYNTAX] has seen substantial growth since it became an official W3C Recommendation in October 2008. It has seen wide adoption among search companies, e-commerce sites, governments, and content management systems. There are numerous interoperable implementations and growth is expected to continue to rise.

In an effort to ensure that browser-based applications are able to fully utilize RDFa, this specification outlines a set of routines that are capable of extracting RDFa from web pages for use in Javascript applications, as well as browser extensions. The RDFa DOM API allows a programmer to query the RDFa data expressed by a document and then make programmatic decisions based on the semantic data embedded in the page.

2. The RDFa DOM API

The RDFa DOM API strives to be a simple set of calls that a developer may use to retrieve triples contained in the document. It is an extension of the existing DOM specification [DOM-LEVEL-1]. The key design issue of the RDFa DOM API is built on using filters for retrieving RDF triples from DOM nodes or retrieving DOM nodes with RDFa content. Therefore it first defines a set of basic types. These types (i.e., RDF Resource, RDF Literal, RDF Node, URI Reference, and Blank node) extend the DOM API in order to program with RDF triples. For retrieving RDF triples from DOM nodes, the RDFa DOM API makes use of so called RDFTripleFilters. Inspired by NodeFilters in the DOM specification, RDFTripleFilters can be user defined and provide a high flexibility in filtering RDF triples for application programmers. For conveinience reasons, the RDFa DOM API provides a set of predefined RDFTripleFilters (i.e.,RDF_OBJECT_IS_LITERAL, RDF_OBJECT_IS_URI, RDF_PREDICATE_IS_TYPE) that cover the main use cases in retrieving RDF from RDFa. In order to retrieve DOM nodes that contain certain RDFa content, the RDFa DOM API extended the w3c:dom:Node interface with a lookup method called containsRDFa(). An RDFTripleFilters can be passed as optional parameter in order to test DOM nodes for certain RDF triples.

The Hierarchy of Basic Datatypes

The basic types of the RDFa DOM API are DOMString, and a hierarchy starting with RDF Resource with its descendants: RDF Literal, RDF Node, URI Reference, and Blank Node.

The hierarchy is defined as follows:

3. RDF Resource

An RDFResource is the abstract root type in the hierarchy of RDF datatypes used in the RDFa DOM API.
interface RDFResource {
    RDFLiteral createLiteral (in DOMString value, in optional DOMString language);

    RDFLiteral createLiteral (in DOMString value, in URI type);

    URI        createURI (in DOMString value);
    URI        createURI (in DOMString namespace, in DOMString suffix);

    BlankNode  createBlankNode (in DOMString id);
};

3.1 Methods

createBlankNode
Creates a new instance of type BlankNode.
ParameterTypeNullableOptionalDescription
idDOMStringThe identifier of the blank node.
No exceptions.
Return type: BlankNode
createLiteral
Creates a new RDFLiteral that is a plain literal in RDF.
ParameterTypeNullableOptionalDescription
valueDOMStringThe lexical value of this literal encoded in the character encoding of the source document.
languageDOMStringA two characters long language tag as defined by [RFC-3066], normalized to lowercase.
No exceptions.
Return type: RDFLiteral
createLiteral
Creates a new RDFLiteral that is typed literal in RDF.
ParameterTypeNullableOptionalDescription
valueDOMStringThe lexical value of this literal encoded in the character encoding of the source document.
typeURIA datatype identified by an URI reference.
No exceptions.
Return type: RDFLiteral
createURI
Creates a new instance of type URI.
ParameterTypeNullableOptionalDescription
valueDOMStringThe lexical value of this URI.
No exceptions.
Return type: URI
createURI
Creates a new instance of type URI.
ParameterTypeNullableOptionalDescription
namespaceDOMStringThe namespace component of the URI's lexical representation.
suffixDOMStringThe suffix component of the URI's lexical representation.
No exceptions.
Return type: URI

4. RDF Literal

An RDF Literal represents lexical values being defined in RDFa data. In RDF, literals may be attached with language information about the given text given as language tag, or a datatype given as URI reference. The language tag is an identifier for a certain language as defined by [RFC-3066]. The datatype's URI reference defines the datatype of the text value, e.g., xsd:DataTime or xsd:boolean. According to the RDF specification, for a given RDF Literal, either language or type information can be given. If the type is set, the RDF Literal conforms to the RDF specification of Typed Literal in [RDF-CONCEPTS]. Otherwise it conforms to be a Plain Literal.

interface RDFLiteral : RDFResource {

    readonly attribute DOMString value;
    readonly attribute DOMString language;
    readonly attribute URI       type;

};

4.1 Attributes

language of type DOMString, readonly
A two characters long language tag as defined by [RFC-3066], normalized to lowercase. If language is set, then the value of type is a URI with value rdf:Literal
No exceptions.
type of type URI, readonly
A datatype identified by an URI reference
No exceptions.
value of type DOMString, readonly
The lexical value of this literal encoded in the character encoding of the source document.
No exceptions.

5. RDF Node

An RDF Node is an abstract type in an RDF graph that subsumes the disjunctive subtypes URI and BlankNode.
interface RDFNode : RDFResource {
};

6. URI Reference

A URI in the RDFa DOM API is a URI reference as specified in [RDF-CONCEPTS]. A URI reference consists of a lexical representation that can be split into two components, namespace and suffix. Two URI references are the same if their lexical representation equals on character level.
interface URI : RDFNode {
    readonly attribute DOMString value;

    readonly attribute DOMString namespace;
};

6.1 Attributes

namespace of type DOMString, readonly
The namespace component of the URI reference.
No exceptions.
value of type DOMString, readonly
The lexical representation of the URI reference.
No exceptions.

7. BlankNode

A BlankNode is an anonymous RDFNode as defined in [RDF-CONCEPTS]. For comparing two BlankNodes an identifier is used. Two BlankNodes are the same if the their identifiers equal on character level.
interface BlankNode : RDFNode {
    readonly attribute DOMString id;
};

7.1 Attributes

id of type DOMString, readonly
The identifier of the BlankNode
No exceptions.

8. RDFTriple

In the RDFa DOM API, RDFTriple defines the data structure to represent an RDF triples as specified in [RDF-CONCEPTS]. The RDFa DOM API is intended to extract RDFTriple objects from the source document.

interface RDFTriple {
    readonly attribute RDFNode     subject;

    readonly attribute URI         predicate;
    readonly attribute RDFResource object;
    readonly attribute URI         context;

    RDFTriple createTriple (in RDFNode? subject, in URI? predicate, in RDFResource? object);

};

8.1 Attributes

context of type URI, readonly
A URI, representing the base URI of the source document.
No exceptions.
object of type RDFResource, readonly
Object value of an RDFTriple.
No exceptions.
predicate of type URI, readonly
Predicate value of an RDFTriple.
No exceptions.
subject of type RDFNode, readonly
Subject value of an RDFTriple.
No exceptions.

8.2 Methods

createTriple
Create a new RDFTriple object. If subject, predicate, or object values are set to null the RDFTriple is concerned to be used as triple patterns in an RDFTripleFilter.
ParameterTypeNullableOptionalDescription
subjectRDFNodeSubject value of the RDFTriple.
predicateURIPredicate value of the RDFTriple.
objectRDFResourceObject value of the RDFTriple.
No exceptions.
Return type: RDFTriple

9. RDF Triple List

The RDFTripleList is a plain list of RDFTriple objects.

interface RDFTripleList {
    readonly attribute unsigned long length;

    RDFTriple get (in unsigned long index);
};

9.1 Attributes

length of type unsigned long, readonly
A positive integer value less than RDFTripleList::length that represents an index inside the RDFTripleList.
No exceptions.

9.2 Methods

get
Returns the RDFTriple object inside this list at position index.
ParameterTypeNullableOptionalDescription
indexunsigned longA positive integer value less than RDFTripleList::length that represents an index inside the RDFTripleList
No exceptions.
Return type: RDFTriple

10. Iterating through RDF Triples

The RDFTripleIterator is an Iterator through RDFa content inside Nodes of the DOM tree.

interface RDFTripleIterator {
    readonly attribute Node            root;

    readonly attribute RDFTripleFilter filter;
    RDFTriple         previousRDFTriple ();
    RDFTriple         nextRDFTriple ();

    RDFTripleIterator createRDFTripleIterator (in Node root, in RDFTripleFilter filter);

};

10.1 Attributes

filter of type RDFTripleFilter, readonly
The RDFTripleFilter object that filters the Node objects in the subtree for certain RDFa content.
No exceptions.
root of type Node, readonly
The Node inside the DOM Tree taken as root node to start from the extraction of RDFa content.
No exceptions.

10.2 Methods

createRDFTripleIterator
Create new instance of RDFTripleIterator.
ParameterTypeNullableOptionalDescription
rootNodeThe Node inside the DOM Tree taken as root node to start from the extraction of RDFa content.
filterRDFTripleFilterAn RDFTripleFilter object that filters the Node objects in the subtree for certain RDFa content.
No exceptions.
Return type: RDFTripleIterator
nextRDFTriple
Returns the next RDFTriple object that is found inside the subtree of DOM nodes or NULL if no more exist.
No parameters.
No exceptions.
Return type: RDFTriple
previousRDFTriple
Returns the previous RDFTriple object that is found inside the subtree of DOM nodes or NULL if no previous RDFTriples exist.
No parameters.
No exceptions.
Return type: RDFTriple

11. Filtering RDF Triples

The RDFa DOM API provides this RDF filter for testing RDFTriple for certain conditions. These conditions can be implemented by application developers. An RDFTripleFilter may be created with a default RDF triple used as dynamic filter pattern. E.g., if an RDFTripleFilter is specified to filter all RDF triples with rdfs:label as predicate, the developer might want to use this filter to search for labels of instances of rdf:type foaf:Person and foaf:Organisation. Therefore he creates the RDF triple patterns (null, rdf:type, foaf:Person) and (null, rdf:type, foaf:Organisation) and passes them to his predefined filter instead of writing additonal two filters. The RDFa DOM API predefines a bunch of static RDFTripleFilter objects for common RDF queries:

interface RDFTripleFilter {
    boolean acceptRDFTriple (in RDFTriple triple, in optional RDFTriple pattern);

};

11.1 Methods

acceptRDFTriple
This function returns true if the passed RDFTriple triple is tested as valid. Otherwise it returns false.
ParameterTypeNullableOptionalDescription
tripleRDFTripleThe RDF triple that should be tested.
patternRDFTripleThe RDF triple pattern given as paramater that should be also tested.
No exceptions.
Return type: boolean

12. Querying and Testing DOM nodes for RDF Triples

This is an extension of w3c:org:dom::Document. It adds one function to retrieve RDF data from web pages.

interface Document {
    readonly attribute RDFTripleFilter RDF_PREDICATE_IS_TYPE;

    readonly attribute RDFTripleFilter RDF_OBJECT_IS_LITERAL;
    readonly attribute RDFTripleFilter RDF_OBJECT_IS_URI;
    RDFTripleList getRDFTriples (in optional RDFTripleFilter filter);

};

12.1 Attributes

RDF_OBJECT_IS_LITERAL of type RDFTripleFilter, readonly
A predefined filter that tests for RDF triples T with T.object typeof RDFLiteral
No exceptions.
RDF_OBJECT_IS_URI of type RDFTripleFilter, readonly
A predefined filter that tests for RDF triples T with T.object typeof URI
No exceptions.
RDF_PREDICATE_IS_TYPE of type RDFTripleFilter, readonly
A predefined filter that tests for RDF triples T with T.predicate = RDF.type
No exceptions.

12.2 Methods

getRDFTriples
This function extracts RDFa content from the current source document in form of a list of RDFTriple objects.
ParameterTypeNullableOptionalDescription
filterRDFTripleFilterAn RDFTripleFilter object that filters the Node objects in the DOM tree for certain RDFa content.
No exceptions.
Return type: RDFTripleList

This is an extension of DOM::Node specification. It add two methods, one for testing if RDFa content can be extracted from a Node, the other for extracting this content as RDFTripleList.

interface Node {
    boolean containsRDFa (in optional RDFTripleFilter filter);
};

12.3 Methods

containsRDFa
This function returns true if the current DOM Node object contains any RDFa content. An optional RDFTripleFilter can be passed to test Node objects for certain kind of RDFTriples.
ParameterTypeNullableOptionalDescription
filterRDFTripleFilter This optional RDFTripleFilter can be used to test for certain kinds of RDFa content.
No exceptions.
Return type: boolean

Storing and Retrieving RDF Triples

This section will contain best practices on storing and retrieving triples on a page.

A. Acknowledgements

This document has been prepared with the help of the following people (in alphabetical order):

B. References

B.1 Normative references

[DOM-LEVEL-1]
Vidur Apparao; et al. Document Object Model (DOM) Level 1. 1 October 1998. W3C Recommendation. URL: http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001/
[RDF-CONCEPTS]
Graham Klyne; Jeremy J. Carroll. Resource Description Framework (RDF): Concepts and Abstract Syntax. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210
[RDFA-SYNTAX]
Steven Pemberton; et al. RDFa in XHTML: Syntax and Processing. 14 October 2008. W3C Recommendation. URL: http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014

B.2 Informative references

No informative references.