W3C

The RDFa DOM API

W3C Working Draft 21 April 2010

This version:
http://www.w3.org/TR/2010/WD-rdfa-dom-api-20100421/
Latest published version:
http://www.w3.org/TR/rdfa-dom-api/
Latest editor's draft:
http://dev.w3.org/rdfa/rdfa-dom-api.html
Editors:
Benjamin Adrian, German Research Center for Artificial Intelligence GmbH
Manu Sporny, Digital Bazaar, Inc.

Abstract

This specification defines an Application Programming Interface (API) for accessing RDFa [RDFA-SYNTAX] data contained in the Document Object Model (DOM) [DOM-LEVEL-3-CORE] of a structured document, such as SVG [SVG12], XHTML [[!XHTML+RDFa]] or HTML [[!XHTML+RDFa]].

RDFa is a serialization format of the Resource Description Framework RDF [RDF-PRIMER]. The RDFa DOM API provides methods to extract RDF triples from the document that contain RDFa data. The RDFa DOM API also provides a mechanism that allows developers to search the document for RDF data using developer-defined search functions.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was published by the RDFa Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-rdfa-wg@w3.org@w3.org (subscribe, archives). All feedback is welcome.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

RDFa [RDFA-SYNTAX] has seen substantial growth since it became an official W3C Recommendation in October 2008. It has seen wide adoption among search companies, e-commerce sites, governments, and content management systems. There are numerous interoperable implementations and growth is expected to continue to rise.

In an effort to ensure that browser-based applications are able to fully utilize RDFa, this specification outlines a set of interfaces that are capable of finding and extracting RDFa from web pages or other document formats that are based on the Document Object Model (DOM). The RDFa DOM API is designed with ease of use in mind. A deep understanding of RDFa is no longer necessary to extract and utilize the structured data embedded in RDFa documents.

TODO: Write more on: Why do we need an RDFa DOM API?

Most browser-based applications and browser extensions that utilize HTML documents are written in Javascript [ECMA-262]. For this reason, the RDFa DOM API is designed to be implemented and used primarily in Javascript in a browser environment. The RDFa DOM API specification is also easily implemented in other languages commonly used in Semantic Web applications, e.g., Python, Java, Perl, or Ruby.

2. How to Read this Document

The specification of the RDFa DOM API is written in the Web Interface Definition Language [WEBIDL].

TODO: List vocabularies used in examples.

The RDFa DOM API uses RDF as underlying data model [RDF-SYNTAX]. Therefore, it provides interface definitions for programming with plain and typed Literals, URI References and Blank nodes. These interfaces are the components to work with RDFTriples in the RDFa DOM API.

3. The RDFa DOM API

TODO: What features does an RDFa DOM API provide?
TODO: How does an RDFa DOM API provide those features?

The RDFa DOM API strives to be a simple set of methods that a developer may use to retrieve RDF triples contained in a document.

TODO: Describe how RDFa DOM API extends Javascript use in browsers in order to extract RDF from DOM nodes. Describe the constant object rdfa.

The key mechanism for extracting RDF data from the document focuses on the use of filters. Filters are used to select RDF triple data as well as elements that are contained in the document.

TODO: Describe idea behind rdfa.filter, rdfa.list, rdfa.iterate

The RDFa DOM API uses a concept called a RDFTripleFilter to retrieve data from DOM nodes. Inspired by NodeFilters [DOM-LEVEL-2-TRAVERSAL-RANGE], RDFTripleFilters can be user defined and allow application programmers a highly flexible way to filter RDF triples.

In order to retrieve DOM nodes that contain RDFa content, the RDFa DOM API provides a method to test DOM nodes. RDFTripleFilters can be passed as an optional parameter to rdfa.containsRDFa() in order to test DOM nodes for RDF triples that match the given filter.

Finally, the following RDFa DOM API allows a programmer to query the RDFa data expressed by a document and then make programmatic decisions based on the semantic data embedded in the page.

Basic Datatypes

The working group has not reached consensus whether or not a traditional object model should be used to express the items returned by the RDFa DOM API, or if a simpler object representation format, such as a Javascript associative Array would be more appropriate for processing RDF data in Javascript.

The basic types of the RDFa DOM API are DOMString, Plain RDF Literal, Typed RDF Literal, URI reference, and Blank Node. (The normative specification of these types is described in RDFa in XHTML: Syntax and Processing [RDFA-SYNTAX].)

4. RDF Literals

An RDF Literal represents lexical values in RDFa data. In RDF, literals may be either attached with language information about the given text given in form of a language tag (e.g., 'en', 'fr', 'de'), or a datatype given as URI reference (e.g., [xsd:DataTime]). The former concept is called plain literal. The latter concept is called typed literal.

The language tag is an identifier for a certain language as defined by [RFC-3066]. The datatype's URI reference defines the datatype of the text value, e.g., xsd:DataTime or xsd:boolean. According to the RDF specification, for a given RDF Literal, either language or type information can be given. If the type is set, the RDF Literal conforms to the RDF specification of Typed Literal in [RDF-CONCEPTS]. Otherwise it conforms to be a Plain Literal.

The RDFa DOM API provides a method to explicitly cast TypedRDFLiteral values to datatypes that are specified in the native Programming language that implements the RDFa DOM API. Developers may write their own TypedLiteralConverter in order to convert RDFLiteral into prefered language constructs. As default standard converters for the following XML Scheme datatypes [XSD] are provided if corresponding datatypes exist the current programming language:

TypedLiteral

A TypedLiteral wraps a DOMString and adds type information about its content.

interface TypedLiteral {
    readonly attribute DOMString value;

    readonly attribute URI       type;
    DOMString toString ();
    DOMString valueOf ();

};

4.1 Attributes

type of type URI, readonly
A datatype identified by an URI reference
No exceptions.
value of type DOMString, readonly
The lexical value of this literal encoded in the character encoding of the source document.
No exceptions.

4.2 Methods

toString
Returns the string representation this literal.
No parameters.
No exceptions.
Return type: DOMString
valueOf
Returns a typed representation of this literal that follows a given mapping from RDFS datatypes to datatypes of the current programming language.
No parameters.
No exceptions.
Return type: DOMString

Example

Example use of TypedLiteral in Javascript

>> var literal = new TypedLiteral('2010-12-24', new URI("http://www.w3.org/2001/XMLSchema#date"));
>> print(literal.toString());
2010-12-24
>> print(literal.valueOf());
Fri Dec 24 2010 00:00:00 GMT+0100

PlainLiteral

A PlainLiteral wraps a DOMString and adds optional language information about its content.

interface PlainLiteral {

    readonly attribute DOMString value;
    readonly attribute DOMString language;
    DOMString toString ();

    DOMString valueOf ();
};

4.3 Attributes

language of type DOMString, readonly
A two characters long language tag as defined by [RFC-3066], normalized to lowercase. If language is set, then the value of type is a URI with value rdf:Literal
No exceptions.
value of type DOMString, readonly
The lexical value of this literal encoded in the character encoding of the source document.
No exceptions.

4.4 Methods

toString
Returns the string representation this literal.
No parameters.
No exceptions.
Return type: DOMString
valueOf
Returns the string representation this literal.
No parameters.
No exceptions.
Return type: DOMString

Example

Example use of PlainLiteral in Javascript

>> var literal = new PlainLiteral('foo', 'en');
>> print(literal.toString());
foo
>> print(literal.valueOf());
foo

TypedLiteralConverter

This callable interface provides a function that concerts the value of a TypedLiteral into a corresponding type of the current programming language.

interface TypedLiteralConverter {

    T convertType (in DOMString value);
};

4.5 Methods

convertType
Returns the value of the passed literal value as certain type.
ParameterTypeNullableOptionalDescription
valueDOMStringThe value of the TypedRDFLiteral.
No exceptions.
Return type: T
interface rdfa {
    void registerTypeConversion (in URI type, in TypedLiteralConverter converter);

};

4.6 Methods

registerTypeConversion
Registers a new type conversion from an RDFS datatype to a dataype of the current programming language.
ParameterTypeNullableOptionalDescription
typeURIAn URI reference of the datatype type of the current TypedLiteral
converterTypedLiteralConverterA function that converts the literal's value into the correct type of the current programming language.
No exceptions.
Return type: void

Example

Example use of PlainLiteral in Javascript

>> rdfa.registerTypeConversion(XSD["boolean"], function(value) {return new Boolean(value);});
>> var literal = new TypedLiteral('1', XSD["boolean"]);
>> print(literal.toString());
1
>> print(literal.valueOf());
true

5. URI Reference

A URI in the RDFa DOM API is a URI reference as specified in [RDF-CONCEPTS].
interface URI {
    readonly attribute DOMString value;

    DOMString toString ();
    DOMString valueOf ();
};

5.1 Attributes

value of type DOMString, readonly
The lexical representation of the URI reference.
No exceptions.

5.2 Methods

toString
Returns the string representation this URI reference.
No parameters.
No exceptions.
Return type: DOMString
valueOf
Returns the string representation this URI reference.
No parameters.
No exceptions.
Return type: DOMString

Example

Example use of URI in Javascript

>> var uri = new URI("http://www.example.com");
>> print(uri.toString());
http://www.example.com
>> print(uri.valueOf());
http://www.example.com

6. BlankNode

A BlankNode is an anonymous RDFNode as defined in [RDF-CONCEPTS].
interface BlankNode {
    readonly attribute DOMString value;
    DOMString toString ();

    DOMString valueOf ();
};

6.1 Attributes

value of type DOMString, readonly
The identifier of the BlankNode
No exceptions.

6.2 Methods

toString
Returns the string representation this BlankNode.
No parameters.
No exceptions.
Return type: DOMString
valueOf
Returns the string representation this BlankNode.
No parameters.
No exceptions.
Return type: DOMString

Example

Example use of BlankNode in Javascript

>> var bn = new BlankNode(":_42");
>> print(bn.toString());
:_42
>> print(bn.valueOf());
:_42

7. RDFTriple

In the RDFa DOM API, RDFTriple defines the data structure to represent an RDF triples as specified in [RDF-CONCEPTS]. The RDFa DOM API is intended to extract RDFTriple objects from the source document.

interface RDFTriple {
    readonly attribute Object subject;
    readonly attribute Object predicate;

    readonly attribute Object object;
    DOMString toString ();
    DOMString valueOf ();

};

7.1 Attributes

object of type Object, readonly
Object value of an RDFTriple.
No exceptions.
predicate of type Object, readonly
Predicate value of an RDFTriple.
No exceptions.
subject of type Object, readonly
Subject value of an RDFTriple.
No exceptions.

7.2 Methods

toString
Returns the string representation this RDFTriple.
No parameters.
No exceptions.
Return type: DOMString
valueOf
Returns the string representation this RDFTriple.
No parameters.
No exceptions.
Return type: DOMString

Example

Example use of RDFTriple in Javascript

>> var triple = new RDFTriple([new URI('http://www.example.com#foo'), RDFS['foo'], new PlainLiteral('foo')]);
>> print(triple.subject);
http://www.example.com#foo
>> print(triple.toString());
http://www.example.com#foo http://www.w3.org/2000/01/rdf-schema#label foo
>> print(triple.valueOf());
http://www.example.com#foo http://www.w3.org/2000/01/rdf-schema#label foo

8. Filtering RDF Triples

The RDFa DOM API provides these RDF filters for testing a RDFTriple instance for certain citeria.

Which filter functions should be provided?

rdfa

The core interface of the RDFa DOM API provides a list of methods for filtering. It destincts between filter methods that return RDF triples as tuples, list methods that return RDF triples in an RDFTripleList, and iterate methods that return RDF triples in an RDFTripleIterator.
interface rdfa {
    Object[]          filter (in Object? subject, in Object? predicate, in Object? object, in Node? node, in RDFTripleFilter? myfilter);

    RDFTripleList     list (in Object? subject, in Object? predicate, in Object? object, in Node? node, in RDFTripleFilter? myfilter);

    RDFTripleIterator iterate (in Object? subject, in Object? predicate, in Object? object, in Node? node, in RDFTripleFilter? myfilter);

    boolean           containsRDFa (in Object? subject, in Object? predicate, in Object? object, in Node? node, in RDFTripleFilter? myfilter);

    boolean           getRDFTriples (in Object? subject, in Object? predicate, in Object? object, in Node? node);

};

8.1 Methods

containsRDFa
Tests a node and its descendants if it contains RDFa content.
ParameterTypeNullableOptionalDescription
subjectObject
predicateObject
objectObject
nodeNode
myfilterRDFTripleFilter
No exceptions.
Return type: boolean
filter
Returns a list of triples of kind [[s,p,o],[s,p,o], ..., [s,p,o]]
ParameterTypeNullableOptionalDescription
subjectObject
predicateObject
objectObject
nodeNode
myfilterRDFTripleFilter
No exceptions.
Return type: Object[]
getRDFTriples
Returns a list of triples of kind [[s,p,o],[s,p,o], ..., [s,p,o]] of the given node that match subject, predicate, object patterns.
ParameterTypeNullableOptionalDescription
subjectObject
predicateObject
objectObject
nodeNode
No exceptions.
Return type: boolean
iterate
Returns an RDFTripleIterator. RDFTripleIterator may be used in documents with a large amount of triples to save memory.
ParameterTypeNullableOptionalDescription
subjectObject
predicateObject
objectObject
nodeNode
myfilterRDFTripleFilter
No exceptions.
Return type: RDFTripleIterator
list
Returns an RDFTripleList that is a list of RDFTriple entries.
ParameterTypeNullableOptionalDescription
subjectObject
predicateObject
objectObject
nodeNode
myfilterRDFTripleFilter
No exceptions.
Return type: RDFTripleList

Example

Example use of rdfa.(filter|list|iterate) in Javascript



RDFTripleFilter

Criterias can also be specified by application developers. An RDFTripleFilter may be created with a default RDF triple used as dynamic filter pattern. RDFTripleFilters are helpful when a developer may want to retrieve a subset of triples from the structured data in the page. For example, if one wanted to retrieve all subjects that have the rdf:type of foaf:Person, as well as all subjects that have the rdf:type of foaf:Organization, one could either create two filter functions, or one RDFTripleFilter. the following triple filter (null, rdf:type, foaf:Person).

interface RDFTripleFilter {
    boolean match (in Node? node, in Object? subject, in Object? predicate, in Object? object);

};

8.2 Methods

match
This callable function returns true if the passed RDFTriple triple is tested as valid. Otherwise it returns false.
ParameterTypeNullableOptionalDescription
nodeNodeThe DOM Node that should be tested.
subjectObjectSubject pattern of an RDF triple that should be tested.
predicateObjectPredicate pattern of an RDF triple that should be tested.
objectObjectObject pattern of an RDF triple that should be tested.
No exceptions.
Return type: boolean

Example

Example use of RDFTripleFilter in Javascript





9. Iterating through RDF Triples

The RDFTripleIterator is an Iterator through RDFa content inside Nodes of the DOM tree.

interface RDFTripleIterator {
    readonly attribute Node            root;
    readonly attribute RDFTripleFilter filter;

    readonly attribute RDFTriple       triplePattern;
    RDFTriple         previousRDFTriple ();
    RDFTriple         nextRDFTriple ();

    RDFTripleIterator RDFTripleIterator (in Node root, in Object? subject, in Object? predicate, in Object? object, in RDFTripleFilter filter);

};

9.1 Attributes

filter of type RDFTripleFilter, readonly
The RDFTripleFilter object that filters the Node objects in the subtree for certain RDFa content.
No exceptions.
root of type Node, readonly
The Node inside the DOM Tree taken as root node to start from the extraction of RDFa content.
No exceptions.
triplePattern of type RDFTriple, readonly
An RDF triple pattern is an additional filter parameter that can be passed to an RDFTripleFilter.
No exceptions.

9.2 Methods

RDFTripleIterator
Create new instance of RDFTripleIterator.
ParameterTypeNullableOptionalDescription
rootNodeThe Node inside the DOM Tree taken as root node to start from the extraction of RDFa content.
subjectObjectSubject pattern of an RDF triple that should be tested.
predicateObjectPredicate pattern of an RDF triple that should be tested.
objectObjectObject pattern of an RDF triple that should be tested.
filterRDFTripleFilterAn RDFTripleFilter object that filters the Node objects in the subtree for certain RDFa content.
No exceptions.
Return type: RDFTripleIterator
nextRDFTriple
Returns the next RDFTriple object that is found inside the subtree of DOM nodes or NULL if no more exist.
No parameters.
No exceptions.
Return type: RDFTriple
previousRDFTriple
Returns the previous RDFTriple object that is found inside the subtree of DOM nodes or NULL if no previous RDFTriples exist.
No parameters.
No exceptions.
Return type: RDFTriple

Example

10. RDF Triple List

The RDFTripleList is a plain list of RDFTriple objects.

interface RDFTripleList {

    readonly attribute unsigned long length;
    RDFTriple get (in unsigned long index);

};

10.1 Attributes

length of type unsigned long, readonly
A positive integer value less than RDFTripleList::length that represents an index inside the RDFTripleList.
No exceptions.

10.2 Methods

get
Returns the RDFTriple object inside this list at position index.
ParameterTypeNullableOptionalDescription
indexunsigned longA positive integer value less than RDFTripleList::length that represents an index inside the RDFTripleList
No exceptions.
Return type: RDFTriple

Example

A. Acknowledgements

This document has been prepared with the help of the following people (in alphabetical order): Mark Birbeck, Ivan Herman, Toby Inkster

B. IDL Definitions

This document contains the Web IDL definitions of the RDFa DOM API:rdfa_dom_api.idl

Javascript Conversion

This document contains the Javascript Conversion of the RDFa DOM API:rdfa_dom_api.js

The API documentation of the Javascript Conversion is provided here: JSDoc API

C. References

C.1 Normative references

[DOM-LEVEL-2-TRAVERSAL-RANGE]
Vidur Apparao; et al. Document Object Model (DOM) Level 2 Traversal and Range Specification. 13 November 2000. W3C Recommendation. URL: http://www.w3.org/TR/2000/REC-DOM-Level-2-Traversal-Range-20001113
[DOM-LEVEL-3-CORE]
Gavin Nicol; et al. Document Object Model (DOM) Level 3 Core Specification. 7 April 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407
[ECMA-262]
ECMAScript Language Specification, Third Edition. December 1999. URL: http://www.ecma-international.org/publications/standards/Ecma-262.htm
[RDF-CONCEPTS]
Graham Klyne; Jeremy J. Carroll. Resource Description Framework (RDF): Concepts and Abstract Syntax. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-concepts-20040210
[RDF-PRIMER]
Frank Manola; Eric Miller. RDF Primer. 10 February 2004. W3C Recommendation. URL: http://www.w3.org/TR/2004/REC-rdf-primer-20040210/
[RDF-SYNTAX]
Ora Lassila; Ralph R. Swick. Resource Description Framework (RDF) Model and Syntax Specification. 22 February 1999. W3C Recommendation. URL: http://www.w3.org/TR/1999/REC-rdf-syntax-19990222
[RDFA-SYNTAX]
Ben Adida, et al. RDFa in XHTML: Syntax and Processing. 14 October 2008. W3C Recommendation. URL: http://www.w3.org/TR/2008/REC-rdfa-syntax-20081014
[SVG12]
Craig Northway; Dean Jackson. Scalable Vector Graphics (SVG) Full 1.2 Specification. 13 April 2005. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2005/WD-SVG12-20050413
[WEBIDL]
Cameron McCormack. Web IDL. 19 December 2008. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2008/WD-WebIDL-20081219

C.2 Informative references

No informative references.