This document is also available in this non-normative format: diff to previous version.
Copyright © 2010 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
RDFa [RDFA-CORE] enables authors to publish structured information that is both human- and machine-readable. Concepts that have traditionally been difficult for machines to detect, like people, places, events, music, movies, and recipes, are now easily marked up in Web documents. While publishing this data is vital to the growth of Linked Data, using the information to improve the collective utility of the Web for humankind is the true goal. To accomplish this goal, it must be simple for Web developers to extract and utilize structured information from a Web document. This document details such a mechanism; an RDFa Application Programming Interface (RDFa API) that allows simple extraction and usage of structured information from a Web document.
This section is non-normative.
This is a version with examples edited by TimBL to see what the tabulator RDF API would look in the same examples.
This document is a detailed specification for an RDFa API. The document is primarily intended for the following audiences:
For those looking for an introduction to the use of RDFa, or some real-world examples, please consult the RDFa Primer [RDFA-PRIMER].
If you are not familiar with RDF, you should read about the Resource Description Framework (RDF) [RDF-CONCEPTS] before reading this document. The [RDF-CONCEPTS] document outlines the core data model that is used by RDFa to express information.
If you are not familiar with RDFa, you should read and understand the [RDFA-CORE] specification. It describes how data is encoded in host languages using RDFa. A solid understanding of concepts in RDFa Core will inevitably help you understand how the RDFa API works in concert with how the data is expressed in a host language.
If you are a Web developer and are already familiar with RDF and RDFa, and you want to programatically extract RDFa content from documents, then you will find the Concept Diagram and Developing with the API sections of most interest. It contains a handful of ECMAScript examples on how to use the RDFa API.
Readers who are not familiar with the Terse RDF Triple Language [TURTLE] may want to read the specification in order to understand the short-hand RDF notation used in some of the examples.
This document uses the Web Interface Definition Language [WEBIDL] to specify all language bindings. If you intend to implement the RDFa API you should be familiar with the Web IDL language [WEBIDL].
Examples may contain references to existing vocabularies and use abbreviations in CURIEs and source code. The following is a list of all vocabularies and their abbreviations, as used in this document:
rdf
, e.g., rdf:type
)xsd
, e.g., xsd:integer
)rdfs
, e.g., rdfs:label
)foaf
, e.g., foaf:name
)This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
The following changes have been made since the First Public Working Draft:
This document was published by the RDFa Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-rdfa-wg@w3.org (subscribe, archives). All feedback is welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This section is non-normative.
RDFa provides a means to attach properties to elements in XML and HTML
documents. Since the purpose of these additional properties is to provide
information about real-world items, such as people, films, companies, events,
and so on, properties are grouped into objects called PropertyGroup
s.
The RDFa API provides a set of interfaces that make it easy to manipulate DOM
objects that contain information that is also part of a PropertyGroup
. This
specification defines these interfaces.
A document that contains RDFa effectively provides two data layers. The first layer is the information about the document itself, such as the relationship between the elements, the value of its attributes, the origin of the document, and so on, and this information is usually provided by the Document Object Model, or DOM [DOM-LEVEL-1].
The second data layer comprises information provided by embedded metadata, such as company names, film titles, ratings, and so on, and this is usually provided by RDFa [RDFA-CORE], Microformats [MICROFORMATS], DC-HTML, GRDDL, or Microdata.
Whilst this embedded information could be accessed via the usual DOM interfaces -- for example, by iterating through child elements and checking attribute values -- the potentially complex interrelationships between the data mean that it is more efficient for developers if they have access to the data after it has been interpreted.
For example, a document may contain the name of a person in one section and the
phone number of the same person in another; whilst the basic DOM interfaces
provide access to these two pieces of information through normal navigation, it
is more convenient for authors to have these two pieces of information
available in one property collection, reflecting the final PropertyGroup
.
All of this is achieved through the RDFa API.
There are many scenarios in which the RDFa API can be used to extract information from a Web document. The following sections describe a few of these scenarios.
Amy has enriched her band's web-site to include Google Rich Snippets event information. Google Rich Snippets are used to mark up information for the search engine to use when displaying enhanced search results. Amy also uses some ECMAScript code that she found on the web that automatically extracts the event information from a page and adds an entry into a personal calendar.
Brian finds Amy's web-site through Google and opens the band's page. He decides that he wants to go to the next concert. Brian is able to add the details to his calendar by clicking on the link that is automatically generated by the ECMAScript tool. The ECMAScript extracts the RDFa from the web page and places the event into Brian's personal calendaring software - Google Calendar.
<div prefix="v: http://rdf.data-vocabulary.org/#" typeof="v:Event"> <a rel="v:url" href="http://amyandtheredfoxies.example.com/events" property="v:summary">Tour Info: Amy And The Red Foxies</a> <span rel="v:location"> <a typeof="v:Organization" rel="v:url" href="http://www.kammgarn.de/" property="v:name">Kammgarn</a> </span> <div rel="v:photo"><img src="foxies.jpg"/></div> <span property="v:summary">Hey K-Town, Amy And The Red Foxies will rock Kammgarn in October.</span> When: <span property="v:startDate" content="2009-10-15T19:00">15. Oct., 7:00 pm</span>- <span property="v:endDate" content="2009-10-15T21:00">9:00 pm</span> </span> Category: <span property="v:eventType">concert</span> </div>
Dave is writing a browser plugin that filters product offers in a web page and displays an icon to buy the product or save it to a public wishlist. The plugin searches for any mention of product names, thumbnails, and offered prices. The information is listed in the URL bar as an icon, and upon clicking the icon, displayed in a sidebar in the browser. He can then add each item to a list that is managed by the browser plugin and published on a wishlist website.
<div prefix="rdfs: http://www.w3.org/2000/01/rdf-schema# foaf: http://xmlns.com/foaf/0.1/ gr: http://purl.org/goodrelations/v1# xsd: http://www.w3.org/2001/XMLSchema#" xml:lang="en"> <div about="#offering" typeof="gr:Offering"> <div rel="foaf:page" resource="http://www.amazon.com/Harry-Potter-Deathly-Hallows-Book/dp/0545139708"></div> <div property="rdfs:label">Harry Potter and the Deathly Hallows</div> <div property="rdfs:comment">In this final, seventh installment of the Harry Potter series, J.K. Rowling unveils in spectactular fashion the answers to the many questions that have been so eagerly awaited. The spellbinding, richly woven narrative, which plunges, twists and turns at a breathtaking pace, confirms the author as a mistress of storytelling, whose books will be read, reread and read again.</div> <div rel="foaf:depiction"> <img src="http://ecx.images-amazon.com/images/I/51ynI7I-qnL._SL500_AA300_.jpg" /> </div> <div rel="gr:hasBusinessFunction" resource="http://purl.org/goodrelations/v1#Sell"></div> <div rel="gr:hasPriceSpecification">Buy for <span typeof="gr:UnitPriceSpecification"> <span property="gr:hasCurrency" content="USD" datatype="xsd:string">$</span> <span property="gr:hasCurrencyValue" datatype="xsd:float">7.49</span> </span> </div> Pay via: <span rel="gr:acceptedPaymentMethods" resource="http://purl.org/goodrelations/v1#PayPal">PayPal</span> <span rel="gr:acceptedPaymentMethods" resource="http://purl.org/goodrelations/v1#MasterCard">MasterCard</span> </div> </div> </div>
Dale has a site that contains a number of images, showcasing his photography. He has already used RDFa to add licensing information about the images to his pages, following the instructions provided by Creative Commons. Dale would like to display the correct Creative Commons icons for each image so that people will be able to quickly determine which licenses apply to each image.
<div prefix="cc: http://creativecommons.org/ns#"> <img src="http://dale.example.com/images/image1.png" rel="cc:license" resource="http://creativecommons.org/licenses/by/3.0/us/"/> <a href="http://dale.example.com" property="cc:attributionName" rel="cc:attributionURL">Dale</a> </div>
Mary is responsible for keeping the projects section of her company's home page up-to-date. She wants to display info-boxes that summarize details about the members associated with each project. The information should appear when hovering the mouse over the link to each member's homepage. Since each member's homepage is annotated with RDFa, Mary writes a script that requests the page's content and extracts necessary information via the RDFa API.
<div prefix="dc: http://purl.org/dc/terms/ foaf: http://xmlns.com/foaf/0.1/"> <div about="#me" property="foaf:name" content="Bob">My<span> interests are: <ol about="#me" typeof="foaf:Person"> <li rel="foaf:interests"> <a href="facebook" rel="tag" property="dc:title">facebook</a> </li> <li rel="foaf:interests"> <a href="opengraph" rel="tag" property="dc:title">opengraph</a> </li> <li rel="foaf:interests"> <a href="semanticweb" rel="tag" property="dc:title">semanticweb</a> </li> </ol> <p>Please follow me on Twitter: <span about="#me" rel="foaf:account"> @<a typeof="foaf:OnlineAccount" property="foaf:accountName" href="http://twitter.com/bob">bob</a>. </span> </p> </div>
Richard has created a site that lists his favourite restaurants and their locations. He doesn't want to generate code specific to the various mapping services on the Web. Instead of creating specific markup for Yahoo Maps, Google Maps, MapQuest, and Google Earth, he instead adds address information via RDFa to each restaurant entry. This enables him to build on top of the structured data in the page as well as letting visitors to the site use the same data to create innovative new applications based on the address information in the page.
<div prefix="vc: http://www.w3.org/2006/vcard/ns# foaf: http://xmlns.com/foaf/0.1/" typeof="vc:VCard"> <span property="vc:fn">Wong Kei</span> <span property="vc:street-address">41-43 Wardour Street</span> <span property="vc:locality">London</span>, <span property="vc:country-name">United Kingdom</span> <span property="vc:tel">020 74373071</span> </div>
Marie is a chemist, researching the effects of ethanol on the spatial orientation of animals. She writes about her research on her blog and often makes references to chemical compounds. She would like any reference to these compounds to automatically have a picture of the compound's structure shown as a tooltip, and a link to the compound's entry on the National Center for Biotechnology Information [NCBI] Web site. Similarly, she would like visitors to be able to visualize the chemical compound in the page using a new HTML5 canvas widget she has found on the web that combines data from different chemistry websites.
<div prefix="dbp: http://dbpedia.org/ontology/ fb: http://rdf.freebase.com/rdf/" > My latest study about the effects of <span about="[fb:en.ethanol]" typeof="[dbp:ChemicalCompound]" property="[fb:chemistry.chemical_compound.pubchem_id]" content="702">ethanol</span> on mice's spatial orientation show that ... </div>
This section is non-normative.
RDFa 1.0 [RDFA-SYNTAX] has seen substantial growth since it became an official W3C Recommendation in October 2008. It has seen wide adoption among search companies, e-commerce sites, governments, and content management systems. There are numerous interoperable implementations and growth is expected to continue to rise with the latest releases of RDFa 1.1 [RDFA-CORE], XHTML+RDFa 1.1 [XHTML-RDFA], and HTML+RDFa 1.1 [HTML-RDFA].
In an effort to ensure that Web applications are able to fully utilize RDFa, this specification outlines an API and a set of interfaces that extract RDF Triples from Web documents or other document formats that utilize RDFa. The RDFa API is designed with maximum code expressiveness and ease of use in mind. Furthermore, a deep understanding of RDF and RDFa is not necessary in order to extract and utilize the structured data embedded in RDFa documents.
Since there are many Web browsers and programming environments for the Web, the rapid adoption of RDFa requires an interoperable API that Web document designers can count on being available in all Web browsers. The RDFa API provides a uniform and developer-friendly interface for extracting RDFa from Web documents.
Since most browser-based applications and browser extensions that utilize Web documents are written in ECMAScript [ECMA-262], the implementation of the RDFa API is primarily concerned with ensuring that concepts covered in this document are easily utilized in ECMAScript.
While ECMAScript is of primary concern, the RDFa API specification is language independent and is designed such that DOM tool developers may implement it in many of the other common Web programming languages such as Python, Java, Perl, and Ruby. Objects that are defined by the RDFA API are designed to work as seamlessly as possible with language-native types, operators, and program flow constructs.
The design goals that drove the creation of the APIs that are described in this document are:
The following diagram describes the relationship between all concepts discussed in this document.
Diagram of RDFa API Concepts
The lowest layer of the API defines the basic structures that are used to store
information; Symbol
, PlainLiteral
, TypedLiteral
, BlankNode
and finally the
RDFStatement/code>
and the
RDFGraph
.
The next layer of the API, the DataStore
, supports the storage of information.
The DataParser
and DataQuery
interfaces directly interact with the DataStore
.
The DataParser
is used to extract information from the Document and store the
result in a DataStore
. The DataQuery
interface is used to extract different
views of data from the DataStore
. The PropertyGroup
is an abstract,
easily manipulable view of this information for developers. While
PropertyGroup
objects can address most use cases, a developer
also has access to the information in the DataStore
at a basic level. Access to
the raw data allows developers to create new views and ways of directly
manipulating the data in a DataStore
.
The highest layer to the API is the Document object and is what most
web developers will use to retrieve PropertyGroup
s created from data stored
in the document.
This section is non-normative.
This section is non-normative.
This API provides a number of interfaces to enable:
PropertyGroup
s from
the data store.
The RDFa API has a number of advanced methods that can be used to
access the DataStore
, DataParser
and DataQuery
mechanisms. Most web
developers will not need to use the advanced methods - most will only require
the following interfaces for most of their day-to-day development activities.
document.getItemsByType(type)
PropertyGroup
s by their type,
such as foaf:Person
.document.getItemBySubject(type)
PropertyGroup
by its subject,
such as http://example.org/people#bob
.document.getItemsByProperty(property, optional value)
PropertyGroup
s by a particular property and optional
value that the PropertyGroup
contains.document.getElementsByType(type)
foaf:Person
.document.getElementsBySubject(type)
http://example.org/people#bob
.document.getElementsByProperty(property, optional value)
document.data.context.setPrefix(prefix, iri)
foaf:Person
to
http://xmlns.com/foaf/0.1/Person
.document.data.query.select(query, template)
PropertyGroup
s based on a set of selection criteria.document.data.store.filter(pattern)
DataStore
by matching a given triple pattern.document.data.parser.iterate(pattern)
The following section uses the markup shown below to demonstrate how to extract
and use PropertyGroup
s using the RDFa API. The following markup is assumed to
be served from a document located at http://example.org/people
.
<div prefix="foaf: http://xmlns.com/foaf/0.1/" about="#albert" typeof="foaf:Person"> <span property="foaf:name">Albert Einstein</span> </div>
You can retrievean set (associative array) that is described above by doing the following:
var people = document.data.findAllMembers("http://xmlns.com/foaf/0.1/Person");or you can specify a short-cut to use when specifying the IRI:
var foaf = document.ns("http://xmlns.com/foaf/0.1/") var people = document.data.each(undefined, rdf('type'), foaf('Person'));
You can also get a PropertyGroup
by its subject:
var kb = document.data; var aboutAlbert = kb.statementsMatching(kb.sym("http://example.org/people#albert"));
You can also specify a relative IRI and the document IRI will be automatically pre-pended:
var albert = document.data.sym("#albert");
You can get a list of PropertyGroup
s by their properties:
var peopleNamedAlbertEinstein = document.data.each(undefined, foaf('name'), "Albert Einstein");
You can get item by its properties:
var personNamedAlbertEinstein = document.data.any(undefined, foaf('name'), "Albert Einstein");
The wildcard (undefined) can be used as any one position in subject, predicate, object. So each() and any() are very flexible functions. The function each() always returns a list, and any() always a single item.
noteProperty groups are not used -- it is not clear they are needed.
You can retrieve property values from PropertyGroup
s like so:
var albert = document.data.sym("#albert"); var name = document.data.any(albert, foaf("name"));
You can also specify values that you would like to map to PropertyGroup
attributes:
RDFA API: var albert = document.getItemBySubject("#albert", {"foaf:name": "name"}); var name = albert.name;
RDF API: var albert = document.data.sym("#albert"); var fn = foaf('name'); var name = doument.data.any(albert, fn);
You can retrieve the DOM Element that is described above by doing the following:
var elements = document.getElementsByType("http://xmlns.com/foaf/0.1/Person");or you can specify a short-cut to use when specifying the IRI:
OLD RDFA API: // Context considered harmful where can be avoided document.data.context.setPrefix("foaf", "http://xmlns.com/foaf/0.1/"); var elements = document.getElementsByType("foaf:Person");
RDF API: foaf = $rdf.Namespace("http://xmlns.com/foaf/0.1/"); var elements = document.getElementsByType(foaf("Person"));
You can also get a list of Elements by the subject of data:
var elements = document.getElementsBySubject("http://example.org/people#albert");
You can also specify a relative IRI and the document IRI will be automatically pre-pended:
var elements = document.getElementsBySubject("#albert");
You can get a list of Elements by the properties and values that they declare:
var elements = document.data.each(undefined, foaf("name"), "Albert Einstein");
You can modify elements that are returned just like any other DOM Node, for example:
var elements = document.getElementsByProperty("foaf:name", "Bob"); for(i = 0; i <= elements.length; i++) { var e = elements[i]; e.style.setProperty('color', '#00cc00', null); }
The code above would change the color of all the areas of the page where the item's name is "Bob" to green.
This section covers a number of concepts that go beyond basic everyday usage of the RDFa API. The interfaces to the API allow you to work with data at an abstract level, or query structured data and override key parts of the software stack in order to extend the functionality that the API provides.
The features available via a Query object will depend on the implementation. However, all conforming processors will provide the basic element selection mechanisms described here.
Perhaps the most basic task is to select PropertyGroup
s of a
particular type. The type of a PropertyGroup
is set in RDFa via
the special attribute @typeof
. For example, the following
markup expresses a PropertyGroup
of type Person in the
Friend-of-a-Friend vocabulary:
<div typeof="foaf:Person"> <span property="foaf:name">Albert Einstein</span> </div>
To locate all PropertyGroup
s that are people, we could use the
document.getItemsByType()
method:
document.getItemsByType("http://xmlns.com/foaf/0.1/Person");or we could do the same using the
DataQuery
interface:
var query = document.data.createQuery("rdfa", store); var people = query.select( { "rdf:type": "foaf:Person" } );
While the query interface is more verbose for simple queries, it becomes necessary for more complex queries as demonstrated later in this section. Note that the Query object has access to the mappings provided via the document.data.context object, so they can also be used in queries. It is also possible to write the same query in a way that is independent of any prefix-mappings:
var people = query.select( { "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://xmlns.com/foaf/0.1/Person" } );
The previous query selected all PropertyGroup
s of a certain
type, but it did so by indicating that the property rdf:type
should have a specific value. Queries can also specify other
properties. For example, given the following mark-up:
<div typeof="foaf:Person"> <span property="foaf:name">Albert Einstein</span> - <span property="foaf:myersBriggs">INTP</span> <a rel="foaf:workInfoHomepage" href="http://en.wikipedia.org/wiki/Albert_Einstein">More...</span> </div> <div typeof="foaf:Person"> <span property="foaf:name">Mother Teresa</span> - <span property="foaf:myersBriggs">ISFJ</span> <a rel="foaf:workInfoHomepage" href="http://en.wikipedia.org/wiki/Mother_Teresa">More...</span> </div> <div typeof="foaf:Person"> <span property="foaf:name">Marie Curie</span> - <span property="foaf:myersBriggs">INTP</span> <a rel="foaf:workInfoHomepage" href="http://en.wikipedia.org/wiki/Marie_Curie">More...</span> </div>
The following query demonstrates how a developer would select and use
all PropertyGroup
s of type Person that also have a Myers Brigg's
personality type of "INTP" (aka: The Architect):
var architects = query.select( { "http://www.w3.org/1999/02/22-rdf-syntax-ns#type": "http://xmlns.com/foaf/0.1/Person", "http://xmlns.com/foaf/0.1/myersBriggs": "INTP" } ); var name = architects[0].get("http://xmlns.com/foaf/0.1/name");
As before, prefix-mappings can also be used:
var architects = query.select( {"rdf:type": "foaf:Person", "foaf:myersBriggs": "INTP"} ); var name = architects[0].get("foaf:name");
Directives to generate the PropertyGroup
object based on a template
specified by the developer can also be used. In this case, all of the "INTP"
personality types are gleaned from the page and presented as PropertyGroup
s
containing each person's name and blog page:
var architects = query.select( {"rdf:type": "foaf:Person", "foaf:myersBriggs": "INTP"}, {"foaf:name": "name", "foaf:workInfoHomepage", "webpage"} ); var name = architects[0].name; var infoWebpage = architects[0].webpage;
The RDFa API allows a developer to not only query the DataStore
at via
the DataQuery
mechanism, it also allows a developer to get to
the underlying data structures that represent the structured data at the
"atomic level".
The filter
interface is a part of the DataStore
and enables a
developer to filter a series of triples out of the DataStore
. For example,
to extract all triples about a particular subject, the developer could do
the following:
var names = document.data.store.filter( {"subject": "http://example.org/people#benjamin"} );
Developers could also combine subject-property filters by doing the following:
var names = document.data.store.filter( {"subject:", "http://example.org/people#benjamin", "property": "foaf:nick"} );
The query above would extract all known nicknames for the subject as triples.
A developer may also retrieve all triples in the DataStore
by specifying no
filter parameters:
var allTriples = document.data.store.filter();
The .iterate interface will almost certainly see large changes in the next version of the RDFa API specification. Implementers are warned to not implement the interface and wait for the next revision of this specification.
The iterate
interface can be used to process triples in the
document as they are discovered. This interface is most useful for processing
large amounts of data in low-memory environments.
var iter = document.data.parser.iterate( {"subject": "http://example.org/people#mark"} );
for(var triple = iter.next(); triple != null; triple = iter.next())
{
// process each triple that is associated with http://example.org/people#mark
}
The sub-modules in the RDFa API are meant to be overridden by developers in order to extend basic functionality as well as innovate new interfaces for the RDFa API.
The API is designed such that a developer may override the default data store provided by the browser by providing their own. This is useful, for instance, if the developer wanted to create a permanent site-specific data store using Local Storage features in the browser, or allowing provenance information to be stored with each triple.
var mydatastore = new MyCustomDataStore(); document.data.store = mydatastore;
Developers may create and specify different parsers for parsing structured
data from the document that builds upon RDFa, or parses other languages not
related to RDFa. For example, Microformats-specific parsers could be created
to extract structured hCard data and store it in an object that is
compatible with the DataStore
interface.
var hcardParser = new MyHCardParser(); document.data.parser = hcardParser;
The query mechanism for the API can be overridden to provide different or more powerful query mechanisms. For example, by replacing the standard query mechanism, developers could provide a full SPARQL query mechanism:
var sparqlQuery = new MySparqlEngine(); document.data.query = sparqlQuery; var books = document.data.query.select("SELECT ?book ?title WHERE { ?book <http://purl.org/dc/elements/1.1/title> ?title . }", {"?book": "subject", "?title": "title"} );
The following section contains all of the interfaces that developers are expected to implement as well as implementation guidance.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
Implementations that use ECMAScript or Java to implement the APIs defined in this specification must implement them in a manner consistent with the respective ECMAScript or Java Bindings defined in the Web IDL specification, as this specification uses that specification's terminology. [WEBIDL]
Implementations that use any other language to implement the APIs defined in this specification that do not have bindings defined in the Web IDL specification should attempt to map the API as closely as possible to the implementation language's native mechanisms and datatypes. Developers are encouraged to work with other developers who are providing the RDFa API in the same langauge to ensure that RDFa Processors are modular and easily exchangable.
RDFa is a syntax for expressing the RDF Data Model [RDF-CONCEPTS] in
Web documents. The RDFa API is designed to extract the RDF Data
Model from Web documents. The following RDF Resources are utilized in this
specification: PlainLiteral
s, TypedLiteral
s,
IRI References (as defined in [IRI]) and BlankNode
s. The
interfaces for each of these RDF Resources are detailed in this section. The
types as exposed by the RDFa API conform to the same data and
comparison restrictions as specified in the RDF concepts
specification [RDF-CONCEPTS] and the [IRI] specification.
Each RDF interface provides access to both the extracted RDFa value and
the DOM Node from which the value was extracted. This allows developers
to extract and use RDF triples from a host language and
also manipulate the DOM based on the associated DOM Node. For example,
an agent could highlight all strings that are marked up as
foaf:name
properties in a Web document.
The basic RDF Resource types used in the RDFa API are:
http://www.w3.org/2001/XMLSchema#string
.
"Harry Potter and the Half-Blood
Prince"@en
is a plain literal expressed in the English language."7"^^xsd:integer
is a typed literal with a value of type
xsd:integer
.
BlankNode
s
include _:me
, and _:42
.<http://example.org/hp>
rdfs:label "Harry Potter" .
Diagram of RDF Classes, Attributes, Methods and linkages.
An RDFa API implementer must provide the basic types as described in this specification. An implementer may provide additional types and/or a deeper type or class hierarchy that includes these basic types.
An RDFNode
is an anything that can be an object of an
RDFTriple
. It is also the base class of PlainLiteral
,
and TypedLiteral
.
[NoInterfaceObject]
interface RDFNode {
readonly attribute stringifier DOMString value;
};
value
of type stringifier DOMString, readonlyvalue
of an RDFNode
is either a literal's
lexical value
or a lexical identifier of an IRI
or BlankNode
. Note
that the value
attribute is marked with a
stringifier
decorator. In [WEBIDL], this
means that the value
is used as the return value if
toString()
is called on an object that inherits from this type.
The stringifier can be overridden by the re-specification of the
toString()
method on a subclass of this type.
An RDFResource
is an anything that can be a subject
of an RDFTriple
.
[NoInterfaceObject]
interface RDFResource : RDFNode
{
readonly attribute stringifier DOMString value;
};
value
of type stringifier DOMString, readonlyRDFResource
is either a string that represents an IRI or
an identifier of a BlankNode
.An IRI Reference in the RDFa API points to a resource and is further defined in [IRI].
[NoInterfaceObject]
interface IRI : RDFResource
{
};
An RDF Literal is an RDF Resource that represents
lexical values in RDFa data. The two RDF Literals provided via the
RDFa API are PlainLiteral
s and TypedLiteral
s. For a given RDF
Literal, either language or type information can be provided. If the type
is set, the RDF Literal is a Typed
Literal. If a type is not set, it is a PlainLiteral
.
PlainLiteral
'en'
, 'fr'
, 'de'
).TypedLiteral
http://www.w3.org/2001/XMLSchema#DateTime
).
PlainLiteral
s have a string value and may specify a language.
[NoInterfaceObject]
interface PlainLiteral : RDFNode
{
readonly attribute stringifier DOMString value;
readonly attribute DOMString language;
};
language
of type DOMString, readonlyvalue
of type stringifier DOMString, readonlyThis section is non-normative.
The following example demonstrates a few common use cases of the
PlainLiteral
type.
// Create a new PlainLiteral using the DataStore's createPlainLiteral interface var literal = document.data.store.createPlainLiteral("Harry Potter and the Half-Blood Prince", "en"); // The API supports conversion of PlainLiterals to native language string types var str = literal.toString(); // At this point, str is equivalent to "Harry Potter and the Half-Blood Prince" // The API supports attribute-based access of PlainLiteral values, the native type // of a PlainLiteral value is a DOMString. var val = literal.value // At this point, val is equivalent to "Harry Potter and the Half-Blood Prince" // The API supports attribute-based access of PlainLiteral language values, the // native type of a PlainLiteral language value is a DOMString var lang = literal.language; // At this point, lang is equivalent to "en"
A TypedLiteral
has a string value and a datatype specified as an IRI Reference. TypedLiteral
s can be converted
into native language datatypes of the implementing programming
language by registering a
Typed Literal Converter as defined later
in the specification.
The datatype's IRI reference specifies the datatype of the text
value, e.g., xsd:DataTime
or xsd:boolean
.
The RDFa API provides a method to explicitly convert
TypedLiteral
values to native datatypes supported by the host programming
language. Developers may write their own Typed Literal Converters in
order to convert an RDFLiteral into a native language type. The
converters are registered by using the registerTypeConversion()
method. Default TypedLiteral
converters must be supported by the
RDFa API implementation for the following
XML Schema datatypes:
[NoInterfaceObject]
interface TypedLiteral : RDFNode
{
readonly attribute stringifier DOMString value;
readonly attribute IRI
type;
any valueOf ();
};
type
of type IRI
, readonlyvalue
of type stringifier DOMString, readonlyvalueOf
any
This section is non-normative.
The following example demonstrates how a TypedLiteral
representing a date is automatically converted to ECMAScript's native
DateTime object.
// Create a new TypedLiteral using the DataStore's createTypedLiteral interface var literal = document.data.store.createTypedLiteral("2010-12-24", "xsd:date"); // The value attribute stores the data in the raw format as specified via the // createTypedLiteral interface or in the document text var value = literal.value; // At this point, value is equivalent to "2010-12-24" // The toString() method will convert the TypedLiteral into a language-native // string value of "2010-12-24". This value may be different from the raw format // stored in the literal.value attribute var str = literal.toString(); // At this point, str is equivalent to "2010-12-24" // The valueOf() method will convert the TypedLiteral into a language-native // datatype. In ECMAScript, the return value of valueOf() will be a // Date object. var date = literal.valueOf(); // At this point, date will be a Date object with a value of // Fri Dec 24 2010 00:00:00 GMT+0000
A BlankNode
is an RDF resource that does not have a
corresponding IRI reference, as defined in [RDF-CONCEPTS]. The value of a
BlankNode
is not required to be the same for identical documents that are
parsed at different times. The purpose of a BlankNode
is to ensure that RDF
Resources in the same document can be compared for equivalence by ID.
The reasoning behind how we stringify BlankNode
s should
be explained in more detail.
BlankNode
s are stringified by concatenating "_:" to BlankNode
.value
[NoInterfaceObject]
interface BlankNode : RDFResource
{
readonly attribute stringifier DOMString value;
};
value
of type stringifier DOMString, readonlyBlankNode
. The value must not be
relied upon in any way between two separate RDFa processing runs of the
same document.Developers and authors must not assume that the value of a
BlankNode
will remain the same between two processing runs. BlankNode
values
are only valid for the most recent processing run on the document. BlankNode
s
values will often be generated differently by different RDFa Processors.
This section is non-normative.
The following example demonstrates the use of a BlankNode
in a ECMAScript implementation that uses incrementing numbers for the
identifier.
// Create two new BlankNodes - A and B var bna = document.data.store.createBlankNode(); var bnb = document.data.store.createBlankNode(); // BlankNode stringification changes from implementation to implementation // and between structured data extraction runs. Developers must not depend on // the stringified names of BlankNodes to be the same between implementations // or two different processing runs of a structured data processor. var stra = bna.toString(); // The stringified representation of the BlankNode at this point could be // "_:1", "_:a", "_:blank_node_alpha", or any other unique identifier starting // with "_:" // Extract the unique values associated with BlankNode A and BlankNode B var bnavalue = bna.value; var bnbvalue = bnb.value; // The two values above, bnavalue and bnbvalue, must not ever be allowed to // be the same. The values can be strings, integers or any other value that // easily establishes a unique identifier for the BlankNode.
The RDFTriple
interface represents an RDF triple as specified in
[RDF-CONCEPTS]. RDFTriple
can be used by referring to properties,
such as subject, property, and object. RDFTriple
can also be used by referring to pre-defined indexes. The stringification of
RDFTriple
results in an N-Triples-based representation as defined in
[RDF-TESTCASES].
[NoInterfaceObject, Null=Null]
interface RDFTriple {
const unsigned long size = 3;
readonly attribute RDFResource
subject;
readonly attribute IRI
property;
readonly attribute RDFNode
object;
getter RDFNode get (in unsigned long index);
stringifier DOMString toString ();
};
get
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
index | unsigned long | ✘ | ✘ |
getter RDFNode
toString
stringifier DOMString
size
of type unsigned longRDFTriple
as array of length three.This section is non-normative.
The following examples demonstrate the basic usage of an RDFTriple
object.
The creation of the triple uses the foaf:name
CURIE, which
is transformed into an IRI. The example assumes that a mapping has already been
created for foaf
. For more information on creating RDFa DOM
API mappings, see the section on IRI Mappings.
// Create a new RDFTriple with a PlainLiteral as the object var triple = document.data.store.createTriple("http://www.example.com#manu", "foaf:name", \ document.data.store.createPlainLiteral("Manu Sporny")); // Adding the RDFTriple's subject to a string will automatically call the // toString() method for the RDFResource, which will serialize the IRI value // to a language-native string var str = "Triple subject: " + triple.subject; // At this point, str will be "Triple subject: http://www.example.com#manu" // You can also convert the entire triple to N-Triples format by making the // language implementation call the underlying toString() interface for RDFTriple. var tstr = "N-Triples: " + triple; // At this point, tstr will be 'N-Triples: <http://www.example.com#manu> <http://xmlns.com/foaf/0.1/name> "Manu Sporny" .'
A number of convenience objects and methods are provided by the RDFa DOM API to help developers manipulate RDF Resources more easily when writing Web applications.
The basic RDF interface types described earlier in this document are utilized by the following Structured Data Interfaces:
RDFTriple
objects.DataStore
.DataParser
.DataStore
.
Diagram of Linked Data Classes, Attributes, Methods and linkages.
Processing RDF data involves the frequent use of unwieldy IRI references and
frequent type conversion. The DataContext
interface is provided in order to
simplify contextual operations such as shortening IRIs and converting RDF
data into native language datatypes.
It is assumed that this interface is created and available before a
document is parsed for RDFa data. For example, while operating within a
Browser Context, it is assumed that the following lines of code are executed
before a developer has access to the RDFa API methods on the
document
object:
document.data.context = new DataContext(); document.data.context.setPrefix("rdf", "http://www.w3.org/1999/02/22-rdf-syntax-ns#"); document.data.context.setPrefix("xsd", "http://www.w3.org/2001/XMLSchema-datatypes#");
In general, when a CURIE is resolved by the RDFa API or a TypedLiteral
is
converted to a native language type, the current
DataContext
stored in the DocumentData
object must be used to perform the
action. This is to ensure that there is only one active DataContext
in use
by the RDFa API at any given time. The default DataContext
stored in the
DocumentData
object may be changed at runtime.
All of the code that sets up the default type converters for Browser Contexts that use ECMAScript should probably be in the code snippet above.
The following interface allows IRI mappings to be easily created and used by Web Developers at run-time. It also allows for conversion of RDF data into native language datatypes.
interface DataContext {
void setPrefix (in DOMString prefix, in DOMString iri);
void registerTypeConversion (in DOMString iri, in TypedLiteralConverter
converter);
IRI
resolveCurie (in DOMString curie);
any convertType (in DOMString value, in optional DOMString inputType, in optional DOMString modifier);
};
convertType
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
value | DOMString | ✘ | ✘ | The value to convert that is associated with the TypedLiteral . |
inputType | DOMString | ✘ | ✔ | The input type for the TypedLiteral passed as a
string. For example, xsd:string or xsd:integer . |
modifier | DOMString | ✘ | ✔ | The developer-supplied modifier for the conversion. The string is a free-form,
string that is used by the TypedLiteralConverter
that is associated with the inputType .
The RDFa Working Group is still discussing whether or not
having a modifier is a good idea as 90% of use cases will never use it and
the remaining use cases could provide the functionality with an external
switch to the conversion function.
|
any
registerTypeConversion
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
iri | DOMString | ✘ | ✘ | A string specifying the IRI datatype. The string may be a CURIE. For example:
http://www.w3.org/2001/XMLSchema-datatypes#integer or
xsd:integer . |
converter |
| ✘ | ✘ | Converts the TypedLiteral 's value into a native language
datatype in the current programming language. |
void
resolveCurie
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
curie | DOMString | ✘ | ✘ | The CURIE that is to be resolved into an IRI. |
IRI
setPrefix
foaf
IRI mapping, they would call
setPrefix("foaf", "http://xmlns.com/foaf/0.1/")
.
Calling the setPrefix()
method with a prefix value that does
not exist results in the creation of a new mapping. Calling the method
with a null IRI value will remove the mapping.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
prefix | DOMString | ✘ | ✘ | The prefix to put into the mapping as the key. (e.g., foaf ) |
iri | DOMString | ✘ | ✘ | The IRI reference to place into the mapping as the mapped value of the given prefix.
(e.g., "http://xmlns.com/foaf/0.1/") |
void
All methods that accept CURIEs as arguments in the RDFa API must use
the algorithm specified in RDFa Core,
Section 7.4: CURIE and URI Processing [RDFA-CORE] for
TERMorCURIEorURI. The prefix and term mappings are
provided by the current document.data.context
instance.
The following examples demonstrate how mappings are created and used via the RDFa API.
// Create a new mapping for the Friend-of-a-Friend vocabulary document.data.context.setPrefix("foaf", "http://xmlns.com/foaf/0.1/"); // The new mapping is automatically used when CURIEs are expanded to IRIs, // the following statement will return all the people in the document var people = document.getItemsByType("foaf:Person"); // The following statement will result in the exact same list of PropertyGroups // as returned by the previous getItemsByType() call. Note that the only // difference is that this call uses a full IRI, while the call above uses a // CURIE var people2 = document.getItemsByType("http://xmlns.com/foaf/0.1/Person");
In the example above, the CURIE "foaf:Person
" is expanded
to an IRI with the value "http://xmlns.com/foaf/0.1/Person
" in
the getItemsByType
method.
IRI mappings for all terms in the following vocabularies must be included: rdf and xsd.
TypedLiteralConverter
is a callable interface that transforms the
value of a TypedLiteral
into a native language type in the current
programming language. The type IRI of the TypedLiteral
is used to
determine the best mapping to the native language type.
A TypedLiteralConverter
may be implemented as a class, or as a language-native
callback in languages like ECMAScript.
[NoInterfaceObject Callback]
interface TypedLiteralConverter {
any convert (in DOMString value, in optional IRI
inputType, in optional DOMString modifier);
};
convert
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
value | DOMString | ✘ | ✘ | The value to convert that is associated with the TypedLiteral . |
inputType |
| ✘ | ✔ | The input type for the TypedLiteral passed as an
IRI . For example, http://www.w3.org/2001/XMLSchema#string
or http://www.w3.org/2001/XMLSchema#integer . |
modifier | DOMString | ✘ | ✔ | A developer-specified modifier used during the conversion. The string is
a free-form string that is used by the developer-specified
convert() method.
|
any
The following example demonstrates how a developer could register and use
a TypedLiteralConverter
.
// Register a new type converter for the "xsd:boolean" type document.data.context.registerTypeConversion("xsd:boolean", function(value) {return new Boolean(value);}); // Create a new TypedLiteral of type "xsd:boolean" with a string value of "1" var literal = document.data.store.createTypedLiteral("1", "xsd:boolean"); // Convert the literal to a string var lstr = literal.toString(); var lvalue = literal.value; // At this point, lstr and lvalue will both be "1" // Get the language-native value of the literal var lnvalue = literal.valueOf(); // At this point, the language-native value of the lnvalue variable will // be a Boolean type whose value is 'true'.
The following example demonstrates how to create and pass a
TypedLiteralConverter
function in ECMAScript:
var converter = function (value) { return new String(value) }; document.data.context.registerTypeConverter("xsd:string", converter);
A TypedLiteralConverter
can also specify a target type for
the converter so that one converter can be used for multiple types:
var converter = function(value, inputType, modifier) { if(inputType == "http://www.w3.org/2001/XMLSchema#integer") { // if the input type is xsd:integer, convert to an ECMAScript integer return parseInt(value); } else if(modifier == "caps") { // if the modifier is "caps" convert to a string and uppercase the // return value return new String(value).toUpperCase(); } else { // in all other cases, convert to a string return new String(value); } }; // register the converter document.data.context.registerTypeConverter("xsd:string", converter); document.data.context.registerTypeConverter("xsd:integer", converter); // Use the developer-defined TypedLiteralConverter to resolve the value to // a language-native type. In this example, airport codes are commonly // upper-cased values, so specify that the conversion should be capitalized. var airportCode = document.data.context.convertType("wac", "xsd:string", "caps"); // At this point, the value of airportCode should be "WAC"
The DataStore
is a set of RDFTriple
objects. It provides a basic getter as well
as an indexed getter for retrieving individual items from the store. The
DataStore
can be used to create primitive types as well as store collections
of them in the form of RDFTriple
s.
The forEach method is not properly defined in WebIDL - need to get input from the WebApps Working Group on how best to author this interface.
[NoInterfaceObject]
interface DataStore {
readonly attribute unsigned long size;
getter RDFTriple get (in unsigned long index);
boolean add (in RDFTriple
triple);
IRI
createIRI (in DOMString iri, in optional Node node);
PlainLiteral
createPlainLiteral (in DOMString value, in optional DOMString? language);
TypedLiteral
createTypedLiteral (in DOMString value, in DOMString type);
BlankNode
createBlankNode (in optional DOMString name);
RDFTriple
createTriple (in RDFResource
subject, in IRI
property, in RDFNode
object);
[Null=Null]
DataStore
filter (in optional Object? pattern, in optional Element? element, in optional RDFTripleFilter
filter);
void clear ();
void forEach (in DataStoreIterator
iterator);
boolean merge (in DataStore
store);
};
size
of type unsigned long, readonlyRDFTriple
s,
of the store.add
RDFTriple
to the DataStore
. Returns True
if the RDFTriple
was added to the store successfully. Adding an
RDFTriple
that already exists in the DataStore
must return True and must not store two duplicate triples in the
DataStore
.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
triple |
| ✘ | ✘ | The triple to add to the DataStore . |
boolean
clear
void
createBlankNode
BlankNode
given an optional name value.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
name | DOMString | ✘ | ✔ | The name of the BlankNode , which will be used when Stringifying
the BlankNode . |
BlankNode
createIRI
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
iri | DOMString | ✘ | ✘ | The IRI reference's lexical value. |
node | Node | ✘ | ✔ | An optional DOM Node to associate with the IRI. |
IRI
createPlainLiteral
PlainLiteral
given a value and an optional language.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
value | DOMString | ✘ | ✘ | The value of the PlainLiteral , which is usually a human-readable
string. |
language | DOMString | ✔ | ✔ | The language that is associated with the PlainLiteral encoded
according to the rules outlined in [BCP47]. |
PlainLiteral
createTriple
RDFTriple
given a subject, property and object. If any
incoming value does not match the requirements listed below, a Null value
must be returned by this method.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
subject |
| ✘ | ✘ | The subject value of the RDFTriple .
The value must be either an IRI or a BlankNode . |
property |
| ✘ | ✘ | The property value of the RDFTriple . |
object |
| ✘ | ✘ | The object value of the RDFTriple . The value must be an IRI ,
PlainLiteral , TypedLiteral , or BlankNode . |
RDFTriple
createTypedLiteral
TypedLiteral
given a value and a type.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
value | DOMString | ✘ | ✘ | The value of the TypedLiteral . |
type | DOMString | ✘ | ✘ | The IRI type of the TypedLiteral . The argument can either be a full
IRI or a CURIE. |
TypedLiteral
filter
DataStore
, which consists of zero or more RDFTriple
objects.
DataStore
or a Sequence of
RDFTriple
s. DataStores allow much higher-level functions to be
carried out versus a simple Sequence of RDFTriple
s. However,
DataStore
s may be very memory intensive to construct and manage.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
pattern | Object | ✔ | ✔ |
A filter pattern that determines which triples to select from the
The subject parameter is used to filter The property parameter is used to filter The object parameter is used to filter |
element | Element | ✔ | ✔ | The parent DOM Element where filtering should start. The implementation must only consider RDF triples on the current DOM Element and its children. |
filter |
| ✘ | ✔ | A user defined function, returning a true or false value, that
determines whether or not an RDFTriple should be added to the final
Array. |
DataStore
forEach
DataStore
.
forEach()
method is intended to
provide a functional mechanism for iterating through a DataStore
,
it has been questioned whether the interface would be useful for developers
since there is already a procedural array-index-based iteration mechanism
built into a DataStore.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
iterator |
| ✘ | ✘ | A function that takes the following arguments:
index, subject, property, object. The function is called for each item in
the DataStore . |
void
get
RDFTriple
object at the given index in the list.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
index | unsigned long | ✘ | ✘ | The index of the RDFTriple in the list to retrieve. The value must be
a positive integer value greater than or equal to zero and less than
DataStore ::length. |
getter RDFTriple
merge
DataStore
into this DataStore
.
Duplicate triples must not be inserted into the same data store. Returns
True if all triples were merged into the store successfully.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
store |
| ✘ | ✘ | The external DataStore to merge into this DataStore . |
boolean
The DataStoreIterator
interface is used by the forEach()
method
on the DataStore
when processing all of the triples in a DataStore
.
[NoInterfaceObject, Callback, Null=Null]
interface DataStoreIterator {
void process (in int index, in RDFResource
subject, in IRI
property, in RDFNode
object);
};
process
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
index | int | ✘ | ✘ | The offset into the DataStore that contains the current RDFTriple being
processed. |
subject |
| ✘ | ✘ | The subject of the RDFTriple being processed. |
property |
| ✘ | ✘ | The property associated with the RDFTriple being processed. |
object |
| ✘ | ✘ | The object associated with the RDFTriple being processed. |
void
This section is non-normative.
The following examples demonstrates two mechanisms that are available
for navigating through a DataStore
; index getter-based iteration and
array index-based iteration.
// Get all triples in the document.data.store object var store = document.data.store.filter(); // Loop through the DataStore for(var i = 0; i < store.size; i++) { // a developer may use the get() interface to retrieve a triple. This // approach is called index getter-based iteration. var t1 = store.get(i); // alternatively, a developer may use the indexed-method of retrieving a // triple from the DataStore. This approach is called array index-based // iteration. var t2 = store[i]; }
The following example demonstrates a more functional mechanism that can be
used to process each triple in a DataStore
:
// Specify a callback function as defined by the DataStoreIterator // interface function alertTriple(index, subject, property, object) { alert("DataStore subject: " + subject + ", property: " + property + ", object: " + object); } // Iterate over the DataStore, executing the alertObject callback for each // triple in the DataStore. document.data.store.forEach(alertTriple);
The DataParser
is capable of processing a DOM Element and placing the parsing
results into a DataStore
. While this section specifies how one would
parse RDFa data and place it into a DataStore
, the interface is also intended
to support the parsing and storage of various Microformats, eRDF, GRDDL,
DC-HTML, and Microdata. Web developers that would like to write customer
parsers should extend this interface.
[NoInterfaceObject]
interface DataParser {
attribute DataStore
store;
[Null=Null]
DataIterator
iterate (in optional Object? pattern, in optional Element? element, in optional RDFTripleFilter
filter);
boolean parse (in Element domElement);
};
store
of type DataStore
DataStore
that is associated with the DataParser
. The results of
each parsing run will be placed into the store.iterate
DataIterator
, which is capable of iterating through
a set of RDF triples, one RDFTriple
at a time. The DataIterator
is
most useful in small memory footprint environments, or in documents that
contain a very large number of triples.
iterate()
method, and a
process-and-store-based mechanism for storing triples via the
parse()
method on the same interface is confusing. Each mechanism
provides an alternate way of processing triples in a document. In the
future, iterate()
and parse()
may be separated
out into two distinct interfaces.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
pattern | Object | ✔ | ✔ |
A filter pattern that determines which triples to select from the
The subject parameter is used to filter The property parameter is used to filter The object parameter is used to filter |
element | Element | ✔ | ✔ | The parent DOM Element where filtering should start. The implementation must only consider RDF triples on the current DOM Element and its children. |
filter |
| ✘ | ✔ | A user defined function, returning a true or false value, that
determines whether or not an RDFTriple should be added to the final
Array. |
DataIterator
parse
store
with the information that is discovered. If a starting
element isn't specified, or the value of the starting element is Null, then
the document
object must be used as the starting element.
RDFTriple
s into the DataStore
, the entire
document must be processed by an RDFa Processor due to context that may
affect the generation of a set of triples. Specifying the DOM Element
is useful when a subset of the document data is to be stored in the
DataStore
.Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
domElement | Element | ✘ | ✘ | The DOM Element that should trigger triple generation. |
boolean
The DataIterator
interface may undergo
changes in the next version of the RDFa API specification. Implementers are
warned to not implement the interface and wait for the next revision of this
specification.
The DataIterator
iterates through a DOM subtree and returns RDFTriple
s
that match a filter function or triple pattern. A DOM Element can
be specified so that only triples contained in the Element and its children will
be a part of the iteration. The DataIterator
is provided in order to
allow implementers to provide a less memory intensive implementation for
processing triples in very large documents.
A DataIterator
is created by calling the
document.data.parser.iterate()
method.
[NoInterfaceObject]
interface DataIterator {
attribute DataStore
store;
readonly attribute Element root;
readonly attribute RDFTripleFilter
filter;
readonly attribute RDFTriple
triplePattern;
RDFTriple
next ();
};
filter
of type RDFTripleFilter
, readonlyRDFTriple
s in a subtree.root
of type Element, readonlyRDFTriple
s.store
of type DataStore
DataStore
that is associated with the DataIterator
.triplePattern
of type RDFTriple
, readonlyThis section is non-normative.
The following examples describe the how various filter patterns can be applied to the DOM via document.data.parser.iterate().
// Get a DataIterator via the DataParser interface. A DataIterator allows // stream-based access to document data and does not store the matched // triples into a DataStore. DataIterators are useful in environments where // memory is not plentiful and stream-based processing would reduce memory // usage. var iter1 = document.data.parser.iterate(); // Iterate through each triple matched by the iterator for(var triple = iter1.next(); triple != null; triple = iter1.next()) { // do something with the RDFTriple } // A developer may provide a pattern filter to use when filtering on subject. // The following iterator would only iterate on triples with a subject // of "http://www.example.com#foo" var iter2 = document.data.parser.iterate( {"subject": "http://www.example.com#foo"} ); // A developer may also provide a pattern filter to use when filtering on property. // The following iterator would only iterate on triples with a property // of "http://xmlns.com/foaf/0.1/name" var iter3 = document.data.parser.iterate( {"property": "foaf:name"} );
The PropertyGroup
interface provides a view on a particular subject contained
in the DataStore
. The PropertyGroup
aggregates the RDFTriple
s
as a single language-native object in order to provide a more natural
programming primitive for developers.
PropertyGroup
attributes can be accessed in the following ways in ECMAScript:
// Retrieves a PropertyGroup for the given subject - since the IRI is // fragment-relative, the document's IRI is used as the base IRI. var person = document.getItemBySubject("#bob"); // Access the PropertyGroup's name attribute via an IRI var name1 = person.get("http://xmlns.com/foaf/0.1/name"); // Access the PropertyGroup's name attribute via a CURIE var name2 = person.get("foaf:name"); // At this point name1 and name2 are equivalent
[NoInterfaceObject]
interface PropertyGroup {
attribute Object info;
attribute IRI
[] properties;
Sequence<RDFNode> get (in DOMString property);
};
info
of type ObjectPropertyGroup
.
The info object must provide a property-based accessor mechanism
for languages that support it, such as ECMAScript. For DOM-based
environments, the Element
that originated the PropertyGroup
must
be specified in a property called source
in the
info
attribute.properties
of type array of IRI
PropertyGroup
.get
BlankNode
s, PlainLiteral
s, and/or
TypedLiteral
s in the projection that have a property IRI that is equivalent
to the given value.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
property | DOMString | ✘ | ✘ | A stringified IRI representing a property whose values are to be
retrieved from the PropertyGroup . For example, using a property of
http://xmlns.com/foaf/0.1/name will return a sequence of
values that represent FOAF names in the PropertyGroup . The given
property may also be a CURIE. |
Sequence<RDFNode>
This section is non-normative.
The following examples demonstrate how to use the PropertyGroup
interface.
// Get a PropertyGroup by subject. In this case, the PropertyGroup will // represent Ivan Herman, the current Semantic Web Activity Lead at the // World Wide Web Consortium var ivan = document.getItemBySubject("http://www.ivan-herman.net/foaf#me"); // Set the context mapping for the foaf prefix so that we can retrieve // properties using CURIEs document.data.context.setPrefix("foaf", "http://xmlns.com/foaf/0.1/"); // Get the names associated with Ivan Herman var names = ivan.get("foaf:name"); // At this point, names will be a list containing two items, // ["Herman Iván", "Ivan Herman"], Ivan's Westernized name and Ivan's // native name. // Get the titles associated with Ivan Herman var titles = ivan.get("foaf:title"); // At this point, the titles list will contain only a single item - ["Dr."] // Get all of the work homepages associated with Ivan var pages = ivan.get("foaf:workInfoHomepage"); // At this point, the pages list will contain three entries: // "http://www.iswsa.org/", "http://www.iw3c2.org", and // "http://www.w3.org/2001/sw/#activity"
The RDFa Working Group is currently discussing the best mechanism to enable access to the DOMNode that contains a particular subject, predicate or object. While there have been several mechanisms that have been proposed, none of them are easy or straighforward to use. This mechanism will be modified heavily in the next version of the document in order to allow easier access to the DOMNode associated with a particular piece of structured data:
var people = document.getItemsByType("foaf:Person"); for (var i = 0; i < people.length; i++) { people[i].info("foaf:name", "source")[0].style.border = "1px solid blue"; }
A query can be used to retrieve not only basic PropertyGroup
s, but can also
specify how PropertyGroup
s are built by utilizing PropertyGroup
Templates.
There has been a complaint that this section comes from out of nowhere. The purpose of this section is to describe that PropertyGroups can be mapped to native language objects to ease development. We may need to elaborate more on this at this point in the document to help integrate this section with the flow of the document.
For example, assume our source document contains the following event, marked up using the Google Rich Snippet Event format (example taken from the Rich Snippet tutorial, and slightly modified):
<div prefix="v: http://rdf.data-vocabulary.org/#" typeof="v:Event"> <a rel="v:url" href="http://amyandtheredfoxies.example.com/events" property="v:summary">Tour Info: Amy And The Red Foxies</a> <span rel="v:location"> <a typeof="v:Organization" rel="v:url" href="http://www.kammgarn.de/" property="v:name">Kammgarn</a> </span> <div rel="v:photo"><img src="foxies.jpg"/></div> <span property="v:summary">Hey K-Town, Amy And The Red Foxies will rock Kammgarn in October.</span> When: <span property="v:startDate" content="20091015T19:00">15. Oct., 7:00 pm</span>- <span property="v:endDate" content="20091015T21:00">9:00 pm</span> </span> Category: <span property="v:eventType">concert</span> </div>
To query for all Event PropertyGroup
s we know that we can do this:
var ar = query.select({ "rdf:type": "http://rdf.data-vocabulary.org/#Event" });
However, to build a special PropertyGroup
that contains the summary, start
date and end date, we need only do this:
var events = query.select({ "rdf:type": "http://rdf.data-vocabulary.org/#Event" }, {"rdf:type" : "type", "v:summary": "summary", "v:startDate": "start", "v:endDate": "end"} );
The second parameter is a Property Group Template. Each key-value pair
specifies an IRI to map to an attribute in the resulting PropertyGroup
object.
Exposing the embedded data in each PropertyGroup
makes it easy
to create an HTML anchor that will allow users to add the event to
their Google Calendar, as follows:
var anchor, button, i, pg; for (i = 0; i < events.length; i++) { // Get the PropertyGroup var event = events[i]; // Create the anchor anchor = document.createElement("a"); // Point to Google Calendar anchor.href = "http://www.google.com/calendar/event?action=TEMPLATE" + "&text=" + event.summary + "&dates=" + event.start + "/" + event.end; // Add the button button = document.createElement("img"); button.src = "http://www.google.com/calendar/images/ext/gc_button6.gif"; anchor.appendChild(button); // Add the link and button to the DOM object // NOTE: The next call will likely change in the next version of the RDF // API specification as it is too difficult to use for most developers. event.info("rdf:type", "source")[0].appendChild(anchor); }
The result will be that the event has an HTML a
element
at the end (and any Event on the page will follow this
pattern):
<div vocab="http://rdf.data-vocabulary.org/#" typeof="Event"> . . . <a href="http://www.google.com/calendar/event?action=TEMPLATE& → text=Hey+K-Town,+Amy+And+The+Red+Foxies+will+rock+Kammgarn+in+October.&dates=20091015T1900Z/20091015T2100Z"> <img src="http://www.google.com/calendar/images/ext/gc_button6.gif" /> </a> </div>
For more detailed information about queries see the DataQuery
interface.
The DataQuery
interface provides a means to query a DataStore
. While this
interface provides a simple mechanism for querying a DataStore
for RDFa, it
is expected that developers will implement other query interfaces that
conform to this DataQuery
interface for languages like SPARQL or other
Domain Specific Language.
[NoInterfaceObject]
interface DataQuery {
attribute DataStore
store;
Sequence<PropertyGroup> select (in Object? query, in optional Object template);
};
select
PropertyGroup
s that matches the given selection
criteria.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
query | Object | ✔ | ✘ | An associative array containing properties as keys and objects to match
as values. If the query is null, every item in the DataStore that the
query is associated with must returned. |
template | Object | ✘ | ✔ | A template describing the attributes to create in each PropertyGroup
that is returned. The template is an associative array containing
properties as keys and attribute names that should be created in the
returned PropertyGroup as values. |
Sequence<PropertyGroup>
The RDFa API is designed to provide a small, powerful set of interfaces that a developer may use to retrieve RDF triples from a Web document. The core interfaces were described in the previous two sections. This section focuses on the final RDFa API that most developers will utilize to generate the objects that are described in the RDF Interfaces and the Structured Data Interfaces sections. The following API is provided by this specification:
The following section describes all of the extensions that are necessary to enable manipulation of structured data within a Web Document.
[Supplemental, NoInterfaceObject]
interface RDFaDocument {
readonly attribute DocumentData
data;
DocumentData
createDocumentData ();
Sequence<PropertyGroup> getItemsByType (in DOMString type);
PropertyGroup
getItemBySubject (in DOMString subject);
Sequence<PropertyGroup> getItemsByProperty (in DOMString property, in DOMString value);
NodeList getElementsByType (in DOMString type);
NodeList getElementsBySubject (in DOMString subject);
NodeList getElementsByProperty (in DOMString property, in DOMString value);
};
data
of type DocumentData
, readonlyDocumentData
interface is useful for extracting and storing data
that is associated with the Document.createDocumentData
DocumentData
object and returns it. The object that is returned
must have the store
, context
, parser
and query
attributes initialized to sensible defaults that would
allow the immediate extraction of RDFa data from the current document by
calling DocumentData
.parser.parse(document)
.DocumentData
getElementsByProperty
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
property | DOMString | ✘ | ✘ | A DOMString representing an IRI-based property. The string can either be a full IRI or a CURIE. |
value | DOMString | ✘ | ✘ | A DOMString representing the value to match against. |
NodeList
getElementsBySubject
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
subject | DOMString | ✘ | ✘ | A DOMString representing an IRI-based subject. The string can either be a full IRI or a CURIE. |
NodeList
getElementsByType
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✘ | A DOMString representing an rdf:type to select against. |
NodeList
getItemBySubject
PropertyGroup
object based on its subject.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
subject | DOMString | ✘ | ✘ | A DOMString representing an IRI-based subject. The string can either be a full IRI or a CURIE. |
PropertyGroup
getItemsByProperty
PropertyGroup
objects based on the values of a
property.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
property | DOMString | ✘ | ✘ | A DOMString representing an IRI-based property. The string can either be a full IRI or a CURIE. |
value | DOMString | ✘ | ✘ | A DOMString representing the value to match against. |
Sequence<PropertyGroup>
getItemsByType
PropertyGroup
objects based on their
rdf:type
property.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✘ | A DOMString representing an rdf:type to select against. |
Sequence<PropertyGroup>
Document implements RDFaDocument
;
RDFaDocument
.
If the RDFa API is implemented in a DOM environment and a
DOMImplementation
interface is provided, the following
additional requirements for the hasFeature()
method must be
met:
interface DOMImplementation {
boolean hasFeature (in DOMString feature, in DOMString version);
};
hasFeature
true
for a feature string of
"RDFaAPI
" and a version string of "1.1
".
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
feature | DOMString | ✘ | ✘ | The feature string to use when checking to see if the DOM environment exposes all of the RDFa API attributes and methods. |
version | DOMString | ✘ | ✘ | The version string to use when checking to see if the DOM environment exposes all of the RDFa API attributes and methods. |
boolean
The DocumentData
interface is used to create structured-data related
context, storage, parsing and query objects.
interface DocumentData {
attribute DataStore
store;
attribute DataContext
context;
attribute DataParser
parser;
attribute DataQuery
query;
DataContext
createContext ();
DataStore
createStore (in optional DOMString type);
DataParser
createParser (in DOMString type, in DataStore
store);
DataQuery
createQuery (in DOMString type, in DataStore
store);
};
context
of type DataContext
DataContext
for the document.parser
of type DataParser
DataParser
for the document.query
of type DataQuery
DataQuery
for the document.store
of type DataStore
DataStore
for the document.createContext
DataContext
and returns it.DataContext
createParser
DataParser
of the given type and returns it.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✘ | The type of DataParser to create. A value of
"rdfa " must be accepted for all conforming implementations
of this specification. |
store |
| ✘ | ✘ | The DataStore to associate with the DataParser . |
DataParser
createQuery
DataQuery
for the given store.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✘ | The type of query to create for the given store. A value of
"rdfa " must be accepted for all conforming implementations
of this specification. Implementations may provide
alternative query interfaces, such as SPARQL, SQL, HQL, GQL, or other
query languages to enable innovative new ways of querying the underlying
storage mechanism.
|
store |
| ✘ | ✘ | The DataStore to associate with the DataQuery . |
DataQuery
createStore
DataStore
and returns it. If the type is not specified,
a DataStore
must be created and returned. Alternatively, developers
may provide other DataStore
implementations such as
persisted triple stores, quad stores, distributed graph stores
and other more advanced storage mechanisms. The type determines the
underlying DataStore
that will be created.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
type | DOMString | ✘ | ✔ | The type of DataStore to create. A value of "triple" must be
accepted for all conforming implementations of this specification. If
the type is omitted, a value of "triple" must be assumed. |
DataStore
An important goal of the RDFa API is to help Web developers filter the set of RDF triples in a document down to only the ones that interest them. This section covers pattern-based filters. Pattern filters trigger off of one or more of the subject, property, or object properties in RDF triples. This section also introduces the interfaces for the other filter types.
Filter criteria may also be defined by the developer as a filter function.
The RDFTripleFilter
is a callable function that determines
whether an RDFTriple
should be included in the set of output triples.
[NoInterfaceObject, Callback, Null=Null]
interface RDFTripleFilter {
boolean match (in RDFTriple
triple);
};
match
RDFTriple
should be included in the output set, or false if the input RDFTriple
should
be rejected from the output set.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
triple |
| ✘ | ✘ | The triple to test against the filter. |
boolean
This section is non-normative.
The examples below use the following HTML code:
<div id="start" about="http://dbpedia.org/resource/Albert_Einstein"> <span property="foaf:name">Albert Einstein</span> <span property="dbp:dateOfBirth" datatype="xsd:date">1879-03-14</span> <div rel="dbp:birthPlace" resource="http://dbpedia.org/resource/Germany"> <span property="dbp:conventionalLongName">Federal Republic of Germany</span> </div> </div>
The following examples demonstrate the use of document.data.store.filter() and document.data.parser.iterate() in ECMAScript.
// create a filter function that filters on triples with properties in the // foaf namespace. function myFilter(element, subject, property, object) { if(subject.value.search(/http:\/\/xmlns.com\/foaf\/0\.1/) >= 0) { return true; } } // start filtering at the element with the id attribute whose value is "start" // using the DataStore's filter() method var store = document.data.store.filter({}, document.getElementById("start"), myFilter); store.forEach(function (index, subject, property, object) { alert(object); }); // The code above will display one alert dialog box containing the following // text: "Albert Einstein". // start filtering at the element with the id attribute whose value is "start" // using the DataParser's iterate() method var iter = document.data.parser.iterate({}, document.getElementById("start"), myFilter); for(var triple=iter.next(); triple != null; triple = iter.next()) { alert(triple.object); } // The code above will display one alert dialog box containing the following // text: "Albert Einstein".
The RDFa API must be initialized before the Web developer has access to any of the methods that are defined in this specification. To initialize the API environment in a Browser-based environment, an implementor must do the following:
Some platforms may merge one or more of these steps as a
convenience to developers. For example, a browser that supports
this API may carry out the first four steps when a document loads,
and then expose a Query interface to allow developers to access the
PropertyGroup
s. Some approaches to this will be discussed in the
next section, but before we look at those, we'll give a brief
overview of how each of these phases would normally be
accomplished.
To create a store the createStore
method is called:
document.data.store = document.data.createStore();
The store object created supports the Store interfaces providing methods to add metadata to the store. These methods are used during parsing to populate the store but they can also be used directly to add additional information. Examples of this are shown later.
Once a store has been created, the implementor should create a default parser:
document.data.parser = document.data.createParser("rdfa", store);
Note that an implementation may support many types of parser, so the specific parser required needs to be specified. For example, an implementation may also support a Microformats hCard parser:
var parser = document.data.createParser("hCard", store);
Implementations may also support different versions of a parser, for example:
var parser1 = document.data.createParser("rdfa1.0", store); var parser2 = document.data.createParser("rdfa1.1", store);
Probably should have a URI to identify parsers rather than a string, since not only are there many different Microformats, but also, people may end up wanting to add parsers for RDF/XML, different varieties of JSON, and so on. However, if we treat the parameter here as a CURIE, then we can avoid having long strings. If we do that, then the version number would need to be elided with the language type: "rdfa1.0", "rdfa1.1", and so on.
Once we have a parser, we can use it to extract information from
sources that contain embedded data. In the following example we
extract data from the Document
object:
parser.parse( document );
Since the parser is connected to a store, the PropertyGroup
s
obtained from processing the document are now available in the
variable document.data.store
.
A store can be used more than once for parsing. For example, if we wanted to apply an hCard Microformat parser to the same document, and put the extracted data into the same store, we could do this:
var store = document.data.createStore(); document.data.createParser("rdfa", store).parse(); document.data.createParser("hCard", store).parse();
The store will now contain PropertyGroup
s from the RDFa
parsing, as well as PropertyGroup
s from the hCard parsing.
If the developer wishes to reuse the store but clear it first,
then the clear()
method on the DataStore
interface can be used.
Diagram: Show the connection between a PropertyGroup
and the DOM.
Query objects are used to interrogate stores and obtain a list
of DOM objects that are linked to PropertyGroup
s. Since there are
a number of languages and techniques that can be used to express
queries, we need to specify the type of query object that we'd
like:
var query = document.data.createQuery("rdfa", store);
This section is non-normative.
The current version of the RDFa API focuses on filtering RDF triples. It also provides methods for filtering DOM Nodes that contain certain types of RDF triples.
The RDFa Working Group is currently discussing whether or not to include the following advanced functionality:
At the time of publication, the members of the RDFa Working Group were: