Subsets

From RDFa Working Group Wiki
Jump to: navigation, search

Safe Subsets of RDFa

10 November 2010

Editor:
Toby Inkster, Invited Expert

Abstract

Certain consumers of RDFa have indicated that they wish to offer only partial support. While the W3C has recommended full RDFa, the RDFa Working Group recognises demand for simpler, consumer-specific subsets.

Some subsets of RDFa are better than others though. For example, the following snippet:

	<div prefix="foaf: http://xmlns.com/foaf/0.1/"
		about="#alice" rev="foaf:knows">
		<span property="foaf:name">Bob</span>
	</div>
	

Means Alice is known by somebody named "Bob". However, if the consumer chooses to ignore the @rev attribute, it will be interpreted as stating that Alice's name is "Bob".

A safe subset of RDFa support is a subset that does not risk the consumer drawing conclusions from the document that could not be drawn by a full RDFa consumer.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document was developed by the RDFa Working Group. This document has no formal status (it is neither a W3C Recommendation nor a Working Group Note).

Safe Subsets

Filtered Triples

This is considered the usual way of consuming a subset of RDF, and can easily be applied to RDFa. This involves processing the entire RDFa document, but discarding parts of it that you don't understand or choose to ignore.

To filter triples, simply process the RDFa document according to RDFa syntax, but when the processing results in a triple being emitted, the processor checks to see if the triple is a supported one (usually by examining the predicate URI, and sometimes the subject or object in conjunction with the predicate), and if it is not supported behaves as if it were never emitted.

Fixed Sets of Profiles

This subset of RDFa can offer performance improvements over full RDFa. The consumer offers support for only an enumerated subset of RDFa profile URIs. Support for these would typically be hard-coded, avoiding the need for costly network operations.

To offer only a fixed set of profiles, process the RDFa document according to RDFa syntax, but when the needing to load a profile, if it is not in the supported set of profile URIs the consumer behaves as if the profile could not be dereferenced - that is, ignores the current element and its children.

The default profiles of any RDFa host languages supported by the consumer (e.g. the default XHTML profile document) must be included in this fixed set.

Ignoring XML Namespaces

The RDFa 1.1 Core syntax deprecates XML namespaces as a CURIE prefix mapping method. A consumer may prefer not to support XML namespaces as a CURIE mapping method. This subset seems to offer little performance improvement over full RDFa parsing.

To ignore XML namespaces, when an element is encountered that contains one or more attributes from the XMLNS namespace (http://www.w3.org/2000/xmlns/), behave as if a non-dereferencable profile URI had been encountered on that element, ignoring that element and its children. Do not consider the default namespace attribute (@xmlns itself) as part of this treatment.

Ignoring Particular Elements

RDFa allows data to be embedded anywhere in a document. For speed of processing, and clarity of markup though, it may be beneficial to only place data in <head> and ignore data in <body>; conversely you may wish to make sure that data used is human visible, so ignore <head>!

To ignore particular elements, process RDFa as usual, but each time a new current element is processed, check if the element is one that you want to ignore (this might be done by inspecting the tag name) and if so behave as if a non-dereferencable profile URI had been encountered on that element, ignoring that element and its children.

Ignoring Resource Types

Resource types may be indicated using the value of the @typeof attribute. A consumer may prefer to ignore these. This subset seems to offer little performance improvement over full RDFa parsing.

To ignore resource types, when @typeof is encountered during processing, treat it as if the attribute were present but empty.

Ignoring Data Types

Data types are indicated using the value of the @datatype attribute. A consumer may prefer to ignore these. This subset seems to offer little performance improvement over full RDFa parsing.

To ignore data types, when @datatype is encountered during processing, expand any CURIE it contains into a URI. If this URI is not http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral then behave as if @datatype were present but empty. If the URI is http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral then either honour that data type, or ignore the element and its children.

Safe Ways of Implementing Otherwise Unsafe Subsets

In a Controlled Environment

If the producer and consumer of RDFa data are in fact the same person or organisation, then it may be safe to use otherwise unsafe subsets - for example, in an Intranet.

Publishers should be careful republishing data that uses that unsafe subset of RDFa on the public Web.

Using a Proprietary Profile URI

People and organisations wishing to consume an unsafe subset of RDFa from the public Web must define a non-dereferencable non-HTTP profile URI to identify this subset. For example, Example Corp may define the URI:

tag:example.com,2010:rdfa

Publishers wishing to target this consumer must include this profile URI on their document root element. When the consumer encounters a document that does not contain this profile URI, it must process it as full RDFa, a safe subset or ignore the document altogether.