W3C

Embedding Accessibility Role and State Metadata in HTML Documents

Editor's Draft December 2006

This version:
http://www.w3.org/WAI/PF/adaptable/HTML4/embedding-20061212
Latest version:
http://www.w3.org/WAI/PF/adaptable/HTML4/
Previous version:
http://www.w3.org/WAI/PF/adaptable/HTML4/embedding-20060318
Authors
Mark Pilgrim, IBM
Aaron Leventhal, IBM
Becky Gibson, IBM

Abstract

This document summarizes the best current practice for embedding accessibility roles and states in HTML documents. In summary, HTML documents SHOULD define a data format to store such role and state metadata in the class attribute of individual HTML elements. HTML documents with such embedded metadata should include an ECMAScript library that understands this data format and parses the role and state metadata for the purpose of copying it to the appropriate namespaced attributes defined in [ROLEMOD] and [WAISTATE].

Status of This Document

This document is offered to the World Wide Web Consortium (W3C) as a suggestion for embedding accessibility roles and states in HTML documents.

This draft represents the current thinking within IBM with regard to creating accessible HTML documents and HTML-based web applications.

W3C has had no editorial control over the preparation of this Note. This document is a work in progress and may be updated, replaced, or rendered obsolete by other documents at any time.

A list of current W3C technical documents can be found at the Technical Reports page.

Table of Contents

  1. Introduction
  2. Terms and Definitions
  3. Technique for embedding accessible role and state metadata in HTML
  4. Examples of the technique
  5. Limitations of the technique
  6. Summary
  7. References

Introduction

User agents parse [HTML4] and XHTML family documents into a Document Object Model [DOM2]. The DOM is a document tree structure with an API to access, modify, and monitor the document. The DOM represents the data model of the document, and the DOM API is the primary means for assistive technologies to retrieve accessibility information about the document.

Assistive technologies depend on extracting semantics from the document in order to represent it to the end user. The HTML specification defines a limited number of semantic elements and attributes, for example data tables within the page, or checkboxes and radio buttons with a web form. Web authors who wish to create richer interfaces generally use generic HTML elements (<div> and <span>) to build custom controls, then script each control's behavior with ECMAScript. These custom controls tend to have poor accessibility, because an assistive technology can not determine how to represent the generic HTML elements to the end user. The Web Accessibility Initiative (WAI) Protocols and Formats working group has created the Dynamic Accessible Web Content Roadmap [DHTML ROADMAP] to describe the problem and solutions to these accessibility issues.

[ROLEMOD] defines an extensible framework for web authors to declare the role of each element on the page. The WAI Working Group is using this extensible framework to create a taxonomy of roles [WAIROLE] and states [WAISTATE] to describe common user interface controls that are not represented in XHTML. Web authors can declare these roles and states on any element within an XHTML document, to indicate that the element represents a custom interface control. User agents can then retrieve this accessibility metadata from the DOM and expose it to assistive technologies through the accessibility architecture of the underlying operating system.

However, the methods described in [ROLEMOD], [WAIROLE], and [WAISTATE] can not be used directly in [HTML4] documents, because they rely on namespace features of XHTML that HTML documents do not support. (More precisely, user agents that parse HTML documents do not parse namespaces defined in HTML documents, so the namespaced attributes would not end up in the appropriate place in the DOM.) However, the DOM API itself does support namespaces, even in HTML documents. A user agent which supports [DOM2] can execute scripts that add namespaced attributes to the DOM after the document has been parsed.

This opens the door to building a bridge between the limitations of HTML and the new capabilities of XHTML. If accessibility metadata could be declared in some way within an HTML document, an ECMAScript library could use the DOM API to retrieve the metadata and copy it into the appropriate role and state namespaces at runtime. This approach has the significant advantage of not requiring any changes to user agents which already support the [WAIROLE] and [WAISTATE] modules in XHTML.

Terms and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119].

Note: The following XHTML definitions are taken from the W3C XHTML Media Types document.

HTML
The HyperText Markup Language. Unless otherwise noted, it is assumed that this refers to documents conforming to the [HTML 4] specification.
XHTML
The Extensible HyperText Markup Language. XHTML is not the name of a single, monolithic markup language, but the name of a family of document types which collectively form this markup language.
XHTML Family document type
A document type which belongs to a family of XHTML document types. Such document types include [XHTML1], and XHTML Host Language document types such as XHTML 1.1 [XHTML11] and XHTML Basic [XHTMLBasic]. Elements and attributes in those document types belong to the XHTML namespace (except those from the XML namespace, such as xml:lang), but an XHTML Family document type MAY also include elements and attributes from other namespaces, such as MathML [MathML2].

Technique for embedding accessible role and state metadata in HTML

HTML documents contain a finite set of elements, and they do not allow the use of namespaces for defining additional elements. However, there is one attribute common to virtually every HTML element, the class attribute, which allows for some limited amount of extensibility. The HTML specification states that the class attribute is primarily used for defining style sheet selectors, but it may also be used for general purpose processing by user agents. [HTML4 section 7.5.2] Furthermore, [HTML4] defines the class attribute as a space-separated list of keywords. This provides enough extensibility to define a data format to encode a finite set of accessible roles and states.

Although XHTML allows multiple roles per element, in practice we have not found a need for such flexibility, so we have limited this technique to declaring a single role per element. However, there is a need for each element to be able to describe multiple states. For example, a tree item within a tree widget can be simultaneously selected, checked, and expanded (or deselected, unchecked, and collapsed). Therefore, the data format for defining a single element's accessible attributes must be able to handle multiple states.

Web authors may still wish to make use of class attributes for their primary purpose, declaring style sheet selectors. This technique should not use up the class attribute completely. Therefore, we will define a keyword that acts as a delimiter. The ECMAScript library will be parsing the document to insert accessibility metadata into the appropriate role and state namespaces; this library should ignore keywords within the class attribute until it sees the delimiter keyword. The next keyword after the delimiter represents the accessible role, and further keywords (if any) represent one or more accessible states. This will allow web authors to continue to define one or more classes that act as style sheet selectors, while allowing the necessarily flexibility to also define an accessible role and any number of accessible states.

User agents will treat all keywords within the class attribute as style sheet selectors. Web authors should take care not to conflate their existing style sheet selectors with the keywords used by the data format for defining accessibility attributes.

Examples of the technique

Using XHTML 1.1

Consider the following XHTML 1.1 document:


<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">
<head>
<script type="text/javascript" src="slider.js" />
</head>
<body>
<span id="slider" class="myslider myselector" />
</body>
</html>

In XHTML 1.1, you can add accessible roles and states by declaring the appropriate namespaced attributes. The following example defines the <span> element to be a slider control with a range of 0 to 50 and a current value of 33.


<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"
xmlns:wairole="http://www.w3.org/2005/01/wai-rdf/GUIRoleTaxonomy#"
xmlns:waistate="http://www.w3.org/2005/07/aaa">
<head>
<script type="text/javascript" src="slider.js" />
</head>
<body>
<span id="slider" class="myslider myselector2"
 role="wairole:slider"
 waistate:valuemin="0"
 waistate:valuemax="50"
 waistate:valuenow="33">
</span>
</body>
</html>

Using XHTML 1.0

Although [ROLEMOD] provides the capability of using the role attribute without a namespace in XHTML 1.1 documents, XHTML 1.0 does not support modularization and therefore requires the role attribute to be namespaced.


<html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml"
xmlns:xhtml10="http://www.w3.org/1999/xhtml"
xmlns:wairole="http://www.w3.org/2005/01/wai-rdf/GUIRoleTaxonomy#"
xmlns:waistate="http://www.w3.org/2005/07/aaa">
<head>
<script type="text/javascript" src="slider.js" />
</head>
<body>
<span id="slider" class="myslider myselector2"
 xhtml10:role="wairole:slider"
 waistate:valuemin="0"
 waistate:valuemax="50"
 waistate:valuenow="33">
</span>
</body>
</html>

Using HTML 4

HTML documents do not support namespaces, so the required accessibility role and state metadata can not be included directly in these documents. In HTML 4, you can define the accessible role and accessible states as keywords in the class attribute, then use an ECMAScript library to parse the class keywords and copy them into the appropriate role and state namespaces.


<html lang="en">
<head>
<script type="text/javascript" src="slider.js"></script>
<script type="text/javascript" src="enable.js"></script>
</head>
<body>
<span id="slider" class="myslider myselector2 axs slider valuemin-0 valuemax-50 valuenow-33" tabindex="0" >
</span>
</body>
</html>

After declaring any classes used for styling, the accessible DHTML control defines a class of "axs". This signals to the ECMAScript compatibility library that this element defines additional role and state information. The next class in the space-separated list of classes is the DHTML role. The remaining classes in the class attribute list represent state information.

A state is a key-value pair, separated by a hyphen. For example, to represent that a checkbox control is currently checked, you could use class="axs checkbox checked-true". Since they are so common, a true boolean state may be represented simply by the state name. So instead of class="axs checkbox checked-true", you can simply say class="axs checkbox checked".

The ECMAScript library, enable.js, implements a function that parses the role and state information stored in the class attribute of each element and copies it to the correct attributes in the appropriate namespaces defined by [ROLEMOD] and [WAISTATE]. From the assistive technology's point of view, the accessibility information stored in HTML documents will "look" identical to accessibility information stored directly in XHTML documents.

By default the enable.js library does nothing. In order to activate it, you will need to call the initApp function with a root element. initApp will iterate through all the children of the given root element. If no root element is given, initApp will iterate through all elements on the page (i.e. document is the default root element). initApp is generally added as the onload event handler of the body element, <body onload="initApp();"> or on the window element within a <script> tag.


<script type="text/javascript">
window.onload=initApp;
</script>

(If your web application needs to register more than one onload event handler, you can use the DOM method window.addEventListener, or one of the widely available cross-browser addEvent functions.)

Since most browsers do not currently support DHTML Accessibility, the ECMAScript compatibility library doesn't actually enhance accessibility in any browser except Firefox 1.5 or later. But as other browsers add support for DHTML Accessibility, this library should work unmodified, and until then it won't do any harm.

If your web application adds new elements to the page after initApp has been called to store role and state information in the class attribute, you will need to call initContent with the new element as a parameter or set the role and state directly on each element.

The ECMAScript compatibility library contains a set of helper functions to map any calls to access role and state attributes. You can use these functions to set the roles and states on newly created elements, or you can modify states on elements at any time. These functions are getAttrNS, setAttrNS, removeAttrNS, and hasAttrNS. The first parameter of each function is element on which to set the attribute.


var slider = document.getElementById("slider");
setAttrNS(slider, http://www.w3.org/2005/07/aaa, "valuenow", "75");

On browsers which support [DOM2] namespaces, these functions map to the appropriate DOM namespace APIs getAttributeNS, setAttributeNS, removeAttributeNS, and hasAttributeNS, respectively. For example:


function setAttrNS(elmTarget, uriNamespace, sAttrName, sAttrValue) {
  elmTarget.setAttributeNS(uriNamespace, sAttrName, sAttrValue);
}

On browsers which do not support namespaces, these functions map to the [DOM1] methods getAttribute, setAttribute, and removeAttribute. (Additional work is needed for hasAttrNS since there is no equivalent hasAttribute function.) In order to add a pseudo namespace to these attributes, the namespace value is converted to a short textual representation followed by a colon and pre-pended to the attribute name


nsMapping["http://www.w3.org/1999/xhtml"] = "xhtml10:";
nsMapping["http://www.w3.org/2005/07/aaa"] = "aaa:";
function setAttrNS(elmTarget, uriNamespace, sAttrName, sAttrValue) {
  elmTarget.setAttribute(nsMapping[uriNamespace] + sAttrName, sAttrValue);
}

Limitations of the technique

There are a number of limitations inherent in this technique, compared to the infinitely extensible framework defined by  [ROLEMOD], [WAIROLE], and [WAISTATE].

  1. Only the roles defined in the [WAIROLE] taxonomy are supported. The ECMAScript library has hard-coded knowledge of the role namespace, so any other role taxonomy would require changes to the ECMAScript library. If a web author needed to support roles defined in two different namespaces simultaneously, she would need to make changes to the class attribute data format to declare which namespace to use for each role.
  2. Only the states defined in the [WAISTATE] taxonomy are supported, for the same reason.
  3. Each HTML element can only be associated with a single role. [ROLEMOD] allows multiple roles per element.

Despite these limitations, this technique addresses the most common accessibility requirements of web applications. If web authors find themselves needing more flexibility than this technique offers, they would be well-advised to migrate to XHTML and use its built-in extensible framework.

Summary

This technique demonstrates how role and state information can be added to user interface controls using HTML. The role and state information is embedded into the class attribute so it becomes part of the DOM when the document is loaded. Immediately after the document has loaded, the enable.js ECMAScript library parses the role and state information into the appropriate namespaced attributes defined in [ROLEMOD] and [WAISTATE]. Adding this namespaced role and state information allows the browser to communicate this information to assistive technologies to create fully accessible components.

While the desired method of adding this accessibility metadata is through namespace features of [XHTML11], this technique provides a bridge to enable the creation of fully accessible components in HTML Web applications. Currently the only user agent to retrieve this accessibility metadata from the DOM and expose it to assistive technologies is Firefox 1.5. However, the addition of this information does not negatively impact other user agents and provides a standard mechanism for future support of DHTML accessibility.

References

[DHTML ROADMAP]

"Dynamic Accessible Web Content Roadmap", W3C Working Draft, Richard Schwerdtfeger, 11 November, 2005.

Available at: http://www.w3.org/WAI/PF/roadmap/DHTMLRoadmap110505.html

The latest version of the Dynamic Accessible Web Content Roadmap is available at: http://www.w3.org/WAI/PF/roadmap/DHTMLRoadmap110505.html

[DOM1]

"Document Object Model (DOM) Level 1 Specification", W3C Recommendation, Lauren Wood et al., eds., 1 October 1998.

Available at: http://www.w3.org/TR/1998/REC-DOM-Level-1-19981001

[DOM2]

"Document Object Model (DOM) Level 2 Core Specification", W3C Recommendation, A. Le Hors et al., eds., 13 November 2000.

Available at: http://www.w3.org/TR/2000/REC-DOM-Level-2-Core-20001113

A list of DOM Level 2 specifications can be found at: http://www.w3.org/DOM/DOMTR#dom2

[RFC2119]

"Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, S. Bradner, March 1997.

Available at: http://www.rfc-editor.org/rfc/rfc2119.txt

[HTML4]

"HTML 4.01 Specification", W3C Recommendation, D. Raggett, A. Le Hors, I. Jacobs, eds., 24 December 1999.

Available at: http://www.w3.org/TR/1999/REC-html401-19991224

The latest version of HTML 4.01 is available at: http://www.w3.org/TR/html401

The latest version of HTML 4 is available at: http://www.w3.org/TR/html4

[MathML2]

"Mathematical Markup Language (MathML) Version 2.0", W3C Recommendation, D. Carlisle, P. Ion, R. Miner, N. Poppelier, eds., 21 February 2001.

Available at: http://www.w3.org/TR/2001/REC-MathML2-20010221

The latest version is available at: http://www.w3.org/TR/MathML2

[ROLEMOD]

"XHTML Role Attribute Module", W3C Working Draft, Mark Birkbeck, Shane McCarron, Steven Pemberton, T.V. Raman and Richard Schwerdtfeger, 25 July 2006.

Available at: http://www.w3.org/TR/2006/WD-xhtml-role-20060725/

The latest version of the XHTML Role Attribute Module is available at: http://www.w3.org/WAI/PF/GUI/

[WAIROLE]

"Role Taxonomy for Accessible Adaptable Applications Working Draft", W3C Working Draft, Lisa Seeman, 21 February 2006.

Available at: http://www.w3.org/WAI/PF/GUI/roleTaxonomy-20060221.html

The latest version of the Role Taxonomy is available at: http://www.w3.org/WAI/PF/GUI/

[WAISTATE]

"States and Adaptable Properties Module", Draft, 06 November 2005.

Available at: http://www.w3.org/WAI/PF/adaptable/StatesAndProperties-20051106.html

The latest version of the State Module is available at: http://www.w3.org/WAI/PF/adaptable/

[XHTML11]

"XHTML™1.1 - Module-based XHTML", W3C Recommendation, M. Altheim, S. McCarron, eds., 31 May 2001.

Available at: http://www.w3.org/TR/2001/REC-xhtml11-20010531

The latest version is available at: http://www.w3.org/TR/xhtml11

[XHTMLBasic]

"XHTML™ Basic", W3C Recemmendation, M. Baker, M. Ishikawa, S. Matsui, P. Stark, T. Wugofski, T. Yamakami, eds., 19 December 2000.

Available at: http://www.w3.org/TR/2000/REC-xhtml-basic-20001219

The latest version is available at: http://www.w3.org/TR/xhtml-basic