This document examines User Agent Guidelines
Working Group rationale for establishing which user agent
functionalities must be supported natively by general purpose user
agents and which are expected to be supported by assistive
Status of this Document
This document does not represent consensus of the User Agent Accessibility Guidelines Working Group. As of
the date at the top of the document,
it only represents the musings of
Ian Jacobs, who
hopes it will serve as a support document for the User
Agent Accessibility Guidelines as they advance
on the Recommendation track.
Table of Contents
The User Agent Accessibility
Guidelines include two types of requirements for general purpose
- Requirements for native implementation of some functionalities
(i.e., the user agent implements them without requiring additional
software other than the operating system).
- Communication through Application Programming
Interfaces (APIs). The Guidelines
require user agents to allow read and write access to both
content and user interface controls.
The second set of requirements allows assistive technologies (ATs)
to offer missing functionalities not offered natively. Since the
Guidelines require that ATs have access to both content and UI
controls, in theory, general-purpose user agents have to implement
natively few functionalities related to accessibility since ATs can
fill in the gaps. They might even do a better job since developers of
specialized tools know their target audience well.
The Working Group has decided that general-purpose user agents must
implement some important functionalities natively rather than relying
on Assistive Technologies to shoulder the load. One important reason
for this decision is that some users do not have access to or cannot
afford specialized browsers, so general-purpose user agents must
themselves be accessible.
This document explains which requirements the User
Agent Guidelines Working Group has chosen for general purpose user
agents to implement natively and why.
A user agent is a set of modules that retrieves Web resources,
renders them, allows control of those processes, and communicates with
other software. For instance, a graphical desktop browser might
- A parser and a tree processing engine
- One or more rendering engines that, given a tree and style parameters,
creates rendering structures.
- A user interface for providing access to content. This includes:
- Navigation and search mechanisms, which allow the user to access
content other than sequentially.
- Orientation mechanisms such as proportional scroll bars,
highlighting of viewports, selection and focus, etc.
- A user interface for configuring the browser, including parameters of
the rendering engines(s), processing information such as natural language
preferences, and the user interface itself (e.g., position of buttons in
- A user interface for facilitating browsing (history mechanism, bookmark
- Other user interface features (e.g., refresh the screen, reload the
document, send to the printer, etc.)
- Interpreters for scripting languages.
- APIs for communication with plug-ins.
- Interfaces (e.g., for HTTP, for DOM, for file i/o including document
loading and printing, communication with the operating system, etc.)
Note that there are areas where content and user interface mingle,
- Form controls
- Links and their styling
- Keyboard and other input configurations provided by the author
- Adoption of markup languages for implementation of UI controls (as is
done, e.g., by IE using HTML and as is done by Mozilla by using XML and
the DOM for UI controls).
For simplicity, I will consider for now that the UI refers to the UA's
components, not those contributed by Web content.
An assistive technology is a user agent that (1) relies on another
user agent to provide some services and (2) provides services beyond
those offered by the "host user agent" to meet the requirements of a
target audience. Additional services might include:
- Read access to the document tree would allow application of different
rendering engines. (e.g., speech output)
- Write access to the document tree would allow completion of forms
through, say, voice input
- Read access to the UI would allow an assistive technology to know which
viewport the user has selected, user agent configuration settings,
- Write access to the UI would allow users to navigate viewports (i.e.,
change the current viewport) through speech input.
- Content transformation tools
- Additional navigation mechanisms
- Additional orientation mechanisms
An assistive technology may not parse document source, for example, but
probably has to include tree processing capabilities in order to offer
The following sections describe some of the factors
that have affected the decision about which functionalities
should be supported natively by general purpose user agents.
Some general-purpose user agents already provide
useful functionalities such as allowing users to
navigate links through the keyboard.
Assistive technologies read the focus and speak
or render as braille the link text. One might argue
that links are so fundamental to browsing the Web that
it makes sense to require navigation of these links to
be a native functionality, I believe that in part
the current requirement is a perpetuation of existing
practice. Lynx offers direct access to links by number, not just
sequential access, and this has shown itself to be a useful
The existence of a platform- and programming-language independent
way to access content means that it's more understandable to ask
AT developers to provide some functionalities. Note, however, that
the lack of a standard for exposing the DOM may hinder adoption.
Also, since assistive technology products are usually designed
to work with other software than user agents, requiring them to
implement a UA-specific interface may be considered burdensome.
Finally, there is not yet a platform-independent API for
accessing user interface controls.
No minimal functional requirement obvious
The WG attempted to identify "minimal requirements"
a user agent would have to satisfy to be considered accessible
(at times, the bar is quite high in fact). For some functionalities,
minimal requirements were difficult or impossible to identify,
and therefore the WG chose either:
- To make a general requirement and leave the specifics
to the Techniques Document, or
- To make no requirement and leave the job to
One example of this includes table navigation. Access
to table cells and cell context (headers, neighboring cells, etc.)
is very important to users and there is a Priority 1 requirement
that such information be made available to users. However, navigation
of table cells is just one (admittedly useful) means to achieve
the goals of access to content and orientation. Two problems
present themselves, however:
- Requiring navigation through N-dimensional space (up, down,
left, right) frames the functionality in terms of the
graphical artifact. For non-sighted users or users with
motor disabilities, requiring navigation through two-dimensional
space may not be an efficient way to access the information.
- Which navigation methods suffice? Sequential access alone?
For large tables (say 500x500 cells), this would surely
be tedious and therefore direct access (by row/column position)
would be more effective. In addition, it would be useful
to be able to shrink or expand parts of the table, search on
table contents, identify all cells (possibly in N-dimensional
space) under a particular header, and all headers for a particular
cell, etc. In short, the WG recognizes the importance of
navigation as a technique for making tables accessible, but
has not been able to identify a minimal requirement for
general-purpose user agents.
Likelihood of implementation
The requirements of the Guidelines are not independent
of considerations of implementability or cost. The Techniques
Document represents the WG's efforts at showing how each
requirement may be implemented. However, the WG may have chosen
not to make certain requirements either because it seemed
"unreasonable" to ask desktop browsers to implement the
functionality or because the likelihood of implementation and
conformance seemed low.
The Working Group has endeavored to incorporate feedback from users
with disabilities and experts in different fields related to
accessibility about important requirements for these guidelines.
The following review is based on the 20 December
1999 UA Guidelines.
In order to provide rationale for requiring native support by
general purpose user agents of certain functionalities, I've grouped
them by theme. This grouping makes it relatively easy to understand
why most of the checkpoints require native support in general purpose
user agents for the functionalities in question. The themes are:
- The requirements of these checkpoints are
applicable to all user agents.
- The requirements of these checkpoints refer to
content rendered natively by the user agent.
- The requirements of these checkpoints pertain
to communication with assistive technologies and
thus were designed specifically for general purpose user agents.
- The requirements of these checkpoints are
readily assignable to a particular class of user agent.
- The requirements of these checkpoints were considered
to be the responsibility of assistive
technologies by the Working Group.
All user agents should meet these requirements, although how they
are met will depend on the type of user agent. These requirements
concern device independence, the native user interface and to
- Checkpoint 2.1 Ensure that the user has access to all content, including alternative equivalents for content.
- Checkpoint 6.1 Implement the accessibility features of supported specifications (markup languages, style sheet languages, metadata languages, graphics formats, etc.).
- Checkpoint 11.1 Provide a version of the product documentation that conforms to the Web Content Accessibility Guidelines.
- Checkpoint 11.2 Document all user agent features that promote accessibility.
- Checkpoint 11.3 Document the default input configuration (e.g., default keyboard bindings).
- Checkpoint 4.13 Allow the user to control how the selection is highlighted (e.g., foreground and background color).
- Checkpoint 4.14 Allow the user to control how the content focus is highlighted (e.g., foreground and background color).
- Checkpoint 5.3 Implement selection, content focus, and user interface focus mechanisms and make them available to users and through APIs.
- Checkpoint 7.1 Allow the user to navigate viewports (including frames).
- Checkpoint 7.2 For user agents that offer a browsing history mechanism, when the user returns to a previous viewport, restore the point of regard in the viewport.
- Checkpoint 8.4 Provide a mechanism for highlighting and identifying (through a standard interface where available) the current viewport, selection, and content focus.
- Checkpoint 1.3 Ensure that the user can interact with all active elements in a device-independent manner.
- Checkpoint 5.6 Follow operating system conventions and accessibility settings. In particular, follow conventions for user interface design, default keyboard configuration, product installation, and documentation.
- Checkpoint 10.6 Allow the user to configure the user agent in named profiles that may be shared on systems with distinct user accounts.
- Checkpoint 11.4 In a dedicated section, document all features of the user agent that promote accessibility.
- Checkpoint 4.15 Allow the user to control user agent-initiated spawned viewports.
- Checkpoint 10.4 Use operating system conventions to indicate the input configuration.
- Checkpoint 10.5 Avoid default input configurations that interfere with operating system conventions.
- Checkpoint 8.8 Provide a mechanism for highlighting and identifying (through a standard interface where available) active elements.
- Checkpoint 9.5 When loading content (e.g., document, video clip, audio clip, etc.) indicate what portion of the content has loaded and whether loading has stalled.
- Checkpoint 9.6 Indicate the relative position of the viewport in content (e.g., the percentage of an audio or video clip that has been played, the percentage of a Web page that has been viewed, etc.).
- Checkpoint 8.9 Maintain consistent user agent behavior and default configurations between software releases. Consistency is less important than accessibility and adoption of operating system conventions.
- Checkpoint 10.7 Provide default input configurations for frequently performed tasks.
It makes sense for user agents to provide native support
for content rendered natively.
- Checkpoint 2.2 For presentations that require user interaction within a specified time interval, allow the user to control the time interval (e.g., by allowing the user to pause and restart the presentation, to slow it down, etc.).
- Checkpoint 2.6 Allow the user to specify that captions and auditory descriptions be rendered at the same time as the associated auditory and visual tracks.
- Checkpoint 3.1 Allow the user to turn on and off rendering of background images.
- Checkpoint 3.2 Allow the user to turn on and off rendering of background audio.
- Checkpoint 3.3 Allow the user to turn on and off rendering of video.
- Checkpoint 3.4 Allow the user to turn on and off rendering of audio.
- Checkpoint 3.5 Allow the user to turn on and off animated or blinking text.
- Checkpoint 3.6 Allow the user to turn on and off animations and blinking images.
- Checkpoint 3.7 Allow the user to turn on and off support for scripts and applets.
- Checkpoint 4.1 Allow the user to control font family.
- Checkpoint 4.2 Allow the user to control the size of text.
- Checkpoint 4.3 Allow the user to control foreground color.
- Checkpoint 4.4 Allow the user to control background color.
- Checkpoint 4.5 Allow the user to slow the presentation rate of audio, video, and animations.
- Checkpoint 4.8 Allow the user to control the position of captions on graphical displays.
- Checkpoint 4.9 Allow the user to control synthesized speech playback rate.
- Checkpoint 4.10 Allow the user to control synthesized speech volume.
- Checkpoint 4.12 Allow the user to select from available author and user style sheets or ignore them.
- Checkpoint 2.5 If more than one alternative equivalent is available for content, allow the user to choose from among the alternatives. This includes the choice of viewing no alternatives.
- Checkpoint 2.3 When no text equivalent has been supplied for an object, make available author-supplied information to help identify the object (e.g., object type, file name, etc.).
- Checkpoint 4.6 Allow the user to start, stop, pause, advance, and rewind audio, video, and animations.
- Checkpoint 4.7 Allow the user to control the audio volume.
- Checkpoint 4.11 Allow the user to control synthesized speech pitch, gender, and other articulation characteristics.
- Checkpoint 2.4 When a text equivalent for content is explicitly empty (i.e., an empty string), render nothing.
- Checkpoint 2.7 For author-identified but unsupported natural languages, allow the user to request notification of language changes in content.
- Checkpoint 3.8 Allow the user to turn on and off rendering of images.
These requirements were designed specifically for
general purpose user agents to ensure interoperability. They
may also apply to user agents in general.
- Checkpoint 1.4 Ensure that every functionality offered through the user interface is available through the standard keyboard API.
- Checkpoint 1.1 Ensure that every functionality offered through the user interface is available through every input device API used by the user agent. User agents are not required to reimplement low-level functionalities (e.g., for character input or pointer motion) that are inherently bound to a particular API and most naturally accomplished with that API.
- Checkpoint 1.2 Use the standard input and output device APIs of the operating system.
- Checkpoint 1.5 Ensure that all messages to the user (e.g., informational messages, warnings, errors, etc.) are available through all output device APIs used by the user agent. Do not bypass the standard output APIs when rendering information (e.g., for reasons of speed, efficiency, etc.).
- Checkpoint 5.1 Provide programmatic read and write access to content by conforming to W3C Document Object Model specifications.
- Checkpoint 5.2 Provide programmatic read and write access to user agent user interface controls using standard APIs (e.g., platform-independent APIs, standard APIs for the operating system, and conventions for programming languages, plug-ins, virtual machine environments, etc.)
- Checkpoint 5.4 Provide programmatic notification of changes to content and user interface controls (including selection, content focus, and user interface focus).
- Checkpoint 9.1 Provide information about user agent-initiated content and viewport changes through the user interface and through APIs
- Checkpoint 9.4 Allow the user to configure notification preferences for common types of content and viewport changes.
- Checkpoint 9.2 Ensure that when the selection or content focus changes, it is in a viewport after the change.
- Checkpoint 9.3 Prompt the user to confirm any form submission triggered indirectly, that is by any means other than the user activating an explicit form submit control.
- Checkpoint 5.5 Ensure that programmatic exchanges proceed in a timely manner.
- Checkpoint 10.1 Provide information directly to the user and through APIs about current user preferences for input configurations (e.g., keyboard or voice bindings).
- Checkpoint 10.2 Provide information directly to the user and through APIs about current author-specified input configurations (e.g., keyboard bindings specified in content such as by "accesskey" in HTML 4.0).
These checkpoints cannot be readily assignable to a particular
class of user agent.
The Working Group has generally considered navigation
a technique for providing access to content and context.
People would probably agree that without adequate navigation,
access to content and context may be so slow as to make the
content unusable. However, there has not been agreement as to
what minimal navigation requirements (if any) should be made
of general purpose user agents. Below, some rationale is offered.
- Checkpoint 7.3 Allow the user to navigate all active elements.
- Links are so important to the Web that general purpose user
agents must provide native support for navigation of them.
- Links and form controls add user interface to a page, thus
it makes sense that the user agent provide native support for
this "imported" user interface. But why limit "active elements"
to links and form controls and not tables, for example? Why
links and form controls a priori?
- Checkpoint 7.4 Allow the user to navigate just among all active elements.
This is just a special case of 7.3 and so once 7.3 is settled, this
one will follow.
- Checkpoint 7.5 Allow the user to search for rendered text content, including text equivalents of visual and auditory content.
- Most user agents do this anyway (except for the alt content part).
- Searching might be considered a minimal form of navigation.
- Checkpoint 7.6 Allow the user to navigate according to structure.
This checkpoint has been included as an umbrella checkpointn
because there are many many navigation possibilities. Rather than
list all of them (per element type, by tree structure, by
element content, etc.), the WG put them all into a single,
Priority 2, intentionally ambiguous checkpoint.
- Checkpoint 7.7 Allow the user to configure structured navigation.
This one follows 7.6
- Checkpoint 8.1 Convey the author-specified purpose of each table and the relationships among the table cells and headers.
- Checkpoint 8.5 Provide a "outline" view of content, built from structural elements (e.g., frames, headers, lists, forms, tables, etc.)
- Checkpoint 8.6 Allow the user to configure the outline view.
- This one follows 8.5.
These may apply to all user agents.
- Checkpoint 10.3 Allow the user to change and control the input configuration. Allow the user to configure the user agent so that some functionalities may be activated with a single command (e.g., single key, single voice command, etc.).
- Checkpoint 10.8 Allow the user to configure the arrangement of graphical user agent user interface controls.
The Working Group has decided that the following requirements, once
checkpoints, belonged to assistive technologies. These requirements
are listed in the
Appendix of Assistive Technology Functionalities of the 20
December 1999 Techniques Document.
- Allow users to navigate up/down and among the cells of a table
(e.g., by using the focus to designate a selected table cell).
Indicate the row and column dimensions of a selected table.
Describe a selected element's
position within larger structures (e.g., numerical or relative
position in a document, table, list, etc.).
Provide information about form structure
and navigation (e.g., groups of controls, control labels,
navigation order, and keyboard configuration).
- Enable announcing of information regarding title, value, grouping,
type, status and position of specific focused elements.