Techniques for User Agent Accessibility Guidelines 1.0

7 August 2002

3 Accessibility topics

3.1 Access to content
3.2 User control of rendering and style
3.3 Link techniques
3.4 List techniques
3.5 Table techniques
3.6 Image map techniques
3.7 Frame techniques
3.8 Form techniques
3.9 Generated content techniques
3.10 Content repair techniques
3.11 Script and applet techniques
3.12 Input configuration techniques
3.13 Synthesized speech techniques
3.14 Techniques for reducing dependency on spatial interactions
3.15 Accessibility and internationalization techniques
3.16 Appendix: Impact matrix
3.17 Appendix: Accessibility features of some operating systems
3.18 Appendix: Loading assistive technologies for access to the document object model

This section presents general techniques that may be relevant to more than one checkpoint. This section is not written in terms of the requirements of the checkpoints. Instead, it is organized according to other topics that should also be familiar to user agent designers.

3.1 Access to content

User agents need to ensure that users have access to content, either rendered through the user interface or made available to assistive technologies through an API. While providing serial access to a stream of content would satisfy this requirement, this would be analogous to offering recorded music on a cassette: other technologies exist (e.g., CD-ROMs) that allow direct access to music. It is just as important for user agents to allow users to access Web content efficiently, whether the content is being rendered as a two-dimensional graphical layout, an audio stream, or series of lines of braille. Providing efficient access to content involves:

Preserving structure when rendering;
Allowing the user to select specific content and query its structure or context (what am I examining?);
Using and generating metadata to provide context (where am I?).

These topics are addressed below.

Note: Throughout this document, the term context menu refers to a user interaction mechanism whereby:

The user designates a piece of content (e.g., with the selection or focus).
The user invokes a menu (e.g., through a keyboard shortcut) consisting of options determined by the context selected by the user. For instance, if the user has selected an HTML table cell, the context menu might offer information about associated headers.

Preserve and provide structure

When used properly, markup languages structure content in ways that allow user agents to communicate that structure across different renderings. A table describes relationships among cells and headers. Graphically, user agents generally render tables as a two-dimensional grid. However, relationships also have to be apparent for users with serial access to content or who navigate sequentially, otherwise users may not understand the purpose of the table and the relationships among its cells (see the section on table techniques). User agents need to render content in ways that allow users to understand the underlying document structure, which may consist of headings, lists, tables, synchronized multimedia, link relationships, etc. Providing alternative renderings (e.g., an outline view) will also help users understand document structure.

Note: Even though the structure of a language like HTML may be defined by a document type definition (DTD) or a schema, user agents may convey structure according to a "more intelligent" document model when that model is well-known. For instance, in the HTML 4 [HTML4] and XHTML 1.0 [XHTML10] DTDs, heading elements (H1 - H6) do not nest, but presenting the document as nested headings may convey the document's structure more effectively than as a flat list of headers.

Allow access to selected content

The guidelines emphasize the importance of navigation as a way to provide efficient access to content. Navigation allows users to access content more efficiently and, when used in conjunction with selection and focus mechanisms, allows users to query content for metadata. For instance, blind users often navigate a document by skipping from link to link, deciding whether to follow each link based on metadata about the link. User agents can help them decide whether to follow a link by allowing them to query each focused link for the link text, title information, information about whether the link has been visited, etc. While much of this information may be rendered, the information has to also be available to assistive technologies.

For example, the Amaya browser/editor [AMAYA] makes available all attributes and their values to the user through a context menu. The user selects an element and opens an attribute menu that shows which attributes are available for the element and which have been assigned values. The user may read or write values to attributes (since Amaya is an editor as well as a browser). Information about attributes is also available through Amaya's structured view, which renders the document tree as structured text.

The selection may be widened (moved to the nearest node one level up the document tree) by pressing the Escape key; this is a form of structured navigation based on the underlying document object model.

Users may want to select content based on structure alone (as offered by Amaya) but also based on how the content has been rendered. For instance, most user agents allow users to select ranges of rendered text that may cross "element boundaries" (e.g., to select part of a paragraph that includes a phrase that is emphasized).

Context

Authors and user agents provide context to users through content, structure, navigation mechanisms, and query mechanisms. Titles, dimensions, dates, relationships, the number of elements, and other metadata all help orient the user, particularly when available as text. For instance, user agents can help orient users by allowing them to request that document headings and lists be numbered. See also the section on table techniques, which explains how user agents can offer table navigation and the ability to query a table cell for information about the cell's row and column position, associated header information, etc.

User agents can use style sheet languages such as CSS 2 [CSS2] and XSLT [XSLT] to generate context information (see techniques for generated content).
For information about elements and attributes that convey metadata in HTML, refer to the index of elements and attributes in "Techniques for Web Content Accessibility Guidelines 1.0" [WCAG10-TECHS].
For information about elements and attributes that convey metadata in SMIL, refer to the index of attributes in the W3C Note "Accessibility Features of SMIL" [SMIL-ACCESS].
Describe a selected element's position within larger structures (e.g., numerical or relative position in a document, table, list, etc.). For example: tenth link of fifty links; document heading 3.4; list one of two, item 4.5; third table, three rows and four columns; current cell in third row, fourth column; etc. Allow users to get this information on demand (e.g., through a keyboard shortcut). Provide this information on the status line on demand from the user.

3.2 User control of rendering and style

To ensure accessibility, users need to be able to configure the style of rendered content and the user interface. Author-specified styles, while important, may make content inaccessible to some users. User agents need to allow users to increase the scale of rendered text, to change colors and color combinations, to slow down multimedia presentations, etc.

Cascading Style Sheets (CSS, defined in [CSS1] and [CSS2]) give authors design flexibility and allow users to control important aspects of content style; see checkpoint 4.14 for information about allowing users to choose from among available style sheets. CSS includes mechanisms for tailoring rendering for a particular output medium, including audio, braille, screen, and print.

User agents should implement the cascade order of CSS 2 ([CSS2], section 6.4.1) not CSS 1. In CSS 2, user style sheets with "!important" declarations (section 6.4.2) take precedence over author styles. Refer also to Web Content Accessibility Guidelines 1.0 checkpoint 3.3 [WCAG10] for content requirements related to style sheets.
CSS-enabled user agents should consider as part of the cascade the markup used for style, giving it a lower weight than actual style sheets. This allows authors to specify style through markup for older user agents and to use more powerful style sheets for CSS-enabled user agents. Refer to the section on the precedence of non-CSS presentational hints in CSS 2 ([CSS2], section 6.4.4).
To hide the CSS syntax from the user, user agents may implement user style sheets through the user agent user interface. User agents can generate a user style sheet from user preferences or behave as though it did. Amaya [AMAYA] provides a GUI-based interface to create and apply internal style sheets. The same technique may be used to control a user style sheet.
In JavaScript, the following may be used to change style information:
document.all.myElement style.color = "red";

3.3 Link techniques

User agents make links accessible by providing navigation to links, helping users decide whether to follow them, and allowing interaction in a device-independent manner. Link techniques include the following:

See sequential navigation techniques for information about navigating to links.
When the user follows a link where the URI has a fragment identifier (e.g., "#chap1"), render the beginning of the target fragment at the top of the viewport (not at the bottom, even if closer). Note: This technique applies to formats like HTML where the fragment identifier semantics are to identify a piece of content.
Provide a link view that lists all links in the document. Allow the user to configure how the links are sorted (e.g., by document order, sequential navigation order, alphabetical order, visited or unvisited or both, internal or external or both, etc.).
Help the user remember links by including metadata in the link view. For example, identify a selected link as "Link X of Y", where "Y" is the total number of links. Lynx [LYNX] numbers each link and provides information about the relative position in the document. Position is relative to the current page and the number of the current page out of all pages. Each page usually has 24 lines.
Allow the user to configure how much information about a link to present in the content view (when a link receives focus). For instance, allow the user to choose between "Display links using hyperlink text" or "Display links by title (if present)", with an option to toggle between the two views. For a link without a title, use the link text.
Here is a sample algorithm for ensuring that an HTML link that has image content has associated text.
1. If the author has specified conditional content (that is not empty content) for the image (e.g., "alt" in HTML), use that as the link text;
2. Otherwise, use the link title if available;
3. [Repair] Otherwise, use title information of the designated Web resource (e.g., the TITLE element of HTML for links to HTML documents).
4. [Repair] Otherwise, render part of the filename or URI of the designated Web resource.
5. [Repair] Otherwise, insert a generic text placeholder (e.g., [LINK]) in place of the image (if configured to do so).
For an image in link content, ensure that the user has access to the link and any long description associated with the image.

JAWS for Windows HTML Options menu, which allows configuration of a number of link rendering options

As shown in the image above, JAWS for Windows [JFW] offers a view for configuring a number of rendering features, notably some concerning link types, text link verbosity, image map link verbosity, graphical link verbosity, and internal links.

3.4 List techniques

User agents can make lists accessible by ensuring that list structure – and in particular, embedded list structure – is available through navigation and rendering.

Allow users to turn on "contextual" rendering of lists (even for unordered "bullet" lists). Use compound numbers (or letters, numbers, etc.) to introduce each list item (e.g., "1, 1.1, 1.2, 1.2.1, 1.3, 2, 2.1"). This provides more context and does not rely on the information conveyed by a graphical rendering, as in:
```
1.
  1.
  2.
    1.
  3.
2.
  1.
```
which might be serialized for synthesized speech or braille as "1, 1, 2, 1, 2, 3, 2, 1".
Specify list numbering styles in CSS. Refer to the section generated content, automatic numbering, and lists in CSS ([CSS2], section 12).
Example.

The following CSS 2 style sheet (taken from CSS 2, section 12.5) shows how to specify compound numbers for nested lists created with either UL or OL elements. Items are numbered as "1", "1.1", "1.1.1", etc.
```
<STYLE type="text/css">
 UL, OL { counter-reset: item }
 LI { display: block }
 LI:before { content: counters(item, "."); 
             counter-increment: item }
</STYLE>
```
End example.

3.5 Table techniques

The HTML TABLE element was designed to represent relationships among data ("data" tables). Even when authored well and used according to format specification, tables may pose problems for users with disabilities for a number of reasons:

Users who have serial access to content or who navigate sequentially to a a table may have difficulty grasping the relationships among cells, especially for large and complex tables.
Users with cognitive disabilities may have trouble grasping or remembering relationships between cells and headers, especially for large and complex tables.
Users of screen magnifiers or with physical disabilities may have difficulties navigating to the desired cells of a table.

For these situations, user agents may assist these users by providing table navigation mechanisms and supplying context that is present in a two-dimensional rendering (e.g., the cells surrounding a given cell).

To complicate matters, many authors use tables to lay out Web content ("layout" tables). Not only are table structures used to arrange objects horizontally and vertically on the screen, table elements such as TH (table header) in HTML are used to change the style of text (as table headers are often rendered in bold fonts) rather than to indicate a true table header. These practices make it difficult for assistive technologies to rely on markup to convey document structure. Consequently, assistive technologies often resort to interpreting the rendered content, even though the rendered content has "lost" information encoded in the markup. For instance, when an assistive technology "reads" a table from its graphical rendering, the contents of multiline cells may become intermingled. For example, consider the following table:

This is the top left cell    This is the top right cell 
of the table.                of the table.

This is the bottom left      This is the bottom right 
cell of the table.           cell of the table.

Screen readers that read rendered content line by line would read the table cells incorrectly as "This is the top left cell This is the top right cell". So that assistive technologies are not required to gather incomplete information from renderings, these guidelines require that user agents provide access to content through an API (see checkpoint 6.3).

The following sections discuss techniques for providing improved access to tables.

Table metadata

Users with serial access to content or who navigate sequentially cannot gather information "at a glance" about a two-dimensional table. User agents can make tables more accessible by providing the user with table metadata such as the following:

The table caption (the CAPTION element in HTML) or summary information (the "summary" attribute in HTML).
The number of column groups and columns. Note that the number of columns may change according to the row. Also, some parts of a table may have more than two "dimensions." Project dimensionality higher than two onto two when rendering information.
The number of row groups and rows, in particular information about table headers and footers.
Which rows contain header information (whether at the top or bottom of the table).
Which columns contain header information (whether at the left or right of the table).
Whether there are subheads.
How many rows or columns a header spans.

When navigating, quick access to table metadata will allow users to decide whether to navigate within the table or skip over it. Other techniques:

Allow users to query table summary information from inside a cell.
Allow the user to choose different levels of detail for the summary (e.g., brief table summary and a more detailed summary).
Allow the user to configure navigation so that table metadata is not (re-)rendered each time the user enters the table.

Linear rendering of tables

A linear rendering of tables – cells presented one at a time, row by row or column by column – may be useful, but generally only for simple tables. For more complex tables, user agents need to convey more information about relationships among cells and their headers. A linear rendering of a table may be useful as an equivalent for a multi-dimensional table.

Note: The following techniques apply to columns as well as rows. The elements listed in this section are HTML 4.01 table elements ([HTML4], section 11).

Provide access to one row (TR) at a time, beginning with any column header (TH). If a header is associated with more than one row, offer that header for each row concerned.
Render cells (TD) with their associated headers. Allow the user to configure how often headers are rendered (e.g., by implementing the 'speak-header' property in CSS 2 [CSS2], section 17.7.1). Note also that the "abbr" attribute in HTML 4 specifies abbreviated headers for synthesized speech and other rendering ([HTML4], section 11.2.6). See also information about cell headers later in this section.
Provide access to cell content as marked up in the document source.
Refer to techniques for authoring accessible tables in "Techniques for Web Content Accessibility Guidelines 1.0" [WCAG10-TECHS].

Cell rendering

The most important aspect of rendering a table cell is that the cell's contents be rendered faithfully and be identifiable as the contents of a single cell. However, user agents may provide additional information to help orient the user:

Render the row and column position of the cell in the table.
Indicate how many rows and columns a cell spans.
Since the contents of a cell in a data table may only be comprehensible in context (i.e., with associated header information, row/column position, neighboring cell information etc.), allow users to navigate to cells and query them for this information.
For HTML tables, refer to the section on associating header information with data cells of HTML 4 ([HTML4], section 11.4.1).
In a table with a leading row and column of TH cells, the interpretation of the corner cell as an empty TD or TH should not contribute to the set of headings for cells in that row and column.
For nested tables, render information about the level of nesting.
Since a cell may belong to N different dimensions in a multi-dimensional table, provide information about headers from each dimension.

Cell header algorithm

Properly constructed data tables distinguish header cells from data cells. How headers are associated with table cells depends on the markup language. The following algorithm is based on the HTML 4.01 algorithm to calculate header information ([HTML4], section 11.4.3). For the sake of brevity, it assumes a left-to-right ordering, but will also work (with minor modifications) for right-to-left tables (refer to the "dir" attribute of HTML 4 [HTML4], section 8.2). For a given cell:

Search left from the cell's position to find row header (TH) cells. Then search upwards from the cell's position to find column header cells. The search in a given direction stops when the edge of the table is reached or when a data cell is found after a header cell. If no headers are found in either direction (left or up), search in the other directions (right or down).
Allow the user to configure where the header text comes from. For example, in HTML 4, either the header cell element's content or the value of the "abbr" attribute value ([HTML4], section 11.2.6).
Insert row headers into the list in the (left-to-right) order they appear in the table. Include values implicitly resulting from header cells in prior rows with rowspan="R", sufficient to extend into the current row.
Insert column headers after row headers, in the (top-to-bottom) order they appear in the table. Include values implicitly resulting from header cells in other columns with colspan="C", sufficient to extend into the current column containing the TD cell.
If a header cell has a value for the "headers" attribute, then insert it into the list and stop the search for the current direction.
Treat cells with a value for the "axis" attribute as header cells.
Be sure to take into account header cells that span several rows or columns.

Cell header repair strategies

Not all data tables include proper header markup, which the user agent may be able to detect. Some repair strategies for finding header information include the following:

Consider that the top or bottom row contains header information.
Consider that the leftmost or rightmost column in a column group contains header information.
If cells in an edge row or column span more than one row or column, consider the following row or column to contain header information as well.
When trying to guess table structure, present several solutions to the user.

Other repair issues to consider:

Consider TH cells on both the left and right of the table.
For TH cells with "rowspan" set: consider the content of those TH cells for each of the N-1 rows below the one containing that TH content.
An internal TH surrounded by TD elements makes it difficult to know whether the header applies to cells to its left or right in the same row (or in both directions) or cells above or below it in the same column (or in both directions).
Finding column header cells assumes they are all above the TD element to which they apply.
A TH element with "colspan" set needs to be included in the list of TH elements for the N-1 columns to its right.

Table navigation

To permit efficient access to tables, user agents should allow users to navigate to tables and within tables, to select individual cells, and to query them for information about the cell and the table as a whole.

Allow users to navigate to a table, down to one of its cells, and back up to the table level. This should work recursively for nested tables.
Allow users to navigate to a cell by its row and column position.
Allow users to navigate to all cells under a given header.
Allow users to navigate row by row or column by column.
Allow users to navigate to the cells around the current cell.
Allow users to navigate to the first or last cell of a row, column, or the table.
Allow users to navigate from a cell directly to its related headers (if it's possible to navigate to the headers).
Allow the user to search for text content within a table (i.e., without searching outside of the table). Allow the user to search for text within specific rows or columns, row groups or column groups, or limited by associated headers.
Alert the user when the navigation reaches a table edge and when a cell contains another table.
Allow relative and direct navigation. For example, entering "-3, 20" might mean "left three cells, up 20 cells").
Allow navigation of table headers or footers only.
Consider the issues raised by navigation to or from a cell that spans more than one row or column.
For examples of table navigation, refer to the table navigation script from the Trace Research Center [TABLENAV].

3.6 Image map techniques

One way to make an image map accessible to some users (e.g., users with blindness) is to render the links it contains as text links. This allows assistive technologies to render the links as synthesized speech or braille, and benefits users with slow access to the Web and users of small Web devices that do not support images but can support hypertext. User agents may allow users to toggle back and forth between a graphical mode for image maps and a text mode.

To construct a text version of an image map in HTML:

If the content of the MAP element includes links, use them.
Otherwise, for each AREA in the map, if (not empty conditional text content is available (e.g., the "alt" attribute), use it as the content of a generated link.
When the author has specified empty conditional text content ("alt=''"), do not render the link.
When the author has specified no text equivalent (no "alt"), repair the missing content per checkpoint 2.7.

Furthermore, user agents that render a text image map instead of an image may preface the text image map with inline metadata such as:

a string that announces the image map (e.g., "Start map")
any conditional text content associated with the image (e.g., "alt" for IMG).
the number of links in the map.

Allow users to suppress, shrink, and expand text versions of image maps so that they may quickly navigate to an image map (which may be, for example, a navigation tool bar) and decide whether to "expand" it and follow the links of the map. The metadata listed above will allow users to decide whether to expand the map. Ensure that the user can expand and shrink the map and navigate its links using the keyboard and other input devices.

3.7 Frame techniques

HTML frames (see, for example, the FRAME and IFRAME elements in HTML 4 [HTML4]) were originally designed so that authors could divide up graphic real estate and allow the pieces to change independently (e.g., selecting an entry in a table of contents in one frame changes the contents of a second frame). While frames are not inherently inaccessible, they raise some accessibility issues:

Equivalents to frame content. Some users cannot make use of frames because they cannot grasp the (spatial or logical) relationships conveyed by frame layout. Others cannot use them because their user agents or assistive technology does not support them or makes access difficult (e.g., users with screen readers or screen magnifiers).
Navigation. Users need to be able to navigate from frame to frame in a device independent manner.
Orientation. Users need to know what frame they are in (so, for example, authors should provide a title for each frame), what other frames are available, and how the frames of a frameset are organized.
Dynamic changes. Users need to know how the changes they cause in one frame affect other frames.

To name a frame in HTML, use the following algorithm:

Use the "title" attribute on FRAME; or if not present,
Use the "name" attribute on FRAME; or if not present,
Use title information of the referenced frame source (e.g., the TITLE element of the source HTML document); or
Use title information of the referenced long description (e.g., what "longdesc" refers to in HTML), or
Use frame context (e.g., "Frame 2.1.3" to indicate the path to this frame in nested framesets).

To make frames accessible, user agents should do the following:

Make available conditional content related to frames (e.g., provided by the HTML 4 NOFRAMES element ([HTML4], section 16.4.1).
Here is a technique for the case of a frameset that does not contain a NOFRAMES element but the individual frames have associated long descriptions ("longdesc"):
1. For each frameset, render the frameset title as an H1 heading.
2. For each frame, render the frame title in an H2 heading, followed by the content of the associated long description.
3. Create a navigable table of contents according to the (possibly nested) frameset structure. Each entry in the table of contents should link to a frameset or frame. The end of the content used for each frame should include a link back to this table of contents.
Alert the user when the viewport contains a frameset.
Render a frameset as a list of links to named frames so the user can identify the number of frames. The list of links may be nested if framesets are nested.
Provide information about the number of frames in the frameset.
Highlight the frameset with the current focus (e.g., by using a thick border, by displaying the name of the frameset in the status bar).
Allow the user to query the frame with the current focus for metadata about the frame. Make available the frame title for speech synthesizers and braille displays. Users may also use information about the number of images and words in the frame to guess the purpose of the frame.
Allow navigation between frames (forward and backward through the nested structure, return to a top-level list of links to frames).
Alert the user when an action in one frame causes the content of another frame to change. Allow the user to navigate with little effort to the frame(s) that changed.
Authors can suppress scrolling of HTML frames with scrolling="no".
In order to ensure that content is accessible, allow the user to override some attributes of the FRAME element of HTML 4 ([HTML4], section 16.2.2): "noresize", "scrolling", and "frameborder".

The following screen shot illustrates how Home Page Reader [HPR] renders a frameset by allowing the user to navigate (on the left side) to each of five frames in a frameset.

Example frameset with five links for each of the frame elements in IBM home page reader

Rendering of a frameset by Home Page Reader [HPR].

The next screen shot illustrates how the user agent can provide information about the number and structure of frames in the user agent user interface.

A pull down menu indicating the number of frames in a document, the labels associated with each frame, and a check mark to indicate the frame with the current focus

In this image of the Accessible Web Browser [AWB], the menu bar indicates the number of frames and uses a check mark next to the name of the frame with the current focus. The menu bar makes the information highly visible to all users and is very accessible to assistive technologies.

3.8 Form techniques

To make a form accessible, the user agent needs to ensure that:

the user can navigate to all of the form elements;
information about the form and its elements is available on demand;
the user can interact with all form elements through the keyboard alone (or voice alone or pointing device alone).

Form navigation techniques

Allow users to navigate to forms and to all controls within a form (refer also to table navigation techniques). Opera [OPERA] and Navigator [NAVIGATOR] offer a number of "form navigation" keyboard commands. When invoked, these "form navigation" commands move the user agent's current focus to the first form element (if any) in the document.
If there are no forms in a document and the user attempts to navigate to a form, alert the user.
Provide a navigable, structured view of form elements (e.g., those grouped by LEGEND or OPTGROUP in HTML) along with their labels.
Allow the user to navigate away from a menu without selecting any option (e.g., by pressing the Escape key).

Form orientation techniques

Provide the following information about forms on demand:

The number of forms in the document.
The percentage of a form that has already been filled out. This will help users who navigate sequentially to form controls know whether they have completed the form. Otherwise, users who encounter a submit button that is not the last control of the form might inadvertently submit the incomplete form.

Form element orientation techniques

In conjunction with navigation:

As the user navigates to a form element, provide information about whether the control has to be activated before form submission. For instance, in section 6.1.3 of XForms 1.0 [XFORMS10], the required property describes whether a value is required before the form's instance data is submitted.
For labels associated with form elements in markup (e.g., the "for" attribute on LABEL in HTML), make available label information when the user navigates among the form elements.
As the user navigates to a form element, provide information (e.g., through context-sensitive help) about how the user can activate the element. Provide information about what is required for each form element. Lynx [LYNX] conveys this information by providing information about the currently selected form element via a status line message:
- Radio Button: Use right-arrow or Return to toggle
- Checkbox Field: Use right-arrow or Return to toggle
- Option List: Press return and use arrow keys and return to select option
- Text Entry Field: Enter Text. Use Up or Down arrows or Tab to move off
- Textarea: Enter text. Up or Down arrows or Tab to move off (^Ve for editor) Note: The ^Ve (caret-V, e) command, included in the TEXTAREA status line message, enables the user to invoke an external editor defined in the local Lynx configuration file (lynx.cfg).

Provide the following information about the elements in a form on demand (e.g., for the element with focus):

Indicate the number of elements in the form.
Indicate the number of elements that have not yet been completed.
Provide a list of elements that have to be activated before form submission.
Provide information about the order of form elements (e.g., as specified by "tabindex" in HTML). This is important since:
1. Most forms are visually oriented, employing changes in font size and color.
2. Users who navigate sequentially to form controls need to know they have supplied all the necessary information before submitting the form.
Provide information about which element has focus (e.g., "element X of Y for the form named MyForm"). The form name is very important for documents that contain more than one form. This will help users who navigate sequentially to form controls know whether they have completed the form.
Allow the user to query a form element for information about title, value, grouping, type, status, and position.
When a group of radio buttons receives content focus, identify the radio button with content focus as "Radio Button X of Y", where "Y" represents the total number of radio buttons in the group. HTML 4 specifies the FIELDSET element ([HTML4], section 17.10), which allows authors to group thematically related elements and labels. The LEGEND element ([HTML4], section 17.10) assigns a caption to a FIELDSET. For example, the LEGEND element might identify a FIELDSET of radio buttons as "Connection Rate". Each button could have a LABEL element ([HTML4], section 17.9.1) stating a rate. When it receives content focus, identify the radio button as "Connection Rate: Radio button X of Y: 28.8kpbs", where "Y" represents the total number of radio buttons in the grouping and "28.8kbps" is the information contained in the LABEL.
Allow the user to invoke an external editor instead of editing directly in a TEXTAREA element. This allows users to use all the features of the external editor: macros, spell-checkers, validators, known input configurations, autosave features, etc.
Provide an option for transforming menus into checkboxes or radio buttons. In the transformation, retain the accessibility information specified by the author for the original form elements. Preserve the labels provided for the OPTGROUP and each individual OPTION, and re-associate them with the generated checkboxes. The LABEL defined for the OPTGROUP should be converted into a LEGEND for the result FIELDSET, and each checkbox should retain the LABEL defined for the corresponding OPTION. Lynx [LYNX] does this for HTML SELECT elements that have the "multiple" attribute specified.

3.9 Generated content techniques

User agents may help orient users by generating additional content that "announces" a context change. This may be done through CSS 2 [CSS2] style sheets using a combination of selectors (including the ':before' and ':after' pseudo-elements described in section 12.1) and the 'content' property (section 12.2).

For instance, the user might choose to hear "language:German" when the natural language changes to German and "language:default" when it changes back. This may be implemented in CSS 2 with the ':before' and ':after' pseudo-elements ([CSS2], section 5.12.3)

Example.

With the following definition in the style sheet:

[lang|=es]:before { content: "start Spanish "; }
[lang|=es]:after  { content: " end Spanish"; }

the following HTML example:

<p lang="es" class="Spanish">
 <a href="foo_esp.html" 
    hreflang="es">Esta pagina en español</a></p>

might be spoken "start Spanish _Esta pagina en espanol_ end Spanish". Refer also to information on matching attributes and attribute values useful for language matching in CSS 2 ([CSS2], section 5.8.1).

The following example uses style sheets to distinguish visited from unvisited links with color and a text prefix.

Example.

The phrase "Visited link:" is inserted before every visited link:

A:link           { color: red }   /* Unvisited links */
A:visited        { color: green } /* Visited links */
A:visited:before { content: "Visited link: " }

To hide content, use the CSS 'display' or 'visibility' properties ([CSS2], sections 9.2.5 and 11.2, respectively). The 'display' property suppresses rendering of an entire subtree. The 'visibility' property causes the user agent to generate a rendering structure, but the content is invisible.

The following XSLT style sheet (excerpted from the XSLT Recommendation [XSLT], Section 7.7) shows how one might number H4 elements in HTML with a three-part label.

Example.

<xsl:template match="H4">
 <fo:block>
  <xsl:number level="any" from="H1" count="H2"/>
  <xsl:text>.</xsl:text>
  <xsl:number level="any" from="H2" count="H3"/>
  <xsl:text>.</xsl:text>
  <xsl:number level="any" from="H3" count="H4"/>
  <xsl:text> </xsl:text>
  <xsl:apply-templates/>
 </fo:block>
</xsl:template>

End example.

3.10 Content repair techniques

When generating repair content, user agent developers should consider the following issues:

Some assistive technologies rely on an accurate mapping between the document object and what the user agent renders. If repair content is not included in the document object, but is used to determine rendering, there is likely to be a mismatch between the two that may lead to confusion. User agent developers should therefore consider including repair content in the document object. On the other hand, some users may wish to leave the author's content unmodified, and others may wish to have repair content included in the document object, but to be informed about what was repaired.
Repair content inserted in the document object should conform to the Web Content Accessibility Guidelines 1.0 [WCAG10]. For example, if the user agent inserts a graphical icon in the document object, that icon should have a text equivalent: since the icon is known to the user agent developer, the developer can provide a sensible text equivalent to accompany it (for the benefit of users of assistive technologies).

See also the section on table cell header repair strategies. Refer also to the W3C document "Techniques for Authoring Tool Accessibility Guidelines 1.0" [ATAG10-TECHS].

3.11 Script and applet techniques

User agents need to make dynamic content accessible to users who may be disoriented by changes in content, who may have a physical disability that prevents them from interacting with a document within a time interval specified by the author, or whose user agent does not support scripts or applets. Not only do user agents make available equivalents to dynamic content (e.g., audio, animations), they have to allow users to turn off scripts, to stop animations, adjust timing parameters, etc. Some user agents also allow users to turn off scripts for security reasons.

Script techniques

Certain elements of a markup language may have associated event handlers that are activated when certain events occur. User agents need to be able to identify those elements with event handlers statically associated (i.e., associated in the content, not in a script). In HTML 4 ([HTML4], section 18.2.3), intrinsic events are specified by the attributes beginning with the prefix "on": "onblur", "onchange", "onclick", "ondblclick", "onkeydown", "onkeypress", "onkeyup", "onload", "onmousedown", "onmousemove", "onmouseout", "onmouseover", "onmouseup", "onreset", "onselect", "onsubmit", and "onunload".

Techniques for providing access to scripts include the following:

Allow the user to configure the user agent so that mouseover/mouseout event handlers are activated by (and activate) focus/blur events. Similarly, allow the user to use a key command, such as enter and Shift-Enter to fire "onclick" and "ondblclick" events.
Implement "Document Object Model (DOM) Level 2 Events Specification" [DOM2EVENTS] events with a single activation event and provide a method for firing that event with each supported input device or input API. These should be the same as the click events and mappings provided above (but note that a user agent which is also an editor may wish to use single click events for moving a system caret, and want to provide a different behavior to activate using the mouse). For example, Amaya [AMAYA] uses a "doAction" command for activating links and form elements, which can be activated either by the mouse (and it is possible to set it for single-click or double-click) or by the keyboard (it is possible to set it for any key using Amaya's keyboard configuration)
For authors: Document the effects of known important scripts to give users an idea in advance of what they do. Do so by using the relevant elements or attributes of the (markup language) specification, or if there aren't any, make available a description of the script behavior.

Applet techniques

When a user agent loads an Java applet, it should support the Java system conventions for loading an assistive technology (see the appendix on loading assistive technologies for DOM access). If the user is accessing the applet through an assistive technology, the assistive technology should alert the user when the applet receives content focus as this will likely result in the launch of an associated plug-in or browser-specific Java Virtual Machine. The user agent then needs to turn control of the applet over to the assistive technology. User agents need to make conditional content available to the assistive technology. Applets generally include an application frame that provides title information.

3.12 Input configuration techniques

User agents that allow users to modify default input configurations need to account for configuration information from several sources: user agent defaults, user preferences, author-specified configurations, and operating environment conventions. Some examples include:

Author: in HTML, the author may specify keyboard bindings with the "accesskey" attribute ([HTML4], section 17.11.2).
User: Users generally specify their preferences through the user interface but may also do so programmatically or through a profile.
Operating environment: Users may specify preferred color contrasts at the operating environment level.

To the user, the most important information is the final configuration once all sources have been cascaded (combined) and all conflicts resolved. Knowing the default configuration is also important; checkpoint 12.3 requires that the default configuration be documented. The user may also want to know how the current configuration differs from the default configuration and what configuration in the viewport with the current focus comes from the author. This information may also be useful to technical support personnel who may be assisting users.

The user interfaces for viewing and editing the input configuration may be combined, but need not be. When a single interface is available to edit bindings from any source, allow the user to apply filters to the list of bindings (e.g., author-specified only, user agent default, user preference, final configuration, etc.).
The user interfaces for viewing and editing the input configuration need to be accessible: do not rely on color alone to convey information, use conventional user interface controls, allow device-independent input and output, etc.
In the user interface, associate with each binding a short text description of the function to be activated. For example, if "Control-P" maps to a print functionality, a short description might be "Print" or "Print setup". For author-specified configurations, use available information (e.g., "title") or use generic descriptions of what action will be taken(e.g., "Follow the link with this link text").
Allow users to query user interface controls for pertinent input configuration information (e.g., what key will activate the functionality).

Resolution of input configuration conflicts

In general, user preferences should override other configurations, however this may not always be desirable. For example, users should be prevented from configuring the user agent in a way that would interfere with important functionalities such as quitting the user agent or reconfiguring it.

Some options for resolving conflicts:

Allow author configurations to override other configurations and alert the user when this happens.
Do not allow author configurations to override other configurations. Alert the user when an author-specified binding has been overridden and provide access to the author-specified control through other means (e.g., an unused binding, a menu, in a list of all author-specified bindings, etc.)
Author-specified keyboard bindings in combination with the user agent's native configuration may conflict with operating environment conventions. For example, Internet Explorer [IE-WIN] in Windows uses the Alt key as the Compose key for author-specified bindings. If the author has specified a configuration with the characters "h" or "f", this will interfere with the operating environment conventions for accessing help and the file menu, respectively. In addition to the previous two options for handling conflicts, the user agent may allow the user to choose another Compose key (either globally or on a per-document basis when conflicts are detected).

Refer to the section on restricted functionality and conformance in UAAG 1.0 for more information about limitations in functionality due to content.

3.13 Synthesized speech techniques

The following techniques apply to user agents that render content as synthesized speech.

Since user agents that render content as synthesized speech do not always pronounce it correctly, they should provide additional context to facilitate understanding. Techniques include:
- Spelling words
- Indicating punctuation, capitalization, etc.
- Allowing users to repeat words alone and in context.
- Using auditory nuances – including pitch, articulation model, volume, and orientation – to convey meaning the way fonts, spacing, and borders do in graphical media.
- Generating context. For example, a user agent might speak the word "link" before a link, "heading" before the text content of a heading or "item 1.4" before a list item.
- Rendering text according in the appropriate natural language.
User agents that synthesize speech should implement the CSS 2 aural style sheet properties ([CSS2], section 19) to allow users to configure synthesized speech rate, volume, and pitch.
User agents that provide accessible solutions for images should, by default, provide no information about images where the author has provided empty conditional content associated with the image, otherwise information may clutter the user's view of the content and cause confusion. See checkpoints 2.7 and 2.8 for more information about repair content.
User agents may recognize different natural languages and be able to render content according to language markup defined for a certain part of the document. For instance, a screen reader might change the pronunciation of spoken text according to the language definition. This is usually desired and done according to the capabilities of the tool. Some specialized tools might give some finer user control for the pronunciation as well.
Switching natural languages for blocks of content may be more helpful than switching for short phrases. In some language combinations (e.g., Japanese being the primary and English being the secondary or quoted language), short foreign language phrases are often well-integrated in the primary language. Dynamic switching for these short phrases may make the content sound unnatural and possibly harder to understand. User agents might allow users to choose elements for which they want to be alerted.
Announce different classes of links differently (see checkpoint 10.5). For instance, announce links internal to a resource as being different from links to another page in the same domain, from links to another domain, etc. Announce visited links differently as well.
The following techniques for speaking data tables are adapted from the "Tape Recording Manual" produced by the (USA) National Braille Association [NBA]:
1. Read the title, source, captions and any explanatory keys.
2. Describe the structure of the table. Include the number of columns, the headers of each column and any associated sub-columns, reading from left to right (for left-to-right tables). The subhead is not considered a column. If column heads have footnotes, read them following each header.
3. Explain whether the table will be read by rows (horizontally) or by columns (vertically). The horizontal reading is usual but, in some cases, the vertical reading better conveys the content. On rare occasions it is necessary to read a table both ways.
4. Repeat the column headers with the figures under them for the first two rows. If the table is long, repeat the headers every fifth row. Always repeat them during the reading of the last row.
5. Indicate the last row by saying, "and finally . . . " or "last row ..."
6. At the completion of the reading say "End table X." If the table appeared on a page other than the one you were recording, add "Returning to text on page Y."

References:

For more information about voice browser technology developed at W3C, refer to "Voice Browsers: An introduction and glossary for the requirements drafts" [VOICEBROWSER].
For information about voice recognition and accessibility, refer to "Speak to Write" [SPEAK2WRITE].

3.14 Techniques for reducing dependency on spatial interactions

Users with serial access to content or who navigate sequentially to content may have difficulty interacting with content in two-dimensional space (e.g., to move a pointing device). Using the keyboard to move the pointing device may help some users, but this technique usually requires a significant amount of visual feedback as well as physical dexterity, both of which may not be possible for users with some disabilities.

To illustrate ways to reduce dependency on spatial interactions, consider a Web site for travel through Europe. The author provides a map of Europe and allows users to select regions they wish to visit using a pointing device. This type of application is very convenient to some users, but may be inaccessible to others. Authors can design such content to place the emphasis on objects (in this case European countries) rather than screen coordinates. In HTML, for example, authors should use client-side image maps instead of server-side image maps.

When this is done, the user agent can present the list of countries in an alternative fashion (e.g., as a list of links or menu entries) for users who may find it difficult to select countries with the pointing device. This type of interface generally benefits all users (e.g., some users who may not recognize countries by shape or flag, some users may prefer the keyboard, some users may have images turned off, the text may be searched, etc.).

3.15 Accessibility and internationalization techniques

The following techniques may be considered when integrating accessibility features and internationalization.

Implement content negotiation so that users may specify language preferences. Or allow the user to choose manually from among related resources available in different languages.
Consider operating environment level natural language preferences as the user's default language preference. However, take caution about sending HTTP Accept-Language request headers ([RFC2616], section 14.4) based on the operating environment preferences. First, there may be a privacy problem as indicated in RFC 2616, section 15.1.4 "Privacy Issues Connected to Accept Headers". Also, the operating environment may define only one language, while the Accept-Language request header may include many languages in different priorities. Setting Accept-Language to be the operating environment language may prevent a user from receiving content from a server that does not have a match for this particular language but does for other languages acceptable to the user.
Render characters with the appropriate directionality. Refer to the "dir" attribute and the BDO element in HTML 4 ([HTML4], sections 8.2 and 8.2.4, respectively).

For more information about content internationalization, refer to W3C's "Character Model for the World Wide Web" [CHARMOD] and the Unicode Consortium's Unicode specification [UNICODE].

3.16 Appendix: Impact matrix

This matrix summarizes which checkpoints are expected to benefit users with certain types of disabilities. For more information about types of disabilities, assistive technologies, access strategies, and more, refer to W3C's "How People with Disabilities Use the Web" [PWD-USE-WEB].

learning: 4.4
hard of hearing: 1.3, 2.2, 2.5, 2.6, 4.1, 4.2, 4.3, 4.7, 4.12
memory: 9.4, 9.10, 10.4, 10.5, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7
color deficiency: 3.1, 4.3, 10.2, 10.3, 10.6
low vision: 2.2, 2.5, 2.7, 2.8, 2.9, 3.2, 3.3, 3.5, 4.1, 4.2, 4.4, 4.6, 4.7, 4.8, 4.9, 4.10, 4.11, 4.12, 4.13, 5.1, 5.4, 9.1, 9.2, 9.3, 9.4, 9.7, 9.8, 9.10, 10.2, 10.3, 10.4, 10.5, 10.6
seizure disorder: 3.3, 3.4
physical: 1.1, 1.2, 2.4, 2.9, 2.10, 5.3, 5.5, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 9.10, 10.3, 10.4, 10.5, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7
all: 2.1, 2.3, 4.14, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 6.10, 7.1, 7.2, 7.3, 7.4, 8.1, 8.2, 12.1, 12.2, 12.3, 12.4, 12.5
deafness: 1.3, 2.2, 2.5, 2.6, 4.1, 4.2, 4.3, 4.6
cognitive: 2.6, 3.1, 3.2, 3.3, 3.5, 3.6, 4.2, 4.3, 4.4, 4.5, 4.6, 5.1, 5.2, 5.3, 5.4, 5.5, 9.4, 9.8, 9.10, 10.4, 10.5, 10.7, 11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 11.7
blindness: 1.1, 1.2, 1.3, 2.2, 2.4, 2.5, 2.7, 2.8, 2.9, 2.10, 3.2, 3.3, 3.5, 4.4, 4.7, 4.8, 4.9, 4.10, 4.11, 4.12, 4.13, 5.1, 5.3, 5.4, 5.5, 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7, 9.8, 9.9, 9.10, 10.2, 10.3, 10.4, 10.5, 10.6, 10.7, 11.1, 11.2, 11.3, 11.5, 11.6, 11.7

3.17 Appendix: Accessibility features of some operating systems

Several operating systems include built-in accessibility features designed to assist individuals with disabilities. Despite operating systems differences, the built-in accessibility features use a similar naming convention and offer similar functionalities, within the limits imposed by each operating system (or particular hardware platform). The following is a list of built-in accessibility features common to several operating environments:

StickyKeys: StickyKeys allows users who have difficulties with pressing several keys simultaneously to press and release in sequence each key of the configuration.
MouseKeys: These allow users to move the mouse cursor and activate the mouse button(s) from the keyboard.
RepeatKeys: RepeatKeys allows users to set how fast a key repeats ("repeat rate") when the key is held pressed. It also allows users to control how quickly the key starts to repeat after the key has been pressed ("delay until repeat"). Users can also turn off key repeating.
SlowKeys: SlowKeys instructs the computer not to accept a key as pressed until it has been pressed and held down for more than a user-configurable length of time.
BounceKeys: BounceKeys prevents extra characters from being typed if the user bounces (e.g., due to a tremor) on the same key when pressing or releasing it.
ToggleKeys: ToggleKeys provides an audible indication for the status of keys that have a toggled state (keys that maintain status after being released). The most common toggling keys include Caps Lock, Num Lock, and Scroll Lock.
SoundSentry: SoundSentry monitors the operating system and applications for sounds in order to provide a graphical indication when a sound is being played. Older versions of SoundSentry may have flashed the entire display screen for example, while newer versions of SoundSentry provide the user with a selection of options, such as flashing the viewport that has the current focus or flashing the active window caption bar.

The next three built-in accessibility features are not as commonly available as the above group of features, but are included here for completeness and future compatibility.

ShowSounds: ShowSounds are user settings or software switches that cause audio messages to be presented using both audio and graphics. Applications may use these switches as the basis of user preferences.
HighContrast: HighContrast sets fonts and colors designed to make the screen easier to read.
TimeOut: TimeOut turns off built-in accessibility features automatically if the computer remains idle for a user-configurable length of time. This is useful for computers in public settings such as a library. TimeOut might also be referred to as "reset" or "automatic reset".

The next accessibility feature listed here is not considered to be a built-in accessibility feature (since it only provides an alternative input channel) and is presented here only for completeness and future compatibility.

SerialKeys: SerialKeys allows a user to perform all keyboard and mouse functions from an external assistive device (such as communication aid) communicating with the computer via a serial character stream (e.g., through a serial port or infra-red port) rather than or in conjunction with, the keyboard, mouse, and other conventional input devices and methods.

Microsoft Windows 95, Windows 98, and Windows NT 4.0

The following accessibility features can be adjusted from the Accessibility Options Control Panel:

StickyKeys: modifier keys include Shift, Control, and Alt.
FilterKeys: grouping term for SlowKeys, RepeatKeys, and BounceKeys.
MouseKeys
ToggleKeys
SoundSentry
ShowSounds
Automatic reset (term used for TimeOut)
High Contrast
SerialKeys

Additional accessibility features available in Windows 98:

Magnifier: Magnifier is a windowed, screen enlargement and enhancement program used by people with low vision to magnify an area of the graphical display (by tracking the text cursor, current focus, etc.). Magnifier can also invert colors within the magnification window.
Accessibility Wizard: The Accessibility Wizard is a setup tool to assist users with the configuration of operating environment accessibility features.

References:

To find out about built-in accessibility features on Windows platforms, ask the operating system via the "SystemParametersInfo" function. Refer to "Software accessibility guidelines for Windows applications" [MS-ENABLE] for more information.
For information about Microsoft keyboard configurations (Internet Explorer, Windows 95, Windows 98, and more), refer to documentation on keyboard assistance for Internet Explorer and MS Windows [MS-KEYBOARD].

Apple Macintosh operating system

The following accessibility features can be adjusted from the Easy Access Control panel. Note: The Apple naming convention for accessibility features is to put spaces between the terms (e.g., "Sticky Keys" instead of "StickyKeys").

Sticky Keys: modifier keys include the Shift, Command (Open apple), Option (Alt), and Control keys.
Slow Keys
Mouse Keys

The following accessibility features can be adjusted from the Keyboard Control Panel.

Key Repeat Rate (part of RepeatKeys)
Delay Unit Repeat (part of RepeatKeys)

The following accessibility feature can be adjusted from the Sound or Monitors and Sound Control Panel (depending on operating system version).

Adjusting the volume to off or mute causes the Macintosh to flash the title bar whenever the operating system detects a sound (e.g., SoundSentry)

Additional accessibility features available for the Macintosh OS:

CloseView: CloseView is a full screen, screen enlargement and enhancement program used by people with low vision to magnify the information on the graphical display, and it can also change the colors used by the operating system.
SerialKeys: SerialKeys is available as freeware from Apple and several other Web sites.

AccessX, X Keyboard Extension (XKB), and the X Window System

The following accessibility features can be adjusted from the AccessX graphical user interface X client on some DEC, SUN, and SGI operating systems. Other operating systems supporting XKB may require command line interaction.

StickyKeys: modifier keys are platform-dependent, but usually include the Shift, Control, and Meta keys.
RepeatKeys
SlowKeys
BounceKeys
MouseKeys
ToggleKeys

Note: AccessX became a supported part of the X Window System X Server with the release of the X Keyboard Extension in version X11R6.1.

DOS (Disk Operating System)

The following accessibility features are available from a freeware program called AccessDOS, which is available from several Internet Web sites including IBM, Microsoft, and the Trace Center, for either PC-DOS or MS-DOS versions 3.3 or higher.

StickyKeys: modifier keys include the Shift, Control, and Alt keys.
Keyboard Response Group: grouping term for SlowKeys, RepeatKeys, and BounceKeys
MouseKeys
ToggleKeys
SoundSentry (incorrectly named ShowSounds)
SerialKeys
TimeOut

3.18 Appendix: Loading assistive technologies for access to the document object model

Many of the checkpoints in the guidelines require a "host" user agent to communicate information about content and the user interface to assistive technologies. This appendix explains how developers can ensure the timely exchange of this information (see checkpoint 6.10). The techniques described here include:

Loading the entire assistive technology in the address space of the host user agent;
Loading part of the assistive technology in the address space of the host user agent (e.g., piece of stub code, a dynamically linked library (DLL), a browser helper object, etc.);
Out-of-process access to the document object model.

The first two techniques are similar, differing in the amount of, or capability of, the assistive technology loaded in the same process or address space as the host user agent. These techniques are likely to provide faster access to the document object model since they will not be subject to inter-process communication overhead.

Loading assistive technologies for direct navigation of the document object model

First, the host user agent needs to know which assistive technology to load. One technique for this is to store a reference to an assistive technology in a system registry file or, in the case of Java, a properties file. Registry files are common among many operating environments:

Windows: use the system registry file
IBM OS/2: use the system.ini
On client/server systems: use a system registry server that an application running on the network client computer can query.
In Sun Java 2, use the "accessibility.properties" file, which causes the system event queue to examine the file for assistive technologies required for loading. If the file contains a property called "assistive_technologies", it will load all registered assistive technologies and start them on their own thread in the Java Virtual Machine that is a single process.

Here is an example entry for Java:

assistive_technologies=com.ibm.sns.svk.AccessEngine

In Microsoft Windows, a similar technique could be followed by storing the name of a Dynamic Link Library (DLL) for an assistive technology in a designated assistive technology key name/assistive technology pair.

Here is an example entry for Windows:

HKEY_LOCAL_MACHINE\Software\Accessibility\DOM 
    "ScreenReader, VoiceNavigation"

Attaching the assistive technologies to the document object model

Once the assistive technology has been registered, any other user agent can determine whether it needs to be loaded and then load it. Once loaded, the assistive technology can monitor the document object model as needed.

On a non-Java platform, a technique to do this would be to create a separate thread with a reference to the document object model using a DLL. This new thread will load the DLL and call a specified DLL entry name with a pointer to the document object model interface. The assistive technology process will then run as long as required.

The assistive technology has the option to either:

communicate with a main assistive technology of its own and process the document object model as a caching mechanism for the main assistive technology, or
act as a bridge to the document object model for the main assistive technology.

In the future, it will be necessary to provide a more comprehensive reference to the application that not only provides direct navigation to its client area document object model, but also multiple document object models that it is processing and an event model for monitoring them.

Java's direct access

Java can facilitate timely access to accessibility components. In this example, an assistive technology running on a separate thread monitors user interface events such as focus changes. When focus changes, the assistive technology is alerted of which component object has focus. The assistive technology can communicate directly with all components in the application by walking the parent/child hierarchy and connecting to each component's methods and monitor events directly. In this case, an assistive technology has direct access to component specific methods as well as those provided for by the Java Accessibility API. There is no reason that a document object model interface to user agent components could not be provided via Java.

In Java 1.1.x, Sun's Java access utilities load an assistive by monitoring the Java awt.properties file for the presence of assistive technologies and loads them as shown in the following code example:

Example.

import java.awt.*;
import java.util.*;
      
String atNames = 
  Toolkit.getProperty("AWT.assistive_technologies",null);
if (atNames != null) {
  StringTokenizer parser = new StringTokenizer(atNames," ,");
  String atName;
  while (parser.hasMoreTokens()) {
   atName = parser.nextToken();
   try {
     Class.forName(atName).newInstance();
   } 
   catch (ClassNotFoundException e) {
     throw new AWTError("Assistive Technology not found: " 
                     + atName);
   } 
   catch (InstantiationException e) {
     throw new AWTError("Could not instantiate Assistive"
                     + " Technology: " + atName);
   } 
   catch (IllegalAccessException e) {
     throw new AWTError("Could not access Assistive"
                     + " Technology: " + atName);
   } catch (Exception e) {
     throw new AWTError("Error trying to install Assistive"
                     + " Technology: " + atName + " " + e);
   }
  }
}

In the above code example, the function Class.forName(atName).newInstance() creates a new instance of the assistive technology. The constructor for the assistive technology will then be responsible for monitoring application component objects by monitoring system events.

In the following code example, the constructor for the assistive technology, AccessEngine, adds a focus change listener using Java accessibility utilities. When the assistive technology is alerted that an object has received focus, it has direct access to that object. If the Object, o, has implemented a document object model interface, the assistive technology will have direct access to the document object model in the same process space as the application.

Example.

import java.awt.*;
import javax.accessibility.*;
import com.sun.java.accessibility.util.*;
import java.awt.event.FocusListener;

class AccessEngine implements FocusListener {
 public AccessEngine() {
   //Add the AccessEngine as a focus change listener
   SwingEventMonitor.addFocusListener((FocusListener)this);
 }

 public void focusGained(FocusEvent theEvent) {
   // get the component object source
   Object o = theEvent.getSource();
   // check to see if this is a dom component
   if (o instanceof DOM) {
     ...
   }
 }
 public void focusLost(FocusEvent theEvent) {
   // Do Nothing
 }
}

In this example, the assistive technology has the option of running stand-alone or acting as a cache for a bridge that communicates with a main assistive technology running outside the Java virtual machine.

Loading part of the assistive technologies for direct access to the document object model

In order to attach to a running instance of Internet Explorer 4.0, you can use a Browser Helper Object ([BHO]), which is a DLL that will attach itself to every new instance of Internet Explorer 4.0 [IE-WIN] (only if you run iexplore.exe). You can use this feature to gain access to the object model of Internet Explorer and to monitor events. This can be tremendously helpful when many method calls need to be made to Internet Explorer, as each call will be executed much more quickly than the out of process case.

There are some requirements when creating a Browser Helper Object:

The application that you create must be an in-process server (that is, DLL).
This DLL must implement IObjectWithSite.
The IObjectWithSite::SetSite() method must be implemented. It is through this method that your application receives a pointer to Internet Explorer's IUnknown. Internet Explorer actually passes a pointer to IWebBrowser2 but the implementation of SetSite() receives a pointer to IUnknown. You can use this IUnknown pointer to automate Internet Explorer or to sink events from Internet Explorer.
It must be registered as a Browser Helper Object as described above.

Java access bridge

To provide native Microsoft Windows assistive technologies access to Java applications without creating a Java native solution, Sun Microsystems provides the "Java Access Bridge." This bridge is loaded as an assistive technology as described in the section on loading assistive technologies for direct navigation of the document object model. The bridge uses a Java Native Invocation (JNI) to Dynamic Link Library (DLL) communication and caching mechanism that allows a native assistive technology to gather and monitor accessibility information in the Java environment. In this environment, the assistive technology determines that a Java application or applet is running and communicates with the Java Access Bridge DLL to process accessibility information about the application/applet running in the Java Virtual Machine.

Loading assistive technologies for indirect access to the document object model

Access to application specific data across process boundaries or address spaces might be costly in terms of performance. However, there are other reasons to consider when accessing the document object model that might lead a developer to wish to access it from their own process or memory address space. One obvious protection this method provides is that, if the user agent fails, it does not disable the user's assistive technology as well. Another consideration would be legacy systems, where the user relies on their assistive technology for access to software other than the user agent, and thus would have their application loaded all the time.

There are several ways to gain access to the user agent's document object model. Most user agents support some kind of external interface, or act as a mini-server to other applications running on the desktop. Internet Explorer [IE-WIN] is a good example of this, as IE can behave as a component object model (COM) server to other applications. Mozilla [MOZILLA], the open source release of Navigator also supports cross-platform COM (XPCOM).

The following example illustrates the use of COM to access the IE object model. This is an example of how to use COM to get a pointer to the WebBrowser2 module, which in turn enables access to an interface/pointer to the document object, or IE document object model for the content.

Example.

/* first, get a pointer to the WebBrowser2 control */
if (m_pIE == NULL) {
  hr = CoCreateInstance(CLSID_InternetExplorer, 
       NULL, CLSCTX_LOCAL_SERVER, IID_IWebBrowser2, 
       (void**)&m_pIE);

  /* next, get a interface/pointer to the document in view, 
     this is an interface to the document object model (DOM)*/

  void CHelpdbDlg::Digest_Document() {
   HRESULT hr;
   if (m_pIE != NULL) {
    IDispatch* pDisp;
    hr = m_pIE->QueryInterface(IID_IDispatch, 
                                  (void**) &pDisp);
     if (SUCCEEDED(hr)) {
      IDispatch* lDisp;
      hr = m_pIE->get_Document(&lDisp);
      if (SUCCEEDED(hr)) {
        IHTMLDocument2* pHTMLDocument2;
    hr = lDisp->QueryInterface(IID_IHTMLDocument2,
         (void**) &pHTMLDocument2);
    if (SUCCEEDED(hr)) {
      /* with this interface/pointer, IHTMLDocument2*,
         you can then work on the document */
        IHTMLElementCollection* pColl;
        hr = pHTMLDocument2->get_all(&pColl);
        if (SUCCEEDED(hr)) {

                LONG c_elem;
                hr = pColl->get_length(&c_elem);
                if (SUCCEEDED(hr)) {
           FindElements(c_elem, pColl);
        }
           pColl->Release();
        }
        pHTMLDocument2->Release();
        }
        lDisp->Release();
    }
    pDisp->Release();
      }
    }
  }
}

For a working example of this method, refer to HelpDB [HELPDB].