This Wiki page is edited by participants of the HTML Accessibility Task Force. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Task Force participants, WAI, or W3C. It may also have some very useful information.

Spec Review/All

From HTML accessibility task force Wiki
Jump to: navigation, search



Under review:

Simon Harper

Looks fine however I'd like to highlight - Other Applicable Specifications - which without some form of accessibility semantics built into the additional spec description could be a problem in the future.

James Craig

Section 2.2.1, Conformance Classes: Non-interactive presentation user agents

User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.

Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.

A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.

The HTML TF should consider if there are any other implications for assistive technology. For example, would exporting to a print-formatted file be considered a non-interactive presentation, and may there be opportunity for assistive technology to interact with the file in another context like a word processor or e-book reader? If so, some of the focus management capabilities may need to be maintained.

Greg Lowney will file a bug that encompasses this

Section 2.2.1, Conformance Classes: Visual user agents that support the suggested default rendering

User agents, whether interactive or not, may be designated (possibly as a user option) as supporting the suggested default rendering defined by this specification.

User agents that are designated as supporting the suggested default rendering must implement the rules in the rendering section that that section defines as the behavior that user agents are expected to implement.

I suggest adding a sentence explicitly stating that a user agent or assistive technology may override the suggested default rendering (e.g. color, font-size, focus style) for the sake of accessibility, readability, or other personal preference. Then again, this is is loosely implied by the first paragraph, so it may not be necessary.

Kelly to file bug Bug 13489

Section 2.2.1, Conformance Classes: Conformance Checkers

(Note: no direct link to Conformance Checkers DT. Add ID?)

Oddly phrased sentence:

(This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE])

If the requirement is really impossible, the requirement shouldn't be in there at all. I think the author probably meant something like:

"This is only a SHOULD and not a MUST requirement because exhaustive and complete testing of all failure cases would be impossible."

Kelly to file bug Bug 13490

Section 2.2.1, Conformance Classes: Authoring tools and markup generators

However, WYSIWYG tools are legitimate. WYSIWYG tools should use elements they know are appropriate, and should not use elements that they do not know to be appropriate. This might in certain extreme cases mean limiting the use of flow elements to just a few elements, like div, b, i, and span and making liberal use of the style attribute.

This means authoring tools are explicitly required to not use semantic markup in places where there is any ambiguity. While the idealistic purity of the requirement is admirable, it seems rather impractical with regards to encouraging the creation of highly inaccessible content. The HTML TF and ATAG WG should discuss this point.


All authoring tools, whether WYSIWYG or not, should make a best effort attempt at enabling users to create well-structured, semantically rich, media-independent content.

Add "accessible" to this list.

Kelly to file bug Bug 13418

2.2.3 Extensibility

Authors can create plugins and invoke them using the embed element. This is how Flash works.

Embed? Is the object element deprecated for this purpose? I seem to recall there is better accessibility support for object than embed. Last I checked, Flash was exporting with a preference for object, with embed only used as the legacy fallback (for Netscape 4!). Admittedly, this may have changed, since I haven't authored Flash in the last 7 years or so.

John to file bug: Bug 13430 - 2.2.3 Extensibility

The conformance terminology for documents depends on the nature of the changes introduced by such applicable specificactions, and on the content and intended interpretation of the document. Applicable specifications MAY define new document content (e.g. a foobar element), MAY prohibit certain otherwise conforming content (e.g. prohibit use of <table>s), or MAY change the semantics, DOM mappings, or other processing rules for content defined in this specification. Whether a document is or is not a conforming HTML5 document does not depend on the use of applicable specifications: if the syntax and semantics of a given conforming HTML5 documentdocument is unchanged by the use of applicable specification(s), then that document remains a conforming HTML5 document. If the semantics or processing of a given (otherwise conforming) document is changed by use of applicable specification(s), then it is not a conforming HTML5 document. For such cases, the applicable specifications SHOULD define conformance terminology.

This may not be an issue, since it claims removing support would make it non-conforming HTML, but this seems to open other document formats that implement part of HTML (EPUB for example) to explicitly forbid certain features of the language, to prohibit or change the processing rules for certain features, such as ARIA or other accessibility features. The HTML TF should review this section.


2.5.6 Colors

Note: CSS2 System Colors are not recognized.

Why not? System colors represent a major accessibility benefit to some users.

Janina to file bug noting that this overlaps with CSS WG issue Bug 13639 HTMLCollection

Only a, applet, area, embed, form, frame, frameset, iframe, img, and object elements can have a name for the purpose of this method; their name is given by the value of their name attribute.

Should input, select, textarea, and others be in that list? The name attribute is most frequently used on form elements. Same comment on the next section, "HTMLAllCollection."

discuss HTMLFormControlsCollection

… an input element whose type attribute is in the Radio Button state and whose checkedness is true.

Editorial: The term "checkedness" is not a perfectly cromulent word. I believe "checked state" would be less awkward.

John to file bug Bug 13412 - Editorial change to HTMLFormControlsCollection

Greg Lowney

The current draft HTML5 spec says:

Non-interactive presentation user agents

User agents that process HTML and XHTML documents purely to render non-interactive versions of them must comply to the same conformance criteria as Web browsers, except that they are exempt from requirements regarding user interaction.

Typical examples of non-interactive presentation user agents are printers (static UAs) and overhead displays (dynamic UAs). It is expected that most static non-interactive presentation user agents will also opt to lack scripting support.

A non-interactive but dynamic presentation UA would still execute scripts, allowing forms to be dynamically submitted, and so forth. However, since the concept of "focus" is irrelevant when the user cannot interact with the document, the UA would not need to support any of the focus-related DOM APIs.

It is very important that developers should not relegate user agents to the "non-interactive" category if there is benefit to the user in being able to interact with them.

Use case: Imelda uses a screen enlarger. She runs an application that displays a static, web-based slide detailing today's weather forecast. Imelda uses a screen enlarger, and normally reads blocks of text by having the magnifier track the text caret as she moves it through the content. However, as the developers considered theirs a non-interactive user agent, they left out the ability to do caret browsing. With luck, the screen enlarger will be able to access the application's DOM and determine the screen coordinates of each word, but it would certainly be easier if the application supported caret browsing.

Use case: The weather application that Imelda is running allows the user to select text with the mouse and automatically copies that text to the clipboard. However, the developers did not consider this "focus" or "activation", or even "selection" because the selection does not persist and the user can't perform their choice of actions on it. However, by this decision they are making functionality available to only one input modality, and users who rely on other modalities such as keyboard or speech recognition are denied full access. In this case, the developers should not have considered their application non-interactive, and instead implemented full focus and selection functionality.

Recommendation: The HTML5 spec should include wording that clarifies that user agents should not consider themselves non-interactive if they render content to the user on any system that can take input, and more specifically should not omit support for focus-related DOM APIs just because they do not expect to be taking input. Ideally it would include use cases similar to the above in order to help readers understand the issue.

Greg to file bug for this with keyword a11ytf Bug 13442


Under review:

Simon Harper

The following are notes and concerns for primarily UAWG but with notes to ATWG and HTML5 if the PFWG decide they can be integrated.

3.2.3 Global attributes

  • accesskey - By only allowing an accessskey for be one unicode character we remove the possibly of sequenced entry such as Alt+F S for file save. This is important in cognition (called Progressive Disclosure). There is an argument which goes 'A sequence such as Alt+F S is activating two UI elements in sequence, the File menu followed by the Save sub menu item. Although you are hitting them in sequence, and you the user may perhaps internally recall the key sequence as a combination to invoke the save action, it does not imply that the Save menu choice has an access key, or shortcut, property value of "Alt+F S". The handling of sequenced keys should be in the menu keyboard handler, not through assigning sequences to a given item.' However if we're talking Rich internet applications, I think this gets to be quite tricky, if the developer of an app is trying to maintain compatibility with the desktop UI conventions. In addition, I also think we need a Web Key (think Windows or Apple key) which give focus to accesskeys first before chrome? Myself and mhakkine have had an email discussion about this which should inform any decision titled 'HTML5 Sanity Check - before it goes to the Wiki'

overlaps with others

  • Class - looks OK
  • contenteditable - Does, contenteditable equal to true, mean that the content/UA now needs to conform to ATAG?


  • contextmenu - typo 'by the invoking the'; Does the 'show' event modify the DOM (cannot find a definition) - if not how will AT 'see' contextmenu?

Michael to file bug on typo Bug 13618


  • dir - looks OK
  • draggable - looks OK
  • dropzone - looks OK
  • hidden - '— if something is marked hidden, it is hidden from all presentations, including, for instance, screen readers.'Seems to me that it should be up to the AT to decide this based on used preference. While the spec states 'should not be rendered' will it be in the DOM and if not it should be.

UAWG to discuss

  • id - looks OK
  • lang - looks OK
  • spellcheck - looks OK
  • style - looks OK
  • tabindex - looks OK but 'an element that is only focusable because of its tabindex attribute will fire a click event in response to a non-mouse activation (e.g. hitting the "enter" key while the element is focused' - seems right but thoughts?
  • title - 'If this attribute is omitted from an element, then it implies that the title attribute of the nearest ancestor HTML element with a title attribute set is also relevant to this element. Setting the attribute overrides this, explicitly stating that the advisory information of any ancestors is not relevant to this element. Setting the attribute to the empty string indicates that the element has no advisory information.' Do we need to say the nearest ancestor HTML element with a title attribute set is automatically used as the uaag repair default? Myself and Jim Allan have had an email discussion about this which should inform any decision, again, titled 'HTML5 Sanity Check - before it goes to the Wiki'

UAWG to discuss

Content Models

I find the Content Model description to be fine including 3.2.7 WAI-ARIA / however someone from PF should take a look at this just to flag anything I've missed in 3.2.7 (I'm technical but not nuanced in ARIA).

Joshue O Connor A11y Review

3.2 Elements [1]

Issue and Concerns.

While this section introduces the use of semantics and how they relate to conformance, the examples given don’t convey the negative impact incorrect or insufficient semantics can have on the user experience, in particular for users of Assistive Technology.

I suggest the following or similar spec text be added to the ‘Elements’ section to illustrate these issues and to provide a deeper sense of context and purpose behind the need to create conformant content.

This text could go in either ‘3.2.1 Semantics’ or in the following section ‘3.2.2 Elements in the DOM’.

Draft Spec text

<sample spec text>

"At all stages, in a dynamically generated or altered document the semantic integrity of the document snapshot must be maintained at each step in the document mutation – in particular where the document is in a ‘steady’ state (after a user interaction for example). These rules may be relaxed to some degree during transitional steps involve background transition or operations with other APIs but, if the document is to be a conformant HTML5 doc, then suitable elements/attributes from this spec must be used in a way that supports the documents integrity, the UA parsing requirements and above all the user experience as this can impact on the usability and accessibility of the content.

Maintaining this integrity will assist user agents such as Assistive Technologies understand and successfully interact with the DOM or other user agent structures, [ref]. This means that a dynamic widget which is semantically structured to be accessible on a page load, will when responding to user interaction/input have to maintain its semantic structure as scripting or other dynamic events are applied and the widget content/structure is altered, in order to be considered conforming html.

This example is applicable to processes such as a checking out cart, or in mashup type applications and semantic integrity will help to maintain the interoperability of application/document structures as well as facilitate a more consistent user experience. **

    • Examples of where this kind of mutation or dynamically altered content can impact on the user can be found in WCAG."
</sample spec text>

Josh to file bug for this Bug 13444

Related WCAG SC, Fails etc

  • **SC 2.4.5: note wcag reference to dynamic processes [2]
  • F76: Failure of Success Criterion 3.2.2 due to providing instruction material about the change of context by change of setting in a user interface element at a location that users may bypass(context change in an interface) [3]
  • G13: Describing what will happen before a change to a form control that causes a change of context to occur is made. [4]

discuss, send to WCAG

3.2.5 Content models [5]

Issues and Concerns

The SVG content model VENN diagram needs to be made accessible [content-venn.svg].

  • Issues – On focus the SVG diagram container is just announces as ‘Frame 0’. This is in breach of WCAG as the frame needs a suitable title that describes the diagram such as @longdesc or ARIA describedby.

Also the SVG diagram isn’t keyboard accessible, the descriptions of each of the nodes only appear on mouse over events. They should also be able to be triggered via the keyboard.

The fallback diagram ‘content-venn.png’ for where SCG is un-supported etc also needs alternate text to be applied in a way that is accessible. It currently isn’t, as the alternate text of the .png document is not announced when the diagram is parsed by Assistive Technology.

Josh to file bug with help from SVG experts Bug 13447 pinged Chaals

Device-independent Events

SVG uses the new event set provided in DOM 2 [DOM2], which supports device-independent interactive content. This allows authors of SVG to ensure that interactive content does not rely on a user having a particular type of device. Good authoring practice will normally use the focusin, focusout and activate events rather than the device specific events for gaining and losing the focus on an element or activating the element. Device-independent scripting is required by Web Content Accessibility Guidelines [WCAG10] checkpoints 9.3 and 6.4. [6]

Metadata Content/ Other notes

The description of that ‘Metadata Content’ is ‘out of band’ is rather vague. It is outlined as ‘[…] content that sets up the presentation or behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other "out of band" information”. Whats does ‘out of band’ mean?

For example, this doesn’t tally with the editors stance that accessibility related semantics such as @summary is hidden ‘metadata’ or ‘out of band’. The summary attribute doesn’t come under the umbrella of ‘[..] behavior of the rest of the content, or that sets up the relationship of the document with other documents, or that conveys other "out of band" information’.

As the @summary on a complex data table can be currently easily used by users of Assistive Technology such as screen readers, and added by content authors how is this content, by inference, - ‘out of band’. It could be argued that the summary attribute relates to ‘presentation’ of content but it could equally be stated that so does the use of heading elements, list items etc but none of these elements are labeled as ‘metadata’. So there is a disconnect here and a lack of consistency about what is or isn’t ‘metadata’ and the inference that @summary is ‘out of band’.


Fallback Content [7]

The definition of what Fallback content is needs to be clearer, as well as how it varies depending on context.

For example, <object> has a fallback model via being "transparent", but <embed> lacks this there's confusion among authors about when to use <embed> and when to use <object>.

Also whether fallback content as supported by <object> is browser support fallback or accessibility alternative fallback these are two separate use cases, even though sometimes they get confused / blended as defined currently in the spec, the fallback for object is for browser support and does not address accessibility.[8]

This confusion lead the bug triage sub team us to conclude that both <embed> and <object> do not have accessibility fallback mechanisms at the moment.

This issue turns out to be represented by Bug 8885. It’s current status (as of 20/6/11 is RESOLVED NEEDSINFO)

discuss, relates to bug 8885 which needs to be revived


[1] [2] [3] [4] [5] [6] [7] [8]

Greg Lowney

As a general rule, when it's common for pieces of content to serve a conventional purpose, the markup should allow this purpose to be identified in an unambiguous and automation-friendly way. This allows user agents and assistive technology to make use of it to provide users with advanced navigation and customization features. HTML5 does this in many areas, many more in fact than HTML4, but there are still common purposes that are not yet addressed.

For example, in order to let people find and navigate content using a wider range of mechanisms, markup languages could provide a way to add tags (also called keywords) to any or almost any content elements. Currently keyword tagging is supported at the HTML document level, but not at smaller granularities. As many web sites support content tagging on images, posts, or articles, and HTML5 puts a lot of effort into facilitating aggregation of these, the new feature would also be an example of providing a standardized way for content to expose information that is currently handled in an ad hoc or site-specific fashion, thus allowing user agents and assistive technology to make use of it in novel ways.

Use case: Nadia uses a screen reader to explore a blogger's web site. The page shows the first paragraph of each recent post, each in an HTML5 article element, and each includes the title, poster, posting date, keyword tags, and a link to the full article. Because scanning the entire page searching for articles that match her criteria is much more difficult and time consuming for her than for most users, Nadia uses a browser add-in that lets her navigate to the most recent article before or after a given date. Each article's title and publication date are marked up in automation-friendly ways, as <article><header><h1>The Very First Rule of Life</h1><header><p><time pubdate datetime="2009-10-09T14:28-08:00">…, allowing user agents and accessibility aids to use this information for navigation, filtering, color-coding, etc. However, HTML5 defines no standard way to identify the content that represents the poster's identity or keyword tags, so the author has to put those in a plain text. Because of this, her add-in cannot let quickly navigate to or between articles that are posted by a particular person or that are tagged with specific keywords.

Use case: Marge is browsing a web page with a hundred images, but she wants to find the one of a sailboat. While many users can make a quick visual scan of each screenful of content, paging down until they find the correct image, this is much more difficult for her, so she activates an add-in for her browser, selects the check box for "Images", and a list box is populated with the keywords for all the images on the page. She types the first characters of "boats" to move the focus to this entry, presses Space to select it, and presses Enter, at which point the extension temporarily hides all the content except the two images that have the boat keywords.

Use case: Randy is easily distracted so he uses a browser add-in to filter each page, hiding information that it can recognize as not relevant to his current task. Using standard markup it can, for example, hide all content on a page excepting articles that meet certain criteria.

Use case: In order to help him understand and navigate a collection of HTML documents, Joshua runs a tool that generates and presents him with an index and table of contents. If the author has no way to recommend keywords and phrases for portions of content, the functionality can only provide a simplistic guess at what are the primary topic of each section, but if the author can explicitly associate key words or phrases with headings, tables, and even paragraphs, the tool can do a much better job. This can also provide tools such as search engines to data and hints to work with.

Recommendation: HTML5 should allow a keywords or tag element to be added to all or nearly all elements. This could be a set of space-separated tokens, and although this precludes including spaces in the keywords or key phrases, it would allow it to be processed in standard ways already used by other attributes. For example:

  • <img keywords="flags swan union-jack" src="western-australia.png" alt="Flag of Western Australia">.
  • <article keywords="computers apple news"…>

Alternatively, a keywords element could be defined that could be associated with another element, either by referencing an element ID or by surrounding content with which it's associated.

discuss, John to follow up with Tantek Çelik and Manu Sporny

Recommendation: HTML5 should define a text-level author element that would identify its content as identifying the author of the associated content. The typical use would be in cases like <article><header><h1>The Very First Rule of Life</h1><header><p><time pubdate datetime="2009-10-09T14:28-08:00"><p>Posted by: <author><a href="/users/gwashington">George Washington</a></author>

discuss, John to follow up with Tantek Çelik and Manu Sporny


Under review:

Joseph Scheuhammer

First, a general comment about section 3.2.5 "WAI-ARIA".  There should be a statement that the implicit/default WAI-ARIA roles are not added to the DOM by a user agent.  That this is the case is implied, albeit loosely, in the definition of "no role", specifically in the parenthetical:

"When [no role is] used as a default implied ARIA semantic, it means the user agent has no default mapping to ARIA roles. (However, it probably will have its own mappings to the accessibility layer.)"

Here is a concrete example:  html5 has an <input type="date"> element that represents a date editor.  There is no corresponding "date editor" ARIA role.  However, there is, in IAccessible2, an IA2_ROLE_DATE_EDITOR.  It's likely that a user agent will expose an <input type="date"> as an IA2_ROLE_DATE_EDITOR if running on an IAccessible2 platform.

The point:  user agents will expose accessibility information via an accessibility API even when there are no ARIA attributes in the source markup.  The tables show the default/implicit mapping between html5 elements and accessibility APIs.  ARIA's use here is a way of expressing that default mapping without stating it for each accessibility API*.  ARIA is being used only as a lingua franca for all such APIs, but the implicit ARIA information is not added to the DOM.

There should be a clear statement to that effect.

* for comparison, see the explicit mapping between ARIA and six accessibility APIs as documented in the ARIA User Agent Implementation Guide. 


Some more specific issues follow.

Note the following questions are good questions but have answers, have created ACTION-131 to address them.

First table -- Strong native semantics and default implied ARIA semantics

  1. The <menu type="context"> element represents a context menu.  That is, a menu that is invoked typically with a right click on empty space.  Such menus are not triggered by a menu item in a menu bar, nor by pressing a button.  They are menus, nonetheless.  The table states that context menus have no implicit ARIA role.  Why isn't the implicit ARIA role "menu"?
  2. The <input type="search"> element represents a search field -- a text input field that is specifically used for searching.  The table suggests the implicit ARIA role is "text".  Why not use the ARIA role "search"?
  3. Conversely, <input type="time">, <input type="datetime">, <input type="datetime-local">, <input type="month"> have no implicit ARIA role.  Why isn't their implicit role "textbox"?  A possible answer is that there are appropriate roles in some accessibility APIs, and that the <input> should be exposed as the appropriate role where possible.  Is that the intent here?  If so, the implicit role mapping should be (using the time type as an example), "if the accessibility API supports a time role, expose the <input type="time"> as such; otherwise, fall back to a role of textbox."
  4. The <optgroup> element is documented as having no implicit ARIA role.  Is it not semantically similar to the "group" role?

Second table  -- restrictions

  1. Possible overrides for the <a> element include "checkbox", "menuitemcheckbox", and "menuitemradio".  It's not clear why "radio" is not also allowed.  An <a> element can be appear as a checkbox either inside or outside a menu; but, only as a radio button when within a menu.  Why is that a restriction?
  2. Similarly, a <button> can be overridden by a "menuitemcheckbox", but not a "checkbox".  Why a checkbox only when in a menu?
  3. This is more of a logical exercise: Heading elements, <hN>, can be overidden with the "link" role, but the <a> element cannot be overridden with the "heading" role.  Thus, a heading can become a link, but a link cannot be made into a heading.  Why the asymmetry?
  4. It's unclear what the implied default role (2nd column) for <hN> elements that <em>are</em> within an <hgroup>.  Is it the case that the highest ranking <hN> within the <hgroup> determines the default ARIA role for the <hgroup> element, and, all the <hN> elements within the group have no role?  Is that the rationale?  And, if it is, how are the <hN> elements represented in the accessibility tree when they are within an <hgroup>? See ISSUE-164
  5. Similary, for <li> elements that are not children of an <ol> or <ul>: what is their default role?  None?  How are such "isolated" <li> elements to be represented in the accessibility tree?


Under review:

Cynthia Shelly

Migrated from Cynthia's preliminary feedback

3.5.1 should describe what happens to the accessibility APIs when the existing document is destroyed. Maybe as part of unload? [9]

Cynthia to file bug against mapping document Bug 13509

This reminds me that document parsing should maybe also talk about setting up the accessibility APIs? [10]

Cynthia to file bug against mapping document Bug 13510

Cynthia to file bug to improve referencing of that document Bug 13511

I believe that this line covers any AT that is getting info directly from the DOM. Can anyone confirm? Unregister all event listeners registered on the Document node and its descendants.

Cynthia to file bug Bug 13654

3.5.3 Document.write(...) has this big warning on it This method has very idiosyncratic behavior. In some cases, this method can affect the state of the HTML parser while the parser is running, resulting in a DOM that does not correspond to the source of the document. In other cases, the call can clear the current page first, as if had been called. In yet more cases, the method is simply ignored, or throws an exception. To make matters worse, the exact behavior of this method can in some cases be dependent on network latency, which can lead to failures that are very hard to debug. For all these reasons, use of this method is strongly discouraged.

I believe this is the behavior that people have sometimes though caused accessibility problems. I have never encountered problems with AT and document.write. Having a DOM that doesn't correspond to the source happens with lots of other script too, so I'm not sure why that is a problem. Does anyone have documented test cases of where document.write is a problem (with a specific page, browser and AT combination)?

If so, can we document that as part of the warning, and also discuss in WCAG 2.0 techniques?

discuss, Cynthia to discuss with Rich and Steve

3.5.5 and 3.5.6 innerHTML, outerHTML, insertAdjacentHTML Document how these impact Accessibility API Tree, and what events get fired (API and OS events)

discuss with mapping team

I think DOM-based AT has to rely on mutation events here, which can be slow, and which they don't like. Can we talk to some AT vendors about how to improve this? Maybe by adding a specialized event.

Cynthia to file bug Bug 13512

These are REALLY useful for doing accessible scripting, since they can easily add new content into the DOM in the correct order (for screen-reader reading and tabbing). These are good candidates for HTML 5 WCAG 2.0 techniques. But, we need to make sure that they will be sufficiently performant in AT.

Cynthia to put starter techniques in WCAG Techniques wiki or raise agenda items to WCAG WG


Under review:


Under review:


Under review:

Léonie Watson

4.5.3. The pre element

The spec states:

"Authors are encouraged to consider how preformatted text will be experienced when the formatting is lost, as will be the case for users of speech synthesizers, braille displays, and the like. For cases like ASCII art, it is likely that an alternative presentation, such as a textual description, would be more universally accessible to the readers of the document."

This doesn't really seem to address the problem. One of the code examples demonstrates the issue quite clearly:

"The following shows a contemporary poem that uses the pre element to preserve its unusual formatting, which forms an intrinsic part of the poem itself."
<pre>                maxling

it is with a          heart

that i admit loss of a feline
        so           loved

a friend lost to the

~cdr 11dec07</pre>

(Apologies if the above code example doesn't come out right. Being a screen reader user, I have no way of telling whether it's displayed correctly or not. That's the problem right there really!)

Bug 10103 filed on 30th June 2011:

4.5.13. The div element

The spec states:

"Authors are strongly encouraged to view the div element as an element of last resort, for when no other element is suitable. Use of the div element instead of more appropriate elements leads to poor accessibility for readers and poor maintainability for authors."

What is the use case for suggesting a div leads to poor accessibility? Using other elements might enhance accessibility, but I'm not sure that's exactly the same thing.

Bug 13157 filed on 6th July 2011:


Under review:

Léonie Watson

No comments.


Under review:

Bruce Bailey

Allowing @title and figure elements as fall back for @alt will, I think, cause some confusion, but this is something I "can live with" and is not worth raising an objection, especially since it is technically consistent with WCAG 2.0. In any, the thought, consideration, and work that has gone into the decision is quite evident. Thank you all!

I did not find any substantive accessibility related changes between the Working and Editors Draft.

Michael Cooper

[Draft Change Proposal on on location of alt guidance]

Bug 13428 just on part on removing alt guidance

Judy Brewer

Figcaption is sometimes appropriate as fallback for @alt but always. Need to figure out the situations. See Examples of figure captions that may be inappropriately long for alt.

Judy to file bug Bug 13651


Under review:

Rich Schwerdtfeger

Need to define model for focus order and how to keyboard out of frame. This may or may not be dealt with in spec.

Cynthia to file bug Bug 13665

Cynthia Shelly

Need to review changes to this model and how it impacts.

Cynthia to file bug Bug 13659, Bug 13660, Bug 13661, Bug 13662, Bug 13663, Bug 13664

Michael Cooper

Need to review for fallback mechanisms for object and embed.


Under review:

Recent Media Bugs

Bug 13400 - Audio: Change note on accessibility for the audio element

Bug 13432 - Editorial changes to The Video element [1 of 5]

Bug 13434 - Editorial changes to The Video element [2 of 5]

Bug 13435 - Editorial changes to The Video element [3 of 5]

Bug 13436 - Editorial changes to The Video element [4 of 5]

Bug 13437 - Editorial changes to The Video element [5 of 5]

Bug 13438 - Editorial changes to Track element [1 of 3]

Bug 13439 - Editorial changes to Track element [2 of 3]

Bug 13440 - Editorial changes to Track element [3 of 3]

Outstanding Media bugs:

Bug 12228 - All Media Elements should have the ability to have both short and longer textual descriptions associated to the element.

Bug 12544 - <video> MEDIA CONTROLLER requires track kind for in-band tracks

Bug 12794 - <video> Add a non-normative note on how to provide text alternatives for media elements

Bug 13383 - Feature request: pause media when hidden

Bug 12405 - <video> There's a problem with overlaying a sign-language video and using native controls, because the overlaid video overlaps the native controls

Bug 11391 - Provide examples of actual <track> usage, user agent implications

Bug 11593 - <video> the <track> @kind attribute should include the all of the identified accessibility content types

Bug 13184 - Allow nested chapters by having nested (time range-wise) cues

Bug 12283 - <video> No indication of parsing error

Bug 12662 - <video> Add a section with suggestions for the 'chapters' text track kind, demonstrating how nested time ranges can be used for hierarchical chapters. (possible duplication of Bug 13184)

Bug 12964 - <video>: Declarative linking of full-text transcripts to video and audio elements

Bug 12141 - <video> Specifically state that all <track> options be exposed to the end user

Bug 13359 - A way is needed to identify the type of data in a track element

Bug 13357 - Additional AudioTrack.kind categories are needed to identify tracks where audio descriptions are premixed with main dialogue.


Under review:

Rich has filed the following bugs:

Image Maps

Under review:

Michael Cooper review

MAP element

No accessibility comments on map element. I wonder though if the usage should be enhanced to phase in id attribute and phase out name attribute.

AREA element

On area element, the sentence "The alt attribute may be left blank if there is another area element in the same image map that points to the same resource and has a non-blank alt attribute." may be problematic. Under this condition the user agent is able to retrieve a text alternative from the other area element, but I wonder if it will. And is there a good use case for this rule? If author has provided alt once but has provided multiple areas, they know the alt and can provide it again. On the other hand, maybe it's easier to only have one pointer to a given resource have the alt, and the others be "merged" by the UA into a single area? The spec covers this idea in the next section, but only for browsers that don't process images; doesn't say what processing is for graphical browsers that also use alt text.

Michael to file bug on this, noting that the UA procedure doesn't seem realistic Bug 13449

Editorially, I had questions reading here that are answered only in the next section by inference from the processing procedure. There should be pointers forward to explain why certain requirements exist.

Image Maps

"In user agents that do not support images, or that have images disabled, object elements cannot represent images, and thus this section never applies (the fallback content is shown instead). The following steps therefore only apply to img elements." Why is this rule here? It renders imagemaps associated with OBJECT inaccessible, unnecessarily. Or, it forces authors to put a redundant copy of the links in the fallback content, but why do that? Unless I better understand the reason for this, I think this is a big accessibility issue.

Michael to file bug Bug 13451

"For historical reasons, the coordinates must be interpreted relative to the displayed image, even if it stretched using CSS or the image element's width and height attributes." If I read this right, changing the size of the image will cause the imagemap areas to be incorrectly sized, as they are not themselves resized but are just smacked onto the resized image. That will create a big accessibility problem, and also a usability one. And there is no reason for a new version of the spec to keep a problematic feature "for historical reasons". The spec should provide a transition to a more workable model.

Michael to file bug that this section is ambiguous and needs clarification; attempt to provide wording; work with Cynthia; also mention use case of browser zoom Bug 13453

Steve Faulker

"The alt attribute may be left blank if there is another area element in the same image map that points to the same resource and has a non-blank alt attribute."

As implemented by browsers this results in screen readers announcing the presence of a link but providing no information about the link target. it is noted that a bug has been filed about this issue Bug 13449.


Under review:

John Gardner

"math" role

User agents should treat all <math> elements as having the "math" role unless explicitly overridden.

Janina to ask John to file bug Note Bug 10438 (wontfix) already exists - we decided to reopen that bug and request it be a tracker issue.

Alt text

User agents should treat a <math> element's "alttext" attribute as equivalent to an "aria-label". When present, this text should be added to the AT agent's speakable buffer in place of a plain-text rendering of the math symbols.

issue for MathML spec

Janina to ask John to file bug, not sure if this is for MathML spec to clarify or if it's particular to the integration of MathML in HTML Bug 13647


Under review:

John Gardner

"image" role

All <svg> elements should default to using the image role. This may already be defined somewhere.

Janina to ask John to file bug Bug 13648

Alt text via <title> and <desc> elements

We need eventually to define how the root-level <title> and <desc> tags are exposed to AT agents in an HTML document. It is not yet clear how the <desc> element should be exposed, but the <title> element is pretty obvious. The root <title> element should be exposed to AT agents exactly as the <img> element's "alt" attribute is exposed. However, an "aria-label" attribute on the <svg> element should take precedence.

Janina to ask John to file bug, not sure if this is for SVG spec to clarify or if it's particular to the integration of SVG in HTML Bug 13649

Child <title> and <desc> elements

Exposure to AT needs eventually to be defined, but it is not yet clear what this should be.

Janina to ask John to file bug, not sure if this is for SVG spec to clarify or if it's particular to the integration of SVG in HTML

Matt May

There is no information on how SVG content is to be handled by or conveyed to AT, though this is consistent throughout the spec. Neither is there info on what ARIA role an SVG object should default to; whether setting it to "application" will offer access to the SVG DOM directly; whether setting it to something other than "application" would hide any accessibility metadata existing inside the SVG object; or whether HTML content inside an svg:foreignObject fragment is attached to the HTML DOM.

Bug 13541


Under review:

Joshue O Connor

4.9 – Tablular Data

Issues and Concerns

In section 4.9.1 the spec states:

“If a table element has a summary attribute, and the user agent has not classified the table as a layout table, the user agent may report the contents of that attribute to the user.”

The link that describes @summary then advises the author to use one of the ‘techniques for describing tables’ instead. This advice is contradictory. Our preference is that @summary is retained as a fully conforming attribute.

There are also some problems with the accessibility of the current examples and how they are supported in current screen readers and browsers. Also the impact this may have on older legacy user agents needs to be considered.

The first example has no programmatic connection between the table and the paragraph that it provides the summary of. So if the screen reader user focuses on the table by using the navigation features available within their screen reader, they will miss this information contained within the <p> element.

If they read the content in a linear way, then they will discover the paragraph before the table, but as outlined this very well may not happen.

The first couple of examples are fine, and can be considered to be accessible ( example as it is nested within the table, with <caption> in <details> element) as they are also nested within the table.

Note that in the example, ‘Table Caption, in a details element’ – the content of the <caption> including the <detail> element are announced on focus using a screen reader. However, the use of the details element is currently unannounced in JAWS 12 and in VoiceOver so the text string is read out. Note that using HTML 4 and adding a caption with a suitable @summary would retain the visual presentation of the <caption> contents within the browser but hide the @summary from non users of AT. This is a desirable option to have as the @summary content may not be useful to sighted people in every instance so retaining the option to do this in a valid conforming way in HTML 5 would make sense.[1]

This is important functionality for a blind person, as they do not have to try to ‘discover’ any extra/supplementary information that could provide useful descriptions of the tables purpose – using @summary does this already. Additional information that is provided in another element such as in the <figure>/<figcaption> may be missed by the screen reader user, as in the following example from the HTML 5 spec.

Whereas many of the examples are good for sighted users as an aid to comprehension, they are often not suitable for non-sighted users, so the option to use the @summary as a valid attribute in HTML5 is a common sense solution.

  • While the descriptions may be hidden by CSS etc this defeats the purpose of the current specs stance of having the ability to display them for everyone. However, it is best to support both use cases where a description may not need to be visually displayed (of which there may be many use cases).

The use of the <figure> element does not currently provide the same level of accessibility in current User Agents.

When testing the example ‘Next to the table, in the same figure’ using JAWS 12 and VoiceOver none of the <figure> or <figcaption> contents were announced when the table received focus using the navigation functions of the screen reader. When a containing element first receives focus and the following <figure> content is read in a linear fashion then it is announced. However, this may not be the way a screen reader user will encounter the table.

 <figcaption>Characteristics with positive and negative sides</figcaption>
 <p>Characteristics are given in the second column, with the
 negative side in the left column and the positive side in the right
    <th id="n"> Negative
    <th> Characteristic
    <th> Positive
    <td headers="n r1"> Sad
    <th id="r1"> Mood
    <td> Happy
    <td headers="n r2"> Failing
    <th id="r2"> Grade
    <td> Passing

The same issues arises with the next example ‘Next to the table, in a figures’s figcaption'.

For example:

  <strong>Characteristics with positive and negative sides</strong>
  <p>Characteristics are given in the second column, with the
  negative side in the left column and the positive side in the right
    <th id="n"> Negative
    <th> Characteristic
    <th> Positive
    <td headers="n r1"> Sad
    <th id="r1"> Mood
    <td> Happy
    <td headers="n r2"> Failing
    <th id="r2"> Grade
    <td> Passing

In current screen readers (tested using the latest version of VoiceOver on Mac OSX 10.6.7 and Safari 5.0.5 and JAWS 12 in IE 8 and IE 9 none of the figure or figcaption data was announced on focus.

Note: This testing was done with new browsers and AT. The mark up examples in the spec would not work with older browsers and AT.


While the above examples are to be considered conformant HTML5 there are accessibility issues with them. The above examples for attaching longer descriptions to tables also may illustrate why there is the need for @summary to be reinstated. The @summary attribute contents are read out by a screen reader as soon as the <table> element receives focus. The scope of what @summary is for may be changed in a suitable Change Proposal but its support in existing user agents, its ease of use and the fact it is announced as soon as the table element receives focus shows it would be unwise to maintain its onsolete but conforming status – especially when we consider that some of the above examples are not suitable nor particularly accessible to the latest in screen reading technology

A screen reader user can often navigate HTML content via explicit elements allowing the user to navigate quickly so explicit programmatic determination such as the current use of @summary is very useful as existing UAs can easily use this data to give the user a quick overview within the context of the current element focus.

Therefore the summary attribute should be reinstated as a full feature of HTML 5 or a suitable summary mechanism defined in a CP.


Yeliz Yesilada

Yeliz was invited to provide feedback by Simon Harper of the UAWG because of his expertise in accessibility of tabular data. His email is yyeliz at metu dot edu dot tr

The specification says that the following is the content model for the table element: "In this order: optionally a caption element, followed by zero or more colgroup elements, followed optionally by a thead element, followed optionally by a tfoot element, followed by either zero or more tbody elements or one or more tr elements, followed optionally by a tfoot element (but there can only be one tfoot element child in total).". As can be seen from this specification having a caption element is *OPTIONAL*. That means one can author a table which does not have a caption. The caption element can be considered to provide a summary, a broad overview of the table (see Section 4.9.2). However, since it is an optional element that means some tables can be created without any summary or overview of the content. Furthermore, the summary attribute on table element is obsolete and should not be used by the authors. That means one can author a table without a caption and that table that will not include a summary. From an assistive technology (AT) perspective, that means the underlying code might not provide any kind of summary of the table. That means for example screen reader users will not have a broad overview/summary of the table. The best recommendation would be for them to reconsider the optionality aspect of captions (in this case they are considered to provide a broad overview) or re-introducing summary attributes.

Related to the topic above, the suggested use of the figCaption element is not currently supported by the assistive technologies, but with the future developments of assistive technologies figcaption can be very useful. Especially, the examples with the figure element and figcaption shows how the captioning can be used and interpreted by assistive technologies.

part of overall table summary discussion

In the techniques for describing the table (Section, the first example shows how a table can be described with a paragraph located outside a table tag <p></p><table>…</table>. However, this is not a good idea because the assistive technologies will not be able to automatically detect if the paragraph located outside the table element is meant to describe the following table. Therefore, the best recommendation would be for them to reconsider this example.


Bruce Bailey

I agree with Yeliz that adding @summary examples is a good idea. I can draft some if that would be helpful.

Below I describe my methodology.

I did not find any substantive accessibility related changes between the Working and Editors Draft.

To get up to speed on the most pressing background issues, I read Working Group Decision on ISSUE-31 / ISSUE-80 requirements survey and Working Group Decision on ISSUE-32 table-summary which in turn referenced HTML-A11Y Task Force Recommendation: ISSUE-32 Table-Summary.

Simon Harper

Content model

My main problem with Tables is the 'Content model: In this order: optionally a caption element, followed by zero or more colgroup elements, followed optionally by a thead element, followed optionally by a tfoot element, followed by either zero or more tbody elements or one or more tr elements, followed optionally by a tfoot element (but there can only be one tfoot element child in total).' in which a caption, column head, or the table foot elements are all optional. These provide the best way to describe a complex data structure and I am unsure as to why they are optional.

TF thinks there are valid use cases

Layout tables

If tables should not be used for layout (as stated) why are mechanisms provided to allow them to be used as layout, especially 'The use of the role attribute with the value presentation' seems to be the most pernicious as it implies layout usage is OK really.

We have a precarious balance with these features we want to maintain, backward compatibility and repair issues exist


Under review:

Joshue O Connor

  • Issue1) Sections 4.10 (and the corresponding subsections) seem fine to me. My only main observation is editorial in that it seems to be a missed opportunity to educate and inform authors about some of the reasons behind best practice and the benefits of well formed mark up. This is because while the examples given are good, easy to follow and clear the spec doesn’t state why form elements and controls are marked up in the way that is demonstrated in the spec, or even mention any of the practical reasons/benefits for doing so (interoperability, accessibility, form validation etc).
  • Issue 2) In section 4.10.6 The label element, the label is said to represent a 'caption' in a user interface. There is however there is no explanation of what “caption” means in this context. Is it the same as a 'caption' for a data table, or is it similar to the figcaption etc? I suggest linking to the section of the spec that describes what a caption is, and what it is for as well as its content model etc.

Michael to file bug (editorial, doesn't need a11ytf keyword) Bug 13620

Markku Hakkinen

Note: migrated from UA wiki 26 July 2011, may not have content added after that

HTML5 includes new input types and attributes, such as type="tel", pattern, and placeholder. These new types and attribute have usability and accessibility implicates, and guidance within the HTML5 specification is at times contradictory.

The pattern attribute

The pattern attribute specifies a regular expression against which the control's value, or, when the multiple attribute applies and is set, the control's values, are to be checked.

The pattern attribute value, in regular expression notation, is not suitable as a hint or description for end-users. The HTML specification describes methods for presenting the pattern to end-users:

When an input element has a pattern attribute specified, authors should include a title attribute to give a description of the pattern. User agents may use the contents of this attribute, if it is present, when informing the user that the pattern is not matched, or at any other suitable time, such as in a tooltip or read out by assistive technology when the control gains focus.

The specification goes on to state:

When a control has a pattern attribute, the title attribute, if used, must describe the pattern. Additional information could also be included, so long as it assists the user in filling in the control. Otherwise, assistive technology would be impaired.

For instance, if the title attribute contained the caption of the control, assistive technology could end up saying something like The text you have entered does not match the required pattern. Birthday, which is not useful.

UAs may still show the title in non-error situations (for example, as a tooltip when hovering over the control), so authors should be careful not to word titles as if an error has necessarily occurred.

The lack of a keyboard accessible mechanism for displaying title content within all major UAs prevents keyboard users from accessing the pattern description. The statement "Otherwise, assistive technology would be impaired" fails to address the actual accessibility implications.

The use of the title, implied in the second paragraph above, as text for an error message, implies processing by the UA or assistive technology, and is in effect a special casing of the title attribute.

The definition of the placeholder attribute also contradicts the recommendation of using title.

Markku to file bug Bug 13635 Kelly Note: This is not the best bug just yet as it has no proposed changed text. I am entering this bug in place of Mark and thought it was higher priority to get the issues raised versus final text. Please advise if final text needs to be in by the deadline of today.

Greg Lowney

Tri-State Checkboxes

Why is there no true support for tri-state checkboxes (those with states of checked, unchecked, and mixed/undefined)? Not providing this means that authors and developers will still have to implement custom controls and behaviors for an extremely common feature.

Use case: Aidan uses speech recognition for input. When he views an interactive web page or web-based application that uses standard HTML5 controls, his speech recognition program can let him control them using standardized commands, such as checking or unchecking recognized checkboxes by saying "Check italic" or "Uncheck bold". However, when it encounters custom controls or controls with nonstandard behaviors, he has to resort to saying actual keystrokes, such as "Press tab. Press tab. Press tab. Press space." He uses a web-based text editor that provides a tri-state check box that is checked with the entire selection is italicized, unchecked when the entire selection is not italicized, and in a third, "mixed" state when only part of the selection is italicized. In one scenario, he accidentally checks the Italics check box then realizes he wants to change it back to the "mixed" state. Let's say it's implemented as an HTML5 input element with type=checkbox, but with scripting to handle the tristate behavior (perhaps as described in Shams' Blog: Tri-State Checkbox using Javascript -; in this case the keyboard UI is not standardized, so the speech recognition utility cannot provide a corresponding voice command. In a second scenario, it's implemented as an entirely custom control.

Use case: Nadia uses a screen reader with the same web-based text editor that provides a tri-state checkbox for indicating and adjusting italics. Regardless of whether the author used an HTML5 input element with type=checkbox or if they used an entirely custom control, the screen reader has no way of determining which state it's really in, and so can't convey this to Nadia using speech.

Use case: Ryan, a keyboard user, is using the same web-based text editor that provides a tri-state checkbox for indicating and adjusting italics. Unfortunately, because each web site or web-based app has to implement its tri-state checkbox itself, they often implement entirely different keyboard UI, and so when Ryan comes to one he cannot easily figure out how to use it with the keyboard.

Recommendation: HTML5 should support tri-state check boxes and menu items so that user agents can provide standardized user interface for them and so assistive technology can provide alternative input and output for them.

Greg to file bug (noting that there is backwards compatibility issue with the boolean checked attribute) Bug 13508

Cynthia Shelly


  • I’d like to see some discussion of using form elements (checkboxes, buttons, etc) outside of <form> tags, as is common for interactive applications.

Cynthia to file bug (will need to suggest edits later, not a11ytf) Bug 13656

Cynthia to file bug about form examples improvement DONE: Bug 13528

  • Does the spec have a glossary?  There are a lot of linked terms don’t lead to a clear definition.  Some of the links seem to be circular.  (some specific examples in the notes below)

Janina to file bug on behalf of PF Bug 13641

Is it really common to wrap form elements in <p> tags?  I would have expected divs or just letting them flow within a fixed-width container.

Cynthia to file private bug DONE Bug 13530

  • example uses <label><input>text</label> which is the less-preferred way to do labeling.  Examples should use <label for> and ID.

Cynthia to file bug DONE Bug 13531, depends on examples bug

  • Uses <button> for submission.  Better practice to use <input type=submit> so script is not required.

Cynthia to file bug DONE Bug 13542 suggesting type=submit|reset for button element. button already has this feature. Look into whether it actually submits the form in any browsers. Janina to work with UAWG to see if there keyboard issues with button or if there UAs in which it does not submit by default

4.10.4 Fieldset

  • Why is fieldset barred from constraint validation.  Why does it support validation methods if it is?

Cynthia to file bug privately

  • Examples are all about disabling sections, which is a new feature in 5.  Add examples of grouping and labeling uses from prior versions.

Cynthia to file bug DONE Bug 13543 suggesting non-interactive fieldset example from HTML 4 spec

4.10.5 Legend

  • This syntax is confusing.  Why are the only examples of legends that are interactive? 


Discuss a11y and aapi issue of form controls in legend

4.10.6 Label

  • The second sentence here seems to contradict the first and the following example???

“The label element's exact default presentation and behavior, in particular what its behavior might be, if anything, should match the platform's label behavior. The activation behavior of a label element for events targetted at interactive content descendants of a label element, and any descendants of those interactive content descendants, must be to do nothing.”

Cynthia to file bug DONE Bug 13553 noting that LC version was as desired, later edit is problematic

  • Multiple labels (control.labels).  HTML 4 restricted this to one label per control.  How does this map to control name accessibility APIs?  How does it interact with ARIA name calculation?  Is it useful for accessibility? What is the use case for this?  Problems with hit testing and API mappings.

Discuss in the task force about how multiple labels should be handled

  • Need to think about how this example would map to Accessibiltiy APIs
<p><label>Full name: <input name=fn> <small>Format: First Last</small></label></p> 
<p><label>Age: <input name=age type=number min=0></label></p> 
<p><label>Post code: <input name=pc> <small>Format: AB12 3CD</small></label></p> 
4.10.7 Input'

Discuss AAPI handling of this use case

  •         In general, having attributes that are only valid with certain types is confusing.  In HTML 4, it’s a common error to use alt on input other than type=img.  The design seems likely to cause more of that sort of confusion.  Can any of the attributes be made applicable to all input types, with specified mapping behavior?

Cynthia to file bug DONE Bug 13547 acknowledging there's history to work with but would like identify a path forward

  • Alt.  Is it required? API mapping needs to cover interaction of label and alt. ***yes, it is required. covered later in the section***
  •         Autocomplete:  how does this map to API?  Does this turn text controls into comboboxes?
  • Should autocomplete be able to take a URL for a service?  How about a datalist?
  •         Autofocus:  good idea to get around script issues with screen readers.  Do we know if it works?
  •         “Save Button” example for formnovalidate is a good candidate for a WCAG technique [11]
  •         Why are there size, width and height attributes?  Shouldn’t this be handled with CSS?  Can we make these obsolete but conforming?

Cynthia to file bug DONE Bug 13548

  • Input.form what is the use case for this?  It claims to be a work-around for the lack of nested forms, but what is the use case for nested forms?  4.10.18 Association of controls and forms [12].  This looks like an authoring error. Specifying what the browser should do in case of this authoring error is fine, but why have an authoring feature to work around it? 

Cynthia to file bug DONE Bug 13549

  • List.  This makes any input that is a text control into a combobox.  Type table needs to reflect that these change control type when they have lists.

Cynthia to file bug DONE Bug 13550

  • Name isn’t in the table of what attributes are supported on what types.  Assume it’s on all types?  Should be listed to make it clear.
  • Pattern.  This is really cool.
  • Placeholder.  What should users put in here, vs. label, vs. title.  In accessibility API, I think this maps to accname, not value as current fake-placeholders do.  How does it play with other things in name calculation?  What should be read if there is a label and a placeholder


  • Required.  Also good.
  • What is a value sanitation algorithm?  Please define what this is supposed to do.  Links within the section go to the top of the section.

Cynthia to file bug DONE Bug 13551

  • Is the second sentence referring to immutable objects? It seems to apply to mutable, but doesn’t make sense  “Each input element is either mutable or immutable. Except where otherwise specified, an input element is always mutable. Similarly, except where otherwise specified, the user agent should not allow the user to modify the element's value or checkedness.”

Cynthia to file bug DONE Bug 13552

Cynthia to file bug DONE Bug 13554

Is the full list of cases defined somewhere?

  • “When aninput element is first created, the element's rendering and behavior must be set to the rendering and behavior defined for the type attribute's state, and the sanitization algorithm, if one is defined for the type attribute's state, must be invoked.”

These are defined with the states.  This is not clear when reading the above text, and the link goes to the top of the same section.  Can we have a generic definition of a value sanitation algorithm?

  • Input type can be changed dynamically.  This is inconsistent with ARIA role.  Is that ok?  It seems unlikely that either will change at this point.  Will need to be considered for API mapping. 

Cynthia to file bug DONE Bug 13560 discuss how to handle in aapi guide and how hard we want to push this, given there's momentum from a year ago supporting this change and there's implementation - need to explore use cases

  • Not accessibility, but it would be nice if <input type=email multiple> could split on semi-colon as well as comma DONE 13559
  • Is there value to allowing some form of friendly name in email fields?  When you paste from email programs, this is often included, and it’s nice to be able to keep the friendly names around.  Cognitive, maybe?

Cynthia to file bug DONE Bug 13558 Discuss friendly name, cognitive load, copy/paste from email programs, etc.

  • Security issues with autocomplete or pattern on password?
  • Still concerning that date and time elements don’t have controls specified for them.  This seems likely to lead to inconsistent UI, which may be confusing to AT users.

Cynthia to file bug DONE 13561 will want to provide info about making control keyboard friendly, stylable, calendar popup readable to screenreaders, etc. and may need to address in aapi implementation guide

  • Range says that the exact number is not important, but it renders as a slider.  There are sliders where the exact number is important. “The user agent could pick which one to display based on the dimensions given in the style sheet. This would allow it to maintain the same resolution for the tick marks, despite the differences in width.” This is very vague.  These different presentations should be controlled with CSS, not the browser.

Cynthia to file bug DONE 13657 about conflating number and range etc.

  • Number’s control is a “text box or spinner control.”  Need to pick one so we know what accessibility role to map it to

Cynthia to file bug Bug 13562

Discuss balance between UA freedom to handle things as it prefers, vs impacts that has in real world on testing, author control, aapi mapping, etc. (Cynthia, Greg, UAWG have input on that)

  • “The input element represents a two-state control that represents the element's checkedness state. If the element's checkedness state is true, the control represents a positive selection, and if it is false, a negative selection. If the element's indeterminate IDL attribute is set to true, then the control's selection should be obscured as if the control was in a third, indeterminate, state.


The control is never a true tri-state control, even if the element's indeterminate IDL attribute is set to true. The indeterminate IDL attribute only gives the appearance of a third state.”

Why isn’t there a 3rd state?  This is inconsistent with ARIA.  Is that ok?

  • Are radio and checkbox valid without associated labels?

Cynthia to file bug Bug 13563

  • Can accept be extensible, to allow other file type filters?
  • “If the src attribute is set, and the image is available and the user agent is configured to display that image, then: The element represents a control for selecting a coordinate from the image specified by the src attribute; if the element is mutable, the user agent should allow the user to select this coordinate. The activation behavior in this case consists of taking the user's selected coordinate, and then, if the element has a form owner, submitting the input element's form owner from the input element. If the user activates the control without explicitly selecting a coordinate, then the coordinate (0,0) must be assumed.

Otherwise, the element represents a submit button whose label is given by the value of the alt attribute; if the element is mutable, the user agent should allow the user to activate the button. The activation behavior in this case consists of setting the selected coordinate to (0,0), and then, if the element has a form owner, submitting the input element's form owner from the input element.”

Does this mean that it’s only a button if the image doesn’t download?

HTML 4 has this:

“Creates a graphical submit button. The value of the src attribute specifies
the URI of the image that will decorate the button. For accessibility reasons,
authors should provide alternate text for the image via the alt attribute.

When a pointing device is used to click on the image, the form is submitted and
the click coordinates passed to the server. The x value is measured in pixels
from the left of the image, and the y value in pixels from the top of the
image. The submitted data includes name.x=x-value and name.y=y-value where
"name" is the value of the name attribute, and x-value and y-value are the x
and y coordinate values, respectively. 

If the server takes different actions depending on the location clicked, users
of non-graphical browsers will be disadvantaged. For this reason, authors
should consider alternate approaches: 

Use multiple submit buttons (each with its own image) in place of a single
graphical submit button. Authors may use style sheets to control the
positioning of these buttons.

Use a client-side image map together with scripting.” 

Sounds like this functionality should be deprecated.  Is it often used?

Cynthia to file bug to deprecate imagemap behaviour on input type=image Bug 13566

Do some research to determine if imagemap on input type=image is used in real world

  • “The alt attribute provides the textual label for the alternative button for users and user agents who cannot use the image. The alt attribute must also be present, and must contain a non-empty string.”  Needs a rewrite.  The alt attribute provides a text equivalent for the image button.  Not sure what an “alternative button” is…

Cynthia to file bug Bug 13567

  • “A user agent may allow the user to override the autocompletion state and set it to always on, always allowing values to be remembered and prefilled), or always off, never remembering values. However, the ability to override the autocompletion state to on should not be trivially accessible, as there are significant security implications for the user if all values are always remembered, regardless of the site's preferences.”  Implications for cognitive.  Some users will need to set this all the time.  Also, I’m not sure what “trivially accessible” means.  I think it is advice on UI design for the browser.  Should that be in a normative section?  Is the an RFC SHOULD?  Use a different word than accessible.

Cynthia to file bug about "trivially accessible" Bug 13568

Cynthia to file bug about user override of autocomplete Bug 13569

  • Is list just for making combo-boxes?  Are there other uses?  Examples using <label>foo<input></label> instead of <label for>
  • Note in 4.10.7 says that readonly sometimes makes element immutable, but The readonly attribute says “When specified, the element is immutable.”
  • “When a control has a pattern attribute, the title attribute, if used, must describe the pattern. Additional information could also be included, so long as it assists the user in filling in the control. Otherwise, assistive technology would be impaired.

For instance, if the title attribute contained the caption of the control, assistive technology could end up saying something like The text you have entered does not match the required pattern. Birthday, which is not useful.

UAs may still show the title in non-error situations (for example, as a tooltip when hovering over the control), so authors should be careful not to word titles as if an error has necessarily occurred.”

Wow.  Title must describe the pattern?  Title is last choice for a lot of AT, right?  Shouldn’t the pattern be described in the visible label for everyone?

  • Needs discussion

Greg to file bug for new global attribute named helptext or hint that would meet this use case among others Bug 13630 Common event behaviors

When the input event applies, any time the user causes the element's value to change, the user agent must queue a task to fire a simple event that bubbles named input at the input element. User agents may wait for a suitable break in the user's interaction before queuing the task; for example, a user agent could wait for the user to have not hit a key for 100ms, so as to only fire the event when the user pauses, instead of continuously for each keystroke.

Examples of a user changing the element's value would include the user typing into a text field, pasting a new value into the field, or undoing an edit in that field. Some user interactions do not cause changes to the value, e.g. hitting the "delete" key in an empty text field, or replacing some text in the field with text from the clipboard that happens to be exactly the same text.

When the change event applies, if the element does not have an activation behavior defined but uses a user interface that involves an explicit commit action, then any time the user commits a change to the element's value or list of selected files, the user agent must queue a task to fire a simple event that bubbles named change at the input element.

An example of a user interface with a commit action would be a File Upload control that consists of a single button that brings up a file selection dialog: when the dialog is closed, if that the file selection changed as a result, then the user has committed a new file selection.

Another example of a user interface with a commit action would be a Date control that allows both text-based user input and user selection from a drop-down calendar: while text input might not have an explicit commit step, selecting a date from the drop down calendar and then dismissing the drop down would be a commit action.

When the user agent changes the element's value on behalf of the user (e.g. as part of a form prefilling feature), the user agent must follow these steps:

If the input event applies, queue a task to fire a simple event that bubbles named input at the input element.

If the change event applies, queue a task to fire a simple event that bubbles named change at the input element.

In addition, when the change event applies, change events can also be fired as part of the element's activation behavior and as part of the unfocusing steps.

The task source for these tasks is the user interaction task source.

  • Why does color support autocomplete?  What does autocomplete on color do? 
  • Random Idea: I wonder if there’s any way to support validating contrast ratios with color pickers within the same form?  Could adding pattern to color help here?

Cynthia to file bug Bug 13570

4.10.8 the button element

  • “Thebutton element represents a button. If the element is not disabled, then the user agent should allow the user to activate the button.”  Should?

Cynthia to file bug Bug 13571

4.10.13 Textarea

  • Anyone see issues with wrap=hard?

Greg to file bug suggesting removing because hard-wrapped text could introduce a11y problems Bug 13513

4.10.18 association of controls and forms

  • Probably need some wcag techniques and failures here.  Seems like you could get pretty messy associating elements to different forms.

Cynthia to add to WCAG techniques wiki

4.10.19 autfocusing a form control

“If the user has indicated (for example, by starting to type in a form control) that he does not wish focus to be changed, then optionally abort these steps.”

Should this really be optional?

Cynthia to file bug Bug 13572

4.10.21 Constraints

This is really powerful.  Consider how this can be used to improve accessibility, and create WCAG techniques to leverage it.

Cynthia to propose WCAG techniques Implicit submissions

“User agents may establish a button in each form as being the form's default button. This should be the first submit button in tree order whose form owner is that form element, but user agents may pick another button if another would be more appropriate for the platform. If the platform supports letting the user submit a form implicitly (for example, on some platforms hitting the "enter" key while a text field is focused implicitly submits the form), then doing so must cause the form's default button's activation behavior, if any, to be run.


Consequently, if the default button is disabled, the form is not submitted when such an implicit submission mechanism is used. (A button has no activation behavior when disabled.)

If the form has no submit button, then the implicit submission mechanism must just submit the form element from the form element itself.”

This seems like it could cause different submission behavior on different platforms when the user hits enter.  Is that ok?

Kelly to file bug suggesting leaving open to UAAG recommendations Bug 13638 Again a bug was entered but this is still not the best bug. I also know that UAAG doesn’t currently have full text covering the topic addressed in this area. We can develop such but I don’t know how that overlaps with asking to reference something that isn’t yet done. But if the objective is to ensure the concern is at least on the table, this bug should be heading toward that objective.

discuss wherefores and whynots of form self submission


Under review:

Michael Cooper

The naming of the summary and details elements suggests a broader use case than the spec seems to indicate. It appears that this is meant to function much like fieldset and legend, except that the contents can be hidden (and are by default) and shown. Although the content model doesn't appear to restrict this, it appears from the examples that this is meant to be used mainly for widgets.

I can think of accessibility use cases for this or a similar element that I would like to see expanded. Primarily for accessibility, a summary that provides an easier-to-read version of the content that is in the details would be very helpful for people with reading and learning disabilities. It could also speed up browsing for AT users by facilitating review of short content while skipping detailed content unless desired. I can also imagine cases like a news site that provides a sentence-or-two summary that can be expanded to a full article.

It's not clear to me whether these elements meet these use cases or not. If they do, perhaps they should be in a different section, rather than cordoned off in the interactive elements section. Examples of these use cases in action would be helpful as well.

Michael to file bug suggesting examples to demonstrate above use cases Bug 13460

Interactive Elements

Under review:

Note reference to Command in Focus#HTML5.

James Nurthen

Command Element

Icon - it should either be noted that the icon may only be decorative as there is no ability to specify alternative text or the ability must be given for a user to specify alternative text.

The example given for the toolbar with 3 buttons needs labelling which fully describes the functionality of the buttons. They either need to be grouped in some way (and an accessible name given for that group) or they should be labeled "Align Left", "Center" and "Align Right"

Menu Element

No issues found


No issues found

Pseudo Classes

No issues found

Greg Lowney

Facilitate grouping of menu items

The description of how user agent should or must handle the menu element and its contents might be overly prescriptive. There are cases where the user agent should be allowed to adjust the presentation, particularly for different screen resolutions or navigation methods, that would require going against the prescribed behaviors. It could also do more to facilitate hierarchical presentation and navigation by allowing labeled groups of items.

Use case: Nadia is blind and using a web browser with a screen reader. The document contains a menu structure created with the HTML5 menu element, and it includes some very long menus with many groups of menu items separated by horizontal rules into various groups or sections. As Nadia uses the down arrow key to navigate through the menu items, she has to pause for each one to be read to her, so traversing a long menu takes a long time and a lot of effort. She would prefer to have the menu presented to her in hierarchical fashion that uses progressive disclosure, so she could navigate through the short list of sections, and then through the short list of commands in the desired section, rather than through one long list of items.

Use case: Aidan is the opposite of Nadia. He uses an alternative input system and input is difficult for him, so he wants to reduce the number of actions he has to take. Therefore he prefers to see all the options visible at once so that he can choose one directly, rather than having to use mechanisms involving progressive disclosure. (He has even invested in a large, high-resolution monitor to support this work style.) Rather than choosing a sub-menu and then items from them, he'd rather have all the sub-menus and their items displayed together. Unfortunately, the HTML5 specification explicitly states that the menu element with a label must be presented as a sub-menu rather than displayed inline.

HTML5 Status: The current HTML5 draft specification has a few problems with this. First, it clearly spells out when user agents should render groups of menu items inline vs. as a submenu, seemingly giving the user agent no leeway to adjust for user preference. Instead, it should note that user agents may override the default presentation described. Second, much to Nadia's dismay, it strongly emphasizes the use of hr elements to divide menu items into groups, rather than recommending methods that allow providing labels for those groups. This prevents the user agent from giving her a meaningful list of sections to navigate through. It looks like some methods are probably supported, such as by using a nested menu element that's labeled by a label element rather than a label attribute, and so would by default be presented inline rather than as a submenu, but they're not discussed in detail or recommended.

Recommendation: The HTML5 specification should note that user agents may override the default presentation described in order to comply with user preferences.

Greg to file bug (cynthia looking for old bug that may be relevant) Bug 13489?

Recommendation: In order to facilitate structural or hierarchical navigation and progressive disclosure, the HTML5 specification should emphasize giving names to groupings, including to groups of menu items.

Greg to file bug (cynthia looking for old bug that may be relevant)Bug 13623

Command element and images should support different resolutions

Why allow multiple icons in different resolutions for a page icon but not for a command element icon or an img element?

Use case: Todd is working on a high resolution monitor but has moderately low vision, so he uses his browser's zoom setting so that he can read the text easily and discern the details of images. He goes to a web page that includes multiple buttons implemented using command elements, each represented by an icon. Unfortunately, since HTML5 only allows a command element to link to a single image, the browser has to use brute force methods to enlarge the image, with a result that's blocky and difficult to understand. If the author had been able to link to multiple versions of the image, each optimized for different screen resolutions or sizes, Todd could have been presented with graphical buttons he could understand.

Recommendation: HTML5 should allow the author to link a command element to multiple images optimized for different screen resolutions and zoom ratios.

It would also be beneficial to allow static images (e.g. the img element) to also link to multiple image files optimized for different resolutions, so that pages can better adapt to different screen resolutions and zoom ratios without having to resort to complex scripting. However, it could be argued that there are different priorities for the relative priority of applying this to command elements--which the user always has to identify to interact with them--and possibly static img element.

Submit WCAG technique(s) note availability of CSS Image Values and Replaced Content

Greg to file bug about issue, noting there are solutions but we don't know which ones will shake out (CSS Image Values, multi-resolution images, script solutions) Bug 13514


Under review:

Michael Cooper

No accessibility-specific issues noticed for links. I found the difference between <link> vs <a> and <area> elements confusing, as well as the fact that some content impacting this section was located in the "inline markup" section. I also found it problematic that link types can be extended merely by editing a wiki, as I don't think the process provides enough assurance that the material there is quality, in spite of the procedures outlined in the spec. However, I don't think these issues are specific to accessibility.


Under review:

Léonie Watson

4.13.2 Bread crumb navigation

The spec states:

"This specification does not provide a machine-readable way of describing bread-crumb navigation menus. Authors are encouraged to just use a series of links

in a paragraph."

Should the links within a bread crumb navigation be marked up as a list? Links within primary navigation blocks are usually represented as a list, so recommending the same for bread crumb navigation would be more consistent.

Using a list would also provide screen reader users with easier access to information available at a glance to sighted people. Most screen readers announce something to the effect of "list of x items" when they encounter a list. Knowledge of the number of links and/or steps within a bread crumb navigation is often very helpful.

4.13.3 Tag clouds

I'm not at all sure about the recommended technique for producing tag clouds. It relies on the title attribute, which may be problematic (see the footnotes section below. I'm also uncertain about the use of text descriptions to handle the weighting of tags within the cloud, and use of the <a> element.

Lastly, display:none; is used to hide the verbal descriptors. If this information is intended to be available to screen readers (but not visually) this technique won't work. It should use an offscreen approach instead (position:absolute; left:-999em;).

Strongly suggest that someone else take a look at this section. I may be worrying about nothing!

4.13.4 Conversations

The spec states:

"This specification does not define a specific element for marking up conversations, meeting minutes, chat transcripts, dialogues in screenplays, instant message logs, and other situations where different players take turns in discourse.

Instead, authors are encouraged to mark up conversations using p elements and punctuation. Authors who need to mark the speaker for styling purposes are encouraged to use span or b. Paragraphs with their text wrapped in the i

element can be used for marking up stage directions."

My concern is that this offers no way for someone using a screen reader or other AT to understand the purpose of the content. Is there an ARIA solution that could be applied here perhaps?

4.13.5 Footnotes

The spec states:

"For short inline annotations, the

title attribute should be used."

This could represent a problem for screen reader users. Most screen readers do not announce the value of a title attribute by default. Some can be configured to acknowledge title attributes, but not all.

In the example provided in the spec, neither Jaws 12 or NVDA 2011.1 (in either IE8/9 or FF4/5) report the footnote.

The spec then states:

"For longer annotations, the a

element should be used, pointing to an element later in the document. The convention is that the contents of the link be a number in square brackets."

My suggestion would be for this to be the recommended method for providing a footnote. It's consistent with the way footnotes are provided in other environments (such as print or word processed documents), and has the advantage of being well supported by screen readers.

Submit WCAG techniques for these

Léonie Kelly to file bug that the section should be removed since it doesn't spec anything, or examples made better for a11y Bug 13650

Greg Lowney

Can you mark up content that is repeated across pages on the site or subsite? For example, an example given in the HTML5 spec for the <footer> element describes it as showing a "site wide footer", but there is no easy way to distinguish that from a page-specific footer. This is particularly, but not exclusively, an issue for <header>, <footer>, and <nav> elements. This is an issue because assistive technology may often want to hide or skip over things that are repeated on multiple pages, but not skip over equivalents that are unique to the current page. Note that repeated content may still vary somewhat from page to page, as in when items in <nav> are all links except the one representing the current page, or the fact that different pages may have different copyright dates. One mechanism might be to allow an string attribute on these elements that could be compared to elements on other pages of the same site, and if the strings match the user agent can assume that they are equivalent (e.g. <nav sitewide="toplevel">).


User Agents

Under review:

Kelly Ford

From an accessibility perspective I think what is present here doesn't have issues. The vast majority of this section deals with DOM and web page access to browser elements. Terms like active window and such are used but they are very general and UAAG would really define how interactions within the browser itself were to be made accessible.


Under review:

Comments from NV Access (Mick Curran)

"I have read through both, and nothing bad jumps out at me accessibility-wise. Of course though we are really only qualified to look from a screen reader perspective.

The Offline applications section I don't think will affect us at all as in reality from the users' point of view its still the same web app whether or not its online or not. Though DOM events for going online and offline are in the spec, and I'm sure that the Browser can make these accessible if required."


Under review:


Under review:

UAWG Keyboard Use Cases and Recommendations

Migrated from UAWG wiki 26 July 2011

  • Complete and efficient keyboard access is critical for accessibility.
  • We examine high-level things that web protocols and formats can do to enable good keyboard UI:
    • Let users accomplish any task using the keyboard alone
    • Let content coexist and adapt to a wide range of user agents, browser add-ins, nested user agents, other content, and assistive technologies
    • Let the user take advantage of the widest range of keyboard commands and shortcuts
    • Provide the information needed to enable a wide range of keyboard features
    • Let the user retain ultimate control of their experience
    • Protect the user from badly behaved content and nested user agents
  • We present a number of specific topics with use cases, issues, and recommendations, as well as topics that have no clear recommendations.
  • Some of these are already covered by the latest HTML5 draft, while others are not.

Background and concepts used here are discussed in below.


Content in this section migrated from UAWG wiki Keyboard Concepts for HTML5 Discussion, 26 July 2011

Basics of keyboard access

Good keyboard access means:

  • Make keyboard access universal:
    • for every task (letting the user do everything using only a keyboard or keyboard emulator), and
    • for every user (not limiting keyboard access to users with good vision, with a particular keyboard layout, etc.)
  • Make keyboard access usable:
    • easy,
    • efficient (keep the number of input steps as low as is feasible, and not disproportionately higher than for people using both mouse and keyboard),
    • reliable/predictable (making sure that keyboard commands act in ways the user expects, including consistency with standards, within related content, and between user agents)
    • easily learned and remembered

Techniques for keyboard access include:

  • sequential navigation : the ability to explore by moving the keyboard focus forward and backwards through all the items that can be visited (e.g. tab and shift+tab to move between controls, left and right arrow to move between characters, or up and down arrows to move between lines of text)
  • direct commands:
    • direct navigation: being able to move the focus directly to the target you want, rather than having to go through everything in between
    • direct activation: being able to trigger an element's action without having to move the focus to a corresponding element (e.g. pressing Ctrl+S at any time to activate the Save menu item)
  • structural navigation: moving the focus using the structure of the content (e.g. forward and backward by sentence, paragraph, group, page, section, frame), as well as being able to see the structure (e.g. choosing a destination by moving through a hierarchical view of the document headings)
  • spatial commands:
    • spatial navigation: moving the focus based on the multidimensional location of things on the screen (e.g. using arrow keys to move between spreadsheet cells)
    • spatial manipulation: commands that move an object in particular directions (e.g. scrolling a window, moving a pushpin on a map, moving the pointer on the screen)
  • textual navigation: moving the focus to destinations based on text (e.g. finding a search string on a page, or moving to a control by typing the first characters of its label)

Mission, Goals, and Principles

Mission: To ensure that software and web technologies can provide the user with full and efficient keyboard access even when working with multiple documents, web apps, add-ins, nested user agents, and accessibility aids.

Goals and Principles:

  • Let users do everything from the keyboard:
    • Access to all elements. Let the user discover, navigate to, and manipulate everything.
    • Access to all input operations. Provide the ability to simulate all non-keyboard input operations so they can be done using the keyboard and other input systems.
  • Let things give users maximum flexibility over the keyboard UI:
    • Put the user in control. User agents should have flexibility to alter or supplement author-specified keyboard shortcuts, tab stops, behaviors, etc. That is, as with many things, user agents should be able to give the user (rather than the author) ultimate control. (The downside to this approach is that a badly designed user agent can mess up both the author and the user, but at least the user can choose user agents, while in many cases they cannot choose their content.)
    • Let content provide hints that allow user agents to add sophisticated keyboard capabilities. For example, if the user agent wants to provide a command to navigate to the navigation bar, it needs to known which element(s) in the content are the navigation bar, and if it wants to provide commands to navigate forward and backward through the document, it needs to understand the content's recommended reading order.
    • Support the full range of keyboard inputs, including unmodified keys, key combinations including the full range of modifier keys, and key combinations.
    • Let things provide information to support enhanced keyboard UI. Specific information needs to be defined in content (etc.) and passed on by the host and platform. For example, accessibility aids and the user agent need to be able to determine an author's intended reading and navigation order, accessibility aids need to be able to determine properties of content elements such as name and keybinding (e.g. to present them to the user) and location (e.g. to click on it if it does not fully support programmatic control).
    • Let things provide methods to support enhanced keyboard UI. Programmatic interfaces need to be defined, and passed on by the host, so that other components can programmatically perform any actions on an element that can be done using a mouse, keyboard, or touchscreen, and so that they can alter the presentation of elements such as to more prominently indicate focus indicated or to make keybinding discoverable,
  • Let users work with multiple things at the same time:
    • Allow content, components and add-ins to coexist and cooperate, even when they were developed independently. For example, allowing them to negotiate allocation of the limited set of possible keyboard shortcuts, allowing to user to transition between the UI of the user agent, content, nested user agents, etc.
    • User agents should be able to prevent things from breaking keyboard access. For example, an embedded object should not be able to trap the keyboard focus or break the methods of exiting that the user is used to, and content should not be able to trap the focus on an input field until the user enters properly formatted data.
  • Let things adapt to keyboard restrictions, conventions, and conflicts:
    • Let things negotiate keybindings, determining and adapting to the set of possible keyboard inputs and which are impossible (e.g. not on the user's keyboard), reserved by the platform (inputs reserved by the host or operating system so they cannot be changed by the component, e.g. Alt+Tab), restricted by convention (e.g. standard keybindings for copy, paste, exit, etc. that should not be changed lightly), or in use (e.g. . For example, allowing them to negotiate allocation of the limited set of possible keyboard shortcuts
    • Let content reflect actual keyboard shortcuts, such as incorporating them into their instructions, even if they are changed to adapt to the environment (platform, user agent, other content, add-ins, etc.).
    • Allow components to conform to the host's keyboard conventions, such as allowing a custom control (e.g. a drop-down list box implemented entirely in Javascript) to set its keyboard commands to match those its host browser provides for its equivalent controls, or allowing a form in HTML to emulate the navigation and access key behavior of native dialog boxes.
  • Let users discover keyboard UI:
    • Let things determine the active keybindings for all components so they can be presented to the user as a training aid or reference. This also allows the user to determine what command a keyboard input is currently mapped to, so they can identify what they accidentally did, or whether it is safe to repurpose a specific keyboard input because its current function is available through other means or not one the user would ever need.


User agents are platforms. Most platforms are designed to support multitasking, but most only support it well when the user is only interacting directly with a single application at a time. That's sufficient for most users, most of the time, but is entirely insufficient for a lot people who rely on assistive technology. Those products often need to modify input or output, or provide global commands that the user can activate regardless of what application or context they're working in, and only many platforms creating such requires undocumented, unsupported, and unreliable hacks simply because the platforms don't provide the necessary infrastructure. As we are defining the future platform architecture we need to do better, not just for assistive technology, but because Web browsers have made add-ins and plug-ins most prominent and popular even among the mainstream.

Unfortunately, most users and developers—including platform and standards developers—only think about a very limited set of input and output options: graphical output with keyboard for text entry and a mouse or touchscreen for navigation, and users interacting directly with only one application at a time. Those limited views do not accommodate the variety of users and their needs.

In reality the user works in an environment made up of many things, including:

  • hardware (e.g. available primary and modifier keys),
  • platforms (e.g. operating systems, GUIs, and window managers that might use or reserve keys),
  • user agents (e.g. web browsers),
  • content, including documents (e.g. HTML pages) and web apps (e.g. dynamic HTML) rendered in a browser,
  • browser extension and plug-ins (e.g. components that modify browser UI or content, or act within the browser on behalf of external utilities),
  • nested user agents (e.g. embedded media player), and
  • external utilities (e.g. accessibility aids that don't run within the browser, yet examine, modify, or provide alternate input or output for browsers and the content they render—and usually other applications as well).

These things often nest and coexist, as in:

  • Content hosting content, e.g. HTML rendered by the browser uses iframe to host HTML also rendered by the browser
  • Content hosting user agents, e.g. HTML rendered by the browser hosts a media player
  • User agent hosting multiple content frames, e.g. browser showing window split between two web pages, which could be static documents or web apps that may or may not interact with the content of other frames
  • User agent hosting add-ins, e.g. add-in creates sidebar providing separate view of or interacting with content being viewed in the browser, and creates its own keyboard commands that may function globally within the browser window or entire browser session
  • User agent interacting with external accessibility aids, e.g. a screen reader providing keyboard shortcuts to read portions of the text

Keyboard Commands

There are several types of keyboard commands:

  • Basic keys are simple keys that have dedicated functions (e.g. the A key enters the letter A; the Del key deletes something, be it a character, a file, contents of a cell; the Left arrow key moves something to the left, etc.)
  • Shortcut keys:
    • Access keys on some platforms are part of the visible label that can be used as a quick way to activate and/or navigate to the associated UI element (e.g. in a dialog box, an underlined S on the Save button indicates that you can press S to activate the button if the focus is not in a text input field, or Alt+S to activate it even if the focus is in a text input field)
    • Hotkeys trigger an action without necessarily triggering any particular UI element (e.g. Ctrl+S saves the current document works even if there is no Save menu item or it is not visible, and F1 brings up help).

Keyboard commands and shortcuts are particularly important for people with disabilities:

  • Many users with disabilities rely entirely on the keyboard or keyboard emulators because they have difficulty manipulating a mouse, seeing a screen, etc.
  • Users who cannot use a mouse often increase the number of shortcut keys in order to make tasks more efficient, especially people for whom each keypress is time-consuming, tiring, or painful.
  • Many users rely on accessibility aids that use their own set of keyboard commands, which can conflict with those defined by user agents, add-ins, or content.

While users with disabilities may need more keyboard shortcuts, the possible shortcuts are limited by:

  1. limitations in the hardware and system (e.g. no Command key on Windows or Windows key on a Macintosh, no Theta on non-Greek keyboards, no numeric plus key on compact keyboards),
  2. need to avoid conflicts with keys reserved by the platform and the user agent's user interface (e.g. Alt+Space displays the window menu on Windows, or Spotlight on OS X),
  3. need to avoid conflicts with keyboard conventions (e.g. if the user expects Ctrl+C to be Copy everywhere on the platform, content and applications can but shouldn't override that to make it trigger a different action),
  4. user agent limits on shortcut characters (e.g. whether the browser limits them to letters, allows compound characters, etc.), and
  5. user agent and format limits on using unmodified keys, different modifier keys, and key sequences (e.g. HTML4 and HTML5 accesskey only allows specifying a single base character)

Ways around these limitations include:

  1. allowing alternate modifiers (e.g. Ctrl+E does one thing, Ctrl+Shift+E another),
  2. allowing key sequences in addition to key combinations (e.g. Ctrl+E,a does one thing, while Ctrl+E,b does another), and
  3. allowing unmodified keys (e.g. F12, or the letter A), although this presents risk for users relying on alternate or automated input, or who have trouble perceiving mode-change indicators while working

We want content and nested user agents to register their keybindings with their host in order to:

  1. allow negotiation to avoid conflicts (e.g. web app changes bindings that conflict with the browser)
  2. allow the user agent and tools to provide enhanced UI (e.g. keybinding lists and reconfiguration)

Shortcut Conflicts

It's worth noting that direct activation and direct navigation commands (shortcut keys) come in two flavors, which we'll call:

  • Access keys provide a quicker way to activate and/or navigate to a UI element in your current context, but they should never be the only way.
    • In some contexts accelerator characters may be usable without any modifier or prefix keys (e.g. S to press the Save button in a dialog box if the focus is not in a text input field) and/or with certain modifier or prefix keys (e.g. Alt+S to press the Save button in a dialog box even if the focus is in a text input field). However, neither work when is the focus were in another context, such as an active menu.
    • HTML implements this using the accesskey attribute, which can be put on nearly any element.
  • Hotkeys trigger an action without necessarily triggering any particular UI element (e.g. Ctrl+S to save the current document works even if there is no Save menu item or it is not visible, although it still wouldn't work if the focus is in another context such as an active menu).
    • The platform should not change these because there is no UI informing the user of the change.
    • HTML5 implements this with the command element, which lets the author associate an accesskey attribute with any scripted action.

Note: Hotkeys are often called shortcut keys, but I'm avoid that term because the HTML spec uses it to include both access keys and hotkeys.

How shortcut keys conflict

One difficulty with hotkeys is that you can have multiple sets active at the same time. For example, the browser might define Ctrl+F as Find, while an add-on defines F12 as displaying a particular sidebar, and the active document might define Ctrl+F as moving focus to one of its input fields. On the other hand, hotkeys can be disabled when the focus is in another context, such as when a menu is active.

By contrast, access keys are designed so that, for the most part, only one set can be active at any time. For example, when typing in a word processor the menu bar is visible and only the access keys for its menus work, along with the access keys of controls in the document. However, when you display the File menu only the access keys for its menu items, and when a dialog box is active, only the access keys for its controls work.

Unfortunately this can break down at times, such as when the active window has a menu bar and also contains controls; in those cases it's often undefined what should be the proper behavior if the access keys for a menu conflicts with that of a control. When an application designer controls both the menu and the window content they can choose access keys that don't conflict, but if the application designer isn't in charge of the window content (e.g. developing a word processor that can show documents with embedded controls) or a content designer isn't in charge of the menu (e.g. authoring a web page that can be viewed in a variety of browsers) it is difficult to avoid conflicts.

Three solutions for conflicts

One solution is to use different modifier keys for application vs. content; Firefox 4 does this by using the Alt with access keys in the application but Shift+Alt with access keys in content. This avoid conflicts between application and content, but it's not perfect because users have to learn a new method of invoking access keys and constantly switch back and forth depending on context. It doesn’t seem to terrible to make the user learn that content in a particular browser work in a different way than applications, but it can be a problem if an author can hide the browser's window controls, in which case the window may look like a native application window but still function like browser content.

Another solution would be for the application to simply not have access keys, but that obviously reduces usability of the application.

Neither of these methods prevent conflicts when using multiple pieces of content (e.g. with iframes) or multiple application components (e.g. a browser and its add-ins).

A third method is to allow access keys to conflict, but ensure that it doesn't make anything unavailable. For example, if there are multiple items with the same access key, pressing the access key can merely move the focus to the next item in that set without actually activating it. The user can then use the access key to quickly navigate between the items, and another key (such as Enter) to activate the one with the focus. This system is how Windows handles conflicts within a menu bar, menu, or form, and so it's already familiar to many users. It decreases efficiency (as things take more keystrokes) and usability (because access keys can behave differently depending on the combination of application, add-ins, and documents), but at least all the functionality is still available and the number of additional keystrokes is usually very small; in that way it's far better than having one access key win and the others be ignored, which could multiply the number of keystrokes required for a task many, many times.

This method does not work for hotkeys as they may not have any UI to move focus to, and so no way to give users feedback when the hotkey key didn't have its expected effect.


The most significant change for keyboard support in HTML5 is the introduction of the command element, which can be used to create menu items with access keys and also to associate a hotkey with some other scripted action.

The introduction of native drag and drop also provides an opportunity to improve keyboard emulation of this activity for cases where the content doesn't implement an equivalent mechanism for the keyboard.


Sequential navigation to all elements that take focus or input

Users need to be sure they can explore and find all focusable and actionable elements, even if they cannot use a mouse.

  • Use case: Laurie is tabbing through a dynamic web page, but finds that there are certain buttons she cannot reach because the author, thinking only of mouse users, has specified that the buttons should not be included in the tab order by setting tabindex to a negative number. Therefore Laurie, who relies entirely on keyboard input, cannot access some functionality on the page.
  • Use case: Laurie is using a web page that contains a custom control, an image that does not take keyboard input or focus but does have an onClick handler. Therefore Laurie, who relies entirely on keyboard input, cannot click on the element to activate it, and even though her browser provides a context menu that would let her activate the image's onClick event, it does not let her move focus to it because that would violate the HTML5 specification.
  • Recommendation: Specification should explicitly state that user agents are allowed and encouraged to provide modes or commands that let the user move focus to all elements that take focus or input, even if the author has indicated that the element should not normally be included in sequential navigation, and even if the element takes input (e.g. has an onClick handler) but lacks other attributes that would normally render it focusable.

Greg to file bug though need to stay clear of deliberately non-focusable (aria-hidden, @hidden, display:none) Bug 13532

  • HTML5 Status: The current HTML5 specification (7.3.1 Sequential focus navigation and the tabindex attribute) says that if the tabindex value is a negative number, "The user agent must allow the element to be focused, but should not allow the element to be reached using sequential focus navigation." I'm not sure whether the use of "should" rather than "must" means that the user agent is allowed to include these in sequential navigation, or whether it is still forbidden. Likewise the spec says that if the tabindex value is a zero the user agent "must" allow the element to be focused, but only "should" allow the element to be reached using sequential focus navigation; if the user agent doesn't provide the ability to sequentially focus all elements that take input, then the spec should be changed to read that elements with tabindex of zero "must" be included in sequential navigation.

Navigation to and through non-editable content

  • Issue: Is there anything that HTML5 should do to facilitate this feature? I haven’t thought of any. It seems like the user agent can do everything it needs without any explicit support in the source language.
  • Use case: Wayne needs to select and copy some content from a Web page. Pressing the Tab key would normally move the focus between controls, links, frames, and the browser UI, but it would not stop at blocks of read-only text and images. For this task Wayne turns on his brower's "caret browsing mode," which adds each block of read-only content to the tab order. He can then move focus to the appropriate block, move the text cursor through it, select a range, and copy it to the clipboard or invoke its shortcut menu, all using the keyboard.

Greg to file bug about UA handling of caret browsing Bug 13533

Discuss if caret keeps its position when going "back" to a page

Preventing validation from trapping focus

  • Use case: Svetlana is tabbing through the controls on a form and lands on a field that expects a telephone number, but when she tries to tab away the user agent puts up an error message saying that a valid telephone number is required. Even though she had no intention of completing the form, she is stuck until she makes up and enters a telephone number.
  • Use case: Etta brings up a web form showing her account information. As she tabs between the fields she lands on one containing her current password. Unfortunately, the security requirements for this web site have recently been increased and her password is no longer considered secure enough, so even though she's only tabbing through the fields, the current value fails to validate, and she is prevented from tabbing on with its current value or by clearing the field.
  • Recommendation: Any field that validates input should allow the user to exit the field. At minimum, allow them to exit with the field being empty or retaining its initial value.
  • HTML5 Status: Unknown

Greg to submit bug that spec be explicit about this

Reading and navigation order

Authors should be able to specify preferred direction and/or order for sequential navigation, even among things such as tables that would not normally have a tab order.

  • Use case: Masahiko is reading a web page, and uses browser commands to move the text cursor to the next and previous paragraphs. In most cases this works fine because the suggested reading order is that in which elements occur in the HTML. However, when Masahiko encounters a table that is designed to be read down the columns rather than across the rows, this simplistic navigation is entirely inappropriate. A similar problem occurs when CSS is used to rearrange blocks of text on the screen.
  • Recommendation: Allow marking up a table to indicate whether the preferred reading order is by columns, by rows, both, or neither. This could be done with a new attribute, such as orientation="columns".

Greg to file bug Bug 13539

  • Recommendation: Allow marking up any element with a reference to the logically next and/or previous elements, for use when those are not the next/previous elements in the source. This could be done with new general attributes, such when a browser was implementing caret browsing and the caret moved beyond the end of a paragraph marked up with next="story5", that attribute would be a hint to the browser to move the caret to the element with id="story5" rather than to the element that follows the paragraph in the HTML source.

Greg to file bug but not with a11ytf keyword until we shake out the suggested mechanism Bug 13540

  • HTML5 Status: HTML5 allows the user to mark up tables with column and row headings, but not specify whether the table's preferred reading order is by rows or by columns. HTML5 also provides the tabindex attribute to specify order, but it is optimized for a relatively small number of controls, and applies to controls that normally take keyboard focus but not to anchors or to text and other elements that are subject to navigation in caret browsing modes.

Facilitate navigating related pages

  • Use case: Jason wants to reduce the number of keystrokes he enters, so when reading web sites he doesn't want to have to tab through all the links and controls on a page just to use the link that takes him to the next page. Instead, he uses his browser's keyboard commands that load the next, previous, first, last, and main pages based on the link elements in the current page's header. Unfortunately, a significant number of site—including some major news sites—fail to provide these elements, so Jason is forced to tab his through their pages.
  • Recommendation: Provide a way to mark up a link or control to identify it unambiguously as leading to the next, previous, first, last page, etc. This could be an attribute on a link (e.g. rel="next") or other mechanism. Even though this would be redundant to the existing link elements, it may increase the number of sites that support automated navigation shortcuts.
  • HTML5 Status: HTML5 doesn't add anything beyond the existing link elements.

Shortcut Keys

Greg to file bug catch-all and supporting bugs that depend on it for the issues in this section Bug 13555 is catch-all; supporting bugs are Bug 13564, Bug 13565, Bug 13575, Bug 13576, Bug 13576

Shortcut keys consist of both access keys (e.g. S or Alt+S to activate the Save button, or Alt+F to activate the File menu) where the user agent is trying to emulate behavior of the platform, and hotkeys (e.g. Ctrl+S to trigger the "save" action, Ctrl+Shift+S to trigger the "save as" action, or Ctrl+C to trigger the "copy" action). See detailed discussion above.

We want content and nested user agents to register their keybindings with their host in order to:

  1. allow negotiation to avoid conflicts (e.g. web app changes bindings that conflict with the browser)
  2. allow the user agent and tools to provide enhanced UI (e.g. list and/or modify keybindings)

Negotiating shortcut keybindings

There should be a mechanism for components (user agents, documents, web apps, embedded objects, accessibility aid, etc.) to negotiate which keyboard commands will be used by one or the other.

  • Why is this an accessibility issue:
    • Users with disabilities are much more likely to rely on keyboard access. For them, keyboard conflicts might present insurmountable barriers, while they'd only be minor inconveniences for users routinely using the mouse.
    • Users who cannot use a mouse often need to drastically increase the number of shortcut keys in order to make tasks more efficient, especially people for whom each keypress is time-consuming, tiring, or painful. Increased number of shortcuts increase the number of potential conflicts.
    • Users with some cognitive impairments have more difficulty adjusting when their accustomed methods suddenly fail to work, or when commands they use suddenly do something unexpected.
Negotiation between host and embedded object
  • Use case: An embedded object uses Shift+Esc for one of its control functions, but it's run on a browser that uses this same key combination as the method for returning focus to the browser. In the simple cases, either the user wouldn't be able to exit the object using the method they're familiar with, or else they couldn’t use some function in the object because the keystroke would exit the object instead.
  • Recommendation: I'd say that it is important that the user have a consistent way to exit all embedded objects, because without this they can become effectively trapped; even if there is a way out, the user may not know it or be able to look it up when needed. The embedded object needs to be able to determine that on this particular browser it cannot use a particular command (e.g. Shift+Esc) and adjust its command set, user interface and instructions accordingly. If the user does exit the embedded object using the host's command, the host should inform the embedded object so that it can "clean up" and handle the action gracefully.
  • HTML5 Status: Unknown
Negotiation with nested hosts
  • Use case: Pablo is used to pressing Ctrl+W to close a browser window. In his browser he's reading a page that contains an embedded user agent, and while browsing in THAT context he presses Ctrl+W to close the window. Unfortunately, the script being run by the inner user agent did not know that Ctrl+W was used by the outer user agent, so it grabs and consumes the keyboard input and carries out some action, so Pablo is unable to use his accustomed method to close the outer browser's window.
  • Recommendation: There should be some way for embedded objects, including nested user agents, to determine which shortcut keys are being used all the layers hosting it, so it can modify its own shortcut keys to avoid conflicting with them.
  • HTML5 Status: Unknown
Negotiation between host and content
  • Use case: Pablo is used to pressing Ctrl+F to find a string on the current page. However, an online encyclopedia grabs Ctrl+F and moves the focus to its own text input fields that's used to search the entire encyclopedia.
  • Recommendation: The script on the page should be able to query whether Ctrl+F is already assigned to something in the system (the host browser, a browser add-in, etc.), and if it is it can identify a new, unused keyboard input, map that to its control, and incorporate that into the instructions it presents to the user.
  • HTML5 Status: HTML5 allows the content to provide a list of suggested Unicode characters, but the user agent gets to decide on the actual key assignment, including base character and modifier keys. The content script can retrieve a user-friendly string representing the assignment. (That's enough for this use case but is too limited for some of the others.)
Disabling unmodified keys as shortcuts

Some components use unmodified letters, numbers, and punctuation marks as keyboard command. This can be handy for users who want to make keyboard input as efficient as possible, including some users with disabilities, but for other users with disabilities it can be a significant problem because everyday text input can trigger a large number of seemingly random actions if it's entered in the wrong context. Therefore user agents should be permitted to make this available as a user option.

  • Use case: Tom uses speech recognition to input text and commands, and he's working in a Web-based word processor while in the background another Web app or browser add-in is downloading a large file. He's in the middle of dictating a letter the background task steals the activation to notify him that the download has completed. Suddenly the text that was supposed to go into a letter is interpreted in the new context as dozens of commands. Tom looks at the browser and finds that his project in that context has been altered or deleted altogether, and also that display options have changed and he has no idea what command would server to restore them.
  • Recommendation: Tom goes into his browser's preference settings and clears a check box to disable the use of unmodified keys as commands and shortcuts. When the Web app starts up, it asks the browser whether the letter "u" is available as a shortcut and is told that it is restricted by policy. Therefore the app goes down its list of preferred keystrokes, determines that Ctrl+U is available, and configures itself to use that instead. It may even display an indicator on its status bar warning the user that non-default keyboard commands are being used. The user can then go into the app's configuration screen to find out the current keybindings.
  • HTML5 Status: Unclear. The example in 7.4.2 shows that a user agent can use a key unmodified, but 7.4.3 merely says the user agent can assign its choice of "a combination of modifier keys", but does not specify whether no modifier key is a valid option.

Retrieving actual keybindings

If the author can only suggest keyboard shortcuts using accesskey, how can they provide instructions to the user?

  • Why is this is an accessibility issue?
    • Many users with disabilities rely entirely on the keyboard or keyboard emulators because they cannot physically manipulate a mouse, or see the screen, etc.
    • Many users have difficulty memorizing keybindings, and this is more extreme for users with some cognitive impairments.
    • Memorizing keybindings is more difficult when they are likely to be adjusted to avoid conflicts, and such adjustments are more common among people with disabilities. Users who cannot use a mouse often increase the number of shortcut keys in order to make tasks more efficient, especially people for whom each keypress is time-consuming, tiring, or painful, and many users rely on accessibility aids that use their own set of keyboard commands.
Retrieving user-friendly keybindings
  • Use case: In a web browser, Aaron views a Web page that has a button with accesskey="E". The author wants to incorporate instructions on the page or popup that explain the keyboard commands, but unfortunately they can't predict what keybinding the browser will use: the accesskey attribute is merely a suggestion, and the actual value will vary depending on both the browser and the platform it's running on.
  • Recommendation: HTML5 defines the new command object with property called assignedaccesskey. To solve this we could define a new property, accessible through the DOM, that returns the actual keybinding associated with an element. In one potential implementation, the page script retrieves the anchor's accesskeystring= value, which Firefox 4 would set to "Shift+Alt+E", but Internet Explorer would set to "Alt+E", Opera would set to "Shift+Esc, E", Konqueror would set to "Ctrl+E", Safari 4 on MacBook Pro would set it to "control+E" but on Windows would set it to "Alt+E", etc. The script can then insert this string into the instruction paragraph on the page so that on Firefox it would read "To delete, press Shift+Alt+E", but on Internet Explorer it would read "To delete, press Alt+E", and so forth.
  • HTML5 Status: It looks like HTML5 defines a new command element with a programmatically-retrievable accessKeyLabel attribute, whose value the user agent calculates based on its accesskey attribute. The accessKeyLabel string is presumably human-friendly, but no specific guidance or examples are provided. (This also means that the string is cannot be parsed by the script, as discussed below.)
Retrieving machine-friendly keybindings
  • Use case: Aaron asks his web browser to display a list of the currently active keyboard shortcuts. In the draft HTML5 specification it would do this by enumerating the command elements and retrieving each one's accessKeyLabel property. The user agent wants to make the list more useful by offering views sorted and organized in different fashions, including a view that includes separate groupings for unmodified keys, Ctrl key combination, Shift key combinations, and Ctrl+Shift key combinations. Unfortunately, it cannot easily parse the accessKeyLabel string, as it knows neither the names the user agent will use for modifier keys (e.g. "Ctrl", "control", "?", etc.) nor what method will be use to concatenate them (e.g. "Ctrl+a", "Ctrl/A", "^A", "A+Control", "?A", etc., all of which may depend on the platform, user agent, language, and/or locale).
  • Recommendation: I see two reasonable approaches. The first, analogous to that used in most native programming environment, would be for a property to return a programmatic representation of the keybinding, such as a list of codes representing keys and their modifiers (e.g. an integer representing the F5 key along with a mask of bits representing Alt and Shift modification). However, this should not be a substitute for returning a user-friendly string representation, as scripts script could have a difficult time creating one from the programmatic representation. The machine-friendly version could either be returned by an element using a property analogous to accessKeyLabel, or the user agent could be required to provide a function that converts the string returned by accessKeyLabel into a machine-friendly form. The second approach would be to have a different function analogous to accessKeyLabel that returned a string compounded from strings that are language-, locale- and platform-neutral (e.g. "Ctrl+A" for the combination of the "A" key and the equivalent of the control key, even in environments where the latter is referred to as "control").
  • HTML5 Status: This is not supported; currently only a human-friendly string is returned (via the accessKeyLabel property).

Maximizing potential keyboard shortcuts

Specifying detailed hotkeys

Today's accesskey attribute is a single character, and it's up the browser to decide which whether that character will be used or another substituted, and what modifiers if any are required (e.g. accesskey="s" might map to Ctrl+S on one environment and Alt+Shift+S on another). That may make sense for access keys (e.g. S or Alt+S to press the Save button) where the user agent is trying to emulate behavior of the platform, but it is an unnecessary limitation for hotkeys (e.g. Ctrl+S to trigger the "save" action, or Ctrl+C to copy the current record). (See detailed discussion of access keys vs. hotkeys above.)

When defining hotkeys developers of Web apps and authors of documents should be able to assign more specific keys, such as actually requesting Ctrl+I for one a frequently used command and Ctrl+Shift+I for another less frequently used. This would let them assign shortcuts in a way that is more meaningful, both in terms of grouping and being easier to remember. For example, a page for reading email could assign shortcut Ctrl+R to the Reply button and Ctrl+Shift+R to the Reply All button, Ctrl+J to the Forward button and Ctrl+Shift+J to the Forward As Attachment button, and so forth. (Of course any specific key combination may already be used by the host or another component in the system, and some keys may not be available on the user's system, so the browser would either treat these as hints to be modified as needed or else the page's script could negotiate with the user agent as described elsewhere in this document.)

  • Use case: Roger is using a Web-based email client that has a row of buttons for things he can do to the selected message. The author wanted to make keyboard usage as efficient as possible and minimize the number of keystrokes that users such as Roger need to enter, so they assigned easily typed keyboard inputs to the most commonly used commands and more complex inputs to less frequently used commands. In this case, it assigns the shortcut Ctrl+R to the Reply button and Ctrl+Shift+R to the Reply All button, Ctrl+J to the Forward button and Ctrl+Shift+J to the Forward As Attachment button, and so forth.
  • Recommendation: Facilities to establish and adjust keybindings should allow content, add-ins, and the user to bind commands to specific key combinations, rather than merely specifying a single base character. HTML would let the author specify preferred keyboard inputs for accesskey and the like including recommended base keys and modifiers. For example, accesskey="ctrl+shift+i", using standardized, non-locale-specific names for keys and modifier keys. There should also be way for the author to request that a key be unmodified, as discussed below in the section "Unmodified Keys as Shortcuts". These recommended keyboard inputs would be modified by the browser to accommodate impossible, reserved, restricted, or conflicting inputs. For example, when an HTML5 command element is used to create a menu item and the user agent wants to emulate native access key behavior on the Windows platform, it would limit the access key to a single character that could be used alone or with the Alt key depending on the situation, but when a command element is used to establish a hotkey for a scripted action the user agent could allow a wider range of modifiers and/or sequences.
  • HTML5 Status: Currently the author can only specify a single, unmodified Unicode code point, and the user agent assigns any combination of modifier keys it chooses with no further hinting from the author.
Sequences as shortcuts

Allowing key sequences to be used as commands greatly increases the number of keyboard commands that can be used, as well as the ability to make these command mnemonic.

  • Use case: Jemiah is a keyboard user who wants to make her work as efficient as possible. She configures her Web-bases word processor to set up shortcut keys for the dozens of the built-in commands and macros she uses frequently, especially those that would normally take many keystrokes to carry out. However, she quickly runs out of good keystrokes, both because of the sheer number and because she needs to keep them mnemonic in order to easily remember them. Therefore she uses key sequences so she can group related commands together with the same prefix. For example, she uses Ctrl+R as the prefix for all the commands dealing with revisions, with Ctrl+R followed by S to show revisions, Ctrl+R followed by H to hide revisions, Ctrl+R followed by A to accept revisions, Ctrl+R followed by R to reject revisions, and so forth.
  • Recommendation: Facilities to establish and adjust keybindings should allow content, add-ins, and the user to bind commands to key sequences as well as single keys and key combinations. HTML would let the author specify preferred keyboard inputs for accesskey and the like including key sequences, such as accesskey="ctrl+r a" for Ctrl+R followed by the letter A.
  • HTML5 Status: Currently key sequences are not supported as shortcuts. [7 User interaction — HTML5 If the value is not a string exactly one Unicode code point in length, then skip the remainder of these steps for this value.]
Enabling unmodified keys as shortcuts

Some components use unmodified letters, numbers, and punctuation marks as keyboard command. This can be handy for users who want to make keyboard input as efficient as possible, including some users with disabilities, but for other users with disabilities it can be a significant problem because everyday text input can trigger a large number of seemingly random actions if it's entered in the wrong context. Therefore user agents should be permitted to make this available as a user option.

  • Use case: Reggie finds it difficult to press key combinations, and even though many platforms provide the StickyKeys feature that lets her simulate them, she wants to author her own web site so that the shortcuts on the pages are accessed with single keys rather than key combinations.
  • Recommendation: See section on Specifying Detailed Shortcuts.
  • HTML5 Status: See section on Specifying Detailed Shortcuts.
Allowing non-character keys as shortcuts

Allowing non-character keys such as F12 and Del to be used as shortcuts significantly increases the number of keys and key combinations available.

  • Use case: Jeanine is developing a Web app and because most of the normal character keys are taken, she wants to have the F12 key activate the "Exit" link on the page, and Shift+F12 activate the "Save and Exit" button.
  • Recommendation: Provide a way to use keys that cannot be represented as a single Unicode code point. For example, define keywords such as "del", "num-del" and "f12", and allow their use in the accesskey attribute along with normal Unicode characters (e.g. accesskey="f12 del d").
  • HTML5 Status: Currently non-character keys such as F12 cannot be assigned as accesskey shortcuts, as it can only be keys equating to single Unicode code points. 7.4.3 Processing model

Shortcuts for Navigation

Navigation shortcuts without visible links
  • Use case: Jeanine is creating a web page, and wants to define a shortcut that would assist users with disabilities by moving the keyboard focus and point of regard to a specific bookmark in the page. However, she doesn't want to have a link to that location be visible on the page.
  • Recommendation: It should be possible to define a keyboard shortcut for the sole purpose of navigation, rather than activation. It should be possible to target essentially any element in the document, and the command should be defined as navigating to without activating the element.
  • HTML5 Status: Unlike HTML4, HTML5 allows accesskey on all elements, and that it creates "a keyboard shortcut that activates or focuses the element". However, it leaves the actual behavior undefined and thus up to the user agent, meaning that a user agent could implement it so that the keyboard command does nothing if the element does not take input. This should be clarified.
Distinguishing activation from navigation shortcuts

HTML4's specification for accesskey says "Pressing an access key assigned to an element gives focus to the element. The action that occurs when an element receives focus depends on the element." In actuality, whether the accesskey moves focus to and/or activates an element varies from one user agent to another. This means the same content or web app behaves differently in different browsers, and in some cases functionality may not be available at all.

  • Issue: Is it worthwhile to define a mechanism whereby authors would want to be able to specify separate shortcuts for activating vs. navigating to an element, or one but not the other? Or do we merely expect user agents to provide some navigation mechanism for activation and navigation without an input from the content?
  • Issue: Should the user be able to activate a button and afterward have the focus remain (or be restored to) where it was? Would this be the author's choice and/or the user's?
  • Use case: Juan is viewing a web page which has a link to a document, and assigns an accesskey to a link. In one browser, Juan can press the accesskey to move the focus to the link, then press a key to display the link's shortcut menu, then select the command to show him the destination and title attributes of the link. However, on one browser pressing the accesskey activates the link, taking him away from the page he was on, so he cannot use the accesskey to navigate to the link, instead having to press the tab key a dozen times.
  • HTML5 Status: The HTML5 spec provides even less guidance than that for HTML4, saying merely "The accesskey attribute's value is used by the user agent as a guide for creating a keyboard shortcut that activates or focuses the element."
User choice between navigating and activating
  • Use case: Roger is using a Web-based email client that has a row of buttons for things he can do to the selected message. Roger can flag a message by clicking the Flag button or pressing the button's shortcut key. When he does either, the focus is left on the button, so after doing this he almost always has to navigate back to the message list before he can use the arrow key to select the next message. He would much rather have focus remain on the message, or move to the next message after activating the Delete button.
  • Recommendation: Let the author specify separate keyboard shortcuts for activating and navigating to an element. For example, HTML could replace or supplement the accesskey attribute with separate activationkey and navigationkey attributes; pressing the former would activate the element without moving focus, while pressing the latter would move the focus to the element without activating it, and if the values were the same then pressing the key would move focus to and activate the element.
  • HTML5 Status: Currently it appears that command elements can only be used for activation, not for navigation. It is implied, but not explicitly stated, that the target control should be activated without moving the focus.

Emulating non-keyboard operations

Simulating drag and drop
  • Use case: June is using interacting with Web page that uses HTML5's drag and drop facilities. For users like June who can't use a mouse, the web browser provides a keyboard mechanism that lets her carry out all the drag and drop operations using only the keyboard, including not only the normal drag and drop but also the behavior of drag and drop modified by various modifier keys. June presses the tab key until the focus is on the element she wants to drag (and she may have had to turn on a special mode to add drag sources to the tab order), and from its shortcut menu chooses the "Drag and drop" submenu, then "Select for Shift + drag and drop". She then presses the tab or an access key to move the focus to the drag target, and from its shortcut menu chooses the "Drag and drop" submenu, then "Drop". The browser sends the proper commands to the page's script to simulate all stages of the process, including triggering dragstart, ondrag, ondragenter, ondrop, etc. events.
  • Issue: Does HTML5 provide everything it needs to allow user agent to enable this type of feature?
  • Issue: Can the content tell the user agent what commands are supported, e.g. drag for move, shift+drag for copy? Can there be a way to register these?
  • HTML5 Status: Unknown.

Greg to file bug to be sure drag and drop fully supported Bug 13591

Greg to file bug about allowing content to register what it supports Bug 13593

Other Keyboard Issues

  • Handling shortcut conflicts. While we can define mechanisms to let dynamic content, add-ins and the like negotiate to try to avoid conflicts, they are likely to still occur at times. With HTML4 and below, the behavior in such circumstances is left undefined, and consequently is handled differently by different user agents and so cannot be planned for.
    • Issue: Should the author be able to suggest ways to handle shortcut conflicts? If an author could specify that pressing a shortcut would move focus sequentially between the elements that have that accesskey, and wrap, they author could design keyboard UI to take advantage of this in a way that is not otherwise possible. On the other hand, issues such as whether sequential navigation wraps should ideally be under the user's control, since wrapping is a significant advantage to some users (e.g. those who need to minimize the number of input commands) but a disadvantage to others (such as those who may not be able to tell when wrapping has occurred).

Greg to file bug Bug 13594

  • Partially downloaded content. What if a web page pulls down content as needed (e.g. Google Maps), and elements in the content may have keyboard shortcuts, but some of those elements are in portions of the content that won't be downloaded until needed?
    • Issue: Should the Web page be able to download a complete list of the keyboard shortcuts associated with elements in the content, and have host user agent notify it when those keys are pressed so that it can download and present the corresponding portion of the content? Is this necessary, or since the as-needed downloading is handled by scripting, is it simply the script's responsibility to handle these shortcuts as well?
    • Use case: A web page displays a list of tens of thousands of names in alphabetical order, with a heading for each initial letter, and a shortcut for each heading. Rather than downloading the entire list, it wants to download portions of the content only as they're needed, in response to the user scrolling the window, moving the text cursor through the content, or pressing a shortcut key associated with the headings.
  • Presenting access keys to the user.
    • Issue: Can there be ways to automatically incorporate the shortcuts into the way content is presented? For example, many GUIs underline the shortcut key in a text label, but that doesn't work with browsers that underline all link text. Do most browsers still not present accesskeys defined by the element? Is there anything that could or should be done in the protocols for formats to assist with this?
    • Recommendation: Where the label element is described in the HTML5 spec, it should specifically discuss and recommend use of the accesskey attribute, and give an example of how a user agent may present not only use this value but also present it to the user. For example, when displaying label text a user agent could underline the first occurrence of the accesskey character in the displayed label text (called an implicit designator), or if the character does not appear in the label text, it could append a space and the underlined accesskey character in parentheses (called an explicit designator).
  • Distinguishing between access keys and hotkeys
    • I'm concerned that perhaps HTML5 should distinguish more clearly between access keys, hotkeys, and menu items. If they should behave differently, should the same element (command element) really be used for both? Even if the user agent can distinguish between them, treating them differently, will it confuse authors?

Greg Lowney

Defining generic commands that have associated keybindings is an extremely powerful mechanism that lets user agents give the user control over keybindings. One use of this is to automatically generate documentation for the user providing a reference and guide to the keyboard commands as they're actually configured. However, a long, unorganized list of keybindings, while better than nothing, is still extremely difficult to use. This could be much easier if the user agent (or tool) could organize the list, as well as allowing the user to filter and navigate it intelligently. This would be possible if the author could provide hints with each command, such as recommended categories or keywords.

Use case: Carlos relies on the keyboard, and command keybindings are very important way for him to perform tasks efficiently. He is using a web-based application, and asks his web browser to present a list of all the commands defined by the web-app, which he can consult and print out for future reference. The browser has already processed all the commands defined in the HTML source, including those created by interactive elements with acccesskey as well as those command elements that associate an action with a keyboard input. Unfortunately, this list is very long, especially if it's combined with the browser's own commands. Luckily, the browser's dialog box contains buttons for sorting the list alphabetically or by category (e.g. commands relating to tables, commands relating to view options, commands for formatting, etc.), and it's able to do that because the author was able to supply a user-friendly name and category (or keywords) for each command. When Carlos wants to look up a command but doesn't know the name assigned to it, or wants to look up a bunch of related commands, he can use the category view, just like those provided in printed software user guides. When he already knows the official name of the command, he can use the alphabetical view to find it quickly. Note that this is particularly important when Carlos moves between different user agents that assign different keybindings to the author's commands.

HTML5 Status: HTML5 lets the author provide a user-friendly name for command elements, but

Recommendation: HTML5 should allow the user to associate a category or, even better, multiple categories or keywords, with each author-defined command. My preferred method would be to allow a keywords or tags attribute on any element that can be used to define a command (that is, all the methods described in section 4.11.5 Commands), as discussed in more detail elsewhere in this document (see 11 Standard pieces of information should be automation-friendly). There would also be value in letting the author define a primary keyword or category, but I think that's less critical than allowing multiple keywords or categories.

Greg to file bug Bug 13616

Andrew Kirkpatrick

The sections under 7.5.1 (Move the Caret and Change the selection) provide this as a vague indication of potential keyboard support:

"This could be triggered as the default action of keydown events with various key identifiers and as the default action of mousedown events."

Does this suffice as a requirement for keyboard access? Do we expect that the spec should indicate this more clearly, or is this expected to be handled by browsers wanting to comply with Section 508 or other accessibility standards?

Andrew to file bug Bug 13573


Under review:

Rich Schwerdtfeger

Title for section

Section 7.5 should be renamed "editing host" or something like that as the name implies the section only applies to the contenteditable attribute. The section should be reworked.

  • Where it becomes confusing is designMode. designMode places the entire document in an editing mode and should also be considered an editing host.
  • DesignMode (section 7.5.2) is a subsection of 7.5, the contenteditable attribute. Yet, 7.5.1 defines User Editing Actions. It is not clear that User Editing Actions apply to both contenteditable and designMode. I think we agree that it should and it is intended but that is now how section 7.5 is structured. Where this becomes important is that 7.5.1 describes the user functions apply to an editing section. This is important for main stream clarification and for accessibility to ensure access to Editing Hosts. These also need to apply to designMode. Fortunately, the user functions (move the caret, etc.) are defined in a device independent way. 7.5 should be restructured as follows:

7.5 Editing hosts

Introductory text should state that this applies to elements having contentEditable in the true state and design Mode in the enabled state.

7.5.1 contentEditable 7.5.2 designMode 7.5.3 User Editing Actions for Editing Hosts

Another approach could be to state that designMode="on" means the equivalent of "contentEditable" being true as applied to the entire document.

Rich to file bug Bug 13416

Consistent navigation

Section 7.5 The specification must mandate and specify consistent navigation across browsers or at least across browsers on the same platform.

Specifically, One of the major problems we see between different browsers is how each one implements navigation and default editing (e.g. delete, backspace etc.) differently. This has resulted in CKEditor implementing special keyboard handling and strange DOM manipulations in order to try and get the experience the same (or close to) between browsers. Here's a common example: in IE, if the caret is at the end of a link and the user hits a backspace, IE deletes the link but leaves the caret in the same position with the text of the link unchanged. In FF, the backspace will delete the last character in the link, preserving the link element.

This problem impacts people with disabilities, mainstream keyboard users, developers that use editing hosts in the web applications.

Rich to file bug Bug 13429

Spelling and grammar in designMode

7.6 Spelling and Grammar checking, Spelling and Grammar checking should be made available for designMode as well. By mentioning only the contenteditable attribute when referring to a editing host the text gives the inference that designMode is not applicable when in fact the entire document becomes an editing host.

This text:

"User agents can support the checking of spelling and grammar of editable text, either in form controls (such as the value of textarea elements), or in elements in an editing host (using contenteditable)."

Should change to: "User agents can support the checking of spelling and grammar of editable text, either in form controls (such as the value of textarea elements), or in elements in an editing host (using contenteditable or designMode)."

Rich to file bug Bug 13431

Greg Lowney

Use case: Nadia is editing text on a web page, in a textarea or an element with contenteditable=true. Her browser's spelling checker is turned on, and the region has the spellcheck=true, so when the browser's spelling checker marks up spans with a red, wavy underline to indicate what it thinks is a spelling error, and a green, wavy underline to indicate what it thinks is a grammar error. Following the advice in the HTML5 spec (4.6.19 The mark element), the browser marks up these phrases with u elements, distinguishing them using different classes. Unfortunately, these classes are meaningless to Nadia's screen reader, which is unable to distinguish them from ordinary underlines; that is, it has no idea what the classes mean, and whether they have semantic meaning or merely stylistic effects. If only they were marked up in a standardized way, her screen reader could use spoken phrases, audio icons, or inflections to indicate which ranges were flagged as spelling errors, etc.

Recommendation: HTML5 should define a standardized mechanism for marking up phrases flagged by a spelling checker or the like, so that content scripts or assistive technology could react to them intelligently. One option would be to extend the mark element with one or more new attributes, such as a flag attribute with enumerated type whose values could be spelling (e.g. "generral") or grammar (e.g. "He run"), or syntax (e.g. coding errors such as end if without preceding if), and also a meaning attribute that could use set to a user-friendly string that would convey the meaning of the markup to the user (e.g. <mark meaning="Character Set"> in a multilingual editor, or <mark meaning="Break Point"> in a debugger). Alternatively, a new element could be created for this purpose, rather than extending the use of the mark or u elements. (It's also worth noting that the important aspect of these indicators is *not* that they're underlines, but that they're spelling or grammar errors, and therefore using the u element is really inappropriate.)

Greg to file bug Bug 13504

Andrew Kirkpatrick

Spellcheck example

The current text reads:

The element with ID "b" in the following example would have checking enabled (the leading space character in the attribute's value on the input element causes the attribute to be ignored, so the ancestor's value is used instead, regardless of the default).

<label>Name: <input spellcheck=" false" id="b"></label>

Recommend change to “The element with ID "b" in the following example would have checking enabled when a user enters a value into the rendered input control” It is a little confusing that the input is checkable but there is nothing to check.

Bug 13519 but did not add a11y keyword, appears to be editorial

Moving the caret with the keyboard

The sections under 7.5.1 Move the Caret and Change the selection provide this as a vague indication of potential keyboard support:

“This could be triggered as the default action of keydown events with various key identifiers and as the default action of mousedown events.”

It is desirable to make this more explicit, so UA's are aware that support for interaction without a mouse is a requirement.

Bug 13573

Drag and Drop

Under review:

Note reference to Drag and Drop in Focus#HTML5.

Comments from NV Access (Mick Curran)

"I have read through both, and nothing bad jumps out at me accessibility-wise. Of course though we are really only qualified to look from a screen reader perspective.


The drag and drop section is probably closer to home for us. But having read it, I feel that everything is covered. At least from a DOM point of view anyway. I'm not sure if it is the job of that document to mention things like the fact that the draggable attribute should be mapped to an accessibility API rather identical to how the ARIA drag and drop properties are?

Neither of these sections, and therefore I assume the entire document, mentions ARIA or accessibility specifically, so I guess these points should be left up to the Browser and other accessibility-specific documents."

Andrew Kirkpatrick

Drag and drop on systems that have pointing devices

The Drag and Drop description around pointing devices is too restrictive and suggests that browsers only need to support non-mouse operations on devices without a mouse. This functionality needs to be available for obligatory keyboard users even on systems that support a mouse.

Current: 7.7 Drag and Drop: On media without a pointing device, the user would probably have to explicitly indicate his intention to perform a drag-and-drop operation, stating what he wishes to drag and where he wishes to drop it, respectively.

Recommend change to: “Drag and drop operations must be able to be accomplished without a pointing device. The user would probably have to explicitly indicate his intention to perform a drag-and-drop operation, stating what he wishes to drag and where he wishes to drop it, respectively.”

Bug 13520

AAPI support for dropzone state

Is there an ability within IA2 or other APIs to support the "dropzone" state, so as to indicate to users that an area is available to drop content being dragged?

Bug 13574


Under review:


Under review:


Under review:

James Nurthen

Oracle uses @abbr on td/th, and is supported, should not be deprecated.

James to file bug Bug 13614

Greg Lowney

@axis related to some table navigation preferences stuff.

Greg to file bug instead, added comment to Bug 13539


User understandable rendering of Element Names and Roles

Migrated from UA wiki 26 July 2011

The addition of new semantic elements and roles within HTML5 implies that the elements must be understandable by users of assistive technologies. Authors can use a variety of techniques to supply human readable labels that name or describe the element. In cases where the UA does not identify an authored label, the UA is suggested to announce the element name. For example, the <details> element has an optional <summary> element that provides a label for or summary of the content of the detail. If the summary element is not used, the HTML5 specifications states:

The first summary element child of the element, if any, represents the summary or legend of the details. If there is no child summary element, the user agent should provide its own legend (e.g. "Details").

To support internationalization, semantic element names or roles, reported by the UA in lieu of an author specified replacement text, must be localized by the UA prior to any visual rendering and before passing to the Accessibility API.

Miscellaneous Comments on Specification (or applying Principle 3, Understandable)

Migrated from UA wiki 26 July 2011


checkedness is used 44 times in the HTML5 specification, but not defined. What is the definition of checkedness?

Global accessibility settings

Migrated from UA wiki 26 July 2011

Issue: Should the HTML5 spec define a standard, platform-independent way for content to query the user agent's accessibility settings, and by extension platform settings that are known to the user agent? Are there any equivalents today?

Use case: All major operating systems support a "high contrast" mode that tells software the user wants high contrast between foreground and background. Yev turns on this option, and in his browser he loads a web-based flow chart editor that displays all its document content in an HTML5 canvas element. The flow chart editor wants to detect when the user has high contrast mode turned on so it can adjust its graphical display appropriately. Because it's designed to run on any browser and any operating system, it needs a platform-independent means of querying this setting.

Use case: Kevin has turned on the "Show Extra Keyboard Help" option in the Windows Control Panel, which tells all software that he wants any and all options that enhance keyboard access to be automatically enabled. His web browser responds to this setting by, for example, always displaying the underlined access keys in menu and control labels. He would like web pages and web apps to also respond to this setting, even if they're creating custom controls.

Greg to file bug to provide global accessibility preference settings Bug 13619

Greg to file bug about privacy issues of global accessibility preference querying including inferring disability from e.g., custom style applied to element, detecting whether keyboard or mouse send an activate event, etc. Bug 13617

Discuss global accessibility issue in task force, along with the privacy issues such a setting would raise

Coordinate with Privacy WG on ways of addressing privacy implications of detecting AT use