DRAFT UAWG comments about 6 May 2003 XHTML 2.0

Status of this document

These comments on the 6 May 2003 version of XHTML 2.0 were prepared for discussion at the 15 May 2003 UAWG teleconference. Ian has discussed many of these comments with Steven Pemberton, Chair of the HTML WG. The expectation is that the UAWG, in conjunction with the PF WG and possibly the QA WG will use these comments as the basis for discussions with the HTML WG.

These comments incorporate earlier comments:

  1. Comments from 7 Mar 2003 UAWG ftf meeting
  2. Comments from Ian Jacobs
  3. Comments from Matt May

Last modified: $Id: xhtml2-comments.html,v 1.1 2003/05/14 20:46:59 ijacobs Exp $

How these comments are organized

  1. Comments related to user agent conformance
  2. Comments related to accessibility themes
  3. Miscellaneous comments
  4. New elements?

This list does not include editorial comments, although there are some requests for clarification.

1. Comments related to user agent conformance

  1. Identify user agent processing, behavior, rendering
  2. Conformance to WAI Guidelines
  3. Relation between hint attributes and protocol headers

1.1 Identify user agent processing, behavior, rendering

It will make the specification easier to read and implement if requirements and recommendations for user agent processing, behavior, and rendering are clearly identified. This should make the specification easier for implementers to use, and should clear up some ambiguities (not listed here). It will also allow the UAWG to help the HTML WG include appropriate references to UAAG 1.0 checkpoints.

The proposal is that each element and attribute definition (mostly element) uniformly identify the following information:

  1. Clearly identify required user agent processing.
  2. Clearly identify required user agent behavior that is perceivable by the user.
  3. Clearly identify default user agent behavior that is perceivable by the user.
  4. Clearly identify required user agent rendering.
  5. Clearly identify default user agent rendering.
  6. Clearly identify recommendations for processing/behavior/rendering (and clearly distinguish them from other requirements).
  7. Clearly identify requirements for configurability, what behavior must be available in some configuration (related to 2), and what may be accomplished through configuration (whether through the user interface or via a configuration file).

Thus, an element definition would include and distinguish (in addition to attribute definitions):

  1. The type of information identified by the element
  2. Processing
  3. Behavior that the user perceives
  4. Configuration
  5. Rendering

Futhermore, it is likely that some classification of user agents will be required, as rendering and behavior may vary according to input and output modality. Output modality is probably more relevant. Suggested classification:

  1. Visual output
  2. Audio output
  3. Paged mode output (if footer element adopted)
  4. Any output

Example for lists (section 11.3)

Here is a quick example of how the spec should distinguish processing/behavior/rendering:

Please specify required processing for ol:

  1. Generate an integer counter for each li that is available for rendering before the li content.

Please specify default processing for ol:

  1. For each ol, the counter value is initially one.

Please specify default rendering for visual user agents (ol):

  1. Render counters in the appropriate glyphs according to specification (including style sheets, mime headers, xml:lang, and user preferences).

Please specify default rendering for visual user agents (ul):

  1. Render a bullet glyph before the content of the li element according to specification (including style sheets, text direction, and user preferences).

    Thus, the shape of the bullet may vary according to available glyphs, user agent defaults, and style sheets. The position of the glyph depends on the writing direction.

This example doesn't include required or default behavior. A good example of where to include default behavior: "what happens when I click on a link established by an a element?"

1.2 Conformance to WAI Guidelines

  1. Include references to WCAG/ATAG/UAAG/XAG
  2. Include links to relevant checkpoints in context. Once requirements and recommendations are clearly identified (and, e.g., in bulleted lists), then finding the appropriate reference to checkpoints will be easy.

Steven suggested that it was more likely that accessibility requirements would be part of a separate appendix than in context.

1.3 Relation between hint attributes and protocol headers

Steven Pemberton clarified for me that the HTML WG's expectation is to remove some hint attributes, and make others part of client accept headers (e.g., "type"). Steven cited the use case of serving two URIs, one ending in a.html and the other in a.css. The author may only wish for the CSS to be "acceptable" in a particular context.

2. Comments related to accessibility themes

  1. Rendering and style sheets
  2. Content
  3. Navigation

2.1 Rendering and style sheets

Descriptions of rendering requirements
  1. Define in CSS terms (even if CSS implementation not required). Steven said the HTML WG was planning to do this.
User control of rendering
  1. Allow the user to override the default user agent rendering, and to change configurations per the rendering requirements of UAAG 1.0.
  2. User agents should allow users to create simplified views of content. For example, there are Mozilla extensions to generate an outline from headings, Amaya has an outline view, etc.
XHTML 2.0 does not remove all presentation elements
Steven and I discussed the reduction of presentation elements in XHTML 2.0. I argued that the block/inline distinction should be reduced if not removed, but Steven pointed out that some authors consider 'quote' and 'blockquote' to be semantically different. He also pointed out that 'pre' could be interpreted as affecting whitespace input, not just output (e.g., for a programming language where whitespace is significant). I argued than an element like 'em' was not inherently block or inline; I might want to emphasize 2 paragraphs. Could some content models be expanded? Why have 'div' and 'span'? I don't expect to pursue this further at this point. I have asked the the HTML WG give a better explanation about why the content model exists, and for each inline element, if there's a corresponding block element, can the two be merged? If there isn't a corresponding block element, why not?
Requirements for style sheets
  1. User must be able to apply style sheets to conditional content (see below).
  2. User must be able to choose from alternative style sheets. (Steven agreed.)
  3. User must be able to turn off author and user style sheets (leaving user agent default style sheet). (Steven agreed.)

Finally, It is important for accessibility that the user be able to control rendering. It is, of course, also important that specifications be followed. The rendering of an xhtml 2.0 document depends on a combination of (at least):

  1. User agent default rendering requirements.
  2. Discretionary user agent rendering
  3. Style sheets
  4. User preferences (expressed via style sheets or in other ways).

I think that there needs to be a statement in the XHTML 2.0 spec that the user needs to be able to override user agent default rendering:

  1. Where required by UAAG 1.0 (e.g., control of text size, text color, etc.), and
  2. According to the appropriate specification (e.g., through user style sheets when rendering is done via CSS).

2.2 Content

Definition of content
The definition of "content" is always a challenging issue (and it is used slightly differently among the various WAI Guidelines). The XHTML 2.0 spec says that the "title" element is not part of content (presumably because it's in the head of the document), whereas UAAG 1.0 considers this to be content since users must have access to it. The UAAG 1.0 approach is to call "content" what is in the DOM. It would be helpful if the HTML WG defined content clearly (e.g., in terms of the Infoset).
Conditional content
Please explicitly identify the elements and attributes that in UAAG 1.0 are referred to as "conditional content"; cf checkpoint 2.3. Conditional content must always be available to the user through a viewport in some configuration (but not in every viewport, and not in every configuration). I believe the list is:
noscript*, content of object element, content of any element with src attribute
table/caption, table/summary, */title, */abbr, object/standby

*'noscript' may not survive.

Additional notes:

Inform the user of conditional content
  1. Inform the user that conditional content (e.g., summary) is available but has not been rendered.
  2. Allow the user to query elements that have associated conditional content (cf. UAAG 1.0 checkpoint 2.3).
  3. In the past, user agents have presented different pieces of conditional content in different ways (e.g., alt v. title). Authors are unsure of how their content will be presented (e.g., in tooltips). Can we do better in XHTML 2.0?
Identification of important content
  1. Users benefit from being able to navigate directly to "important content" on a page. Today, there is no way for the author to indicate what the author considers important; user agents rely on generic markup (e.g., headings) to guess the author's intent. It would be useful if the author could annotate content in a way that would allow user agents to:
    1. Present important content differently on user configuration
    2. Allow the user to navigate directly to important content.
    3. Allow the user to hide (i.e., not render) unimportant content (e.g., useful for some users with cognitive disabilities).
  2. Relation of "important content" to navigation?
  3. Need to define carefully what we mean by "important content".
  1. An important part of navigation for some users with disabilities is the ability to jump from link to link, with the presumption that link text is short. Since every element can be a link in XHTML 2.0, the spec should say something about the disadvantages of, say, having an entire paragraph be a link.
  2. Nested links may lead to accessibility problems if there are not rendering requirements to ensure that the user can distinguish them in various modalities (especially when rendered as speech). Even today, adjacent links that are not separated by an obvious barrier may be perceived as only one link by users. Nested anchors don't seem to cause any issues. Please explain in the spec use cases of nested links, the benefits to the author of using nested links, and requirements that the user agent render distinct links in a manner that the user can perceive. Steven pointed out that image maps already allow nested links. While true, this can be confusing if the regions are not clearly distinguishable.
  3. Allow users to query links for information such as:
    1. Is there conditional content associated with this link?
    2. Is link internal or external to current resource?
    3. Was link recently visited?
    4. Are there hints about linked resource?
  4. Section 5.5 Attribute types: About "rel=redirect". Don't reintroduce this problem; have people configure servers properly. I don't think it's widely accepted that the "redirect" instruction is for SERVERS not user agents. Don't make user agents do the redirect; that may confuse users. Have the redirect done on the server side. Or, if it is part of the spec, include a requirement that the user agent offer a configuration to make it manual rather than automatic.
  5. Please show an example of including an image with both a short description and an out-of-band long description. Show that the object element is itself a link by including the href attribute. For both visual and auditory user agents, since there may be multiple links available, need to ensure that the user agent conveys to the user that multiple links are available and that multiple renderings go together.

2.3 Navigation

Definition of focus
  1. Adopt the UAAG 1.0 focus and selection definitions. (See related UAAG 1.0 requirements about selection and focus).
  2. Define navigation in terms of moving the content focus among enabled elements.
    1. When a user agent first renders XHTML content in a viewport, no element yet has content focus.
    2. If the user chooses to set the content focus (e.g., by moving it forward or in reverse), the user agent SHOULD assign content focus to the next enabled element (i.e., one capable of taking focus) in the viewport after the beginning of the viewport (or before the beginning of the viewport for reverse navigation).
    3. Please see related UAAG 1.0 checkpoints 5.1, 5.2, 5.4, 7.1, 9.3, 9.4, 9.7, and others. Checkpoint 5.4 in particular addresses viewport behavior when the user changes content focus.
  3. Distinguish moving the focus from activating the default behavior of an element with focus.
  4. The "href" attribute is not well-defined since one does not "actuate a URI". Instead, for example, say something like:
    1. The behavior of an element with href set depends on the URI scheme of the URI value.
    2. That behavior is what is triggered when the user interacts with that element (e.g., after it has received focus, or when it has been designated by a pointing device)
  5. Need to clarify relation between focus as defined in UAAG 1.0 and tabindex. Most likely that elements with tabindex are simply defined as "interactive" elements ("interactive element" is defined in UAAG 1.0).
Definition of point of regard and viewport
Navigation bars
  1. Move the point of regard (e.g., top of viewport, but also for speech synthesizers) to the first element after a navigation bar. We (and section 508) are asking user agents to implement what authors have done in the past, namely putting an anchor on the element that follows a navigation bar, and allowing the user to jump to that anchor. Screen readers have also implemented this functionality in the past.
  2. User should be able to sequentially move focus to first link of each navigation bar (i.e., jump from one navigation bar to the next).
Navigation lists (nl)
  1. Can this be accomplished with xforms?
  2. Is it decided how nl will control XFrames?
  3. The spec reads "The behavior of navigation lists in nonvisual user agents is unspecified." Instead, I suggest to define NL in terms of interaction requirements, not presentation requirements. I suspect that the result will be that the interaction requirements apply for any output modality. For example:
    1. For each NL, the user agent MUST make available to the user the contents of the label element.
    2. The user agent MUST allow the user to navigate (move the focus) to the label. When the label takes focus, the user agent MUST make available to the user the contents of each li element in the list.
    3. The user agent MUST allow the user to navigate to each rendered li element in an nl.
  4. Define the visual presentation separately:
    1. Default behavior for a graphical user agent: When the user moves focus to the label element of an nl element, the user agent MUST display the contents of the li elements. The label and li elements MUST be displayed as long as the user moves focus within the nl element. When the user moves focus away from the nl element, the user agent MUST display the label.

    Of course, rendering may be changed by style sheets, but that fact should be addressed elsewhere, and the relationship between the default behavior MUSTs and proper interpretation of style sheets needs to be covered so there is no conflict. Navigation lists should have user-specifiable (style-based) static rendering to accommodate people with motor disabilities.

Specific navigation requirements
  1. Navigate to "important" parts of content in current viewport. Having navigated to important content, dig down from there.
  2. Allow navigation among links internal to this resource (i.e., distinguish internal from external navigation).
  1. Does this model really work? Need WAI input.
  2. Need to communicate to the user which author-supplied shortcuts are available.
  3. Is there a way to improve cross-platform interoperability of prefix key for accesskeys, or is solution to require configurability? See related UAAG 1.0 requirements.
  4. Allow users to override author-supplied bindings if they need to (e.g., for users with physical disabilities).
  1. Does this model really work? Need WAI input.
  2. Should there be the possibility of additional scoping (e.g., navigation order within a table)?
  3. 6.4: Indicate whether rules for calculating nav order are required or recommended processing rules.
Focus and plug-ins
  1. How to ensure that a user agent hands off focus to a plug-in.
Image maps
  1. Please include a statement from WCAG that authors should use client-side image maps except when the map semantics depend on pixels.
  2. For the question of whether UAs should provide feedback on regions: See UAAG 1.0 checkpoint 10.2, provision 4 for advice on doing this in a way that respects the granularity of the image format.
  3. Indicate in example what happens when image map not rendered.

3. Miscellaneous comments

Comments here are listed by section number in the XHTML 2.0 spec.

General comments:

  1. Please don't organize the spec in terms of block / inline, which is a visual rendering view of the elements. Instead, group by element semantics. If the semantics are presentation-based, that's a problem.
  2. Choose one between "default value is unspecified" and "default value is user agent-dependent".
  3. Often several ways to accomplish the same goal; is that desirable or will it lead to confusion? Example of MAP content with AREA or not. Get rid of AREA element?
  4. UAAG 1.0 requires that all UA functionality be possible through the keyboard. How to include this and other general user agent requirements (i.e., not strictly tied to the format) in XHTML 2.0?

4. New elements?

  1. For bibliographies?
  2. For glossaries?