DRAFT UAWG comments about 6 May 2003 XHTML 2.0

Status of this document

These comments on the 6 May 2003 version of XHTML 2.0 were prepared for discussion at the 15 May 2003 UAWG teleconference. Ian has discussed many of these comments with Steven Pemberton, Chair of the HTML WG. The expectation is that the UAWG, in conjunction with the PF WG and possibly the QA WG will use these comments as the basis for discussions with the HTML WG.

These comments incorporate earlier comments:

Last modified: $Id: xhtml2-comments.html,v 1.1 2003/05/14 20:46:59 ijacobs Exp $

How these comments are organized

Comments related to user agent conformance
Comments related to accessibility themes
Miscellaneous comments
New elements?

This list does not include editorial comments, although there are some requests for clarification.

1. Comments related to user agent conformance

Identify user agent processing, behavior, rendering
Conformance to WAI Guidelines
Relation between hint attributes and protocol headers

1.1 Identify user agent processing, behavior, rendering

It will make the specification easier to read and implement if requirements and recommendations for user agent processing, behavior, and rendering are clearly identified. This should make the specification easier for implementers to use, and should clear up some ambiguities (not listed here). It will also allow the UAWG to help the HTML WG include appropriate references to UAAG 1.0 checkpoints.

The proposal is that each element and attribute definition (mostly element) uniformly identify the following information:

Clearly identify required user agent processing.
Clearly identify required user agent behavior that is perceivable by the user.
Clearly identify default user agent behavior that is perceivable by the user.
Clearly identify required user agent rendering.
Clearly identify default user agent rendering.
Clearly identify recommendations for processing/behavior/rendering (and clearly distinguish them from other requirements).
Clearly identify requirements for configurability, what behavior must be available in some configuration (related to 2), and what may be accomplished through configuration (whether through the user interface or via a configuration file).

Thus, an element definition would include and distinguish (in addition to attribute definitions):

The type of information identified by the element
Processing
Behavior that the user perceives
Configuration
Rendering

Futhermore, it is likely that some classification of user agents will be required, as rendering and behavior may vary according to input and output modality. Output modality is probably more relevant. Suggested classification:

Visual output
Audio output
Paged mode output (if footer element adopted)
Any output

Example for lists (section 11.3)

Here is a quick example of how the spec should distinguish processing/behavior/rendering:

Please specify required processing for ol:

Generate an integer counter for each li that is available for rendering before the li content.

Please specify default processing for ol:

For each ol, the counter value is initially one.

Please specify default rendering for visual user agents (ol):

Render counters in the appropriate glyphs according to specification (including style sheets, mime headers, xml:lang, and user preferences).

Please specify default rendering for visual user agents (ul):

Render a bullet glyph before the content of the li element according to specification (including style sheets, text direction, and user preferences).
Thus, the shape of the bullet may vary according to available glyphs, user agent defaults, and style sheets. The position of the glyph depends on the writing direction.

This example doesn't include required or default behavior. A good example of where to include default behavior: "what happens when I click on a link established by an a element?"

1.2 Conformance to WAI Guidelines

Include references to WCAG/ATAG/UAAG/XAG
Include links to relevant checkpoints in context. Once requirements and recommendations are clearly identified (and, e.g., in bulleted lists), then finding the appropriate reference to checkpoints will be easy.

Steven suggested that it was more likely that accessibility requirements would be part of a separate appendix than in context.

1.3 Relation between hint attributes and protocol headers

Steven Pemberton clarified for me that the HTML WG's expectation is to remove some hint attributes, and make others part of client accept headers (e.g., "type"). Steven cited the use case of serving two URIs, one ending in a.html and the other in a.css. The author may only wish for the CSS to be "acceptable" in a particular context.

2. Comments related to accessibility themes

Rendering and style sheets
Content
Navigation

2.1 Rendering and style sheets

Descriptions of rendering requirements

Define in CSS terms (even if CSS implementation not required). Steven said the HTML WG was planning to do this.

User control of rendering

Allow the user to override the default user agent rendering, and to change configurations per the rendering requirements of UAAG 1.0.
User agents should allow users to create simplified views of content. For example, there are Mozilla extensions to generate an outline from headings, Amaya has an outline view, etc.

XHTML 2.0 does not remove all presentation elements

Steven and I discussed the reduction of presentation elements in XHTML 2.0. I argued that the block/inline distinction should be reduced if not removed, but Steven pointed out that some authors consider 'quote' and 'blockquote' to be semantically different. He also pointed out that 'pre' could be interpreted as affecting whitespace input, not just output (e.g., for a programming language where whitespace is significant). I argued than an element like 'em' was not inherently block or inline; I might want to emphasize 2 paragraphs. Could some content models be expanded? Why have 'div' and 'span'? I don't expect to pursue this further at this point. I have asked the the HTML WG give a better explanation about why the content model exists, and for each inline element, if there's a corresponding block element, can the two be merged? If there isn't a corresponding block element, why not?

Requirements for style sheets

User must be able to apply style sheets to conditional content (see below).
User must be able to choose from alternative style sheets. (Steven agreed.)
User must be able to turn off author and user style sheets (leaving user agent default style sheet). (Steven agreed.)

Finally, It is important for accessibility that the user be able to control rendering. It is, of course, also important that specifications be followed. The rendering of an xhtml 2.0 document depends on a combination of (at least):

User agent default rendering requirements.
Discretionary user agent rendering
Style sheets
User preferences (expressed via style sheets or in other ways).

I think that there needs to be a statement in the XHTML 2.0 spec that the user needs to be able to override user agent default rendering:

Where required by UAAG 1.0 (e.g., control of text size, text color, etc.), and
According to the appropriate specification (e.g., through user style sheets when rendering is done via CSS).

2.2 Content

Definition of content

The definition of "content" is always a challenging issue (and it is used slightly differently among the various WAI Guidelines). The XHTML 2.0 spec says that the "title" element is not part of content (presumably because it's in the head of the document), whereas UAAG 1.0 considers this to be content since users must have access to it. The UAAG 1.0 approach is to call "content" what is in the DOM. It would be helpful if the HTML WG defined content clearly (e.g., in terms of the Infoset).

Conditional content

Please explicitly identify the elements and attributes that in UAAG 1.0 are referred to as "conditional content"; cf checkpoint 2.3. Conditional content must always be available to the user through a viewport in some configuration (but not in every viewport, and not in every configuration). I believe the list is:

Elements: noscript*, content of object element, content of any element with src attribute
Attributes: table/caption, table/summary, */title, */abbr, object/standby

*'noscript' may not survive.

Additional notes:

The title element is required by spec to be available to the user. In the definition of "title", include in list of example renderings: "(in the title bar of a window, as a caption, spoken).
The following elements and attributes are not defined in xhtml 2.0: alt, longdesc, noframes
XHTML 2.0 has a uniform mechanism for including alternative content: the src attribute. What is missing from this approach is the classification of alternative content. Classes that appear in UAAG 1.0 are: summary, title, alternative, description, and expansion (e.g., for abbr/title). We discussed a possible "role" attribute that might be used in this way.

Inform the user of conditional content

Inform the user that conditional content (e.g., summary) is available but has not been rendered.
Allow the user to query elements that have associated conditional content (cf. UAAG 1.0 checkpoint 2.3).
In the past, user agents have presented different pieces of conditional content in different ways (e.g., alt v. title). Authors are unsure of how their content will be presented (e.g., in tooltips). Can we do better in XHTML 2.0?

Identification of important content

Users benefit from being able to navigate directly to "important content" on a page. Today, there is no way for the author to indicate what the author considers important; user agents rely on generic markup (e.g., headings) to guess the author's intent. It would be useful if the author could annotate content in a way that would allow user agents to:
1. Present important content differently on user configuration
2. Allow the user to navigate directly to important content.
3. Allow the user to hide (i.e., not render) unimportant content (e.g., useful for some users with cognitive disabilities).
Relation of "important content" to navigation?
Need to define carefully what we mean by "important content".

Links

An important part of navigation for some users with disabilities is the ability to jump from link to link, with the presumption that link text is short. Since every element can be a link in XHTML 2.0, the spec should say something about the disadvantages of, say, having an entire paragraph be a link.
Nested links may lead to accessibility problems if there are not rendering requirements to ensure that the user can distinguish them in various modalities (especially when rendered as speech). Even today, adjacent links that are not separated by an obvious barrier may be perceived as only one link by users. Nested anchors don't seem to cause any issues. Please explain in the spec use cases of nested links, the benefits to the author of using nested links, and requirements that the user agent render distinct links in a manner that the user can perceive. Steven pointed out that image maps already allow nested links. While true, this can be confusing if the regions are not clearly distinguishable.
Allow users to query links for information such as:
1. Is there conditional content associated with this link?
2. Is link internal or external to current resource?
3. Was link recently visited?
4. Are there hints about linked resource?
Section 5.5 Attribute types: About "rel=redirect". Don't reintroduce this problem; have people configure servers properly. I don't think it's widely accepted that the "redirect" instruction is for SERVERS not user agents. Don't make user agents do the redirect; that may confuse users. Have the redirect done on the server side. Or, if it is part of the spec, include a requirement that the user agent offer a configuration to make it manual rather than automatic.
Please show an example of including an image with both a short description and an out-of-band long description. Show that the object element is itself a link by including the href attribute. For both visual and auditory user agents, since there may be multiple links available, need to ensure that the user agent conveys to the user that multiple links are available and that multiple renderings go together.

2.3 Navigation

Definition of focus

Adopt the UAAG 1.0 focus and selection definitions. (See related UAAG 1.0 requirements about selection and focus).
Define navigation in terms of moving the content focus among enabled elements.
1. When a user agent first renders XHTML content in a viewport, no element yet has content focus.
2. If the user chooses to set the content focus (e.g., by moving it forward or in reverse), the user agent SHOULD assign content focus to the next enabled element (i.e., one capable of taking focus) in the viewport after the beginning of the viewport (or before the beginning of the viewport for reverse navigation).
3. Please see related UAAG 1.0 checkpoints 5.1, 5.2, 5.4, 7.1, 9.3, 9.4, 9.7, and others. Checkpoint 5.4 in particular addresses viewport behavior when the user changes content focus.
Distinguish moving the focus from activating the default behavior of an element with focus.
The "href" attribute is not well-defined since one does not "actuate a URI". Instead, for example, say something like:
1. The behavior of an element with href set depends on the URI scheme of the URI value.
2. That behavior is what is triggered when the user interacts with that element (e.g., after it has received focus, or when it has been designated by a pointing device)
Need to clarify relation between focus as defined in UAAG 1.0 and tabindex. Most likely that elements with tabindex are simply defined as "interactive" elements ("interactive element" is defined in UAAG 1.0).

Definition of point of regard and viewport

Review UAAG 1.0 definitions of viewport and point of regard.
Discuss relation between point of regard and focus (cf. UAAG 1.0 about keeping focus in viewport after moving viewport).

Navigation bars

Move the point of regard (e.g., top of viewport, but also for speech synthesizers) to the first element after a navigation bar. We (and section 508) are asking user agents to implement what authors have done in the past, namely putting an anchor on the element that follows a navigation bar, and allowing the user to jump to that anchor. Screen readers have also implemented this functionality in the past.
User should be able to sequentially move focus to first link of each navigation bar (i.e., jump from one navigation bar to the next).

Navigation lists (nl)

Can this be accomplished with xforms?
Is it decided how nl will control XFrames?
The spec reads "The behavior of navigation lists in nonvisual user agents is unspecified." Instead, I suggest to define NL in terms of interaction requirements, not presentation requirements. I suspect that the result will be that the interaction requirements apply for any output modality. For example:
1. For each NL, the user agent MUST make available to the user the contents of the label element.
2. The user agent MUST allow the user to navigate (move the focus) to the label. When the label takes focus, the user agent MUST make available to the user the contents of each li element in the list.
3. The user agent MUST allow the user to navigate to each rendered li element in an nl.
Define the visual presentation separately:
1. Default behavior for a graphical user agent: When the user moves focus to the label element of an nl element, the user agent MUST display the contents of the li elements. The label and li elements MUST be displayed as long as the user moves focus within the nl element. When the user moves focus away from the nl element, the user agent MUST display the label.
Of course, rendering may be changed by style sheets, but that fact should be addressed elsewhere, and the relationship between the default behavior MUSTs and proper interpretation of style sheets needs to be covered so there is no conflict. Navigation lists should have user-specifiable (style-based) static rendering to accommodate people with motor disabilities.

Specific navigation requirements

Navigate to "important" parts of content in current viewport. Having navigated to important content, dig down from there.
Allow navigation among links internal to this resource (i.e., distinguish internal from external navigation).

Accesskey

Does this model really work? Need WAI input.
Need to communicate to the user which author-supplied shortcuts are available.
Is there a way to improve cross-platform interoperability of prefix key for accesskeys, or is solution to require configurability? See related UAAG 1.0 requirements.
Allow users to override author-supplied bindings if they need to (e.g., for users with physical disabilities).

Navindex

Does this model really work? Need WAI input.
Should there be the possibility of additional scoping (e.g., navigation order within a table)?
6.4: Indicate whether rules for calculating nav order are required or recommended processing rules.

Focus and plug-ins

How to ensure that a user agent hands off focus to a plug-in.

Image maps

Please include a statement from WCAG that authors should use client-side image maps except when the map semantics depend on pixels.
For the question of whether UAs should provide feedback on regions: See UAAG 1.0 checkpoint 10.2, provision 4 for advice on doing this in a way that respects the granularity of the image format.
Indicate in example what happens when image map not rendered.

3. Miscellaneous comments

Comments here are listed by section number in the XHTML 2.0 spec.

6.2, 6.3: Why aren't I18N attribute collection and Bidi text collection both part of one I18N collection?
7: About a "security" element: ,This seems to be a perfect opportunity to use generic xml mechanisms defined by W3C Recommendations
8.2: The blockcode element. This seems to be going in the direction of including more style in HTML. This is a convenience element. Rather than include it, just expand the content model of the code element. If you keep it, please also state more clearly that whitespace is handled differently between blockquote and blockcode.
8.5: Instead of "The visual presentation of headers can render more important headings in larger fonts than less important ones." suggested: "Default rendering behavior for graphical user agents: Render H1 using a larger text size than H2, H2 using a larger text size than H3, etc."
8.6: The hr element (and br): What is the semantic purpose of a separator element? Are these strictly about presentation? Why not do through style (e.g., borders, :before/:after)?
9.7: The l element. Please explain its relation to the P element. Must a line always appear as part of a paragraph? I realize that there are situations where one is not using "paragraphs" in a traditional sense (e.g., when writing computer code) and "line" is a more natural starting point. But if you can use L instead of a one-line P, there should be some explanation why to do so or not do so.
1. Don't define "l" in terms of visual rendering. The concept of "line" may be meaningful in a braille context as well.
2. A "line" may include other content than text characters; see CSS box model for more information.
3. Does this element make sense in an I18N context?
9.12, 9.13: Combine these sections. Don't say "should be regarded as" say "is". [See general comment above about stating clearly what the information type identified by an element is.] State rendering requirements for visual user agents clearly (in terms of css 'vertical-align').
10.1 Under a element, example, don't say "will retrieve" since that, I believe, is not required behavior. Perhaps rephrase in terms of default behavior.
10.1 end: Don't put required user agent behavior in an informative Note. List as part of required user agent behavior for all user agents. Also, unclear what "find anchors" means. Please define (e.g., user agent must include these anchors in focus navigation order, user agents must treat these anchors as they would others when handling URI references with fragment ids, etc.).
11: The spec says that authors must not use lists just for formatting purposes. I suggest this be deleted (in particular since it suggests that authors can rely on visual indentation even though this rendering behavior is not specified as default rendering behavior). Instead, in separate discussion on using style sheets for presentation, talk about the fact that authors should not use ANY markup just to achieve a formatting effect, and to use style sheets instead.
11.3: The sentence "User agents may present those numbers in a variety of ways" can be deleted since, as long as there is no default or other presentation requirements, the user agent can do what it wants for ANY element.
12.1: In the definition of the media attribute: "Unlike a, it may only appear". Change to "Only appears in the head" since that's per the DTD/Schema.
13.1 Not sure that it's true that for common collection of attributes, the interpretation is profile-dependent.
14: This section seems to have a lot of redundant text, and it's a section that needs to clearly define required processing.
14: For the sentence "Authors should not include content in object elements that appear in the head element." what about saying instead: "Default rendering: The user agent MUST NOT render the content of an OBJECT element in the document head."
14.1: The "data" attribute says "If you use a relative URI...". I.e., it slips into "you" rather than "the author should/must".
14.1.3: For the example: "When the user agent encounters objectdataA AND IS ABLE TO PROCESS THAT OBJECT ELEMENT", please define what is meant by "is able to process."
14.2: "The user agent or the external application can utilize the param element name/value pairs to pass unique datapoints to trigger specific functions or actions." This seems wrong. It probably is required behavior that they do (or at least try to). So much stronger than "can utilize."
14.2: "If the URI is relative, it may be based from the referring document location or from the xml:base attribute location." Who chooses?
16.2: The user agent must allow the user to turn off scripts; cf UAAG 1.0 checkpoint 3.4.
16.2.1: the script element description should distinguish processing requirements from rendering requirements as described above. Define processing success and failure (including user configuration, availability of external script, etc.), and describe rendering in terms of success and failure.
17: "The Style Attribute Module defines the style attribute. When this module is selected, it activates the Style Collection." What does "activate" mean here?
17: Instead of speaking out so strongly against inline style, describe the costs and benefits of the three approaches to specifying style. As written, it seems like the spec shouldn't even have inline style.
18.1.2: "User agents must respect media descriptors when applying any style sheet." How do they respect these descriptors?
19.2.1: What does this mean:: "Once the user agent has calculated the number of columns in the table, it may group them into a colgroup."
19.3: Indicate that "summary" useful to all users, not just users with disabilities.
19.4.1: Visual rendering: Is this default rendering, required rendering for visual user agents, or recommended rendering for visual user agents? Same question for heading algorithm in 19.4.3.3: Is this default/required/recommended processing?
19.4.2: Please rewrite this section in terms of user agent required/recommended behavior (with SHOULD/MUST/MAY). Also, see UAAG 1.0 checkpoint 10.1 for table requirements.
19.4.1: Change "In order for a user agent to format a table in one pass, authors must tell the user agent:" to "When authors specify the number and width of columns, they help visual user agents format the table in one pass."

General comments:

Please don't organize the spec in terms of block / inline, which is a visual rendering view of the elements. Instead, group by element semantics. If the semantics are presentation-based, that's a problem.
Choose one between "default value is unspecified" and "default value is user agent-dependent".
Often several ways to accomplish the same goal; is that desirable or will it lead to confusion? Example of MAP content with AREA or not. Get rid of AREA element?
UAAG 1.0 requires that all UA functionality be possible through the keyboard. How to include this and other general user agent requirements (i.e., not strictly tied to the format) in XHTML 2.0?

4. New elements?

For bibliographies?
For glossaries?