From: Wendy A Chisholm <wendy@w3.org>. This was originally sent in October 2000 to the member-only archive of the PFWG as http://lists.w3.org/Archives/Member/w3c-wai-pf/2000OctDec/0082, and was made public in August 2002 by permission.

This is some review notes rather than a comprehensive review.

Subject: Using XML Accessibility Guidelines to review Speech Synthesis Markup Language

Hello, I used the XML Accessibility Guidelines (XMLAG) [1] to check the accessibility of Speech Synthesis Markup Language Specification for the Speech Interface Framework (SSML) [2]. I have several comments and questions about both documents, but I focus on XMLAG in this message.

1. There are several typos, but I won't list those here.

2. It seems that there are two levels of details, Guidelines and Checkpoints, to use other WAI Guidelines terminology. As such, I will refer to the "checkpoints" in the rest of this message. E.g., I refer to the first bullet under Guideline 2 as checkpoint 2.1. Guideline one does not have any checkpoints, just suggestions for techniques. It would be good to number these in the document.

3. Overall, my assessment is that SSML contains a lot of presentation elements. Primarily: sayas, phoneme, voice, prosody. Unfortunately, ACSS is not rich enough (as far as I can tell) to completely cover all that is being described in SSML. I am following up with others in the VBWG to find out more about the lack of separation between content and structure from presentation. I know there has been discussion on this list and that Raman sent the URI for "Proper Relation between SABLE and Aural Cascaded Style Sheets" paper that he and Richard Sproat wrote.

4. Checkpoint 2.1 says, "Do not define presentation elements or attributes but enable the use of style sheets." I propose this should be reworded to say, "For structural languages, do not define presentation elements..." Then add checkpoints about what is necessary for style languages such as CSS and XSL. The WCAG WG and ERT WG have discussed using standardized classes in HTML/XHTML to provide some meaning to classes. For example, there is some meaning in the presentation that gets lost because there is not a way to associate a human-readable label with classes and styles. Is that something we should advocate for in the XMLAG? For more info on that thread, refer to: http://www.w3.org/WAI/GL/wcag20-issues.html#4 http://lists.w3.org/Archives/Public/w3c-wai-gl/2000JulSep/0288.html (this was also sent to the PF list)

5. Checkpoint 2.3 can be generalized a bit to say something like, "Define structural element types (classifying and grouping elements like headers, sections, etc.)" I propose generalizing it for "structure" because SSML has "sentence" and "paragraph" elements. While these are "grouping" elements in some sense, because they group words, or letters, or sentences, I like the emphasis on structure. It fits well with the way WCAG 2.0 is headed. I'm not too fussed about it.

6. Checkpoint 2.4 says, "Define element types that identify important text content." What is "important?" (<evil grin> I get this question all the time with WCAG stuff) We are trying to eliminate the use of terms like "important" "applicable" etc. and phrase checkpoints so that it is easier for an author to determine when he/she has satisfied the checkpoint.

7. Some of these checkpoints seem to apply to every element and/or attribute while others seem to apply to the language as a whole (i.e. the specifics vs. the big picture). You might want to specify the scope in each checkpoint or group them in some way. For example, 3.2 "Provide schemas and mean to access it" applies to the whole language while 1.0 says, "Ensure that authors can associate a text description with any non-text content..." which would require a language developer to go through each element to determine if the equivalents/alternatives can be associated.

Thanks, --wendy

[1] http://www.w3.org/WAI/PF/xmlgl 27 July 2000 WD

[2] http://www.w3.org/TR/speech-synthesis 8 August 2000 WD