W3C

Site Navigation

Nearby

Results of Questionnaire Review of Media Accessibility Requirements

The results of this questionnaire are available to anybody.

This questionnaire was open from 2010-05-27 to 2010-06-07.

16 answers have been received.

Jump to results for question:

Audio Description
Texted Audio Description
Extended audio description
Clear audio
Content Navigation by Content Structure
Captioning
Extended Captioning
Sign Translation
Transcripts
Keyboard Access to interactive controls / menus
Granularity Level Control for Structural Navigation
Time Scale Modification
Production practice and resulting requirements
Discovery and activation/deactivation of available alternative content by the user
Requirements on making properties available to the accessibility interface
Requirements on the use of the viewport
Requirements on the parallel use of alternate content on potentially multiple devices in parallel

1. Audio Description

Refer to 2.1 Audio Description.

Audio descriptions (also known as video descriptions or simply descriptions) make visual media accessible to people who are blind or visually impaired by providing descriptive narration of key visual elements. These elements include actions, costumes, gestures, scene changes or any other important visual information that someone who cannot see the screen might ordinarily miss. Descriptions are usually timed and recorded to fit into natural pauses in the program-audio track. (See the section on Extended Descriptions for an alternative.) The descriptions are usually read by a narrator with a voice that cannot be easily confused with other voices in the primary audio track. They are written to convey objective information (e.g., a yellow flower) rather than subjective judgments (e.g., a beautiful flower).
As with captions, descriptions can be open or closed.

Open descriptions are merged with the program-audio track and cannot be turned off by the viewer.
Closed descriptions can be recorded as a separate track containing descriptions only, timed to play at specific spots in the timeline and played in parallel with the program-audio track.
Some audio descriptions can be given as a separate audio channel mixed in at the player.
Other options include a computer-generated ‘text to speech’ track (texted audio descriptions) - this is described in the next subsection.

Audio descriptions provide benefits that reach beyond blind or visually impaired viewers: e.g., students grappling with difficult materials or concepts. Descriptions can be used to give supplemental information about what is on screen—the structure of lengthy mathematical equations or the intricacies of a painting, for example.
Audio description is available on some television programs and in movie theaters in the U.S. and in some other countries; however regulation in the U.S. and Europe is increasingly focusing on description, especially for television, reflecting its priority with citizens who have visual impairment. The technology needed to deliver and render basic audio description is in fact relatively straightforward, being an extension of the common audio-processing solutions. Playback products must support multi-audio channels required for description, and that any product dealing with broadcast TV content provide adequate support for description.
Requirements
Systems supporting audio/video descriptions that are not open must:

(AD-1) Provide an indication that descriptions are available, and are active/non-active.
(AD-2) Render descriptions in a time synchronized manner, using the media resource as the timebase master.
(AD-3) Support multiple description tracks (e.g., discrete tracks containing different levels of detail).
(AD-4) Support recordings of real human speech as part of a media resource, or as an external file.
(AD-5) Allow the author to independently adjust the volumes of the audio-description and original soundtracks.
(AD-6) Allow the user to independently adjust the volumes of the audio-description and original soundtracks, with the user's settings overriding the author's.
(AD-7) Permit smooth changes in volume rather than stepped changes. The degree and speed of volume change should be under provider control.
(AD-8) Allow the author to provide fade and pan controls to be accurately synchronised with the original soundtrack.
(AD-9) Allow the author to use a codec which is optimised for voice only, rather than requiring the same codec as the original soundtrack.
(AD-10) Allow the user to select from among different languages of descriptions, if available, even if they are different from the language of the main soundtrack.
(AD-11) Support the simultaneous playback of both the described and non-described audio tracks so that one may be directed at separate outputs (e.g., a speaker and headphones).
(AD-12) Provide a means to prevent descriptions from carrying over from one program or channel when the user switches to a different program or channel.
(AD-13) Allow the author and the user to relocate the description track within the audio field, with the user setting overriding the author setting. The setting should be readjustable as the media plays. [what is the difference to AD-8?]
(AD-14) Support metadata, such as copyright information, usage rights, language, etc.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	5
Accept this section with the following changes	11
Do not accept this section for the following reasons
Abstain

Details

Responder	Audio Description	Comments
Philip Jägenstedt	Accept this section with the following changes	(AD-4) is a requirement to synchronize audio/video resources from potentially different servers. This is potentially much more difficult than synchronizing text to a media resource. The current work on <track> is not trying to solve this, so if this is important then it should be stated more clearly. (AD-5) and (AD-6) assumes that several audio tracks can play in parallel at all. This is unlikely to be true in the near future. (AD-7) seems like a quality of implementation issue, I would remove it. (AD-8) A balance attribute has been requested on the WHATWG list on several occasions but failed to gather much support. Is it important? (AD-9) Is too vague, isn't it enough that the audio codec works well for voice rather than it was specifically designed for it? If not, is it Speex we want to require? (AD-11) Is the requirement the same as (AD-3), or is it a requirement on UA to be able to select the audio output device? This seems to be something usually controlled by the OS, not the UA. (AD-12) There is no concept of "program" or "channel" in the spec, clarify or remove this requirement. (AD-13) "what is the difference to AD-8?" (AD-14) Is too vague, should metadata be exposed in the DOM? Only language seems actually relevant for meeting the other requirements.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	This content type will be very difficult or impossible to support on some resource constrained devices because it requires the media engine to decode and play more than one audio stream simultaneously. For example, with the audio chips used in many modern cell phones it is impossible to decode and play more than one stream of digital audio data at a time. (AD-4) Any digital audio format should be able to handle "real human speech", it doesn't need to be called out. Synchronizing audio and video loaded from different files can be very difficult to do well. (AD-5) and (AD-6) These require multiple audio streams to play simultaneously. This is not possible on all devices. (AD-7) What does "The degree and speed of volume change should be under provider control" mean? Is it really crucial for this feature? (AD-9) We have not been able agree on required audio and video codecs, will we will be able to agree on a codec specialized for speech? Is it really required? (AD-11) This requires multiple audio streams to play simultaneously. See above. (AD-12) What does this mean? What are programs and channels? (AD-13) I have no idea what this means. (AD-14) What does "support metadata" mean? Digital media files can already carry metadata, is the requirement to expose it to the DOM?
aurélien levy	Accept this section with the following changes	AD-3 is sometimes impossible to achieve since you can't give more details in the same time interval, so you can only do it when you use extended audio description mechanism
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section with the following changes	AD-9 Could be rephrased to say simply "Allow the author to use different codecs for the original and audio description tracks"? Also I am not sure what some of the requirements mean: AD-7 What does it mean: "The degree and speed of volume change should be under provider control." AD-8 What does it mean: "fade and pan controls" AD-11 Is this assuming a mixed-viewing situation (like TAD-3), or not?
Denis Boudreau	Accept this section with the following changes	(AD-3) Depending on the available time interval, it may be impossible to go beyond a certain level of details without resorting to some kind of extended audio description mechanism. (AD-7) I also have a hard time understanding what "degree and speed of volume change should be under provider control" means. (AD-9) Isn't this opening the door to more cat fights as to which codecs should be supported or not? (AD-12) I agree with Philip that undefined concepts need to be either defined or avoided (AD-13) Seems to be redundant with AD-8
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section with the following changes	It seems these requirements are useful for accessibility, but they could be prioritized to make sure the most important features are considered first. Priority 1: AD-1, AD-2, AD-6, AD-10, AD-12 Priority 2: AD-3, AD-4, AD-5, AD-7 Priority 3: AD-8, AD-9, AD-11, AD-13, AD-14
Richard Schwerdtfeger	Accept this section with the following changes	(AD-3) should be a SHOULD and not a MUST General comments from the beginning of the document that do not impact this survey question but should be addressed: - In the introduction place an is before generally in this sentence: To a certain extent cognitive problems also come into play, but this generally better addressed in the production of the material itself, rather than on providing access mechanisms to the content, although some access mechanisms may come into play for this audience. - Important: Define what you mean by media. It sounds like you were only talking about audio and video, yet you then mention virtual reality? How is that a media and web content is not. On the web everything delivered has a media type. Under blindness: - Does an alternative mode mean the accessibility API as well? An interesting thing about tickers is that in ARIA we needed to know if there was one and mark it so that a screen reader could ignore it. Under low vision - this needs a lot of rework as their are way too many generalizations: I think you need to be more precise here. For example, people who require say 1.5 times magnification do not have similar issues to those who are blind. I think you mean People with extremely low vision (first sentence). Here is a suggested rewrite to see what I mean: People with severely low vision can use some visual information; although they will have similar issues as people who are blind, but depending on their visual ability might have specific issues such as difficulty discriminating foreground information from background information, or discriminating colors. Often, the elderly experience a yellowing of the cornea. This results in excessive glare caused by scattering of light in the eye when exposed to very bright content or surroundings. Consequently, these users may be unable to react quickly to transient information, and may have a narrow angle of view and may not pick up key information presented temporarily where they are not looking, or in text that is moving or scrolling. A person using a low vision assistive technology aid, such as a screen magnifier, will only be viewing a portion of the screen, and depend on their assistive technology to managed tracking media. Low vision users may have difficulty reading when text is too small, has contrast with background that is too low, or when outline or other fancy font types are used. They may be using an AT that adjusts all the colors of the screen, such as inverting the colors, so the media content must be viewable through the AT. - Atypical color perception You use the word colour. Did you want U.K. English or U.S. English? - Deafness <change> People who are deaf cannot access audio transmissions. Thus, an alternative representation is required, typically through synchronized captions. </change> <to> People who are deaf cannot access audio transmissions. Thus, an alternative representation is required in the form of synchronized text captions, a text transcript, or via a synchronized sign language avatar. </to> - Hard of hearing <change (editorial)> People who are hard of hearing may be able to use some audio material, but might not be able to discriminate certain types of sound, and may miss any information presented as audio only if it contains frequencies they can't hear, or is masked by background noise or distortion, They may miss audio which is too quiet, or of poor quality. Speech may be problematic if it is too fast and cannot be played back more slowly. Information presented using multichannel audio (e.g. stereo) may not be perceived by people who are deaf in one ear. </change> <to> People who are hard of hearing may be able to use some audio content, but might be unable to discriminate between certain types of sound. These users may miss information presented as audio if it contains frequencies they can't hear, or is masked by background noise or distortion, They may miss audio which is too quiet, or of poor quality. Speech may be problematic if it is too fast and cannot be played back more slowly. Information presented using multichannel audio (e.g. stereo) may not be perceived by people who are deaf in one ear. </to>
Laura Carlson	Accept this section with the following changes	It could be a huge time sink to agree on a codec specialized for speech. Codec has been a time drain in the full working group.
Sean Hayes	Accept this section with the following changes	AD3 should mention it is to allow for user choice for those with and without visual memory. Note that the choice could be to a separate mixed file, which may be of different length to the un described media. Ad-4 is mostly about the fact that audio description of media is better with a real human rendering, so that a text track is not considered sufficient. Should be clear that these can be pre-mixed and not require client side mixing AD 5 & 6 are should be couched in terms of "where encoding media allows" AD7 should be reworded to indicate that precise control is required where the client is doing the mixing, so that the descriptions are not overridden by the main soundtrack, and vice-versa. AD8 should mention pan for moving the descriptions away from the main audio; again "where encoding allows" AD9 is really a technical requirement so that where descriptions are being sent as separate channels bandwidth can be optimised. I'd save this for later. Its not really about what codec is used; merely that they are not corellated. AD11 should mention that it si for mixed audiences. AD12 should be re-phrased to address the user need of not wrongly associating captions with the wrong media if they are too close to a media source changing. AD8 could be merged with AD13 as the latter is the more general user requirement to place descriptions away from other sounds in the audio space; AD8 is a technical means to partially achieve it. AD14 needs to include the fact that this information should be available to the user so they can determine what is useful to them, and to AT to assist in rendering.
John Foliot	Accept this section as proposed	The justification for "AD-7: Permit smooth changes in volume rather than stepped changes." is not clear.
Silvia Pfeiffer	Accept this section with the following changes	I would remove AD-13 since I think it's a duplicate of AD-8.
Judy Brewer	Accept this section with the following changes	Requesting clarification of meaning and rationale for AD-12. Might this be intended to refer to settings for descriptions, instead of the descriptions themselves?

2. Texted Audio Description

Refer to 2.2 Texted Audio Description.

Audio descriptions that are given as text rather than recorded voice create specific requirements.
Texted audio descriptions are delivered to the client as text and rendered locally by e.g. a screen-reader or through a braille device. This can have advantages for users who are frequent screen-reader users and are thus given detailed control of the preferred voice and speaking rate, and many more options to control the speech synthesis.
Texted audio descriptions are provided as text files with a start time for a description cue. Since the duration that a screenreader takes to read out a description cue cannot be determined during authoring of the cues, it is difficult to make sure they don't run over into the main audio or into other description cues. This is likely to be caused by at least three reasons:

A typical author of textual audio descriptions does not have a screen reader. This means s/he cannot check if the description fist within the time frame. Even if s/he has a screen reader, a different screen reader may take longer to read out the same sentence;
Some screen reader users (e.g. elderly and people with learning disabilities) may slow down the speech rate; or
A visually-complicated scene (e.g. figures on a blackboard in an online physics class) may not be sufficiently described within any time interval in the original audio track.

Requirements
Systems supporting texted audio/video descriptions must:

(TAD-1) Support to present a text audio description through a screen-reader or braille device with playback speed control and voice control and synchronisation points with the video.
(TAD-2) Textual ADs or EADs need to be provided in a format that contains the following information:
start time, text per description cue (the duration is determined dynamically, though an end time could provide a cut point)
possibly a speech synthesis markup to improve quality or the AD (SSML or Speech CSS)
accompanying metadata providing labeling for speakers, copyright information, usage rights, language, etc.

(TAD-3) Support to present a text or separate audio track privately to those that need it in a mixed-viewing situation, e.g. through headphones.
(TAD-4) Support different options for the overflow case: continue reading, stop reading, pause the video.
(TAD-5) Support the control over speech synthesis playback speed, volume, voice, and provide synchronisation points with the video.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	10
Accept this section with the following changes	4
Do not accept this section for the following reasons	2
Abstain

Details

Responder	Texted Audio Description	Comments
Philip Jägenstedt	Accept this section with the following changes	(TAD-2) Do there already exist formats that can embed speech synthesis markup? Is copyright information and usage rights really relevant to users of this technology? It seems like a general requirements unrelated to accessibility which should be suggested for <audio>/<video>, if at all. (TAD-3) Sounds like a quality of implementation issue for UAs, not something all UAs will be able to do on all platforms. (TAD-4) Is this an author option, or a user setting?
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	If we want people to be able to create interoperable pages, this requirement must specify a TAD file format. (TAD-1), (TAD-2), (TAD-4), and (TAD-5) Aren't these all requirements for the TAD file format? (TAD-3) This won't be possible on all platforms, for example devices that only allow one mode of audio output.
aurélien levy	Accept this section as proposed
Marco Ranon	Accept this section as proposed	Typo: "check if the description fist within the time frame" should be "check if the description fits within the time frame" (fits, not fist).
Masatomo Kobayashi	Accept this section with the following changes	TAD-2 Could label the three specific requirements so that we could easily refer to each item? TAD-4 Could say "with the user setting overriding the author setting", or specify which side sets this option?
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Do not accept this section for the following reasons	<change TAG-1 to> Provide support to present a text audio description that can be accessed and presented in speech or Braille by an assistive technology with playback speed control and voice control and synchronization points with the video. </change> You need to expand EAD as it is not done until later. I would return this as it does not clearly state how the AD or EAD is provided to the screen reader or Braille device. Has the group considered allowing the screen reader an interface to control the Audio and EAD or AD programmatically and how - even if it is not a must?
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	TAD-2 needs to be reworded to present the user requirement, its currently a mix of user and technical requirement. TAD-4 needs to be generalised a bit and couched more in terms of the user need. (It can hardly be a user need to not read part of a description) The output of a text rendering to audio should be subject to the same requirements as a pre-recorded audio; so we should just say that and not repeat reqts from above.
John Foliot	Accept this section as proposed	I see no mention of Emotional ML (http://www.w3.org/TR/2009/WD-emotionml-20091029/), but note that TAD-2 is non-specific
Silvia Pfeiffer	Accept this section as proposed	General comment on the document: I think that it is a good idea to collect the media a11y requirements from a a11y user and a11y industry perspective in a single document. While some of these requirements seem to be nice-to-haves, most of them are actual requirements. In either case, it is good to "know what the customer wants" before we design the technical solutions. I foresee changes will still happen to all of these sections and this document should continue to be improved. I personally don't have any more changes to suggest at this point in time. I would, however, want to see the feedback that we receive included before we freeze a first version of this document.
Judy Brewer	Accept this section with the following changes	Needs copy-edits: - "Support to present" should be "Support presentation of" - "Textual ADs or EADS need to be provided in a format" should be spelled out and rephrased, e.g. "Support presentation of textual audio descriptions and extended audio descriptions in a format..."

3. Extended audio description

Refer to 2.3 Extended audio description.

In some types of material the pace of delivery is such that there is insufficient time to represent all of the necessary information in the gaps in the existing audio. To meet this case, the concept of extended description was developed. This extends the overall playback time – typically by inserting pauses at key moments - and then uses the additional time to deliver longer descriptions. For technical reasons this has not been possible in broadcast television. However hard-disk recording and on-demand internet systems can make this a practical possibility.
Extended audio description (EAD) has been reported to have benefits for cognitive disabilities; for example it may be of benefit for Aspergers Syndrome and other autistic spectrum problems, in that it can make connections between cause and effect, point out what is important to look at, or explain moods that might otherwise be missed.
Requirements
Systems supporting extended audio/video descriptions must:

(EAD-1) Support the use of extended descriptions with detailed user control.
(EAD-2) Support automatically pausing the video and main audio tracks in order to play a lengthy description.
(EAD-3) Support resuming playback of video and main audio tracks when the description is finished.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	11
Accept this section with the following changes	3
Do not accept this section for the following reasons	2
Abstain

Details

Responder	Extended audio description	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	This is a very complicated use case requiring deep integration between the UA (browser), media framework and speech synthesis software. While certainly nice if it were possible, this is too big an implementation burden to be realistic. If any browser vendor feels differently then I'd be keen to hear about it. (This is not an official position of Opera Software; Opera Software does not have an official position on much of anything.)
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	This is impossible to evaluate without a specification for an EAD file format. WCAG2 recommends using SMIL 1 or 2, which would be an enormous implementation burden an a UA.
aurélien levy	Accept this section as proposed
Marco Ranon	Accept this section with the following changes	The term "Extended audio description" may generate confusion. It suggests that it could be similar to "Audio description". However extended description is not suitable for streamed media. I noted that in the Audio Description section there is the text "See the section on Extended Descriptions for an alternative", where the word 'audio' is missing. Alternatives: drop 'audio' or replace it with 'visual'.
Masatomo Kobayashi	Accept this section with the following changes	EAD-1 Please clarify "detailed user control" Note that, without considering the strictness of synchronization, implementing EAD is not so difficult even in a HTML 4 document at least for human-narrated AD (e.g., simply using JavaScript and two media player objects). The extended TAD is also relatively easy for a self-voicing browser. If the UA depends on screen reader software for speech synthesis, some feedback mechanism will be required to meet EAD-3. I think officially-supported EAD will highly benefit end-users and content creators while the implementation will require less effort for UA developers (than implementing SMIL)
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	EAD-1 should be split into 2. Support of extended descriptions, and user control. The former could be satisifed with a pre-mixed alternate media stream; while the latter could not. The type of control needs to be made explicit here. EAD-2&3 are technical reqt. and should be held till later.
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

4. Clear audio

Refer to 2.4 Clear audio.

A relatively recent development in television accessibility is the concept of clear audio, which takes advantage of the increased adoption of multichannel audio. This is primarily aimed at audiences who are hard of hearing, and consists of isolating the audio channel containing the spoken dialog and important non dialog audio that can then be amplified, or otherwise modified, while the other channels (containing music or ambient sounds) are attenuated.
Using the isolated audio track may make it possible to apply more sophisticated audio processing such as pre-emphasis filters, pitch shifting and so on to tailor the audio to the users needs, since hearing loss is typically frequency dependant, and the user may have reasonable hearing in some bands, yet none at all in others.
Requirements
Systems supporting clear audio must:

(CA-1) Support speech as a separate audio track from other sounds.
(CA-2) Support the synchronisation of multitrack audio either within the same file or from separate files - preferably both.
(CA-3) Support separate volume control of the different audio tracks.
(CA-4) Potentially support pre-emphasis filers, pitch shifting, and other audio processing algorithms.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	10
Accept this section with the following changes	3
Do not accept this section for the following reasons	3
Abstain

Details

Responder	Clear audio	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	It sounds like the introduction is talking about the front center channel in 5.1, 6.1 or 7.1 speaker setups. As far as I know this channel usually doesn't only contain voice, even though that's possible. The specific requirements talk about a separate track, rather than a specific channel of the audio, a track which may be in a different resource on a different server. This seems like an unnecessary implementation burden, I would suggest removing the requirements and simply allowing multi-channel audio tracks where one track may or may not contain only voice. Fine-grained control over the volume of each channel is not something I've ever seen in a media player, but could be a nice feature of UAs that have the resources to support it.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	(CA-1) - (CA-3) These require a UA to play multiple tracks of audio simultaneously. As previously noted, this is not possible on all devices.
aurélien levy	Do not accept this section for the following reasons	as we already have requirement for multitracks supports and separate volume I think the only relevant requirement here is the CA-4
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section with the following changes	This is fine however I would be concerned about cognitively impaired users having to manage volume control on multiple sound tracks. Perhaps a vehicle to control all volumes with one volume control would be beneficial? or a single control to increase the clear voice volume and decreasing the other audio tracks.
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	CA-1 and 2 should be couched in terms of where media encoding allows, and the speech is encoded in a separate track. CA4 - potentially is not a good word to use in a requirement.
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section with the following changes	Is CA-4 intended to reference "filters" instead of "filers"?

5. Content Navigation by Content Structure

Refer to 2.5 Content Navigation by Content Structure.

Most people are familiar with fast forward and rewind in media content. However, fast forward and rewind, because they progress through content based only on time, are ineffective particularly when the content is being used for other than entertainment purposes. People with disabilities are also particularly disadvantaged if forced to rely solely on time-based forward and rewind to study content.
Fortunately, most content is structured, and appropriate markup can expose this structure to forward and rewind controls:

Books generally have chapters and perhaps subsections within those chapters. They also have structures such as page numbers, side-bars, tables, footnotes, tables of contents, glossaries, etc.
Short music selections tend to have versus and repeating choruses.
Larger musical works have movements which are further dividable by component parts such as "Exposition, Development, and Recapitulation;" or "Theme and Variations."
Operas, theatrical plays, and movies have acts and scenes within those acts.
Television programs generally have clear divisions, e.g. newscasts have individual stories usually wrapped within a larger structure called "News, Weather, and Sports."
A lecturer may first lay out the issue, then consider a series of approaches or illustrative examples, and finally the lecturer's conclusion.

Support for effective structural navigation will require an additional control not typically available on current media players. This realtime control will allow the user to adjust the level of granularity applied to "next" and "previous" controls.
Two Examples of Granularity Levels
1. In a news broadcast, the most global level (analogous to <h1>) might be "News, Weather, and Sports." The second level (analogous to <h2>) would identify each individual news (or sports) story. With the granularity control set to level 1, "next" and "previous" would cycle among "News, Weather, and Sports." Set at level 2, it would cycle among individual news (or sports) stories.
2. In a bilingual "Audiobook Plus Text" production of Dante Alighieri's "La Divina Commedia," the user would choose whether to listen to the original medieval Italian or its modern language translation--possibly toggling between them. Meanwhile, both the original and translated texts might appear on screen, with both the original and translated text highlighted, line by line, in sync with the audio narration.

The most global (<h1>) level would be each individual "book," "Inferno," "Purgatorio," and "Paradiso."
The second (<h2>) level would be each individual "Canto."
The third (<h3>) level would be each individual "Verso."
The fourth (<h4>) level would be each individual line of poetry.

With granularity set at level 1, "Next" and "Previous" would cycle among the three books of "The Divine Comedy." Set at level 2, they would cycle among its "Cantos," at level 3 among its "Versos," and at level 4 among the individual lines of poetry text.
Note that, just as printed books may have footnotes, sidebars, and other ancillary content structures, media productions may also contain ancillary content. Newscasts will have commercials. Audio productions of "The Divine Comedy" may well include reproductions of famous frescoes or paintings interspersed throughout the text, though these are not properly part of the text/content.
Just as the structures introduced particularly by nonfictional titles make books more usable, media is more usable when its inherent structure is exposed by markup. And, markup-based access to structure is critical for persons with disabilities who possess less ability to infer structure from purely presentational queues.
Structural navigation has proven highly effective internationally in various programs of electronic book publication for persons with print disabilities. Nowadays, these programs are based on the ANSI/NISO Z39.86 specifications. Z39.86 structural navigation is also supported by e-publishing industry specifications.
The user can navigate along the timebase using a continuous scale, and by relative time units within rendered audio and animations (including video and animated images) that last three or more seconds at their default playback rate. (UAAG 2.0 4.9.6?)
The user can navigate by semantic structure within the time-based media, such as by chapters or scenes, if present in the media (UAAG 2.0 4.9.7).
Requirements
Systems supporting content navigation must:

(CN-1) Generally, provide accessible keyboard controls for navigating a media resource in lieu of clicking on the transport bar need to be available, e.g. 5sec forward/back, 30sec forward/back, beginning, end.
(CN-2) Provide a means to structure a media resource by semantic content structure, e.g. through adding a track to the video that contains navigation markers (in table-of-content style). Such a track can be provided as part of the media resource or externally and synchronised with the media resource.
(CN-3) The navigation track should provide for hierarchical structures with titles for the sections.
(CN-4) Support both global navigation by the larger structural elements of a media work, and also the most localized atomic structures of that work, even though authors may not have marked-up all levels of navigational granularity.
(CN-5) Be possible through third-party provided navigational markup files.
(CN-6) Keep all content representations in sync, so that moving to any particular structural element in media content also moves to the corresponding point in all provided alternate media representations (captions, described video, transcripts, etc) associated with that work.
(CN-7) Support direct access to any structural element, possibly through URIs.
(CN-8) Support pausing primary content traversal to provide access to such ancillary content in line.
(CN-9) Support skipping of ancillary content in order to not interrupt content flow.
(CN-10) Support direct access to each ancillary content item, including with "next" and "previous" controls, apart from accessing the primary content of the title.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	9
Accept this section with the following changes	3
Do not accept this section for the following reasons	2
Abstain	2

Details

Responder	Content Navigation by Content Structure	Comments
Philip Jägenstedt	Accept this section with the following changes	If the chapter markers are a tree, are "previous" and "next" be sufficient to navigate them? While one could simply flatten the tree, what benefit is there to the user of a tree structure in the markup if it cannot be represented as such in a UI? (CN-7) is a requirement for integration with Media Fragments, which I support, this is a good idea. In (CN-8)-(CN-10), what is the ancillary content being referred to? I don't understand these 3 requirements at all.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	This will require a mandated CN file format if we want interoperable content. The two formats mentioned are quite complex and will require enormous implementation effort. Do we actually require that level of sophistication? (CN-2) - This may not be possible with all CN formats as not all are self-contained. (CN-3) - (CN-5) These are functions of a CN file format. (CB-8) - (CN-10) I don't know what these mean.
aurélien levy	Do not accept this section for the following reasons	I don't understand CN-8 to CN-10
Marco Ranon	Accept this section as proposed	In the second example, shouldn't be "La Divina Commedia" the H1?
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Abstain	(CN-8) (CN-9) (CN-10) how could we possibly determine what content is primary or secondary?
Joshue O'Connor	Accept this section as proposed	Great Idea!
Jon Gunderson	Accept this section as proposed	This looks a lot like DAISY
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	CN-1 This needs to be re-written. Generally is not a good word to use in a reqt. We should require the navigation control be accessible, and what that means, rather than necessarily refer to the means of effecting it (keyboard). I assume we would still want the navigation to be possible on form factors that do not have a keyboard. CN-2 should be about the users ability to navigate through a media resurce in a structured way, rather than technical constraints on the realisation of that functionality. Not sure CN-5 is a user need.
John Foliot	Abstain	I have reservations/concerns about referencing books in a discussion around the <video> and <audio> elements, and fear that it will be held out as scope-creep. I also muse aloud about: ”Short music selections tend to have versus and repeating choruses” - and specific use-cases when a user would want to jump around inside a media piece such as that, as none have been provided. I question whether “CN-2: Provide a means to structure a media resource by semantic content structure, e.g. through adding a track to the video that contains navigation markers (in table-of-content style).” – is in scope here, as it appears to be a media content authoring requirement: a topic that is not addressed in any other way here. It always seems to also stray towards prescriptive, rather than descriptive in expression of need
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section with the following changes	Requesting clarification of CN-5

6. Captioning

Refer to 2.6 Captioning.

For people who are deaf or hard of hearing, captioning is the main alternative representation of audio. Captions are always written in the same language as the main audio track. They not only render the audio as text on the screen, they also indicate important non-speech information, such as sound effects, music and laughter. Captions are either closed or open. Closed captions are transmitted as data along with the video but are not visible until the user elects to turn them on, usually by invoking an on-screen control or menu selection. Open captions are always visible; they have been merged with the video track and cannot be turned off.
Ideally captions should be a verbatim representation of the audio; however, captions are sometimes edited for various reasons-- for example, for reading speed or for language level. In general, consumers of captions have expressed that the text should represent exactly what is in the audio track. If edited captions are provided, then they should be clearly marked as such, and the full verbatim version should also be available as an option.
The timing of caption text can coincide with the mouth movement of the speaker (where visible), but it is not strictly necessary. For timing purposes, captions may sometimes precede or extend slightly after the audio they represent. Captioning should also use adequate means to distinguish between speakers as turn-taking occurs during conversation; this is commonly done by positioning the text near the speaker, although in some countries color is used to indicate a change in speaker.
Captions are useful to a wide array of users in addition to their originally intended audiences. Gyms, bars and restaurants regularly employ captions as a way for patrons to watch television while in those establishments. People learning to read or learning English as a second language also benefit from captions: research has shown that captions help reinforce vocabulary and language. Captions can also provide a powerful search capability, allowing users to search the caption text to locate a specific video or an exact point in a video.
Requirements
Formats for captions, subtitles or foreign-language subtitles must:

(CC-1) Render text in a time-synchronized manner, using the audio track as the timebase master.
(CC-2) Allow the author to specify erasures, when no text is displayed on the screen.
(CC-3) Allow the author to assign timestamps so that one caption/subtitle follows another, with no perceivable gap in between.
(CC-4) Be available in a text encoding.
(CC-5) Support positioning in all parts of the screen.
(CC-6) Support the display of multiple regions of text simultaneously.
(CC-7) Display multiple rows of text when rendered as text in a right-to-left- or left-to-right language.
(CC-8) Allow the author to specify line breaks.
(CC-9) Permit a range of font faces and sizes.
(CC-10) Render a background in a range of colors, supporting a full range of opacities.
(CC-11) Render text in a range of colors.
(CC-12) Render text with an outline or drop shadow.
(CC-13) Allow the background to be removed or to remain on screen when no text is displayed.
(CC-14) Allow the use of mixed display styles-- e.g., mixing paint-on captions with pop-on captions-- within a single caption or in the caption stream as a whole.
(CC-15) Support positioning such that the lowest line of captions appears at least 1/12 of the total screen height above the bottom of the screen, when rendered as text in a right-to-left- or left-to-right language.
(CC-16) Use conventions that include inserting left-to-right and right-to-left segments within a vertical run (e.g. Tate-chu-yoko in Japanese), when rendered as text in a top-to-bottom oriented language.
(CC-17) Represent content of different natural languages, where the content of distinct languages may be segregated into separate documents or integrated into a single document.
(CC-18) Represent content of at least those specific natural languages that may be represented with [Unicode 3.2], including common typographical conventions of that language (e.g., through the use of furigana and other forms of ruby text).
(CC-19) Present the full range of typographical glyphs, layout and punctuation marks normally associated with the natural language's print writing system.
(CC-20) Permit in-line mark-up for foreign words or phrases.
(CC-21) Permit the distinction between different speakers.

Further, systems that support captions must:

(CC-22) Support captions provided inside media resources as tracks, or in external files.
(CC-23) Ascertain that captions are displayed in sync with the media resource.
(CC-24) Support user activation/deactivation of caption tracks.
(CC-25) Support edited and full verbatim captions.
(CC-26) Support multiple tracks of captions in different languages.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	6
Accept this section with the following changes	10
Do not accept this section for the following reasons
Abstain

Details

Responder	Captioning	Comments
Philip Jägenstedt	Accept this section with the following changes	(CC-1) The reference time doesn't have to be the audio track, let's just say it's the media resource. (Video without audio but with captions is possible.) (CC-5) Is this a requirement for pixel-perfect positioning, or relative positioning? Must it be possible to give a bounding box for the text, or is it enough to say where it starts? (CC-13) Sounds very peculiar and not something that is possible with most formats I have seen. Please give a rationale for why this is important (or remove the requirement if it isn't). (CC-14) How can a single caption mix "display styles", e.g. both be "paint-on" and "pop-up"? (I don't know exactly what any of the quoted words mean.) (CC-17) If there are separate caption files, is it expected that these should be displayed together and not overlap? This sounds rather difficult to implement, why not simply have a single caption file? (CC-20) Is supporting italics enough to differentiate between languages, or should it be possible to mark up the actual language. If yes, why? (CC-26) Sounds like a bonus, not an essential requirement.
Leif Halvard Silli	Accept this section with the following changes	The first paragraph of the introductory text says: "Captions are always written in the same language as the main audio track." Whereas the Requirements sections says: "Formats for captions, subtitles or foreign-language subtitles must:" Obviously the Requirements section knows about subtitles in other languages than the languag of the main audio track ... Please bring in the subject of subtitling in other languages into the introductory text.
Gregory Rosmaita	Accept this section with the following changes	plus 1 to lief's request that the subject of subtitling in multiple languages into the introductory text
Eric Carlson	Accept this section with the following changes	(CC-1) Not all media files have audio. (CC-17) Does this mean that captions are rendered from more than one external caption file simultaneously? (CC-25) What must a UA do differently for edited vs full verbatim captions?
aurélien levy	Accept this section with the following changes
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section with the following changes	CC-1 The timebase master should be the media resource, instead of the audio track?
Denis Boudreau	Accept this section with the following changes	(CC-1) What if said media has no audio track? (CC-17) How would multiple caption files coexist for the same media? Based on user preferences?
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section with the following changes	The user should also have final control over rendering styles like color and fonts
Richard Schwerdtfeger	Accept this section with the following changes	There should be a way to find and switch caption languages on the fly. How is that done if the caption document is found at a different URI? cc-5: Do you really want to say ALL parts of the screen? Placing captions to the far right of the screen may not be the best thing to allow.
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	We should be careful not to conflate the terminology between captions and subtitles as some people get upset about that. CC-1 A media file may have a separate time encoding which is used both video and audio. However captions are defined as a text representation of audio; so captions and video only don't make sense. CC9-12 should be clear that the effects must be mixable within one caption. CC-13 is to allow the user to see as much of the underlying video as possible where captions are infrequent. Where captions are frequent; it is preferable to keep the caption background so that it minimises distraction. CC-17 is really a requirement on subtitles (foreign language).
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

7. Extended Captioning

Refer to 2.7 Extended Captioning.

Extended captioning adds additional metadata to captions so that a richer, more hypertext-like experience is possible – for example, adding glossary definitions for acronyms and other intialisms, foreign terms (for example Latin), jargon or other difficult language. This may be age-graded, so that multiple caption tracks are supplied, or the glossary function may be added dynamically. Glossary information can be added in the normal time allotted for the caption (e.g. as a callout or other overlay), or it might take the form of a hyperlink that pauses the main content and allows access to more complete explanatory material.
Extended captioning can be useful for those with restricted reading skills. In addition, captioning can be provided for the audio-description track. This may prove helpful for comprehension, especially when extended descriptions are being used.
Requirements
Systems that support extended captions must:

(ECC-1) Support metadata markup for (sections of) caption text.
(ECC-2) Support hyperlinks and other activation mechanisms for supplementary data for (sections of) caption text.
(ECC-3) Support alternative extended caption tracks for different purposes.
(ECC-4) Support captions that may be longer than the space to the next caption and thus provide overlapping caption cues - in this case, a feature should be provided to decide if overlap is ok or should be cut or the media resource be paused while the caption is displayed.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	9
Accept this section with the following changes	4
Do not accept this section for the following reasons	3
Abstain

Details

Responder	Extended Captioning	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	(ECC-1) Much too vague. What are the actual requirements and use cases? (ECC-2) I would like to see links, but "other activation mechanisms" is again much too vague. What are the actual requirements and use cases? (ECC-4) Is the feature part of the format, or a feature of the UA?
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	(ECC-1) What does this mean? What is the format of "metadata markup"? (ECC-2) What are "other activation mechanisms"? (ECC-4) How does this work? Who decides if "overlap is OK ..."?
aurélien levy	Accept this section with the following changes	ECC-1,2 and 4 need to be more explicit
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section with the following changes	ECC-4 Could say "with the user setting overriding the author setting", or specify which side sets this option? Also I am not sure what ECC-1 and -2 actually mean.
Denis Boudreau	Do not accept this section for the following reasons	(ECC-1), (ECC-2) and (ECC-4) need to be defined more clearly.
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section with the following changes	Regarding ECC-4 what about providing a feature to configure wrapping text with smaller fonts? This seems better than cutting off the text.
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	General, these should probably be called additional text information or something, as captions are a representation of the audio track; and this text is not in the audio. This text should not be marked as caption type. ECC-1 - what user need is this addressing? ECC 2 - split into two requirements, one for additional information that can be displayed in the same time as a caption (Ruby for example could be considered this kind of information); and another for information which is presented independantly outside the timeline of the media. ECC-4 - needs to be distinguished from the normal caption case of two speech acts that overlap in time (e.g. an interruption). Also the required display time of extended text information is dependant on the users reading speed. user should have a means to control the display time and size so that if they can keep up they avoid pausing the video
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

8. Sign Translation

Refer to 2.8 Sign Translation.

Sign language shares the same concept as captioning: it presents both speech and non-speech information in an alternative format. Note that due to the wide regional variation in signing systems (e.g., American Sign Language vs British Sign Language), sign translation may not be appropriate for content with a global audience unless localized variants can be made available.
From a technology point of view signing can be open, mixed with the video and offered as an entirely alternate stream or closed (using some form of picture-in-picture or alpha blending technology). It is possible to use quite low bit rates for much of the signing track, but it is important that facial, arm, hand and other body gestures be delivered at sufficient resolution to support legibility. Animated avatars may not currently be sufficient as a substitute for human signers, although research continues in this area and it may become practical at some point in the future.
Requirements
Systems supporting sign language must:

(SL-1) Support sign language video either as a track as part of a media resource or as an external file.
(SL-2) Support the synchronised playback of the sign language video with the media resource.
(SL-3) Support the display of sign language video either as picture-in-picture or alpha-blended overlay, as parallel video, or as the main video with the original video as picture-in-picture or alpha-blended overlay.
(SL-4) Support multiple sign language tracks in several sign languages.
(SL-5) Support the interactive activation/deactivation of a sign language track by the user.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	12
Accept this section with the following changes	3
Do not accept this section for the following reasons	1
Abstain

Details

Responder	Sign Translation	Comments
Philip Jägenstedt	Accept this section with the following changes	(SL-3) What is "as parallel video"? "as picture-in-picture or alpha-blended overlay" and "as the main video with the original video as picture-in-picture or alpha-blended overlay" also seem to mean exactly the same thing, am I missing something? Note that <track> may not be the solution for this, rather linking several <video> elements together seems like a more reasonable solution if the video tracks are from separate resources.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	Decoding and rendering multiple streams is even more difficult with video than with audio. Video decoding hardware used in many mobile devices is not able to decode more than one video stream simultaneously. (SL-1) and (SL-3) Playing multiple media files in sync is not currently supported by HTML5, and will be a significant amount of work for some UAs.
aurélien levy	Accept this section as proposed
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Do not accept this section for the following reasons	I think you need to support multiple tracks for different sign languages if they are available and allow the user to specify which one is presented.
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	describe better the two cases: of picture in picture (where the background portion of the sign video is visible to the user), and alpha blended; where only the signer and no background is mixed onto the main video. Note that SL could also be in a separate window frame to the main video (main video for example being resized, this is sometimes done in TV broadcast)
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

9. Transcripts

Refer to 2.9 Transcripts.

While synchronized captions are preferable for people with hearing impairments, for some users they are not viable – those who are deaf-blind, for example, or those with cognitive or reading impairments that make it impossible to follow synchronized captions. And even with ordinary captions, it is possible to miss some information as the captions and the video require two separate loci of attention. Providing a full transcript is a good option in addition to, but not as a replacement for, timed captioning. A transcript can either be presented simultaneously with the media material, which can assist slower readers or those who need more time to reference context, but it should also be made available independently of the media.
A full text transcript should include information that would be in both the caption and audio description, so that it is a complete representation of the material, as well as containing any interactive options.
Requirements
Systems supporting transcript must:

(T-1) Support the provisioning of a full text transcript for the media asset in a separate but linked resource, where the linkage is programatically accessible to AT.
(T-2) Support the provisioning of a full text transcript with the media resource, e.g. in a scrolling area next to the video or underneath the video, which is also AT accessible.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	10
Accept this section with the following changes	5
Do not accept this section for the following reasons	1
Abstain

Details

Responder	Transcripts	Comments
Philip Jägenstedt	Accept this section with the following changes	(T-2) seems incompatible with the content model of <video>. Is the intention that the text be in some kind of overlay that is not part of the page layout, or should the size of the <video> element change when transcripts are activated. In general it seems that a transcript is already supported: simply include it next to the <video>. What's missing is some way of highlighting the current position.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	(T-2) Does this affect page layout, eg. does change the size of the <video> element or appear above other page content? How will this work on a device with a small screen or a device that only allows fullscreen playback,?
aurélien levy	Accept this section as proposed
Marco Ranon	Do not accept this section for the following reasons	There are two requirements, one for a link to the text version and the other to scrolling text adjacent to the playing video. It appears that both are required, if this is the case, some users with distractibility conditions may find it difficult to concentrate on the video due to distraction from the scrolling text. Could it also be required that there's a mechanism for hiding the scrolling text?
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Accept this section with the following changes	Consider adjusting the intro a bit. Synchronized captions may not always be the sole preference for all people with hearing impairments if cognitive disabilities are also involved. Some prefer transcripts with the video. Transcripts also have advantages over captioned video for some eg Deaf-blind, people with slow internet connections (I'm currently on dialup, can attest to that one), low-income people whose main/only net access is at free public terminals eg local library (browsing sessions will be unavoidably short - eg half or one hour - and probably not daily. Can print transcript to take away, leaves more time in session for other things). A transcript for a media asset in separate but linked resource is extremely important to all people with disabilities and those without. If all else fails a transcript often saves the day.
Sean Hayes	Accept this section with the following changes	T-2 is not clear whether scrolling is a requirement or not. The key things here are that the text is provided, it is available to AT through adequate labelling, and it can be included in the layout by the page author if desired.
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section with the following changes	Suggesting spelling out AT to assistive technology.

10. Keyboard Access to interactive controls / menus

Refer to 3.1 Keyboard Access to interactive controls / menus.

Ideally, the user agent would have a rich set of native controls for media operation. This would ensure that the controls are discoverable by the user and assistive technology. These controls include but are not limited to: play, pause, stop, jump to beginning, jump to end, scale player size (up to full screen), adjust volume, mute, captions on/off, descriptions on/off, selection of audio language, selection of caption language, selection of audio description language, location of captions, size of captions, video contrast/brightness, playback rate, content navigation on same level (next/prev) and between levels (up/down) etc.
The author would be able to choose any/all of the controls, skin them and position them. The user would have the ability to override all author settings. If the author chose to create a new interface using scripting, that interface MUST map to the native controls in the user agent, so the user can ignore author controls and use accessible native controls.
Requirements
Systems supporting keyboard accessibility must:

(KA-1) Support operation of all functionality via the keyboard using sequential or direct keyboard commands that do not require specific timings for individual keystrokes, except where the underlying function requires input that depends on the path of the user's movement and not just the endpoints (e.g., free hand drawing). This does not forbid and should not discourage providing mouse input or other input methods in addition to keyboard operation. (UAAG 2.0 4.1.1)

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	11
Accept this section with the following changes	5
Do not accept this section for the following reasons
Abstain

Details

Responder	Keyboard Access to interactive controls / menus	Comments
Philip Jägenstedt	Accept this section with the following changes	The requirement is sound, but the introductory text is strange. "If the author chose to create a new interface using scripting, that interface MUST map to the native controls in the user agent, so the user can ignore author controls and use accessible native controls." Scripted controls do not "map" to native controls, rather they both control ("map to") the same underlying interface. However, this is completely unrelated to the possibility of ignoring author controls. I suggest removing the section completely and adding a requirement that it should be possible to enable native controls regardless of the author preference (stated through the controls attribute on <video>).
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	I don't understand "If the author chose to create a new interface using scripting, that interface MUST map to the native controls in the user agent". Controls implemented with scripting should use the same interface to control a media file, but they do not use the native controls. Like Philip, I suggest we add a requirement that it must be possible to enable native controls even if a page has custom controls.
aurélien levy	Accept this section as proposed
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section with the following changes	In the introductory text, could say like "MUST NOT interfere the native controls" instead of saying "MUST map to the native controls"?
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section with the following changes	Question: How do mobile devices operate these controls without a keyboard? Are you going to require that devices support a keyboard to access controls/menus? I would investigate this issue with the mobile device manufacturers.
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section as proposed
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section with the following changes	Seeking confirmation from UAWG that this is the only keyboard support requirement that needs to be referenced?

11. Granularity Level Control for Structural Navigation

Refer to 3.2 Granularity Level Control for Structural Navigation.

As explained in Section "Content Navigation" above, a realtime control mechanism is required for adjusting the granularity of what specific structural navigation point "next" and "previous" controls will access.
Requirements

(CNS-1) All identified structures, including ancillary content as defined in "Content Navigation" above, must be accessible with the use of "next" and "previous," as refined by the granularity control.
(CNS-2) Ancillary structures must provide the possibility to be skipped, played in line, or accessed directly apart from primary content.
(CNS-3) This control must be input device agnostic.
(CNS-4) Producers and authors may optionally provide additional access options to identified structures, such as direct access to any node in a Table of Contents.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	12
Accept this section with the following changes	1
Do not accept this section for the following reasons	3
Abstain

Details

Responder	Granularity Level Control for Structural Navigation	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	(CNS-1) "ancillary content" isn't defined above, it is merely mentioned without explaining what it is. As such, I don't understand what this requirement is about. (CNS-2) I also don't know what this is about. (CNS-4) Is not a requirement at all, just a statement of something that might be possible. Voting to remove this completely unless the requirements can be made more specific and understandable for someone who hasn't taken part in the discussion leading up to them (such as myself).
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	I don't understand this requirement at all.
aurélien levy	Do not accept this section for the following reasons	I don't understand ancillary structures
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section as proposed	(CNS-1) Would ancillary content share the same level of importance or hierarchy than primary content when going from the previous to the next?
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	Not sure why this is called out as a separate section. And it appears to repeat some of the KN-* requirements.
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

12. Time Scale Modification

Refer to 3.3 Time Scale Modification.

While all devices may not support the capability, a standard control API must support the ability to speed up or slow down content presentation without altering audio pitch.
Requirements
The user can adjust the playback rate of prerecorded time-based media content, such that all of the following are true (UAAG 2.0 4.9.5):

(TSM-1) The user can adjust the playback rate of the time-based media tracks to between 50% and 250% of real time.
(TSM-2) Speech whose playback rate has been adjusted by the user maintains pitch in order to limit degradation of the speech quality.
(TSM-3) Audio and video tracks remain synchronized across this required range of playback rates.
(TSM-4) The user agent provides a function that resets the playback rate to normal (100%).
(TSM-5)The user can stop, pause, and resume rendered audio and animation content (including video and animated images) that last three or more seconds at their default playback rate. (UAAG 2.0 4.9.6)

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	10
Accept this section with the following changes	2
Do not accept this section for the following reasons	2
Abstain	2

Details

Responder	Time Scale Modification	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	These requirements are redundant with what is already required by the spec, so including them only hides things that are actually a problem. The unconditional requirement to maintain pitch is not reasonable and ought to be both optional and controlled by a DOM attribute as it isn't always desired.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	(TSM-2) As I have noted before, it is not possible to change speed without changing pitch on all devices that are capable of playing digital audio. (TSM-3) - (TSM-5) These are already required by the HTML5 spec.
aurélien levy	Accept this section with the following changes	TSM-2 seems to to be quite difficult to achieve
Marco Ranon	Accept this section as proposed	Is TSM-2 achievable? Is it possible to anticlimactically insert pauses between words? (just asking since I have no real experience with multimedia.)
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Abstain	How could (TSM-1) and (TSM-2) be achieved without altering audio pitch?
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Abstain
Sean Hayes	Accept this section with the following changes	TSM-3 should include captions and descriptions.
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

13. Production practice and resulting requirements

Refer to 3.4 Production practice and resulting requirements.

One of the biggest problems to date has been the lack of a universal system for media access. In response to user requirements various countries and groups have defined systems to provide accessibility, especially captioning for television. However these systems are typically not compatible. In some cases the formats can be inter-converted, but some formats – for example DVD sub-pictures – are image based and are difficult to convert to text.
Caption formats are often geared towards delivery of the media, for example as part of a television broadcast. They are not well suited to the production phases of media creation. Media creators have developed their own internal formats which are more amenable to the editing phase, but to date there has been no common format that allows interchange of this data.
Any media based solution should attempt to reduce as far as possible layers of translation between production and delivery.
In general captioners use a proprietary workstation to prepare caption files; these can often export to various standard broadcast ingest formats, but in general files are not inter-convertible. Most video editing suites are not set up to preserve captioning, and so this has typically to be added after the final edit is decided on; furthermore since this work is often outsourced, the copyright holder may not hold the final editable version of the captions. Thus when programming is later re-purposed, e.g. a shorter edit is made, or a ‘directors cut’ produced, the captioning may have to be redone in its entirety. Similarly, and particularly for news footage, parts of the media may go to web before the final TV edit is made, and thus the captions that are produced for the final TV edit are not available for the web version.
It is important when purchasing or commissioning media, that captioning and audio description is taken into account and made equal priority in terms of ownership, rights of use, etc., as the video and audio itself.
Requirements
Systems supporting accessibility needs for media must:

(PP-1) Support existing production practice for alternative content resources, in particular allow for the association of separate alternative content resources to media resources.
(PP-2) Support the association of authoring and rights metadata with alternative content resources.
(PP-3) Support the simple replacement of alternative content resources even after publishing.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	7
Accept this section with the following changes	1
Do not accept this section for the following reasons	2
Abstain	5

(1 response didn't contain an answer to this question)

Details

Responder	Production practice and resulting requirements	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	What is a "proprietary workstation"? Is it a workstation that runs proprietary software, or something else? I have no major issue with any of the requirements, but am voting to reject this as it seems like advice to authors and organizations rather than something the HTML WG needs to consider when developing the technical solutions. The exception might be (PP-2), but the metadata issue has also been mentioned in other requirements, so there's no need to repeat it.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	This is advice for people authoring media, not HTML. It isn't something the HTML WG needs to consider.
aurélien levy	Accept this section as proposed
Marco Ranon	Abstain	Abstaining since I believe that this issue is not strictly related to accessibility but related to proprietary formats and copyright - therefore not really my remit.
Masatomo Kobayashi	Abstain	I am not sure whether this kind of requirements should be included or not.
Denis Boudreau	Abstain	I agree that this does not seem to be something directly related to accessibility or even HTML5.
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson
Sean Hayes	Accept this section with the following changes	I think this probably needs to be recast to indicate that specific technical choices for HTML are not made in a vacuum, and need to consider wider impliations of the production of media. So this section is not really a user need, (apart from a general requirement to maximise the amount of accessible content being produced), but may influence specific technical choices later on.
John Foliot	Abstain	While I understand and support the requirements, I am concerned again that this is out of scope for the delivery of <audio> and <video> content, as it feels more like authoring agent requirements
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Abstain

14. Discovery and activation/deactivation of available alternative content by the user

Refer to 3.5 Discovery and activation/deactivation of available alternative content by the user.

As described above, individuals need a variety of media (alternative content) in order to perceive and understand the content. The author or some web mechanism provides the alternative content. This alternative content may be part of the original content, embeded within the media container as 'fallback content', or linked from the original content. The user is faced with discovering the availability of alternative content.
Requirements
The user agent can facilitate the discovery of alternative content by following the criteria:

(DAC-1) The user has the ability to have indicators rendered along with rendered elements that have alternative content (e.g. visual icons rendered in proximity of content which has short text alternatives, long descriptions, or captions). In cases where the alternative content has different dimensions than the original content, the user has the option to specify how the layout/reflow of the document should be handled. (UAAG 2.0 3.1.1).
(DAC-2) The user has a global option to specify which types of alternative content by default and, in cases where the alternative content has different dimensions than the original content, how the layout/reflow of the document should be handled. (UAAG 2.0 3.1.2).
(DAC-3) The user can browse the alternatives, switch between them.
(DAC-4) Synchronized alternatives for time-based media (e.g., captions, audio descriptions, sign language) can be rendered at the same time as their associated audio tracks and visual tracks (UAAG 2.0 3.1.3).
(DAC-5) Non-synchronized alternatives (e.g., short text alternatives, long descriptions) can be rendered as replacements for the original rendered content (UAAG 2.0 3.1.3).
(DAC-6) Provide the user with the global option to configure a cascade of types of alternatives to render by default, in case a preferred alternative content type is unavailable (UAAG 2.0 3.1.4).
(DAC-7) During time-based media playback, the user can determine which tracks are available and select or deselect tracks. These selections may override global default settings for captions, audio descriptions, etc. (UAAG 2.0 4.9.8)
(DAC-8) Provide the user with the option to load time-based media content such that the first frame is displayed (if video), but the content is not played until explicit user request. (UAAG 2.0 4.9.2)

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	12
Accept this section with the following changes	4
Do not accept this section for the following reasons
Abstain

Details

Responder	Discovery and activation/deactivation of available alternative content by the user	Comments
Philip Jägenstedt	Accept this section with the following changes	(DAC-1) "the user has the option to specify how the layout/reflow of the document should be handled" This sounds very strange and something I think we should absolutely not allow for. (DAC-4) seem redundant with what has already been said, I suggest removing it. (DAC-5) is incompatible with how <video> is specified, I suggest removing it or outlining in detail the suggest content model of <video>. (DAC-6) Does "cacade" refer to the same kind of list that is configurable for Content-Language? (DAC-7) seems redundant with many requirements in the above, I suggest removing it. (DAC-8) Is this a user overload for the preload and autoplay attributes?
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	(DAC-1) - (DAC-1) How will the user control reflow and layout? (DAC-4) Isn't this fundamental to the concept of an alternative? If it needs to be mentioned at all, it should be in the section(s) about specific alternatives. (DAC-5) This needs to be fully specified, eg. what happens when alternative content doesn't fit into the <video> region, etc. (DAC-7) This should be mentioned in the section(s) about alternative content. (DAC-8) This changes behavior of 'autoplay' and 'preload', say so.
aurélien levy	Accept this section with the following changes	add a requirement on device independence to activate/deactivate the alternative content
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	User agent guidelines should be introduced as recommended practice in the HTML spec (consistent with how CSS rendering is introduced)
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

15. Requirements on making properties available to the accessibility interface

Refer to 3.6 Requirements on making properties available to the accessibility interface .

Often forgotten in media systems, especially with the newer forms of packaging such as DVD menus and on screen program guides, is the fact that the user needs to actually get to the content, and turn on any accessibility options they require. For players running on platforms that support an accessibility API any media controls need to be connected to that API.
On self-contained products that do not support assistive technology, any menus in the content need to provide information in alternative formats (e.g., talking menus). Products with a separate remote control, or that are self-contained boxes, should ensure the physical design does not block access, and should make accessibility controls, such as the closed-caption toggle, as prominent as the volume or channel controls.
Requirements

(API-1) Support to expose the alternative content tracks for a media resource to the user, i.e. to the browser.
(API-2) Since authors will need access to the alternative content tracks, too, the structure needs to be exposed to authors, too, which requires a dynamic interface.
(API-3) Accessibility APIs need to gain access to alternative content tracks, too, no matter whether those content tracks come from within a resource or are combined through markup on the page.
(API-4) External control devices, such as remote controls, need to be enabled to access alternative content tracks, too.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	9
Accept this section with the following changes	7
Do not accept this section for the following reasons
Abstain

Details

Responder	Requirements on making properties available to the accessibility interface	Comments
Philip Jägenstedt	Accept this section with the following changes	(API-4) Strike this requirement, it isn't relevant for the web or the HTML WG.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	(API-4) Why is this an issue for the HTML WG?
aurélien levy	Accept this section as proposed
Marco Ranon	Accept this section as proposed
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section with the following changes	I also think we should remove (API-4).
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section with the following changes	It would be nice if industry would be consistent here. The U.S. 508 refresh refers to the accessibility interface as accessibility services. You cannot assume that media player is going to use accessibility services bundled with the platform, so: <change> For players running on platforms that support an accessibility API any media controls need to be connected to that API. </change> <to> For players running supporting accessibility API implemented for a platform any media controls need to be connected to that API. </to>
Laura Carlson	Accept this section with the following changes	I agree with Rich's change.
Sean Hayes	Accept this section with the following changes	API-1-4 I can see the need to know of the existence of the tracks, and select between them; but I dont see the need to expose the content of the tracks. Clarify the rqt to be the former. API-1 Dont understand
John Foliot	Accept this section with the following changes	I again voice my concern over scope-creep: "(API-4) External control devices, such as remote controls, need to be enabled to access alternative content tracks, too." - appears to be a user-agent device requirement and should already be addressed in the UAAG
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

16. Requirements on the use of the viewport

Refer to 3.7 Requirements on the use of the viewport.

The view port in particular of the video plays a particularly important role wrt alternative content technologies. Mostly it provides a bounding box for many of the visually represented alternative content technologies (e.g. captions, hierarchical navigation points, sign language), although some alternative content does not rely on that (e.g. audio descriptions, full transcripts).
One key principle to remember when designing player ‘skins’ is that the lower-third of the video may be needed for caption text (either in the video or as closed captions). Caption consumers rely on being able to make fast eye movements between the captions and the video content, if the captions are in a non-standard place, this may slow them down and cause them to miss information. The use of this area for other controls such as transport controls, whilst appealing aesthetically, may lead to accessibility issues.
Requirements

(VP-1) If alternative content has a different height or width to the media content, then the user agent will reflow the viewport. (UAAG 2.0 3.1.4).
(VP-2) The user can globally set the following characteristics of visually rendered text content, overriding any specified by the author or user agent defaults (UAAG 2.0 3.6.1). (Note: this should include captions):
(a) text scale (i.e., the general size of text) ,
(b) font family, and
(c) text color (i.e., foreground and background).

(VP-3) Provide the user with the ability to adjust the size of the time-based media up to the full height or width of the containing viewport, with the ability to preserve aspect ratio and to adjust the size of the playback viewport to avoid cropping, within the scaling limitations imposed by the media itself. (UAAG 2.0 4.9.9)
(VP-4) Provide the user with the ability to control the contrast and brightness of the content within the playback viewport. (UAAG 2.0 4.9.11)
(VP-5) Captions occupy traditionally the lower-third of the video - the use of this area for other controls or content needs to be avoided.

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	10
Accept this section with the following changes	4
Do not accept this section for the following reasons
Abstain	1

(1 response didn't contain an answer to this question)

Details

Responder	Requirements on the use of the viewport	Comments
Philip Jägenstedt	Accept this section with the following changes	(VP-1) Text tracks don't have dimensions, they depend on the video track. If there are text formats where this is not true, those formats will not be used on the web. Remove this requirement. (VP-3) Can this be made more specific? Is it about being able to scale overlay sign language video tracks, or also about some other kind of track? (VP-4) Is not currently possible since the <video> can be drawn on a <canvas>, where the colors cannot be altered because of user preference. No UA has such an option for images, why is it a requirement for video? Suggest to remove this requirement. (VP-5) All browsers use this area for controls. What is the suggested alternative?
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Accept this section with the following changes	(VP-3) Need details about how this is supposed to work, eg. does the video pop up over the page content or does the rest of the page reflow? Is page zoom enough? "with the ability to preserve aspect ratio" - when would the user ever not want to preserve aspect ratio? (VP-5) All existing browser, and many stand-alone media player applications, place controls along the bottom edge of the movie. Where should they go instead?
aurélien levy	Accept this section with the following changes	VP-5 may not be relevant in some case for example if the video contain essential information in this area overlapping it with caption make it impossible to see
Marco Ranon
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Accept this section as proposed
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Accept this section as proposed
Laura Carlson	Accept this section as proposed
Sean Hayes	Accept this section with the following changes	Again UA guidelines need to be introduced as recommended practice in a content spec.
John Foliot	Abstain	“(VP-1) If alternative content has a different height or width to the media content, then the user agent will reflow the viewport.” – this does not seem to account for a scenario when the view-port has already been maximized, but remains small due to device limitations. “(VP-4) Provide the user with the ability to control the contrast and brightness of the content within the playback viewport.” - appears to be a user-agent device requirement and should already be addressed in the UAAG. This is also to me a clear candidate for “SHOULD” language as it does not account for limitations of various devices
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

17. Requirements on the parallel use of alternate content on potentially multiple devices in parallel

Refer to 3.8 Requirements on the parallel use of alternate content on potentially multiple devices in parallel.

Multiple user devices must be directly addressable. It must be assumed that many users will have multiple video displays and/or multiple audio output devices attached to an individual computer, or addressable via LAN. It must be possible to configure certain types of media for presentation on specific devices, and these configuration settings must be readily overwritable on a case by case basis by users.
Requirements
Systems supporting multiple devices for accessibility must:

(MD-1) Support an platform accessibility architecture relevant to the operating environment. (UAAG 2.0 2.1.1)
(MD-2) Ensure accessibility of all user interface components including the user interface, rendered content, and alternative content, make available the name, role, state, value, and description via a platform accessibility architecture. (UAAG 2.0 2.1.2)
(MD-3) If a feature is not supported by the accessibility architecture(s), provide an equivalent feature that does support the accessibility architecture(s). Document the equivalent feature in the conformance claim. (UAAG 2.0 2.1.3)
(MD-4) If the user agent implements one or more DOMs, they must be made programmatically available to assistive technologies. (UAAG 2.0 2.1.4) This assumes the video element will write to the DOM.
(MD-5) If the user can modify the state or value of a piece of content through the user interface (e.g., by checking a box or editing a text area), the same degree of write access is available programmatically (UAAG 2.0 2.1.5).
(MD-6) If any of the following properties are supported by the accessibility platform architecture, make the properties available to the accessibility platform architecture (UAAG 2.0 2.1.6):
(a) the bounding dimensions and coordinates of rendered graphical objects
(b) font family of text
(c) font size of text
(d) foreground color of text
(e) background color of text.
(f) change state/value notifications

(MD-7) Ensure that programmatic exchanges between APIs proceed at a rate such that users do not perceive a delay. (UAAG 2.0 2.1.7).

Accept this section?

Summary

Choice	All responders
Choice	Results
Accept this section as proposed	9
Accept this section with the following changes	1
Do not accept this section for the following reasons	4
Abstain	1

(1 response didn't contain an answer to this question)

Details

Responder	Requirements on the parallel use of alternate content on potentially multiple devices in parallel	Comments
Philip Jägenstedt	Do not accept this section for the following reasons	(MD-1) Is not a reasonable requirement. What if the platform only has one accessibility architecture but it is too bad to be worth supporting? Overall, these requirements seem to simply be copied from UAAG and not specific to <video> or media at all. I suggest rewriting the section to be more specific about precisely how to integrate which features with accessibility frameworks.
Leif Halvard Silli	Accept this section as proposed
Gregory Rosmaita	Accept this section as proposed
Eric Carlson	Do not accept this section for the following reasons	(MD-3) I don't understand this, an example would be helpful. (MD-4) "This assumes the video element will write to the DOM" - does this mean the media element's properties are accessible via the DOM? (MD-5) "e.g., by checking a box ..." because this document is about media accessibility, the example should be about a media element (MD-7) I don't understand this.
aurélien levy	Accept this section as proposed
Marco Ranon
Masatomo Kobayashi	Accept this section as proposed
Denis Boudreau	Do not accept this section for the following reasons	I agree with Philip that these requirements don't seem to be related to media or <video> at all. More details need to be provided.
Joshue O'Connor	Accept this section as proposed
Jon Gunderson	Accept this section as proposed
Richard Schwerdtfeger	Do not accept this section for the following reasons	Again, now we call change from accessibility interfaces and API to architecture. simply state: <change> If any of the following properties are supported by the accessibility platform architecture, make the properties available to the accessibility platform architecture (UAAG 2.0 2.1.6): </change> <to> If any of the following properties are supported by the accessibility API implemented for the platform, make the properties available to the accessibility platform architecture (UAAG 2.0 2.1.6): </to> UAAG 2.0 is not a recommendation and has not gone through full WAI review even once. The wording is not consistent with what is being specified for the 508 refresh. This adds to the confusion. Consequently, I have issues with a lot of the wording from UAAG 2.0 and have never had a chance to review it. While I support the intent of this section tieing it to UAAG 2.0 at this state in the game is problematic.
Laura Carlson	Accept this section with the following changes	Needs to be reworked to be more applicable to media.
Sean Hayes	Abstain	didnt get to this section.
John Foliot	Accept this section as proposed
Silvia Pfeiffer	Accept this section as proposed
Judy Brewer	Accept this section as proposed

More details on responses

Philip Jägenstedt: last responded on 28, May 2010 at 11:35 (UTC)
Leif Halvard Silli: last responded on 1, June 2010 at 17:37 (UTC)
Gregory Rosmaita: last responded on 1, June 2010 at 20:37 (UTC)
Eric Carlson: last responded on 2, June 2010 at 04:23 (UTC)
aurélien levy: last responded on 2, June 2010 at 09:29 (UTC)
Marco Ranon: last responded on 2, June 2010 at 14:59 (UTC)
Masatomo Kobayashi: last responded on 3, June 2010 at 04:47 (UTC)
Denis Boudreau: last responded on 3, June 2010 at 05:30 (UTC)
Joshue O'Connor: last responded on 3, June 2010 at 10:08 (UTC)
Jon Gunderson: last responded on 4, June 2010 at 15:57 (UTC)
Richard Schwerdtfeger: last responded on 6, June 2010 at 00:51 (UTC)
Laura Carlson: last responded on 6, June 2010 at 21:24 (UTC)
Sean Hayes: last responded on 7, June 2010 at 10:42 (UTC)
John Foliot: last responded on 7, June 2010 at 23:55 (UTC)
Silvia Pfeiffer: last responded on 8, June 2010 at 01:21 (UTC)
Judy Brewer: last responded on 8, June 2010 at 03:58 (UTC)

Everybody has responded to this questionnaire.

Compact view of the results / list of email addresses of the responders

WBS home / Questionnaires / WG questionnaires / Answer this questionnaire

Report issues on GitHub project w3c/wbs-design or by mail to sysreq.

W3C

Site Navigation

Nearby

Results of Questionnaire Review of Media Accessibility Requirements

1. Audio Description

Details

2. Texted Audio Description

Details

3. Extended audio description

Details

4. Clear audio

Details

5. Content Navigation by Content Structure

Details

6. Captioning

Details

7. Extended Captioning

Details

8. Sign Translation

Details

9. Transcripts

Details

10. Keyboard Access to interactive controls / menus

Details

11. Granularity Level Control for Structural Navigation

Details

12. Time Scale Modification

Details

13. Production practice and resulting requirements

Details

14. Discovery and activation/deactivation of available alternative content by the user

Details

15. Requirements on making properties available to the accessibility interface

Details

16. Requirements on the use of the viewport

Details

17. Requirements on the parallel use of alternate content on potentially multiple devices in parallel

Details

More details on responses

Footer Navigation

Contact W3C