Multimedia related UA Techniques

Textual equivalents for audio and video

Textual equivalents present the audio and video (usually including audio tracks) information in a textual form. They enable the understanding of the communicated information without a need to hear the audio or see the video or both. Occasionally the textual equivalents also offer the information in several languages.

Speech and sound effects presented as audio or a single audio track of a video can be easily written down to captions or transcripts. When these are presented to the user it is important to be able to synchronize the text equivalent with the original audio or video material. It is also important that the user can control the positioning of the captions or transcripts so that it is as comfortable as possible and does not obscure the original material.

[Checkpoint 5.2.5] Allow the user to specify that captions (MRK: or transcripts) for audio (MRK: or audio track in a video) be rendered at the same time as the audio. [Priority 2]

[Checkpoint 5.1.10] Allow the user to control the position of captions (MRK; and transcripts? What about descriptions? Or textual equivalents?). [Priority 1]

In addition to speech and sound, also the visual information in video can be written down to a textual description. This can then be presented as text or as audio. Descriptions make things a bit more complex as they add more combinations for presenting the original information. There may be several textual equivalents that need to be presented to the user at the same time. In the following is an example of a video of mathematics lecture.

The video is showing mathematics professor who writes complex equations and graphs on the overhead while discussing them with the students. She does not describe what she actually writes on the overhead. A description of the video is badly needed by people with visual impairments or when the video cannot be seen for other reasons.

In the previous example both the textual description and the original audio need to be presented by audio means. In other cases it might be beneficial to have the textual description synchronized with the presentation of the original video. For instance, it could explain some cultural habits or draw attention to some details that are hard to understand otherwise. In this case the user needs to be able to specify that the description needs to be synchronized with the visual information in the video.

[Checkpoint 5.2.8] Allow the user to specify that captions or descriptions for video be rendered at the same time as the video. [Priority 1]


Audio equivalents for video

The user using an audio description might choose either to see the video at the same time or only hear the description. In case the video is seen the user should be able to control the synchronization of the audio with the video.

[Checkpoint 5.2.9] Allow the user to specify that audio descriptions of video be rendered at the same time as the video. [Priority 2]

In case the user needs to hear both the audio description and the original audio tracks of the video, the user should be given easy controls for repeating last part of the audio, silencing or lifting up the volume of the different tracks and listening them sequentially in suitable blocks e.g. paragraph after paragraph.


(MRK: These points are related also to these issues, but I guess we only refer to them once?)

[Checkpoint 5.2.7] If a technology allows for more than one audio track for video, allow the user to choose from among tracks. [Priority 1]

[Checkpoint 5.2.4] If a technology allows for more than one caption or description track for audio, allow the user to choose from among tracks. [Priority 1]