Open audio descriptions are integrated into
the program audio track and are heard by everyone. They cannot
be turned off.
Closed audio descriptions can be turned on
and off by viewers.
Audio descriptions are usually timed to play during pauses or
breaks in narration or dialog, although extended
audio descriptions may be implemented where necessary.
In cases where no pauses are available, a single summary,
called a pre-description, can be inserted at the beginning of
Audio-description tracks can be presented as pre-recorded
human-recorded speech or text-to-speech (TTS) audio, or they can
be text tracks that are delivered on the fly invisibly and read
aloud by screen readers.
Most described content today is presented with open descriptions,
using one of two options:
Two separate videos, one with open descriptions, and the other
with no descriptions. Authors give users a link or some other
method to choose one or the other.
A single video that contains two audio tracks, one with
descriptions and one without. Authors give users a button or
menu to switch from one track to the other.
Pre-produced audio descriptions
Describing a video can be a time-consuming and complex process,
depending on the subject matter. Before beginning, take a look at
the description decision tree to
determine if descriptions are even necessary. For longer videos,
it may be more time- and cost-efficient to hire a professional
audio-description service provider to write and record
Descriptions are usually recorded as human narration before being
integrated into the video presentation, but technology and markup
now exist to convey descriptions as text which are read aloud on
the fly by screen readers or other text-to-speech (TTS) methods. Read more about text-to-speech descriptions.
Production workflow: audio descriptions
Basic workflow for creating pre-produced audio descriptions:
When recording the descriptions, it will pay to create the
highest-quality audio files possible. Keep these points in mind:
Use the highest-quality microphone and recording software
Use a microphone stand and speak clearly into the microphone.
Record the descriptions in a room that is isolated from all
Avoid rooms with hard surfaces (e.g., tile or wood floors).
When mixing the descriptions into the program audio, lower the
program-audio level when the description plays while
simultaneously raising the description's audio level. When the
description is finished playing, lower the description audio
level and raise the program-audio level to its proper setting.
Repeat this process (known as "ducking") for every description
Production workflow: audio descriptions (TTS
TTS descriptions are not pre-recorded. Instead, they are
transmitted at the proper intervals to users during playback, and
are read aloud by the user's screen reader. Think of them as an
invisible text track that screen readers can read aloud as the
text is delivered. See examples
of TTS descriptions. The basic workflow for TTS audio
descriptions generally follows this pattern:
Basic workflow for creating pre-produced TTS audio descriptions:
Below is an image of a caption editor being used to timestamp an
Using the track element and the kind
attribute, the descriptions can be delivered at the time of
playback and a screen reader will read them aloud. Below is a code
The kind attribute will cause the description file
to be received invisibly (e.g., off-screen) so sighted users will
not see it, but screen readers will be aware of it. Screen readers
will then read the description text as it is delivered,
synchronized at the time of playback. Read more about techniques
for delivering TTS descriptions.
Unfortunately, practical screen-reader/native-browser support for
delivery of TTS descriptions using track and kind
does not exist. See functioning
examples of TTS descriptions using the track
will read off-screen descriptions aloud.
Typically, descriptions are written to fit into natural pauses in
narration or dialog. However, there will be circumstances where
the pauses are not long enough to accommodate a full description.
In these cases, extended descriptions may be implemented. In an
extended description, the video and audio tracks are
programmatically paused when the description begins playing. When
the description is finished playing, the video and audio tracks
are programmatically resumed. At the next instance of an extended
description, the process is repeated. Note that extended and
"regular" descriptions may be mixed in a single multimedia
These tutorials provide best-practice guidance on
implementing accessibility in different situations. This page
combined the following WCAG 2.0 success criteria and
techniques from different conformance levels:
Audio Description or Media Alternative: An
alternative for time-based media or audio description of
the prerecorded video content is provided for synchronized
media, except when the media is a media alternative for
text and is clearly labeled as such. (Level A)
Extended Audio Description (Prerecorded): Where pauses
in foreground audio are insufficient to allow audio
descriptions to convey the sense of the video, extended
audio description is provided for all prerecorded video
content in synchronized media. (Level AAA)