Audio Content and Video Content

in Making Audio and Video Media Accessible


This page addresses accessibility considerations when planning, scripting, storyboarding, recording, and producing audio and video.

Some of the guidance below is related to requirements in Web Content Accessibility Guidelines (WCAG) and has links to a separate resource. (The Planning page of this resource introduces the WCAG Standard.) Other guidance is good practice.

Additional guidance is in the resource Making Events Accessible - Checklist for meetings, conferences, training, and presentations that are remote/virtual, in-person, or hybrid:


Create high-quality audio – recording setup

Use low background audio – recording, post-production (WCAG AAA)

When the main audio is a person speaking and you have background music, set the levels so people with hearing or cognitive disabilities can easily distinguish the speaking from the background.

Specifically, make the background sounds at least 20 decibels lower than the foreground speech content (with the exception of occasional sounds that last for only one or two seconds).

Avoid sounds that can be distracting or irritating, such as some high pitches and repeating patterns.

More information is in Understanding Success Criterion 1.4.7: Low or No Background Audio (AAA).

Speak clearly and slowly – speakers

Speak clearly. This is important for people wanting to understand the content, and for captioners.

Speak as slowly as appropriate. This will enable listeners to understand better, and make the timing better for captions and sign language.

Give people time to process information – speakers, post-production

Pause between topics.

Use clear language – script

Avoid or explain jargon, acronyms, and idioms. For example, expressions such as “raising the bar” can be interpreted literally by some people with cognitive disabilities and can be confusing.

Provide redundancy for sensory characteristics – script (WCAG A)

Make your information work for people who cannot see and/or cannot hear.

For example, instead of saying:

Attach this to the green end.


Attach the small ring to the green end, which is the larger end.

More information that primarily addresses web pages, yet is relevant to audio and video, is in Understanding Success Criterion 1.3.3: Sensory Characteristics (A).


Avoid causing seizures – storyboarding, post-production (WCAG A)

Avoid anything that flashes more than three times in any one second period.

More information is in Understanding Success Criterion 2.3.2: Three Flashes (AAA) and Understanding Success Criterion 2.3.1: Three Flashes or Below Threshold (A)

Consider speaker visibility – storyboarding, recording, post-production

Some people use mouth movement to help understand spoken language. When feasible, ensure that the speaker’s face is visible and in good light.

Make overlay text readable – storyboarding, post-production (WCAG AA, AAA)

For any text, consider the font family, size, and contrast between the text and background.

More information is in Understanding Success Criterion 1.4.3: Contrast (Minimum) (AA) and Understanding Success Criterion 1.4.6: Contrast (Enhanced) (AAA).

Plan for sign language – storyboarding, script, recording (WCAG AAA)

Often sign languages are provided as an overlay in the bottom right corner of videos. For example: NHS 111 British Sign Language (BSL) Advert (YouTube)

Plan for the video not to include important information that would be obstructed by a sign language overlay.

For other guidance including recording, see another page of this resource: Sign Languages

Plan for description of visual information – storyboarding, recording (WCAG A, AA)

Description provides content to people who are blind and others who cannot see the video adequately. It describes the visual information needed to understand the content, including text displayed in the video.

Plan to either:

Integrate description

For some videos, such as presentations and instructional videos, the best way to handle description is not to need it at all — that is, all the visual information that users need to understand the content is integrated in the main audio. This is called “integrated description”. When planned in advance, this is fairly simple for many videos on the web. For example:

Instead of the speaker saying: The speaker can say:
As you can see on this chart, sales increased significantly from the first quarter to the second quarter. This chart shows that sales increased significantly, from 1 million in the first quarter to 1.3 million in the second quarter.
Whip the mixture until it looks like this. Whip the mixture until the oil, vinegar, and spices are well combined.
Attach this to the green end. Attach the small ring to the green end, which is the larger end.

Here is an example training video with the description integrated in what the trainer is saying (YouTube)

If you want guidance on what to include in description, see the “Description of Visual Information” page, Tips for Writing Description section.

Time for description

For some types of videos, such as dramas, the description of the visual information cannot be smoothly handled by the speakers in the main video. For those videos, the description will be separate.

Where the description is fairly short, plan space in the audio to add the description.

Where the description is longer than you want to leave space in the main audio, you can record extra time in the scene to accommodate the description without having to pause the scene. That is, the same scene is shorter in the main video. In the described version, that same scene is a little longer at the beginning or the end of it. For example:

Narration Main Video Scene Duration Described Video Scene Duration Description
Captions are also handy for people who want to watch video in loud environments. 3 seconds 7 seconds A man is watching the captioned video with a group of people chatting away next to him.
Or where you need to be very, very quiet. 2 seconds 5 seconds Turns out that they are in a library. The group is shushed by the librarian.

An example of this is the Web Accessibility Perspectives: Video Captions video. The main video is 48 seconds long. The described version is 1 minute and 18 seconds long, yet there are no pauses in the visual aspect of the video.

More about description

More information is in the next page of this resource: Description of Visual Information.

Back to Top