Techniques for WCAG 2.0

Skip to Content (Press Enter)


G203: Using a static text alternative to describe a talking head video

Important Information about Techniques

See Understanding Techniques for WCAG Success Criteria for important information about the usage of these informative techniques and how they relate to the normative WCAG 2.0 success criteria. The Applicability section explains the scope of the technique, and the presence of techniques for a specific technology does not imply that the technology can be used in all situations to create content that meets WCAG 2.0.


Videos of only a speaker

This technique relates to:


The purpose of this technique is to provide an alternative to audio description for synchronized media that has no important time based information contained in the video portion of the media. This particularly applies to "talking head" videos where a person is talking in front of an unchanging background, such as a press conference, company president talk, or government announcement, etc. In this case there are no "important visual details" which would warrant audio description.

Audio description is not necessary when there is one person speaking against an unchanging background because there is no time-based visual information in the video that is "important" to the understanding of the content. The environment is static and therefore can be described in a non-multimedia static format such as alternative text that is programmatically associated with the video.

All that is necessary in this case is a static text alternative which would contain a general description of the context of the environment, any opening/closing credits, any text that appears in the bottom of the video with the name of the speaker, and other basic information, if these are seen on the screen and cannot be heard in the audio.

This technique does NOT apply to a situation where there are multiple speakers and where the identity of each new speaker is not evident in the audio track but is identified on screen with visual text as they speak. In this case, audio description should be used, and this technique would not apply.


Example 1: A video of a CEO speaking to shareholders

A CEO is speaking to shareholders from his office. The video has a title page at the beginning of the video giving the date. When the speaker begins, there is a strip of text at the bottom of the video saying "John Doe, President of XYZ Cooperation". At the end of the video are title credits that say "produced by the Honest TV Productions Ltd."

As an alternative, there is a paragraph below the video which is associated with the video file using aria-describedby which says: "July 22, 2011, John Doe, President of XYZ cooperation, speaking from his office. Video produced by produced by the Honest TV Productions Ltd."



  1. Check that there is no important time-based information in the video track

  2. Check that the programmatically associated description of the media contains any context of the content that is not contained in the audio track (e.g. speaker identification, credits, context)

Expected Results

If this is a sufficient technique for a success criterion, failing this test procedure does not necessarily mean that the success criterion has not been satisfied in some other way, only that this technique has not been successfully implemented and can not be used to claim conformance.