This Wiki page is edited by participants of the WCAG Working Group. It does not necessarily represent consensus and it may have incorrect information or information that is not supported by other Working Group participants, WAI, or W3C. It may also have some very useful information.

Using the track element to provide captions

Revision as of 16:50, 2 April 2012 by Lguarino (Talk | contribs)

Jump to: navigation, search


  • New technique.

This technique would be listed as one of the technology-specific techniques implementing G87, listed in the third bullet of sufficient techniques in Understanding 1.2.2.


  • HTML5

WCAG references

Note: This technique is tentatively listed as applicable to SC 1.2.4 Captions(Live). I do not think it will prove to be applicable for Captions(Live), so I have not listed it here. Live captions will be supported by the HTML5 technique for using the TextTrack Javascript API.

User Agent and Assistive Technology Support Notes

In addition to evaluating the support for the track element in user agents, do we also need to assess the conformance of the player used to activate captions? If custom controls are needed to provide access, do we need to create a technique for the controls and include them as part of the sufficient technique?


The objective of this technique is to use the HTML5 track element to specify a caption timed text track for a video element. Caption timed text tracks contain transcription or translation of the dialogue, sound effects, relevant musical cues, and other relevant audio information, suitable for when sound is unavailable or not clearly audible.

The src attribute of the track element is a URL that is the address of the text track data.

The kind attribute of the track element indicates the kind of information in the timed text. caption text tracks provide the dialogue and also include other sounds important to understanding the video. subtitles contain only the dialogue. If other audio information is important to understanding the video, a subtitle track will not be sufficient to meet the success criteria.

Note: Some cultures use the term "subtitle" for any visible text representation of the audio track. An author may mark up a timed text track in the language of the audio track as kind=subtitle, instead of kind=caption, and may include additional relevant audio information. It is not best practice to use subtitle in this situation, since it may confuse users who are trying to find captions. But such a timed text track would meet the requirements of Success Criterion 1.2.2.


Example 1

A video element for a video in the English language with an English caption track. srt is a common caption format.

  <video poster="myvideo.png" controls>
    <source src="myvideo.mp4" srclang="en" type="video/mp4">
    <track src="" kind="caption" srclang="en" label="English">

Example 2

A video element for a video with both an English and French language source element, and with an English caption track and a French captions track. WebVTT (wtt) and Timed Text Markup Language (ttml) are different caption file formats.

  <video poster="myvideo.png" controls>
    <source src="myvideo.mp4" srclang="en" type="video/mp4">
    <source src="myvideo.webm" srclang="fr" type="video/webm">
    <track src="myvideo_en.wtt" kind="caption" srclang="en" label="English">
    <track src="myvideo_fr.ttml" kind="caption" srclang="fr" label="French">


Related Techniques



For the video element used to play a video:

  1. Check that the video contains a track element of kind caption in the language of the video.

Expected Results

  • Check that #1 is true