Media
elements (audio
and video
,
in this specification) implement the following interface:
interface HTMLMediaElement : HTMLElement { // error state readonly attribute MediaError? error; // network state attribute DOMString src; readonly attribute DOMString currentSrc; attribute DOMString crossOrigin; const unsigned short NETWORK_EMPTY = 0; const unsigned short NETWORK_IDLE = 1; const unsigned short NETWORK_LOADING = 2; const unsigned short NETWORK_NO_SOURCE = 3; readonly attribute unsigned short networkState; attribute DOMString preload; readonly attribute TimeRanges buffered; void load(); DOMString canPlayType(DOMString type); // ready state const unsigned short HAVE_NOTHING = 0; const unsigned short HAVE_METADATA = 1; const unsigned short HAVE_CURRENT_DATA = 2; const unsigned short HAVE_FUTURE_DATA = 3; const unsigned short HAVE_ENOUGH_DATA = 4; readonly attribute unsigned short readyState; readonly attribute boolean seeking; // playback state attribute double currentTime; readonly attribute unrestricted double duration; readonly attribute Date startDate; readonly attribute boolean paused; attribute double defaultPlaybackRate; attribute double playbackRate; readonly attribute TimeRanges played; readonly attribute TimeRanges seekable; readonly attribute boolean ended; attribute boolean autoplay; attribute boolean loop; void play(); void pause(); // media controller attribute DOMString mediaGroup; attribute MediaController? controller; // controls attribute boolean controls; attribute double volume; attribute boolean muted; attribute boolean defaultMuted; // tracks readonly attribute AudioTrackList audioTracks; readonly attribute VideoTrackList videoTracks; readonly attribute TextTrackList textTracks; TextTrack addTextTrack(DOMString kind, optional DOMString label, optional DOMString language); };
The media element
attributes, src
, crossorigin
, preload
, autoplay
, mediagroup
, loop
, muted
, and controls
, apply to all media elements. They
are defined in this section.
Media elements are used to present audio data, or video and audio data, to the user. This is referred to as media data in this section, since this section applies equally to media elements for audio or for video. The term media resource is used to refer to the complete set of media data, e.g. the complete video file, or complete audio file.
A media resource can have multiple audio and
video tracks. For the purposes of a media element, the video data of the
media resource is only that of the
currently selected track (if any) given by the element's
videoTracks
attribute, and the audio data of the
media resource is the result of mixing
all the currently enabled tracks (if any) given by the element's
audioTracks
attribute.
Both audio
and video
elements can be used for both audio and video. The main difference
between the two is simply that the audio
element has no playback area for visual content (such as video or
captions), whereas the video
element does.
error
Returns a MediaError
object representing the
current error state of the element.
Returns null if there is no error.
interface MediaError { const unsigned short MEDIA_ERR_ABORTED = 1; const unsigned short MEDIA_ERR_NETWORK = 2; const unsigned short MEDIA_ERR_DECODE = 3; const unsigned short MEDIA_ERR_SRC_NOT_SUPPORTED = 4; readonly attribute unsigned short code; };
error
.
code
Returns the current error's error code, from the list below.
MEDIA_ERR_ABORTED
(numeric value 1)MEDIA_ERR_NETWORK
(numeric value 2)MEDIA_ERR_DECODE
(numeric value 3)MEDIA_ERR_SRC_NOT_SUPPORTED
(numeric value 4)src
attribute was not suitable.The src
content attribute on
media elements gives
the address of the media resource (video, audio) to show. The
attribute, if present, must contain a valid
non-empty URL potentially surrounded by spaces.
The crossorigin
content
attribute on media elements is a
CORS settings attribute.
currentSrc
Returns the address of the current media resource.
Returns the empty string when there is no media resource.
There are two ways to specify a media resource, the src
attribute, or source
elements. The attribute overrides
the elements.
A media resource can be described in terms of
its type, specifically a MIME type, in some cases with a codecs
parameter. (Whether the codecs
parameter is allowed or not depends on the MIME
type.) [RFC4281]
Types are usually somewhat incomplete descriptions; for example
"video/mpeg
" doesn't say anything except what
the container type is, and even a type like "video/mp4; codecs="avc1.42E01E, mp4a.40.2"
" doesn't
include information like the actual bitrate (only the maximum
bitrate). Thus, given a type, a user agent can often only know
whether it might be able to play media of that type (with
varying levels of confidence), or whether it definitely
cannot play media of that type.
A type that the user agent knows it cannot render is one that describes a resource that the user agent definitely does not support, for example because it doesn't recognize the container type, or it doesn't support the listed codecs.
The MIME type "application/octet-stream
"
with no parameters is never a type
that the user agent knows it cannot render. User agents must
treat that type as equivalent to the lack of any explicit Content-Type metadata when it is used to label a
potential media resource.
"application/octet-stream
"
is special-cased here; if any parameter appears with it, it
should
be treated just like any other MIME type. This is a deviation from the rule
that unknown MIME type parameters should be ignored.
canPlayType
(type)Returns the empty string (a negative response), "maybe", or "probably" based on how confident the user agent is that it can play media resources of the given type.
This script tests to see if the user agent supports a
(fictional) new format to dynamically decide whether to use a
video
element or a plugin:
<section id="video"> <p><a href="playing-cats.nfv">Download video</a></p> </section> <script> var videoSection = document.getElementById('video'); var videoElement = document.createElement('video'); var support = videoElement.canPlayType('video/x-new-fictional-format;codecs="kittens,bunnies"'); if (support != "probably" && "New Fictional Video Plugin" in navigator.plugins) { // not confident of browser support // but we have a plugin // so use plugin instead videoElement = document.createElement("embed"); } else if (support == "") { // no support from browser and no plugin // do nothing videoElement = null; } if (videoElement) { while (videoSection.hasChildNodes()) videoSection.removeChild(videoSection.firstChild); videoElement.setAttribute("src", "playing-cats.nfv"); videoSection.appendChild(videoElement); } </script>
The type
attribute of the source
element allows the user agent to
avoid downloading resources that use formats it cannot render.
networkState
Returns the current state of network activity for the element, from the codes in the list below.
NETWORK_EMPTY
(numeric
value 0)NETWORK_IDLE
(numeric
value 1)NETWORK_LOADING
(numeric value 2)NETWORK_NO_SOURCE
(numeric value 3)load
()Causes the element to reset and start selecting and loading a new media resource from scratch.
The preload
attribute is an
enumerated attribute. The following
table lists the keywords and states for the attribute — the
keywords in the left column map to the states in the cell in the
second column on the same row as the keyword. The attribute can be
changed even once the media resource is being buffered or played;
the descriptions in the table below are to be interpreted with that
in mind.
Keyword | State | Brief description |
---|---|---|
none |
None | Hints to the user agent that either the author does not expect the user to need the media resource, or that the server wants to minimise unnecessary traffic. This state does not provide a hint regarding how aggressively to actually download the media resource if buffering starts anyway (e.g. once the user hits "play"). |
metadata |
Metadata | Hints to the user agent that the author does not expect the
user to need the media resource, but that fetching the resource
metadata (dimensions, track list, duration, etc), and maybe even
the first few frames, is reasonable. If the user agent precisely
fetches no more than the metadata, then the media element will end up with its
readyState attribute set to HAVE_METADATA ; typically though, some
frames will be obtained as well and it will probably be
HAVE_CURRENT_DATA or HAVE_FUTURE_DATA . When the media
resource is playing, hints to the user agent that bandwidth is to
be considered scarce, e.g. suggesting throttling the download so
that the media data is obtained at the slowest possible rate that
still maintains consistent playback. |
auto |
Automatic | Hints to the user agent that the user agent can put the user's needs first without risk to the server, up to and including optimistically downloading the entire resource. |
The empty string is also a valid keyword, and maps to the Automatic state. The attribute's missing value default is user-agent defined, though the Metadata state is suggested as a compromise between reducing server load and providing an optimal user experience.
Authors might switch the attribute from
"none
" or "metadata
" to "auto
" dynamically once the user begins
playback. For example, on a page with many videos this might be
used to indicate that the many videos are not to be downloaded
unless requested, but that once one is requested it is to
be downloaded aggressively.
The autoplay
attribute can override the
preload
attribute (since if the media
plays, it naturally has to buffer first, regardless of the hint
given by the preload
attribute). Including both is not
an error, however.
buffered
Returns a TimeRanges
object that represents the
ranges of the media resource that the user agent has
buffered.
duration
Returns the length of the media resource, in seconds, assuming that the start of the media resource is at time zero.
Returns NaN if the duration isn't available.
Returns Infinity for unbounded streams.
currentTime
[ = value ]Returns the official playback position, in seconds.
Can be set, to seek to the given time.
Will throw an InvalidStateError
exception if there is no selected media resource or if there is a
current media controller.
The loop
attribute is a boolean attribute that, if specified,
indicates that the media element is to seek back to the start
of the media resource upon reaching the end.
The loop
attribute has no effect while the element
has a
current media controller.
readyState
Returns a value that expresses the current state of the element with respect to rendering the current playback position, from the codes in the list below.
HAVE_NOTHING
(numeric
value 0)No information regarding the media resource is available. No data for
the
current playback position is available. Media elements whose
networkState
attribute are set to NETWORK_EMPTY
are always in the
HAVE_NOTHING
state.
HAVE_METADATA
(numeric
value 1)Enough of the resource has been obtained that the duration of
the resource is available. In the case of a video
element, the dimensions of the video are also available. The API
will no longer throw an exception when seeking. No media data is available for the immediate
current playback position.
HAVE_CURRENT_DATA
(numeric value 2)Data for the immediate
current playback position is available, but either not enough
data is available that the user agent could successfully advance
the
current playback position in the
direction of playback at all without immediately reverting to
the HAVE_METADATA
state, or there is no
more data to obtain in the
direction of playback. For example, in video this corresponds
to the user agent having data from the current frame, but not the
next frame, when the
current playback position is at the end of the current frame;
and to when playback has ended.
HAVE_FUTURE_DATA
(numeric value 3)Data for the immediate
current playback position is available, as well as enough data
for the user agent to advance the
current playback position in the
direction of playback at least a little without immediately
reverting to the HAVE_METADATA
state, and the text tracks are ready. For
example, in video this corresponds to the user agent having data
for at least the current frame and the next frame when the
current playback position is at the instant in time between the
two frames, or to the user agent having the video data for the
current frame and audio data to keep playing at least a little when
the
current playback position is in the middle of a frame. The user
agent cannot be in this state if playback has ended, as the
current playback position can never advance in this case.
HAVE_ENOUGH_DATA
(numeric value 4)All the conditions described for the HAVE_FUTURE_DATA
state are met,
and, in addition, either of the following conditions is also
true:
In practice, the difference between HAVE_METADATA
and HAVE_CURRENT_DATA
is
negligible. Really the only time the difference is relevant is when
painting a video
element onto a canvas
,
where it distinguishes the case where something will be drawn
(HAVE_CURRENT_DATA
or greater)
from the case where nothing is drawn (HAVE_METADATA
or less). Similarly,
the difference between HAVE_CURRENT_DATA
(only the
current frame) and HAVE_FUTURE_DATA
(at least this
frame and the next) can be negligible (in the extreme, only one
frame). The only time that distinction really matters is when a
page provides an interface for "frame-by-frame" navigation.
It is possible for the ready state of a media
element to jump between these states discontinuously. For example,
the state of a media element can jump straight from HAVE_METADATA
to HAVE_ENOUGH_DATA
without passing
through the HAVE_CURRENT_DATA
and
HAVE_FUTURE_DATA
states.
The autoplay
attribute is a
boolean attribute. When present, the
user agent will automatically begin playback of the media resource as soon as it can do so
without stopping.
Authors are urged to use the autoplay
attribute rather than using
script to trigger automatic playback, as this allows the user to
override the automatic playback when it is not desired, e.g. when
using a screen reader. Authors are also encouraged to consider not
using the automatic playback behavior at all, and instead to let
the user agent wait for the user to start playback explicitly.
paused
Returns true if playback is paused; false otherwise.
ended
Returns true if playback has reached the end of the media resource.
defaultPlaybackRate
[ = value ]Returns the default rate of playback, for when the user is not fast-forwarding or reversing through the media resource.
Can be set, to change the default rate of playback.
The default rate has no direct effect on playback, but if the user switches to a fast-forward mode, when they return to the normal playback mode, it is expected that the rate of playback will be returned to the default rate of playback.
When the element has a
current media controller, the
defaultPlaybackRate
attribute is ignored and the
current media controller's
defaultPlaybackRate
is used instead.
playbackRate
[ = value ]Returns the current rate playback, where 1.0 is normal speed.
Can be set, to change the rate of playback.
When the element has a
current media controller, the
playbackRate
attribute is ignored and the
current media controller's
playbackRate
is used instead.
played
Returns a TimeRanges
object that represents the
ranges of the media resource that the user agent has
played.
play
()Sets the paused
attribute to false, loading the media resource and beginning playback if
necessary. If the playback had ended, will restart it from the
start.
pause
()Sets the paused
attribute to true, loading the media resource if necessary.
seeking
Returns true if the user agent is currently seeking.
seekable
Returns a TimeRanges
object that represents the
ranges of the media resource to which it is possible for
the user agent to seek.
A media resource can have multiple embedded audio and video tracks. For example, in addition to the primary video and audio tracks, a media resource could have foreign-language dubbed dialogues, director's commentaries, audio descriptions, alternative angles, or sign-language overlays.
audioTracks
Returns an AudioTrackList
object representing
the audio tracks available in the media resource.
videoTracks
Returns a VideoTrackList
object representing
the video tracks available in the media resource.
In this example, a script defines a function that takes a URL to a video and a reference to an element where the video is to be placed. That function then tries to load the video, and, once it is loaded, checks to see if there is a sign-language track available. If there is, it also displays that track. Both tracks are just placed in the given container; it's assumed that styles have been applied to make this work in a pretty way!
<script> function loadVideo(url, container) { var controller = new MediaController(); var video = document.createElement('video'); video.src = url; video.autoplay = true; video.controls = true; video.controller = controller; container.appendChild(video); video.onloadedmetadata = function (event) { for (var i = 0; i < video.videoTracks.length; i += 1) { if (video.videoTracks[i].kind == 'sign') { var sign = document.createElement('video'); sign.src = url + '#track=' + video.videoTracks[i].id; sign.autoplay = true; sign.controller = controller; container.appendChild(sign); return; } } }; } </script>
AudioTrackList
and VideoTrackList
objectsThe AudioTrackList
and VideoTrackList
interfaces are used by
attributes defined in the previous section.
interface AudioTrackList : EventTarget { readonly attribute unsigned long length; getter AudioTrack (unsigned long index); AudioTrack? getTrackById(DOMString id); attribute EventHandler onchange; attribute EventHandler onaddtrack; attribute EventHandler onremovetrack; }; interface AudioTrack { readonly attribute DOMString id; readonly attribute DOMString kind; readonly attribute DOMString label; readonly attribute DOMString language; attribute boolean enabled; }; interface VideoTrackList : EventTarget { readonly attribute unsigned long length; getter VideoTrack (unsigned long index); VideoTrack? getTrackById(DOMString id); readonly attribute long selectedIndex; attribute EventHandler onchange; attribute EventHandler onaddtrack; attribute EventHandler onremovetrack; }; interface VideoTrack { readonly attribute DOMString id; readonly attribute DOMString kind; readonly attribute DOMString label; readonly attribute DOMString language; attribute boolean selected; };
audioTracks
.
length
videoTracks
.
length
Returns the number of tracks in the list.
audioTracks
[index]
videoTracks
[index]Returns the specified AudioTrack
or VideoTrack
object.
audioTracks
.
getTrackById
( id )
videoTracks
.
getTrackById
( id )Returns the AudioTrack
or VideoTrack
object with the given
identifier, or null if no track has that identifier.
id
id
Returns the ID of the given track. This is the ID that can be
used with a fragment identifier if the format supports the
Media Fragments URI syntax, and that can be used with
the getTrackById()
method. [MEDIAFRAG]
kind
kind
Returns the category the given track falls into. The possible track categories are given below.
label
label
Returns the label of the given track, if known, or the empty string otherwise.
language
language
Returns the language of the given track, if known, or the empty string otherwise.
enabled
[ = value ]Returns true if the given track is active, and false otherwise.
Can be set, to change whether the track is enabled or not. If multiple audio tracks are enabled simultaneously, they are mixed.
videoTracks
.
selectedIndex
Returns the index of the currently selected track, if any, or −1 otherwise.
selected
[ = value ]Returns true if the given track is active, and false otherwise.
Can be set, to change whether the track is selected or not. Either zero or one video track is selected; selecting a new track while a previous one is selected will unselect the previous one.
Category | Definition | Applies to... | Examples |
---|---|---|---|
"alternative " |
A possible alternative to the main track, e.g. a different take of a song (audio), or a different angle (video). | Audio and video. | Ogg: "audio/alternate" or "video/alternate"; DASH: "alternate" without "main" and "commentary" roles, and, for audio, without the "dub" role (other roles ignored). |
"captions " |
A version of the main video track with captions burnt in. (For legacy content; new content would use text tracks.) | Video only. | DASH: "caption" and "main" roles together (other roles ignored). |
"description " |
An audio description of a video track. | Audio only. | Ogg: "audio/audiodesc". |
"main " |
The primary audio or video track. | Audio and video. | Ogg: "audio/main" or "video/main"; WebM: the "FlagDefault" element is set; DASH: "main" role without "caption", "subtitle", and "dub" roles (other roles ignored). |
"main-desc " |
The primary audio track, mixed with audio descriptions. | Audio only. | AC3 audio in MPEG-2 TS: bsmod=2 and full_svc=1. |
"sign " |
A sign-language interpretation of an audio track. | Video only. | Ogg: "video/sign". |
"subtitles " |
A version of the main video track with subtitles burnt in. (For legacy content; new content would use text tracks.) | Video only. | DASH: "subtitle" and "main" roles together (other roles ignored). |
"translation " |
A translated version of the main audio track. | Audio only. | Ogg: "audio/dub". DASH: "dub" and "main" roles together (other roles ignored). |
"commentary " |
Commentary on the primary audio or video track, e.g. a director's commentary. | Audio and video. | DASH: "commentary" role without "main" role (other roles ignored). |
"" (empty string) | No explicit kind, or the kind given by the track's metadata is not recognised by the user agent. | Audio and video. | Any other track type, track role, or combination of track roles not described above. |
The
audioTracks
and
videoTracks
attributes allow scripts to select which
track should play, but it is also possible to select specific
tracks declaratively, by specifying particular tracks in the
fragment identifier of the URL of the media resource. The format of the fragment
identifier depends on the MIME type of the media resource. [RFC2046]
[RFC3986]
In this example, a video that uses a format that supports the Media Fragments URI fragment identifier syntax is embedded in such a way that the alternative angles labeled "Alternative" are enabled instead of the default video track. [MEDIAFRAG]
<video src="myvideo#track=Alternative"></video>
Each media element can have a MediaController
. A MediaController
is an object that
coordinates the playback of multiple media elements, for
instance so that a sign-language interpreter track can be overlaid
on a video track, with the two being kept in sync.
By default, a media element has no MediaController
. An implicit
MediaController
can be assigned
using the mediagroup
content attribute. An
explicit MediaController
can be assigned
directly using the
controller
IDL attribute.
Media elements with a
MediaController
are said to be
slaved to their controller. The MediaController
modifies the
playback rate and the playback volume of each of the media elements slaved
to it, and ensures that when any of its slaved media elements
unexpectedly stall, the others are stopped at the same time.
When a media element is slaved to a MediaController
, its playback rate
is fixed to that of the other tracks in the same MediaController
, and any looping is
disabled.
enum MediaControllerPlaybackState { "waiting", "playing", "ended" }; [Constructor] interface MediaController : EventTarget { readonly attribute unsigned short readyState; // uses HTMLMediaElement.readyState's values readonly attribute TimeRanges buffered; readonly attribute TimeRanges seekable; readonly attribute unrestricted double duration; attribute double currentTime; readonly attribute boolean paused; readonly attribute MediaControllerPlaybackState playbackState; readonly attribute TimeRanges played; void pause(); void unpause(); void play(); // calls play() on all media elements as well attribute double defaultPlaybackRate; attribute double playbackRate; attribute double volume; attribute boolean muted; attribute EventHandler onemptied; attribute EventHandler onloadedmetadata; attribute EventHandler onloadeddata; attribute EventHandler oncanplay; attribute EventHandler oncanplaythrough; attribute EventHandler onplaying; attribute EventHandler onended; attribute EventHandler onwaiting; attribute EventHandler ondurationchange; attribute EventHandler ontimeupdate; attribute EventHandler onplay; attribute EventHandler onpause; attribute EventHandler onratechange; attribute EventHandler onvolumechange; };
MediaController
()Returns a new MediaController
object.
controller
[ = controller ]Returns the current MediaController
for the media element, if any; returns null
otherwise.
Can be set, to set an explicit MediaController
. Doing so removes
the mediagroup
attribute, if any.
readyState
Returns the state that the MediaController
was in the last
time it fired events as a result of reporting the controller
state. The values of this attribute are the same as for the
readyState
attribute of media elements.
buffered
Returns a TimeRanges
object that represents the
intersection of the time ranges for which the user agent has all
relevant media data for all the slaved media elements.
seekable
Returns a TimeRanges
object that represents the
intersection of the time ranges into which the user agent can seek
for all the slaved media elements.
duration
Returns the difference between the earliest playable moment and the latest playable moment (not considering whether the data in question is actually buffered or directly seekable, but not including time in the future for infinite streams). Will return zero if there is no media.
currentTime
[ = value ]Returns the
current playback position, in seconds, as a position between
zero time and the current
duration
.
Can be set, to seek to the given time.
paused
Returns true if playback is paused; false otherwise. When this attribute is true, any media element slaved to this controller will be stopped.
playbackState
Returns the state that the MediaController
was in the last
time it fired events as a result of reporting the controller
state. The value of this attribute is either "playing
",
indicating that the media is actively playing, "ended
",
indicating that the media is not playing because playback has
reached the end of all the
slaved media elements, or "waiting
",
indicating that the media is not playing for some other reason
(e.g. the MediaController
is paused).
pause
()Sets the
paused
attribute to true.
unpause
()Sets the
paused
attribute to false.
play
()Sets the
paused
attribute to false and invokes the play()
method of each slaved media element.
played
Returns a TimeRanges
object that represents the
union of the time ranges in all the slaved media elements that
have been played.
defaultPlaybackRate
[ = value ]Returns the default rate of playback.
Can be set, to change the default rate of playback.
This default rate has no direct effect on playback, but if the
user switches to a fast-forward mode, when they return to the
normal playback mode, it is expected that rate of playback
(playbackRate
)
will be returned to this default rate.
playbackRate
[ = value ]Returns the current rate of playback.
Can be set, to change the rate of playback.
volume
[ = value ]Returns the current playback volume multiplier, as a number in the range 0.0 to 1.0, where 0.0 is the quietest and 1.0 the loudest.
Can be set, to change the volume multiplier.
Throws an IndexSizeError
if the new value is not in the range 0.0 .. 1.0.
muted
[ = value ]Returns true if all audio is muted (regardless of other attributes either on the controller or on any media elements slaved to this controller), and false otherwise.
Can be set, to change whether the audio is muted or not.
The mediagroup
content
attribute on media elements can be
used to link multiple media elements
together by implicitly creating a MediaController
. The value is text;
media elements with
the same value are automatically linked by the user agent.
Multiple media elements
referencing the same media resource will share a single network
request. This can be used to efficiently play two (video) tracks
from the same media resource in two different places on
the screen. Used with the mediagroup
attribute, these elements
can also be kept synchronised.
In this example, a sign-languge interpreter track from a movie
file is overlaid on the primary video track of that same video file
using two video
elements, some CSS, and an implicit MediaController
:
<article> <style scoped> div { margin: 1em auto; position: relative; width: 400px; height: 300px; } video { position; absolute; bottom: 0; right: 0; } video:first-child { width: 100%; height: 100%; } video:last-child { width: 30%; } </style> <div> <video src="movie.vid#track=Video&track=English" autoplay controls mediagroup=movie></video> <video src="movie.vid#track=sign" autoplay mediagroup=movie></video> </div> </article>
A media element can have a group of associated text tracks, known as the media element's list of text tracks. The text tracks are sorted as follows:
track
element children of the media element, in tree order.
addTextTrack()
method, in the order they were added,
oldest first.A text track consists of:
This decides how the track is handled by the user agent. The kind is represented by a string. The possible strings are:
subtitles
captions
descriptions
chapters
metadata
The kind of track can
change dynamically, in the case of a text track corresponding to a track
element.
This is a human-readable string intended to identify the track for the user.
The label of a
track can change dynamically, in the case of a text track corresponding to a track
element.
When a text track label is the empty string, the user agent should automatically generate an appropriate label from the text track's other properties (e.g. the kind of text track and the text track's language) for use in its user interface. This automatically-generated label is not exposed in the API.
This is a string extracted from the media resource specifically for in-band metadata tracks to enable such tracks to be dispatched to different scripts in the document.
For example, a traditional TV station broadcast streamed on the Web and augmented with Web-specific interactive features could include text tracks with metadata for ad targetting, trivia game data during game shows, player states during sports games, recipe information during food programs, and so forth. As each program starts and ends, new tracks might be added or removed from the stream, and as each one is added, the user agent could bind them to dedicated script modules using the value of this attribute.
Other than for in-band metadata text tracks, the in-band metadata track dispatch type is the empty string. How this value is populated for different media formats is described in steps to expose a media-resource-specific text track.
This is a string (a BCP 47 language tag) representing the language of the text track's cues. [BCP47]
The language of
a text track can change dynamically, in the case of a text track corresponding to a track
element.
One of the following:
Indicates that the text track's cues have not been obtained.
Indicates that the text track is loading and there have been no fatal errors encountered so far. Further cues might still be added to the track by the parser.
Indicates that the text track has been loaded with no fatal errors.
Indicates that the text track was enabled, but when the user agent attempted to obtain it, this failed in some way (e.g. URL could not be resolved, network error, unknown text track format). Some or all of the cues are likely missing and will not be obtained.
The readiness state of a text track changes dynamically as the track is obtained.
One of the following:
Indicates that the text track is not active. Other than for the purposes of exposing the track in the DOM, the user agent is ignoring the text track. No cues are active, no events are fired, and the user agent will not attempt to obtain the track's cues.
Indicates that the text track is active, but that the user agent is not actively displaying the cues. If no attempt has yet been made to obtain the track's cues, the user agent will perform such an attempt momentarily. The user agent is maintaining a list of which cues are active, and events are being fired accordingly.
Indicates that the text track is active. If no attempt has yet
been made to obtain the track's cues, the user agent will perform
such an attempt momentarily. The user agent is maintaining a list
of which cues are active, and events are being fired accordingly.
In addition, for text tracks whose kind is
subtitles
or
captions
, the
cues are being overlaid on the video as appropriate; for text
tracks whose kind is
descriptions
,
the user agent is making the cues available to the user in a
non-visual fashion; and for text tracks whose kind is
chapters
, the
user agent is making available to the user a mechanism by which the
user can navigate to any point in the media resource by selecting a cue.
A list of text track cues, along with rules for updating the text track rendering. For example, for WebVTT, the rules for updating the display of WebVTT text tracks. [WEBVTT]
The list of cues of a text track can change dynamically, either because the text track has not yet been loaded or is still loading, or due to DOM manipulation.
Each text track has a corresponding TextTrack
object.
Each media element has a list of pending text tracks, which must initially be empty, a blocked-on-parser flag, which must initially be false, and a did-perform-automatic-track-selection flag, which must also initially be false.
When the user agent is required to populate the list of pending text tracks of a media element, the user agent must add to the element's list of pending text tracks each text track in the element's list of text tracks whose text track mode is not disabled and whose text track readiness state is loading.
Whenever a track
element's parent node changes, the
user agent must remove the corresponding text track from any list of pending text tracks
that it is in.
Whenever a text track's text track readiness state changes to either loaded or failed to load, the user agent must remove it from any list of pending text tracks that it is in.
When a media element is created by an HTML parser or XML parser, the user agent must set the element's blocked-on-parser flag to true. When a media element is popped off the stack of open elements of an HTML parser or XML parser, the user agent must honor user preferences for automatic text track selection, populate the list of pending text tracks, and set the element's blocked-on-parser flag to false.
The text tracks of a media element are ready when both the element's list of pending text tracks is empty and the element's blocked-on-parser flag is false.
A text track cue is the unit of time-sensitive data in a text track, corresponding for instance for subtitles and captions to the text that appears at a particular time and disappears at another time.
Each text track cue consists of:
An arbitrary string.
The time, in seconds and fractions of a second, that describes the beginning of the range of the media data to which the cue applies.
The time, in seconds and fractions of a second, that describes the end of the range of the media data to which the cue applies.
A boolean indicating whether playback of the media resource is to pause when the end of the range to which the cue applies is reached.
A writing direction, either horizontal (a line extends horizontally and is positioned vertically, with consecutive lines displayed below each other), vertical growing left (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the left of each other), or vertical growing right (a line extends vertically and is positioned horizontally, with consecutive lines displayed to the right of each other).
If the writing direction is horizontal, then line position percentages are relative to the height of the video, and text position and size percentages are relative to the width of the video.
Otherwise, line position percentages are relative to the width of the video, and text position and size percentages are relative to the height of the video.
A boolean indicating whether the line's position is a line position (positioned to a multiple of the line dimensions of the first line of the cue), or whether it is a percentage of the dimension of the video.
Either a number giving the position of the lines of the cue, to be interpreted as defined by the writing direction and snap-to-lines flag of the cue, or the special value auto, which means the position is to depend on the other active tracks.
A text track cue has a text track cue computed line position whose value is that returned by the following algorithm, which is defined in terms of the other aspects of the cue:
If the text track cue line position is numeric, the text track cue snap-to-lines flag of the text track cue is not set, and the text track cue line position is negative or greater than 100, then return 100 and abort these steps.
If the text track cue line position is numeric, return the value of the text track cue line position and abort these steps. (Either the text track cue snap-to-lines flag is set, so any value, not just those in the range 0..100, is valid, or the value is in the range 0..100 and is thus valid regardless of the value of that flag.)
If the text track cue snap-to-lines flag of the text track cue is not set, return the value 100 and abort these steps. (The text track cue line position is the special value auto.)
Let cue be the text track cue.
If cue is not in a list of cues of a text track, or if that text track is not in the list of text tracks of a media element, return −1 and abort these steps.
Let track be the text track whose list of cues the cue is in.
Let n be the number of text tracks whose text track mode is showing and that are in the media element's list of text tracks before track.
Increment n by one.
Negate n.
Return n.
A number giving the position of the text of the cue within each line, to be interpreted as a percentage of the video, as defined by the writing direction.
A number giving the size of the box within which the text of each line of the cue is to be aligned, to be interpreted as a percentage of the video, as defined by the writing direction.
An alignment for the text of each line of the cue, either start alignment (the text is aligned towards its start side), middle alignment (the text is aligned centered between its start and end sides), end alignment (the text is aligned towards its end side). Which sides are the start and end sides depends on the Unicode bidirectional algorithm and the writing direction. [BIDI]
The raw text of the cue, and rules for its interpretation, allowing the text to be rendered and converted to a DOM fragment.
Each text track cue has a corresponding
TextTrackCue
object. A text track cue's in-memory representation
can be dynamically changed through this TextTrackCue
API.
In addition, each text track cue has two pieces of dynamic information:
This flag must be initially unset. The flag is used to ensure events are fired appropriately when the cue becomes active or inactive, and to make sure the right cues are rendered.
The user agent must synchronously unset this flag whenever the
text track cue is removed from its text track's text track list of cues;
whenever the text track itself is removed from its media element's list of text tracks or has its
text track mode changed to disabled; and whenever the media element's
readyState
is changed back to HAVE_NOTHING
. When the flag is unset
in this way for one or more cues in text tracks that were
showing
prior to the relevant incident, the user agent must, after having
unset the flag for all the affected cues, apply the rules for
updating the text track rendering of those text tracks. For example,
for text tracks based on
WebVTT,
the
rules for updating the display of WebVTT text tracks. [WEBVTT]
This is used as part of the rendering model, to keep cues in a consistent position. It must initially be empty. Whenever the text track cue active flag is unset, the user agent must empty the text track cue display state.
The text track cues of a media element's text tracks are ordered relative to each other in the text track cue order, which is determined as follows: first group the cues by their text track, with the groups being sorted in the same order as their text tracks appear in the media element's list of text tracks; then, within each group, cues must be sorted by their start time, earliest first; then, any cues with the same start time must be sorted by their end time, latest first; and finally, any cues with identical end times must be sorted in the order they were last added to their respective text track list of cues, oldest first (so e.g. for cues from a WebVTT file, that would initially be the order in which the cues were listed in the file). [WEBVTT]
A media-resource-specific text track is a text track that corresponds to data found in the media resource.
interface TextTrackList : EventTarget { readonly attribute unsigned long length; getter TextTrack (unsigned long index); attribute EventHandler onaddtrack; attribute EventHandler onremovetrack; };
textTracks
. length
Returns the number of text tracks associated with
the media element (e.g. from track
elements). This is the number of
text tracks in the
media element's list of text tracks.
textTracks[
n ]
Returns the TextTrack
object representing the
nth text track in the media element's list of text tracks.
track
Returns the TextTrack
object representing the
track
element's text track.
enum TextTrackMode { "disabled", "hidden", "showing" }; interface TextTrack : EventTarget { readonly attribute DOMString kind; readonly attribute DOMString label; readonly attribute DOMString language; readonly attribute DOMString inBandMetadataTrackDispatchType; attribute TextTrackMode mode; readonly attribute TextTrackCueList? cues; readonly attribute TextTrackCueList? activeCues; void addCue(TextTrackCue cue); void removeCue(TextTrackCue cue); attribute EventHandler oncuechange; };
addTextTrack
( kind [, label [, language ] ] )Creates and returns a new TextTrack
object, which is also added to
the media element's list of text tracks.
kind
Returns the text track kind string.
label
Returns the text track label, if there is one, or the empty string otherwise (indicating that a custom label probably needs to be generated from the other attributes of the object if the object is exposed to the user).
language
Returns the text track language string.
inBandMetadataTrackDispatchType
Returns the text track in-band metadata track dispatch type string.
mode
[ = value ]Returns the text track mode, represented by a string from the following list:
disabled
"The text track disabled mode.
hidden
"The
mode.showing
"The text track showing mode.
Can be set, to change the mode.
cues
Returns the text track list of cues, as a
TextTrackCueList
object.
activeCues
Returns the text track cues from
the text track list of cues that are
currently active (i.e. that start before the
current playback position and end after it), as a
TextTrackCueList
object.
addCue
( cue )Adds the given cue to textTrack's text track list of cues.
removeCue
( cue )Removes the given cue from textTrack's text track list of cues.
In this example, an audio
element is used to play a specific sound-effect from a sound file
containing many sound effects. A cue is used to pause the audio, so
that it ends exactly at the end of the clip, even if the browser is
busy running some script. If the page had relied on script to pause
the audio, then the start of the next clip might be heard if the
browser was not able to run the script at the exact time
specified.
var sfx = new Audio('sfx.wav'); var sounds = sfx.addTextTrack('metadata'); // add sounds we care about function addFX(start, end, name) { var cue = new TextTrackCue(start, end, ''); cue.id = name; cue.pauseOnExit = true; sounds.addCue(cue); } addFX(12.783, 13.612, 'dog bark'); addFX(13.612, 15.091, 'kitten mew')) function playSound(id) { sfx.currentTime = sounds.getCueById(id).startTime; sfx.play(); } // play a bark as soon as we can sfx.oncanplaythrough = function () { playSound('dog bark'); } // meow when the user tries to leave window.onbeforeunload = function () { playSound('kitten mew'); return 'Are you sure you want to leave this awesome page?'; }
interface TextTrackCueList { readonly attribute unsigned long length; getter TextTrackCue (unsigned long index); TextTrackCue? getCueById(DOMString id); };
length
Returns the number of cues in the list.
Returns the text track cue with index index in the list. The cues are sorted in text track cue order.
getCueById
( id )Returns the first text track cue (in text track cue order) with text track cue identifier id.
Returns null if none of the cues have the given identifier or if the argument is the empty string.
enum AutoKeyword { "auto" }; [Constructor(double startTime, double endTime, DOMString text)] interface TextTrackCue : EventTarget { readonly attribute TextTrack? track; attribute DOMString id; attribute double startTime; attribute double endTime; attribute boolean pauseOnExit; attribute DOMString vertical; attribute boolean snapToLines; attribute (long or AutoKeyword) line; attribute long position; attribute long size; attribute DOMString align; attribute DOMString text; DocumentFragment getCueAsHTML(); attribute EventHandler onenter; attribute EventHandler onexit; };
TextTrackCue
(
startTime, endTime,
text )Returns a new TextTrackCue
object, for use with the
addCue()
method.
The startTime argument sets the text track cue start time.
The endTime argument sets the text track cue end time.
The text argument sets the text track cue text.
Returns the TextTrack
object to which this text track cue belongs, if any, or null
otherwise.
Returns the text track cue identifier.
Can be set.
Returns the text track cue start time, in seconds.
Can be set.
Returns the text track cue end time, in seconds.
Can be set.
Returns true if the text track cue pause-on-exit flag is set, false otherwise.
Can be set.
Returns a string representing the text track cue writing direction, as follows:
The empty string.
The string "rl
".
The string "lr
".
Can be set.
Returns true if the text track cue snap-to-lines flag is set, false otherwise.
Can be set.
Returns the text track cue line
position. In the case of the value being auto, the string
"auto
" is returned.
Can be set.
Returns the text track cue text position.
Can be set.
Returns the text track cue size.
Can be set.
Returns a string representing the text track cue alignment, as follows:
The string "start
".
The string "middle
".
The string "end
".
Can be set.
Returns the text track cue text in raw unparsed form.
Can be set.
Returns the text track cue text as a
DocumentFragment
of HTML elements and other DOM nodes.
Chapters are segments of a media resource with a given title. Chapters can be nested, in the same way that sections in a document outline can have subsections.
Each text track cue in a text track being used for describing chapters has three key features: the text track cue start time, giving the start time of the chapter, the text track cue end time, giving the end time of the chapter, and the text track cue text giving the chapter title.
The following snippet of a WebVTT file shows how nested chapters can be marked up. The file describes three 50-minute chapters, "Astrophysics", "Computational Physics", and "General Relativity". The first has three subchapters, the second has four, and the third has two. [WEBVTT]
WEBVTT 00:00:00.000 --> 00:50:00.000 Astrophysics 00:00:00.000 --> 00:10:00.000 Introduction to Astrophysics 00:10:00.000 --> 00:45:00.000 The Solar System 00:00:00.000 --> 00:10:00.000 Coursework Description 00:50:00.000 --> 01:40:00.000 Computational Physics 00:50:00.000 --> 00:55:00.000 Introduction to Programming 00:55:00.000 --> 01:30:00.000 Data Structures 01:30:00.000 --> 01:35:00.000 Answers to Last Exam 01:35:00.000 --> 01:40:00.000 Coursework Description 01:40:00.000 --> 02:30:00.000 General Relativity 01:40:00.000 --> 02:00:00.000 Tensor Algebra 02:00:00.000 --> 02:30:00.000 The General Relativistic Field Equations
The controls
attribute is a
boolean attribute. If present, it
indicates that the author has not provided a scripted controller
and would like the user agent to provide its own set of
controls.
volume
[ = value ]Returns the current playback volume, as a number in the range 0.0 to 1.0, where 0.0 is the quietest and 1.0 the loudest.
Can be set, to change the volume.
Throws an IndexSizeError
if the new value is not in the range 0.0 .. 1.0.
muted
[ = value ]Returns true if audio is muted, overriding the volume
attribute, and false if the volume
attribute is being honored.
Can be set, to change whether the audio is muted or not.
The muted
attribute on media elements is a
boolean attribute that controls the
default state of the audio output of the media resource, potentially overriding user
preferences.
This attribute has no dynamic effect (it only controls the default state of the element).
This video (an advertisment) autoplays, but to avoid annoying users, it does so without sound, and allows the user to turn the sound on.
<video src="adverts.cgi?kind=video" controls autoplay loop muted></video>
Objects implementing the TimeRanges
interface represent a list of
ranges (periods) of time.
interface TimeRanges { readonly attribute unsigned long length; double start(unsigned long index); double end(unsigned long index); };
length
Returns the number of ranges in the object.
start
(index)Returns the time for the start of the range with the given index.
Throws an IndexSizeError
if the index is out of range.
end
(index)Returns the time for the end of the range with the given index.
Throws an IndexSizeError
if the index is out of range.
[Constructor(DOMString type, optional TrackEventInit eventInitDict)] interface TrackEvent : Event { readonly attribute object? track; }; dictionary TrackEventInit : EventInit { object? track; };
track
Returns the track object (TextTrack
, AudioTrack
, or VideoTrack
) to which the event
relates.
This section is non-normative.
The following events fire on media elements as part of the processing model described above:
Event name | Interface | Fired when... | Preconditions |
---|---|---|---|
loadstart |
Event |
The user agent begins looking for media data, as part of the resource selection algorithm. |
networkState equals NETWORK_LOADING |
progress |
Event |
The user agent is fetching media data. |
networkState equals NETWORK_LOADING |
suspend |
Event |
The user agent is intentionally not currently fetching media data. |
networkState equals NETWORK_IDLE |
abort |
Event |
The user agent stops fetching the media data before it is completely downloaded, but not due to an error. | error
is an object with the code MEDIA_ERR_ABORTED .
networkState equals either NETWORK_EMPTY or NETWORK_IDLE , depending on when the
download was aborted. |
error |
Event |
An error occurs while fetching the media data. | error
is an object with the code MEDIA_ERR_NETWORK or higher.
networkState equals either NETWORK_EMPTY or NETWORK_IDLE , depending on when the
download was aborted. |
emptied |
Event |
A media element whose
networkState was previously not in the NETWORK_EMPTY state has just
switched to that state (either because of a fatal error during load
that's about to be reported, or because the load()
method was invoked while the resource selection
algorithm was already running). |
networkState is NETWORK_EMPTY ; all the IDL
attributes are in their initial states. |
stalled |
Event |
The user agent is trying to fetch media data, but data is unexpectedly not forthcoming. |
networkState is NETWORK_LOADING . |
loadedmetadata |
Event |
The user agent has just determined the duration and dimensions of the media resource and the text tracks are ready. |
readyState is newly equal to HAVE_METADATA or greater for the
first time. |
loadeddata |
Event |
The user agent can render the media data at the current playback position for the first time. |
readyState newly increased to HAVE_CURRENT_DATA or greater
for the first time. |
canplay |
Event |
The user agent can resume playback of the media data, but estimates that if playback were to be started now, the media resource could not be rendered at the current playback rate up to its end without having to stop for further buffering of content. |
readyState newly increased to HAVE_FUTURE_DATA or
greater. |
canplaythrough |
Event |
The user agent estimates that if playback were to be started now, the media resource could be rendered at the current playback rate all the way to its end without having to stop for further buffering. |
readyState is newly equal to HAVE_ENOUGH_DATA . |
playing |
Event |
Playback is ready to start after having been paused or delayed due to lack of media data. |
readyState is newly equal to or greater than
HAVE_FUTURE_DATA and
paused
is false, or paused
is newly false and
readyState is equal to or greater than HAVE_FUTURE_DATA . Even if this
event fires, the element might still not be
potentially playing, e.g. if the element is
blocked on its media controller (e.g. because the
current media controller is paused, or another slaved media element is stalled
somehow, or because the media resource has no data corresponding to
the
media controller position), or the element is
paused for user interaction or
paused for in-band content. |
waiting |
Event |
Playback has stopped because the next frame is not available, but the user agent expects that frame to become available in due course. |
readyState is equal to or less than HAVE_CURRENT_DATA , and
paused
is false. Either seeking
is true, or the
current playback position is not contained in any of the ranges
in buffered .
It is possible for playback to stop for other reasons without
paused
being false, but those reasons do not fire this event (and when
those situations resolve, a separate playing event is not fired either): e.g.
the element is newly
blocked on its media controller, or playback ended, or playback
stopped due to errors, or the element has
paused for user interaction or
paused for in-band content. |
seeking |
Event |
The seeking
IDL attribute changed to true. |
|
seeked |
Event |
The seeking
IDL attribute changed to false. |
|
ended |
Event |
Playback has stopped because the end of the media resource was reached. |
currentTime equals the end of the media resource; ended
is true. |
durationchange |
Event |
The duration
attribute has just been updated. |
|
timeupdate |
Event |
The current playback position changed as part of normal playback or in an especially interesting way, for example discontinuously. | |
play |
Event |
The element is no longer paused. Fired after the play()
method has returned, or when the autoplay attribute has caused playback
to begin. |
paused
is newly false. |
pause |
Event |
The element has been paused. Fired after the pause()
method has returned. |
paused
is newly true. |
ratechange |
Event |
Either the
defaultPlaybackRate or the
playbackRate attribute has just been updated. |
|
volumechange |
Event |
Either the volume
attribute or the muted
attribute has changed. Fired after the relevant attribute's setter
has returned. |
The following events fire on MediaController
objects:
Event name | Interface | Fired when... |
---|---|---|
emptied |
Event |
All the
slaved media elements newly have
readyState set to HAVE_NOTHING or greater, or there
are no longer any
slaved media elements. |
loadedmetadata |
Event |
All the
slaved media elements newly have
readyState set to HAVE_METADATA or greater. |
loadeddata |
Event |
All the
slaved media elements newly have
readyState set to HAVE_CURRENT_DATA or
greater. |
canplay |
Event |
All the
slaved media elements newly have
readyState set to HAVE_FUTURE_DATA or
greater. |
canplaythrough |
Event |
All the
slaved media elements newly have
readyState set to HAVE_ENOUGH_DATA or
greater. |
playing |
Event |
The MediaController is no longer a
blocked media controller. |
ended |
Event |
The MediaController has reached the end
of all the
slaved media elements. |
waiting |
Event |
The MediaController is now a
blocked media controller. |
ended |
Event |
All the slaved media elements have newly ended playback. |
durationchange |
Event |
The
duration attribute has just been updated. |
timeupdate |
Event |
The media controller position changed. |
play |
Event |
The
paused attribute is newly false. |
pause |
Event |
The
paused attribute is newly true. |
ratechange |
Event |
Either the
defaultPlaybackRate attribute or the
playbackRate attribute has just been updated. |
volumechange |
Event |
Either the
volume attribute or the
muted attribute has just been updated. |
This section is non-normative.
Playing audio and video resources on small devices such as
set-top boxes or mobile phones is often constrained by limited
hardware resources in the device. For example, a device might only
support three simultaneous videos. For this reason, it is a good
practice to release resources held by media elements when
they are done playing, either by being very careful about removing
all references to the element and allowing it to be garbage
collected, or, even better, by removing the element's src
attribute and any source
element descendants, and invoking
the element's load()
method.
Similarly, when the playback rate is not exactly 1.0, hardware, software, or format limitations can cause video frames to be dropped and audio to be choppy or muted.