Autoplay Policy Detection

W3C Working Draft,

More details about this document
This version:
https://www.w3.org/TR/2022/WD-autoplay-detection-20220407/
Latest published version:
https://www.w3.org/TR/autoplay-detection/
Editor's Draft:
https://w3c.github.io/autoplay/
Previous Versions:
History:
https://www.w3.org/standards/history/autoplay-detection
Feedback:
GitHub
Editor:
(Mozilla)

Abstract

This specification provides web developers the ability to detect if automatically starting the playback of a media file is allowed in different situations.

Status of this document

This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

Feedback and comments on this specification are welcome. GitHub Issues are preferred for discussion on this specification. Alternatively, you can send comments to the Media Working Group’s mailing-list, public-media-wg@w3.org (archives). This draft highlights some of the pending issues that are still to be discussed in the working group. No decision has been taken on the outcome of these issues including whether they are valid.

This document was published by the Media Working Group as a Working Draft using the Recommendation track. This document is intended to become a W3C Recommendation.

Publication as a Working Draft does not imply endorsement by W3C and its Members.

This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 2 November 2021 W3C Process Document.

1. Introduction

Most user agents have their own mechanisms to block autoplaying media, and those mechanisms are implementation-specific. Web developers need to have a way to detect if autoplaying media is allowed or not in order to make actions, such as selecting alternate content or improving the user experience while media is not allowed to autoplay. For instance, if a user agent only blocks audible autoplay, then web developers can replace audible media with inaudible media to keep media playing, instead of showing a blocked media which looks like a still image to users. If the user agent does not allow any autoplay media, then web developers could stop loading media resources and related tasks to save the bandwidth and CPU usage for users.

Currently, this specification only handles HTMLMediaElement (video and audio) and Web Audio API. This specification does not handle Web Speech API and animated image (GIF animation).

2. The Autoplay Detection API

Autoplay detection can be performed through the Navigator object. The result can either allow authors to know if media, which have the same type of the given media type and exist in the document contained in the Window object asscociated with the queried Navigator object, are allowed to autoplay, or to know if a specific element is allowed to autoplay.

2.1. Autoplay Policy Enum

enum AutoplayPolicy {
  "allowed",
  "allowed-muted",
  "disallowed"
};
Enumeration description
"allowed" Media are allowed to autoplay.
"allowed-muted" Inaudible media are allowed to autoplay.
Note: Currently, this attribute will only be returned when the given media type or element is a type of HTMLMediaElement or its extensions, such as HTMLVideoElement or HTMLAudioElement.
An inaudible media element is an HTMLMediaElement that has any of the following conditions:
  • media’s volume equal to 0
  • media’s muted is true
  • media’s resource does not have an audio track
"disallowed" No media is allowed to autoplay.
Note: The autoplay policy represents the current status of whether a user agent allows media to autoplay, which can vary in the future. Therefore, it is recommended that authors check the result every time if they want to have an up-to-date result.
If a user agent uses the user activation, described in HTML § 6.4.1 Data model, to determine if the autoplay media are allowed or not, and the default policy is to block all autoplay (disallowed). Then the policy could change to allowed or allowed-muted after a user performs a supported user gesture on the page or the media.

2.2. The Autoplay Detection Methods

enum AutoplayPolicyMediaType { "mediaelement", "audiocontext" };

[Exposed=Window]
partial interface Navigator {
  AutoplayPolicy getAutoplayPolicy(AutoplayPolicyMediaType type);
  AutoplayPolicy getAutoplayPolicy(HTMLMediaElement element);
  AutoplayPolicy getAutoplayPolicy(AudioContext context);
};
Enumeration description
mediaelement It’s used to query a status for HTMLMediaElement and its extensions, such as HTMLVideoElement and HTMLAudioElement.
audiocontext It’s used to query a status for AudioContext.

2.2.1. Query by a Media Type

The getAutoplayPolicy(type) methods return the rough status of whether media elements or audio context, which exist in the document contained in the Window object associated with the queried Navigator object, are allowed to autoplay or not. The rough status here means that the returned result isn’t always correct for every elements which have the same type of the given media type.
Note: Depending on the implementation, it’s still possible for some media that exist on the same document would be allowed to autoplay when the result of querying by a media type is disallowed. In this situation, it is recommended that authors also query by a specific element in order to get an accurate result.
Some user agents may not allow any media element to autoplay by default, but allow autoplay on those media elements which have been clicked by users.

For example, at first, the result of querying by a media type and querying by an object would both be disallowed. After a user clicks on a media element, then querying by that media element would become allowed if a user agent decides to bless that element because that behavior seems intended by users, but querying by a media type and querying by other media elements, which haven’t been clicked yet, would still return disallowed.

When getAutoplayPolicy(type) method is called, the user agent MUST run the following steps:

  1. If type is mediaelement, return a result that represents the current status for HTMLMediaElement and its extensions, such as HTMLVideoElement and HTMLAudioElement, which exist in the document contained in the Window object associated with the queried Navigator object.
  2. If type is audiocontext, return a result that represents the current status for AudioContext, which exist in the document contained in the Window object associated with the queried Navigator object.
If the return value is allowed
All media, corresponding with the given type, are allowed to autoplay.
If the return value is allowed-muted
All inaudible media, corresponding with the given type, are allowed to autoplay.
Note: Currently, this attribute will only be returned when the given media type is mediaelement. The inaudible media means inaudible media element.
If the return value is disallowed
None of media, corresponding with the given type, are allowed to autoplay.
Note: Depending on the implementation, if a document has child documents, then the result queried from the Navigator object asscociated with the parent document could be different from the result queried from the Navigator object asscociated with the child documents.
Assume that the top level document A in foo.com returns allowed and it has an embedded iframe, which has another document B from bar.com. A user agent could either make child document B return same result that is inherited from the top level document A. Or make the document B return a different result, eg. disallowed.

Doing the former helps to lower the complexity and make the behavior of blocking autoplay more consistent. The latter helps providing a finer-grained autoplay control.

2.2.2. Query by an Element

The getAutoplayPolicy(element) and getAutoplayPolicy(context) methods return the current status of whether the given element is allowed to autoplay or not.
If the return value is allowed
This element is allowed to autoplay within the current execution context.
If the return value is allowed-muted
This element will only be allowed to autoplay if it’s inaudible.
Note: Currently, this attribute will only be returned when the given element is HTMLMediaElement or its extensions, such as HTMLVideoElement or HTMLAudioElement. The inaudible media means inaudible media element.

In addition, if authors make an inaudible media element audible right after it starts playing, then it is recommended for a user agent to pause that media element immediately because it’s no longer inaudible.

If the return value is disallowed
This element is not allowed to autoplay.
Note: For HTMLMediaElement, if authors call its play(), the returned promise from play() will be rejected with NotAllowedError exception.

For AudioContext, that means its AudioContextState would keep in suspended state.

If the result of querying by a media type is different from the result of querying by an element, authors should take the latter one as the correct result. Example 2 shows a possible scenario.

Note: If the element which authors pass is not HTMLMediaElement (or its extension, such as HTMLVideoElement and HTMLAudioElement) or AudioContext, then these methods will throw a TypeError.

3. Examples

An example for checking whether authors can autoplay a media element.
if (navigator.getAutoplayPolicy("mediaelement") === "allowed") {
  // Create and play a new media element.
} else if (navigator.getAutoplayPolicy("mediaelement") === "allowed-muted") {
  // Create a new media element, and play it in muted.
} else {
  // Autoplay is disallowed, maybe show a poster instead.
}
An example for checking whether authors can start an audio context. Web Audio uses sticky activation to determine if AudioContext can be allowed to start.
if (navigator.getAutoplayPolicy("audiocontext") === "allowed") {
  let ac = new AudioContext();
  ac.onstatechange = function() {
    if (ac.state === "running") {
      // Start running audio app.
    }
  }
} else {
  // Audio context is not allowed to start. Display a bit of UI to ask
  // users to start the audio app. Audio starts via calling ac.resume()
  // from a handler, and 'onstatechange' allows knowing when the audio
  // stack is ready.
}
Example of querying by a specific media element.
function handlePlaySucceeded() {
  // Update the control UI to playing.
}
function handlePlayFailed() {
  // Show a button to allow users to explicitly start the video and
  // display an image element as poster to replace the video.
}

let video = document.getElementById("video");
switch (navigator.getAutoplayPolicy(video)) {
  case "allowed":
    video.src = "video.webm";
    video.play().then(handlePlaySucceeded, handlePlayFailed);
    break;
  case "allowed-muted":
    video.src = "video.webm";
    video.muted = true;
    video.play().then(handlePlaySucceeded, handlePlayFailed);
    break;
  default:
    // Autoplay is not allowed, no need to download the resource.
    handlePlayFailed();
    break;
}
Example of querying by a specific audio context. Web Audio uses sticky activation to determine if AudioContext can be allowed to start.
let ac = new AudioContext();
if (navigator.getAutoplayPolicy(ac) === "allowed") {
  ac.onstatechange = function() {
    if (ac.state === "running") {
      // Start running audio app.
    }
  }
} else {
  // Display a bit of UI to ask users to start the audio app.
  // Audio starts via calling ac.resume() from a handler, and
  // 'onstatechange' allows knowing when the audio stack is ready.
}

4. Security and Privacy Considerations

Per the Self-Review Questionnaire: Security and Privacy § questions.

The API introduced in this specification has very low impact with regards to security and privacy. It does not expose any sensitive information that can be used to to identify users. It does not expose any ability to control sensors and any users' devices. It does not introduce any new state for an origin that will persist across browsing sessions. It does not allow an origin to send any data to the underlying platform. It does not introduce or enable new script execution and loading mechanism. It does not allow an origin to draw over a user agent’s native UI. It does not allow an origin to detect if users are in the private or non-private browsing mode.

5. Acknowledgments

This specification is the collective work of the W3C media Working Group.

The editors would like to thank Alastor Wu, Becca Hughes, Christoph Guttandin, Chris Needham, Chris Pearce, Dale Curtis, Eric Carlson, François Daoust, Frank Liberato, Gary Katsevman, Jer Noble, Mattias Buelens, Mounir Lamouri, Paul Adenot and Tom Jenkinson for their contributions to this specification.

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[SVG2]
Amelia Bellamy-Royds; et al. Scalable Vector Graphics (SVG) 2. 4 October 2018. CR. URL: https://www.w3.org/TR/SVG2/
[WEBAUDIO]
Paul Adenot; Hongchan Choi. Web Audio API. 17 June 2021. REC. URL: https://www.w3.org/TR/webaudio/
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/

Informative References

[SPEECH-API]
Web Speech API. cg-draft. URL: https://wicg.github.io/speech-api/

IDL Index

enum AutoplayPolicy {
  "allowed",
  "allowed-muted",
  "disallowed"
};

enum AutoplayPolicyMediaType { "mediaelement", "audiocontext" };

[Exposed=Window]
partial interface Navigator {
  AutoplayPolicy getAutoplayPolicy(AutoplayPolicyMediaType type);
  AutoplayPolicy getAutoplayPolicy(HTMLMediaElement element);
  AutoplayPolicy getAutoplayPolicy(AudioContext context);
};