This document defines how a stream of media can be captured from a DOM element, such as a <video>, <audio>, or <canvas> element, in the form of a MediaStream [GETUSERMEDIA].

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

It is partially based on existing implementation experience in Firefox; it is nevertheless still an early proposal, and, while early experimentations are encouraged, it is therefore not intended for implementation.

This document was published by the Web Real-Time Communication Working Group and Device APIs Working Group as a First Public Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Web Real-Time Communication Working Group, Device APIs Working Group) made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 August 2014 W3C Process Document.

Table of Contents

1. Introduction

This section is non-normative.

This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.

The captured media is formed into a MediaStream [GETUSERMEDIA], which can then be consumed by the various APIs that process streams of media, such as WebRTC [WEBRTC], or Web Audio [WEBAUDIO].

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words MAY, MUST, and SHOULD are to be interpreted as described in [RFC2119].

This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.

Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.

3. HTML Media Element Media Capture Extensions

The captureStream() and captureStreamUntilEnded() methods are defined on HTML [HTML5] media elements.

Both MediaStream and HTMLMediaElement expose the concept of a track. Since there is no common type used for HTMLMediaElement, this document uses the term track to refer to either VideoTrack or AudioTrack. MediaStreamTrack is used to identify the media in a MediaStream.

partial interface HTMLMediaElement {
    MediaStream captureStream ();
    MediaStream captureStreamUntilEnded ();

3.1 Methods


The captureStream() method produces a real-time capture of the media that is rendered to the media element.

The captured MediaStream comprises of MediaStreamTracks that render the content from the set of selected (for VideoTracks, or other exclusively selected track types) or enabled (for AudioTracks, or other track types that support multiple selections) tracks from the media element. If the media element does not have a selected or enabled tracks of a given type, then no MediaStreamTrack is present in the captured stream.

A <video> element can therefore capture a video MediaStreamTrack and any number of audio MediaStreamTracks. An <audio> element can capture any number of audio MediaStreamTracks. In both cases, the set of captured MediaStreamTracks could be empty.

captureStream produces a MediaStream that captures any media that is currently playing on the element. Changes in the media element source do not cause the stream to terminate, though the set of MediaStreamTracks might change over time. If the source stream for the media element ends, or the a different source is selected, the MediaStream captures the state of the media element. This means that there could be periods where the captured stream has no active media content.

If the selected VideoTrack changes or enabled AudioTracks change for the media element, MediaStreamTracks are added or removed as necessary to ensure that the MediaStreamTracks in the MediaStream correctly reflect the changes. Necessary addtrack and removetrack events are generated to notify applications of these changes.

A captured MediaStreamTrack ends when the track that it captures ends. Captured MediaStreamTracks also end when tracks that are rendered to the media element change, causing the MediaStreamTrack to be removed from the captured stream. That is, when a different VideoTrack is selected or the corresponding AudioTrack is disabled.

If media playback is paused, the captured stream continues to produce whatever is being actively rendered to the element. What is rendered to the captured stream will vary based on the type of media; a VideoTrack might capture a still frame, or an AudioTrack might capture silence.

Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding the media element suppress captured video.

No parameters.
Return type: MediaStream

A stream captured using captureStreamUntilEnded() captures the rendered output from a single media resource. The resulting stream ends when the media element has ended playback, or when the media element is changed to render a different resource.

captureStreamUntilEnded() operates in the same way that captureStream() does, except that when a captured MediaStreamTrack is removed from the MediaStream no further tracks are added.

A stream captured with captureStreamUntilEnded() MAY still start with fewer tracks than the media element permits. The first track of any given type that becomes selected or enabled results in a MediaStreamTrack being added to the captured stream. Only the first track of a given type is added; new AudioTracks that are enabled are not added to the capture.

Once all tracks in the captured stream have been removed, the captured stream becomes permanently inactive. New tracks are not added to the capture, even if they are the first of their type.

This allows for a media element that renders multiple media types to be captured without a complete set of media being present when the capture is initiated. For instance, a <video> element might initially only render audio, but have a VideoTrack added (or selected) after the capture commences. A late-starting VideoTrack would consequently be added to the capture. However, if the VideoTrack commences after the AudioTrack ends, then the VideoTrack will not be added to the captured MediaStream.

No parameters.
Return type: MediaStream

4. HTML Canvas Element Media Capture Extensions

The captureStream() method is added to the HTML [HTML5] canvas element. The resulting CanvasCaptureMediaStream provides methods that allow for controlling when frames are sampled from the canvas.

partial interface HTMLCanvasElement {
    CanvasCaptureMediaStream captureStream (optional double frameRate);

4.1 Methods


The captureStream() method produces a real-time video capture of the surface of the canvas. The resulting media stream has a single video MediaStreamTrack that matches the dimensions of the canvas element.

This method throws a SecurityError exception if the canvas is not origin-clean. The captured stream immediately ceases to capture content from the canvas if the origin-clean flag of the canvas becomes false at any time.

A user agent SHOULD await a stable state in the script execution of the current page or worker that has control of the canvas before capturing a frame.

In order to support manual control of frame capture, browsers MUST support a value of 0 for frameRate. The captured stream always captures at least one frame, even if frameRate is zero.

If the frameRate value is omitted, the user agent SHOULD capture new frames each time that the content of the canvas changes.

Return type: CanvasCaptureMediaStream

4.2 The CanvasCaptureMediaStream

The CanvasCaptureMediaStream is an extension of MediaStream that provide a single requestFrame() method. Applications that depend on tight control over the rendering of content to the media stream can use this method to control when frames from the canvas are captured.

interface CanvasCaptureMediaStream : MediaStream {
    readonly    attribute HTMLCanvasElement canvas;
    void requestFrame ();

4.2.1 Attributes

canvas of type HTMLCanvasElement, readonly
The canvas element that this media stream captures.

4.2.2 Methods


The requestFrame() method allows applications to manually request that a frame from the canvas be captured and rendered into the media stream. In cases where applications progressively render to a canvas, this allows applications to avoid capturing a partially rendered frame.

No parameters.
Return type: void

5. Security Considerations

Media elements can render media resources from origins that differ from the origin of the media element. In those cases, the contents of the resulting MediaStream MUST be protected from access by the document origin.

How this protection manifests will differ, depending on how the content is accessed. For instance, rendering inaccessible video to a canvas element [2DCONTEXT] causes the origin-clean property of the canvas to become false; attempting to create a Web Audio MediaStreamAudioSourceNode [WEBAUDIO] succeeds, but produces no information to the document origin (that is, only silence is transmitted into the audio context); attempting to transfer the media using WebRTC [WEBRTC] results in no information being transmitted.

The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn't result in exposure of cross origin content.

6. Change Log

This section will be removed before publication.

Changes since 2015-tbd-tbd

A. Acknowledgements

This document is based on the stream processing specification [streamproc] originally developed by Robert O'Callahan.

B. References

B.1 Normative references

Daniel Burnett; Adam Bergkvist; Cullen Jennings; Anant Narayanan. Media Capture and Streams. 12 February 2015. W3C Working Draft. URL: http://www.w3.org/TR/mediacapture-streams/
Ian Hickson; Robin Berjon; Steve Faulkner; Travis Leithead; Erika Doyle Navara; Edward O'Connor; Silvia Pfeiffer. HTML5. 28 October 2014. W3C Recommendation. URL: http://www.w3.org/TR/html5/
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
Cameron McCormack. Web IDL. 19 April 2012. W3C Candidate Recommendation. URL: http://www.w3.org/TR/WebIDL/

B.2 Informative references

Rik Cabanier; Jatinder Mann; Jay Munro; Tom Wiltzius. HTML Canvas 2D Context. 21 August 2014. W3C Candidate Recommendation. URL: http://www.w3.org/TR/2dcontext/
Paul Adenot; Chris Wilson; Chris Rogers. Web Audio API. 10 October 2013. W3C Working Draft. URL: http://www.w3.org/TR/webaudio/
Adam Bergkvist; Daniel Burnett; Cullen Jennings; Anant Narayanan. WebRTC 1.0: Real-time Communication Between Browsers. 10 February 2015. W3C Working Draft. URL: http://www.w3.org/TR/webrtc/
Robert O'Callahan. MediaStream Processing API. 31 May 2012. W3C Note. URL: http://www.w3.org/TR/streamproc/