Copyright © 2015-2023 World Wide Web Consortium. W3C® liability, trademark and permissive document license rules apply.
        This document defines how a stream of media can be captured from a DOM element, such as a
        video, audio, or canvas
        element, in the form of a MediaStream [GETUSERMEDIA].
      
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is not complete. It is subject to major changes and, while early experimentations are encouraged, it is therefore not intended for implementation.
This document was published by the Web Real-Time Communications Working Group as a Working Draft using the Recommendation track.
Publication as a Working Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 03 November 2023 W3C Process Document.
This section is non-normative.
This document describes an extension to both HTML media elements and the HTML canvas element that enables the capture of the output of the element in the form of streaming media.
        The captured media is formed into a MediaStream [GETUSERMEDIA], which can
        then be consumed by the various APIs that process streams of media, such as WebRTC
        [WEBRTC], or Web Audio [WEBAUDIO].
      
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MUST and MUST NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
        The method captureStream is added on HTML [HTML5] media elements.
        Methods for capture are added to both HTMLMediaElement
        and HTMLCanvasElement.
      
        Both MediaStream and HTMLMediaElement expose the concept
        of a track
. Since there is no common type used for
        HTMLMediaElement, this document uses the term track to refer
        to either VideoTrack
        or AudioTrack.
        MediaStreamTrack
        is used to identify the media in a MediaStream.
      
WebIDLpartial interface HTMLMediaElement {
    MediaStream captureStream ();
};
        captureStream
            
                The captureStream() method produces a real-time capture of
                the media that is rendered to the media element.
              
                The captured MediaStream comprises of MediaStreamTracks
                that render the content from the set of 
                selected (for VideoTracks, or other exclusively selected
                track types) or enabled
                (for AudioTracks, or other track types that support
                multiple selections) tracks from the media element. If the media element
                does not have a selected or enabled tracks of a given type, then no
                MediaStreamTrack of that type is present in the captured stream.
              
                A video element can therefore capture a video
                MediaStreamTrack and any number of audio
                MediaStreamTracks. An audio element can capture
                any number of audio MediaStreamTracks. In both cases, the set of
                captured MediaStreamTracks could be empty.
              
                Unless and until there is a track of given type that is selected or enabled,
                no MediaStreamTrack of that type is present in the captured stream. In
                particular, if the media element does not have a source assigned, then the captured
                MediaStream has no tracks. Consequently, a media element with a ready
                state of HAVE_NOTHING
                produces no captured MediaStreamTrack instances. Once metadata is
                available and the selected or enabled tracks are determined, new captured
                MediaStreamTrack instances are created and added to the
                MediaStream.
              
                A captured MediaStreamTrack ends when playback
                ends (and the ended event fires) or when the track that it
                captures is no longer selected or enabled for playback. A track is no longer
                selected or enabled if the source is changed by setting the src or
                srcObject attributes of the media element.
              
                The set of captured MediaStreamTracks change if the source of the
                media element changes. If the source for the media element ends, a different source
                is selected.
              
                If the selected VideoTrack or enabled AudioTracks for the media
                element change, a addtrack
                event with a new MediaStreamTrack is generated for each track
                that was not previously selected or enabled; and a removetrack
                events is generated for each track that ceases to be selected or enabled. A
                MediaStreamTrack MUST end prior to being removed from the
                MediaStream.
              
                Since a MediaStreamTrack can only end once, a track that is enabled,
                disabled and re-enabled will be captured as two separate tracks. Similarly,
                restarting playback after playback ends causes a new set of captured
                MediaStreamTrack instances to be created. Seeking during playback
                without changing track selection does not generate events or cause a captured
                MediaStreamTrack to end.
              
                The MediaStreamTracks that comprise the captured
                MediaStream become muted or unmuted as the tracks they capture change
                state. At any time, a media element might not have active content available for
                capture on a given track for a variety of reasons:
              
MediaStreamTrack that is acting as a source could be
                muted
                or disabled.
                
                Absence of content is reflected in captured tracks through the muted
                attribute. A captured MediaStreamTrack MUST have a
                muted attribute set to true if its corresponding
                source track does not have available and accessible content. A
                
                mute event is raised on the MediaStreamTrack when content
                availability changes.
              
                What output a muted capture produces as a result will vary based on the type of
                media: a VideoTrack ceases to capture new frames when muted,
                causing the captured stream to show the last captured frame; a muted
                AudioTrack produces silence.
              
Whether a media element is actively rendering content (e.g., to a screen or audio device) has no effect on the content of captured streams. Muting the audio on a media element does not cause the capture to produce silence, nor does hiding a media element cause captured video to stop. Similarly, the audio level or volume of the media element does not affect the volume of captured audio.
Captured audio from an element with an effective playback rate other than 1.0 MUST be time-stretched. An unplayable playback rate causes the captured audio track to become muted.
        The captureStream method is added to the HTML [HTML5] canvas
        element. The resulting CanvasCaptureMediaStreamTrack provides methods
        that allow for controlling when frames are sampled from the canvas.
      
WebIDLpartial interface HTMLCanvasElement {
    MediaStream captureStream (optional double frameRequestRate);
};
        captureStream
            
                The captureStream() method produces a real-time video
                capture of the surface of the canvas. The resulting media stream has a single video
                CanvasCaptureMediaStreamTrack that matches the dimensions of
                the canvas element.
              
                Content from a canvas that is not 
                origin-clean MUST NOT be captured. This method throws a SecurityError
                exception if the canvas is not origin-clean.
              
                A captured stream MUST immediately cease to capture content if the
                origin-clean flag of the source canvas becomes false after the stream is
                created by captureStream(). The captured MediaStreamTrack
                MUST become muted, producing no new content while the canvas remains in this
                state.
              
                Each track that captures a canvas has a
                [[frameCaptureRequested]] internal slot that is set to true when a
                new frame is requested from the canvas.
              
                The value of  [[frameCaptureRequested]] on all new
                tracks is set to true when the track is created. On creation of the
                captured track with a specific, non-zero frameRequestRate, the user
                agent starts a periodic timer at an interval of 1/frameRequestRate
                seconds. At each activation of the timer,
                [[frameCaptureRequested]] is set to true.
              
                In order to support manual control of frame capture with the
                requestFrame() method, browsers MUST support a value of 0 for
                frameRequestRate. However, a captured stream MUST request capture of a
                frame when created, even if frameRequestRate is zero.
              
                This method throws a NotSupportedError if frameRequestRate is negative.
              
                A new frame is requested from the canvas when
                [[frameCaptureRequested]] is true and the canvas is painted. Each
                time that the captured canvas is painted, the following steps are executed:
              
[[frameCaptureRequested]] internal slot of
                    track is set, add a new frame to track containing what
                    was painted to the canvas.
                    [[frameCaptureRequested]] internal slot of track
                    to false.
                    When adding new frames to track containing what was painted to the canvas, the alpha channel content of the canvas must be captured and preserved if the canvas is not fully opaque. The consumers of this track might not preserve the alpha channel.
This algorithm results in a captured track not starting until something changes in the canvas.
| Parameter | Type | Nullable | Optional | Description | 
|---|---|---|---|---|
| frameRequestRate | double | ✘ | ✔ | 
MediaStream
              CanvasCaptureMediaStreamTrack
        
          The CanvasCaptureMediaStreamTrack is an extension of
          MediaStreamTrack that provide a single requestFrame() method.
          Applications that depend on tight control over the rendering of content to the media
          stream can use this method to control when frames from the canvas are captured.
        
WebIDL[Exposed=Window] interface CanvasCaptureMediaStreamTrack : MediaStreamTrack {
    readonly        attribute HTMLCanvasElement canvas;
    undefined requestFrame ();
};
          canvas of type HTMLCanvasElement, readonly
              requestFrame
              
                  The requestFrame() method allows applications to manually
                  request that a frame from the canvas be captured and rendered into the track. In
                  cases where applications progressively render to a canvas, this allows
                  applications to avoid capturing a partially rendered frame.
                
                  As currently specified, this results in no SecurityError or other
                  error feedback if the canvas is not origin-clean. In part, this is because we
                  don't track where requests for frames come from. Do we want to highlight that?
                
undefined
                
        Media elements can render media resources from origins that differ from the origin of the
        media element. In those cases, the contents of the resulting MediaStreamTrack
        MUST be protected from access by the document origin.
      
        How this protection manifests will differ, depending on how the content is accessed. For
        instance, rendering inaccessible video to a canvas element [HTML]
        causes the origin-clean
        flag of the canvas to become false; attempting to create a Web Audio
        MediaStreamAudioSourceNode [WEBAUDIO] succeeds, but produces no information
        to the document origin (that is, only silence is transmitted into the audio context);
        attempting to transfer the media using WebRTC [WEBRTC] results in no information being
        transmitted.
      
The origin of the media that is rendered by a media element can change at any time. This is even the case for a single media resource. User agents MUST ensure that a change in the origin of media doesn't result in exposure of cross origin content.
This section will be removed before publication.
This document is based on the stream processing specification [streamproc] originally developed by Robert O'Callahan.
Referenced in:
Referenced in:
Referenced in: