W3C First Public Working Draft
Copyright © 2021-2022 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and permissive document license rules apply.
        This document defines how a browser viewport can be used as the source of
        a media stream using getViewportMedia, an extension to the
        Screen Capture API [screen-capture].
      
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
This document is not complete.
This document was published by the Web Real-Time Communications Working Group as a First Public Working Draft using the Recommendation track.
Publication as a First Public Working Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 2 November 2021 W3C Process Document.
This section is non-normative.
This document describes an extension to the Screen Capture API [screen-capture] that enables the acquisition of the browser viewport (the current tab), in the form of a video track. In some cases, tab audio is also captured in the form of an audio track. This enables use cases such as: recording an ongoing WebRTC [WEBRTC] video meeting, or a user in a video meeting sharing their presentation without having to locate it in a picker, by instead clicking a button in their presentation application.
This feature is only available to "cross-origin isolated" documents. This prevents applications from using this API to access potentially confidential information from other origins, content that should remain inaccessible due to the protections offered by the user agent sandbox.
This feature has security implications, and requires a permission prompt. Sharing the rendered viewport may expose user information such as browsing history (through link purpling), personal details like address or payment info (through user agent or web extension features like form autofill), or personal preferences (like font size).
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key words MAY, MUST, and MUST NOT in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.
Implementations that use ECMAScript [ECMA-262] to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.
        The following example demonstrates a request for viewport capture using the
        navigator.mediaDevices.getViewportMedia method defined in this
        document.
      
try {
  const stream = await navigator.mediaDevices.getViewportMedia();
  videoElement.srcObject = stream;
} catch (e) {
  console.log('Unable to acquire viewport capture: ' + e);
}
        This document uses the definitions of MediaStream, MediaStreamTrack,
        MediaStreamConstraints and ConstrainablePattern
        from [GETUSERMEDIA], and the definitions of
        display surface
        and browser display surface
        from [screen-capture].
      
        Capture of the viewport is enabled through the addition of a new
        getViewportMedia method on the MediaDevices
        interface, that is similar to getDisplayMedia(),
        except it only captures the top-level document's viewport (current tab),
        using a permission prompt instead of presenting the user with a picker.
        For security reasons, it also only works from "cross-origin isolated"
        documents that opt-in with a document-policy.
      
WebIDLpartial interface MediaDevices {
  Promise<MediaStream> getViewportMedia(
      optional DisplayMediaStreamConstraints constraints = {});
};
        getViewportMedia
          Prompts the user for permission to live-capture the viewport (current tab).
The user agent MUST apply any provided constraints to the produced media after permission has been granted.
              In the case of audio, the user agent MAY present the end-user with an option to
              include audio from the current viewport in the capture, if available. Like
              getDisplayMedia() with regards to audio+video, the user agent is
              allowed to not return audio even if the audio constraint is present. If the user
              agent knows no audio will be shared for the lifetime of the stream it MUST NOT
              include an audio track in the resulting stream. The user agent MAY accept a
              request for audio and video by only returning a video track in the resulting
              stream, or it MAY accept the request by returning both an audio track and a video
              track in the resulting stream. The user agent MUST reject audio-only requests.
            
              Like getDisplayMedia(), the "granted"
              permission cannot be persisted.
            
When the getViewportMedia()
            method is called, the user agent MUST run the following
            steps:
If the current settings object's
                cross-origin isolated capability
                is false, return a promise rejected
                with a DOMException object whose name
                attribute has the value SecurityError.
Document's
                top-level browsing context's
                
                required document policy does not contain
                Require-Document-Policy: viewport-capture and
                Document-Policy: viewport-capture (TODO: use correct algorithm),
                return a promise rejected
                with a DOMException object whose name
                attribute has the value SecurityError.
              If the relevant global object of this does not have
          transient activation, return a promise rejected
                with a DOMException object whose name
                attribute has the value InvalidStateError.
Let constraints be the method's first argument.
If constraints.video is false,
                  return a promise rejected with a newly
                  created TypeError.
For each existing member in constraints whose value, CS, is a dictionary, run the following steps:
If CS contains a member named advanced,
                    return a promise rejected with a newly
                    created TypeError.
If CS contains a member whose name specifies a
                    constrainable property applicable to
                    display surfaces,
                    and whose value in turn is a dictionary containing a member
                    named either min or exact, return
                    a promise rejected with a newly
                    created TypeError.
If CS contains a member whose name specifies a
                    constrainable property applicable to
                    display surfaces,
                    and whose value in turn is a dictionary containing a member
                    named max, and that member's value in turn is
                    less than the constrainable property's
                    floor value,
                    then let failedConstraint be the name of the
                    member, let message be either
                    undefined or an informative human-readable
                    message, and return a promise rejected with a new
                    OverconstrainedError created by calling
                    OverconstrainedError(failedConstraint,
                    message).
                  
Let requestedMediaTypes be the set of media
                types in constraints with either a dictionary
                value or a value of true.
If the relevant global object's associated Document
                is NOT fully active or does NOT
                have focus, return
                  a promise rejected with a
                  DOMException object whose
                  name attribute has the value
                  InvalidStateError.
Let p be a new promise.
Run the following steps in parallel:
For each media type T in requestedMediaTypes,
If no sources of type T are available,
                        reject p with a new
                        DOMException object whose
                        name attribute has the value
                        NotFoundError.
Read the current permission state for obtaining
                        sources of type T in the current browsing
                        context. If the permission state is "denied", jump to
                        the step labeled PermissionFailure below.
Optionally, e.g., based on a previously-established user preference, for security reasons, or due to platform limitations, jump to the step labeled Permission Failure below.
Request permission to use
                    viewport capture, for a PermissionDescriptor with its
                    name set to "viewport-capture", resulting in
                    a set of provided media.
The provided media MUST include precisely one video track, which
                    MUST be a live-capture of the
                    browser
                    display surface
                    of the relevant global object's associated Document's
                    top-level browsing context's
                    viewport.
The provided media MUST include at most one audio track, which, if
                    provided, MUST be the combined audio produced by the sum of
                    documents that consist of the
                    relevant global object's associated Document's
                    top-level browsing context's active document, and
                    all active documents in nested browsing contexts of
                    the relevant global object's associated Document's
                    top-level browsing context. This audio track
                    MUST NOT be included if audio was not specified in
                    requestedMediaTypes, or if it was specified as
                    false.
The source of a MediaStreamTrack MUST NOT change.
If the result of the request is "granted", then for
                    each device that is sourcing the provided media, using
                    a stable and private id for the device, deviceId,
                    set [[devicesLiveMap]][deviceId] to
                    true, if it isn’t already true,
                    and set the
                    [[devicesAccessibleMap]][deviceId] to
                    true, if it isn’t already
                    true.
The user agent MUST NOT
                    store a "granted" permission entry.
                    
If the result is "denied", jump to the step labeled
                    Permission Failure below. If the user never
                    responds, this algorithm stalls on this step.
If the user grants permission but a hardware error
                    such as an OS/program/webpage lock prevents access,
                    reject p with a new
                    DOMException object whose
                    name attribute has the value
                    NotReadableError and abort these steps.
If the result is "granted" but device access fails for
                    any reason other than those listed above, reject
                    p with a new DOMException
                    object whose name attribute has the
                    value AbortError and abort these steps.
Let stream be the
                    MediaStream object for which the user
                    granted permission.
Run the ApplyConstraints algorithm on all
                    tracks in stream with the appropriate
                    constraints. Should this fail, let failedConstraint
                    be the result of the algorithm that failed, and let
                    message be either undefined or an
                    informative human-readable message, and then reject
                    p with a new OverconstrainedError
                    created by calling
                    OverconstrainedError(failedConstraint,
                    message).
Resolve p with stream and abort these steps.
Permission Failure: Reject
                    p with a new DOMException
                    object whose name attribute has the
                    value NotAllowedError.
Return p.
The user agent MUST NOT capture content that's behind a partially transparent captured display surface.
The user agent MUST NOT share the audio other than audio emitted from the captured tab, and MUST NOT share audio of the entire system.
The constraints relevant to getViewportMedia are only
        those relevant to getDisplayMedia(), as defined in
        
          5.4 Constrainable Properties for Captured Display Surfaces.
      
          Viewport Capture is a powerful feature which is identified by the
          name "viewport-capture",
          requiring express permission to be used.
        
As required for integration with the Permissions specification, this specification defines the following:
prompt" and "denied". The user agent MUST NOT
            ever set this descriptor's permission state to "granted".
          This specification defines a policy-controlled feature
        identified by the string "viewport-capture".
        Its default allowlist is "self".
        
A document's
          permissions policy
          determines whether any content in that document is allowed to use
          getViewportMedia. If disabled in any document, no content in
          the document will be
          allowed to use
          getViewportMedia. This is enforced by the
          request permission to use algorithm.
          
This specification extends the 
        Privacy Indicator Requirements of
      getDisplayMedia() to include getViewportMedia.
This section is informative; however, it notes some serious risks to platform security if the advice it contains are not adhered to.
TBD.
Referenced in: