"MediaStream Image Capture"

1. Introduction

The API defined in this document captures images from a photographic device referenced through a valid MediaStreamTrack. The produced image can be in the form of a Blob (see takePhoto() method) or as a ImageBitmap (see grabFrame()).

Reading capabilities and settings and applying constraints is done in one of two ways depending on whether it impacts the video MediaStreamTrack or not. Photo-specific capabilities and current settings can be retrieved via getPhotoCapabilities()/getPhotoSettings() and configured via takePhoto()'s PhotoSettings argument. Manipulating video-related capabilities, current settings and constraints is done via the MediaStreamTrack extension mechanism.

2. Security and Privacy Considerations

The privacy and security considerations discussed in [GETUSERMEDIA] apply to this extension specification.

Moreover, implementors should take care to prevent additional leakage of privacy-sensitive data from captured images. For instance, embedding the user’s location in the metadata of the digitzed image (e.g. EXIF) might transmit more private data than the user is expecting.

3. Image Capture API

The User Agent must support Promises in order to implement the Image Capture API. Any Promise object is assumed to have a resolver object, with resolve() and reject() methods associated with it.

[Constructor(MediaStreamTrack videoTrack)]
interface ImageCapture {
   Promise<Blob>              takePhoto(optional PhotoSettings photoSettings);
   Promise<PhotoCapabilities> getPhotoCapabilities();
   Promise<PhotoSettings>     getPhotoSettings();

   Promise<ImageBitmap>       grabFrame();

   readonly attribute MediaStreamTrack track;
};

takePhoto() returns a captured image encoded in the form of a Blob, whereas grabFrame() returns a snapshot of the track video feed in the form of a non-encoded ImageBitmap.

3.1. Attributes

track, of type MediaStreamTrack, readonly: The MediaStreamTrack passed into the constructor.

3.2. Methods

ImageCapture(MediaStreamTrack videoTrack)

Parameter	Type	Nullable	Optional	Description
videoTrack	`MediaStreamTrack`	✘	✘	The `MediaStreamTrack` to be used as source of data. This will be the value of the `track` attribute. The `MediaStreamTrack` passed to the constructor MUST have its `kind` attribute set to `"video"` otherwise a `DOMException` of type `NotSupportedError` will be thrown.

takePhoto(optional PhotoSettings photoSettings)

takePhoto() produces the result of a single photographic exposure using the video capture device sourcing the track and including any PhotoSettings configured, returning an encoded image in the form of a Blob if successful. When this method is invoked:

If the readyState of track provided in the constructor is not live, return a promise rejected with a new DOMException whose name is InvalidStateError.
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps in parallel:
1. Gather data from the track underlying source with the defined photoSettings and into a Blob containing a single still image. The method of doing this will depend on the underlying device.
  Devices MAY temporarily stop streaming data, reconfigure themselves with the appropriate photo settings, take the photo, and then resume streaming. In this case, the stopping and restarting of streaming SHOULD cause onmute and onunmute events to fire on the track in question.
2. If the UA is unable to execute the takePhoto() method for any reason (for example, upon invocation of multiple takePhoto() method calls in rapid succession), then the UA MUST return a promise rejected with a new DOMException whose name is UnknownError.
3. Return a promise resolved with the Blob object.

Parameter	Type	Nullable	Optional	Description
settings	`PhotoSettings`	✔	✘	The `PhotoSettings` dictionary to be applied.

getPhotoCapabilities()

getPhotoCapabilities() is used to retrieve the ranges of available configuration options, if any. When this method is invoked:

If the readyState of track provided in the constructor is not live, return a promise rejected with a new DOMException whose name is InvalidStateError.
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps in parallel:
1. Gather data from track into a PhotoCapabilities object containing the available capabilities of the device, including ranges where appropriate. The method of doing this will depend on the underlying device.
2. If the UA is unable to to gather the data for any reason (for example, the MediaStreamTrack being ended asynchronously), then the UA MUST return a promise rejected with a new DOMException whose name is OperationError.
3. Return a promise resolved with the PhotoCapabilities object.

getPhotoSettings()

getPhotoSettings() is used to retrieve the current configuration settings values, if any. When this method is invoked:

If the readyState of track provided in the constructor is not live, return a promise rejected with a new DOMException whose name is InvalidStateError.
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps in parallel:
1. Gather data from track into a PhotoSettings containing the current conditions in which the device is found. The method of doing this will depend on the underlying device.
2. If the UA is unable to gather the data for any reason (for example, the MediaStreamTrack being ended asynchronously), then the UA MUST return a promise rejected with a new DOMException whose name is OperationError.
3. Return a promise resolved with the PhotoSettings object.

grabFrame()

grabFrame() takes a snapshot of the live video being held in track, returning an ImageBitmap if successful. grabFrame() returns data only once upon being invoked. When this method is invoked:

If the readyState of track provided in the constructor is not live, return a promise rejected with a new DOMException whose name is InvalidStateError.
Otherwise it MUST queue a task, using the DOM manipulation task source, that runs the following steps in parallel:
1. Gather data from track into an ImageBitmap object. The width and height of the ImageBitmap object are derived from the constraints of track.
2. Returns a promise resolved with a newly created ImageBitmap object.
3. If the UA is unable to execute the grabFrame() method for any reason (for example, upon invocation of multiple grabFrame()/takePhoto() method calls in rapid succession), then the UA MUST return a promise rejected with a new DOMException whose name is UnknownError.

4. PhotoCapabilities

interface PhotoCapabilities {
  readonly attribute RedEyeReduction            redEyeReduction;
  readonly attribute MediaSettingsRange         imageHeight;
  readonly attribute MediaSettingsRange         imageWidth;
  readonly attribute FrozenArray<FillLightMode> fillLightMode;
};

4.1. Attributes

redEyeReduction, of type RedEyeReduction, readonly: The red eye reduction capacity of the source.
imageHeight, of type MediaSettingsRange, readonly: This reflects the image height range supported by the UA.
imageWidth, of type MediaSettingsRange, readonly: This reflects the image width range supported by the UA.
fillLightMode, of type FrozenArray<FillLightMode>, readonly: This reflects the supported fill light mode (flash) settings, if any.

The supported resolutions are presented as segregated imageWidth and imageHeight ranges to prevent increasing the fingerprinting surface and to allow the UA to make a best-effort decision with regards to actual hardware configuration.

5. PhotoSettings

dictionary PhotoSettings {
  FillLightMode   fillLightMode;
  double          imageHeight;
  double          imageWidth;
  boolean         redEyeReduction;
};

5.1. Members

redEyeReduction, of type boolean: This reflects whether camera red eye reduction is desired
imageWidth, of type double: This reflects the desired image height. The UA MUST select the closest height value this setting if it supports a discrete set of height options.
imageHeight, of type double: This reflects the desired image width. The UA MUST select the closest width value this setting if it supports a discrete set of width options.
fillLightMode, of type FillLightMode: This reflects the desired fill light mode (flash) setting.

6. `MediaSettingsRange`

interface MediaSettingsRange {
    readonly attribute double max;
    readonly attribute double min;
    readonly attribute double step;
};

6.1. Attributes

max, of type double, readonly: The maximum value of this setting
min, of type double, readonly: The minimum value of this setting
step, of type double, readonly: The minimum difference between consecutive values of this setting.

7. `RedEyeReduction`

enum RedEyeReduction {
  "never",
  "always",
  "controllable"
};

7.1. Values

never: Red eye reduction is not available in the device.
always: Red eye reduction is available in the device and it is always configured to true.
controllable: Red eye reduction is available in the device and it is controllable by the user via redEyeReduction.

8. `FillLightMode`

enum FillLightMode {
  "auto",
  "off",
  "flash"
};

8.1. Values

auto: The video device’s fill light will be enabled when required (typically low light conditions). Otherwise it will be off. Note that auto does not guarantee that a flash will fire when takePhoto() is called. Use flash to guarantee firing of the flash for takePhoto() method.
off: The source’s fill light and/or flash will not be used.
flash: This value will always cause the flash to fire for takePhoto() method.

9. Extensions

This Section defines a number of new set of Constrainable Properties for MediaStreamTrack that can be applied in order to make its behavior more suitable for taking pictures. Use of these constraints via MediaStreamTrack's methods getCapabilities(), getSettings(), getConstraints() and applyConstraints() will modify the behavior of the ImageCapture object’s track.

9.1. `MediaTrackSupportedConstraints` dictionary

MediaTrackSupportedConstraints is extended here with the list of constraints that a User Agent recognizes for controlling the photo capabilities. This dictionary can be retrieved using MediaDevices getSupportedConstraints() method.

partial dictionary MediaTrackSupportedConstraints {
  boolean whiteBalanceMode = true;
  boolean exposureMode = true;
  boolean focusMode = true;
  boolean pointsOfInterest = true;

  boolean exposureCompensation = true;
  boolean colorTemperature = true;
  boolean iso = true;

  boolean brightness = true;
  boolean contrast = true;
  boolean saturation = true;
  boolean sharpness = true;
  boolean focusDistance = true;
  boolean zoom = true;
  boolean torch = true;
};

9.1.1. Members

whiteBalanceMode, of type boolean, defaulting to true: Whether white balance mode constraining is recognized.
colorTemperature, of type boolean, defaulting to true: Whether color temperature constraining is recognized.
exposureMode, of type boolean, defaulting to true: Whether exposure constraining is recognized.
exposureCompensation, of type boolean, defaulting to true: Whether exposure compensation constraining is recognized.
iso, of type boolean, defaulting to true: Whether ISO constraining is recognized.
focusMode, of type boolean, defaulting to true: Whether focus mode constraining is recognized.
pointsOfInterest, of type boolean, defaulting to true: Whether points of interest are supported.
brightness, of type boolean, defaulting to true: Whether brightness constraining is recognized.
contrast, of type boolean, defaulting to true: Whether contrast constraining is recognized.
saturation, of type boolean, defaulting to true: Whether saturation constraining is recognized.
sharpness, of type boolean, defaulting to true: Whether sharpness constraining is recognized.
focusDistance, of type boolean, defaulting to true: Whether focus distance constraining is recognized.
zoom, of type boolean, defaulting to true: Whether configuration of the zoom level is recognized.
torch, of type boolean, defaulting to true: Whether configuration of torch is recognized.

9.2. `MediaTrackCapabilities` dictionary

MediaTrackCapabilities is extended here with the capabilities specific to image capture. This dictionary is produced by the UA via getCapabilities() and represents the supported ranges and enumerations of the supported constraints.

partial dictionary MediaTrackCapabilities {
  sequence<DOMString>  whiteBalanceMode;
  sequence<DOMString>  exposureMode;
  sequence<DOMString>  focusMode;

  MediaSettingsRange   exposureCompensation;
  MediaSettingsRange   colorTemperature;
  MediaSettingsRange   iso;

  MediaSettingsRange   brightness;
  MediaSettingsRange   contrast;
  MediaSettingsRange   saturation;
  MediaSettingsRange   sharpness;

  MediaSettingsRange   focusDistance;
  MediaSettingsRange   zoom;

  boolean              torch;
};

9.2.1. Members

whiteBalanceMode, of type sequence<DOMString>: A sequence of supported white balance modes. Each string MUST be one of the members of MeteringMode.
colorTemperature, of type MediaSettingsRange: This range reflects the supported correlated color temperatures to be used for the scene white balance calculation.
exposureMode, of type sequence<DOMString>: A sequence of supported exposure modes. Each string MUST be the members of MeteringMode.
exposureCompensation, of type MediaSettingsRange: This reflects the supported range of exposure compensation. The supported range can be, and usually is, centered around 0 EV.
iso, of type MediaSettingsRange: This reflects the permitted range of ISO values.
focusMode, of type sequence<DOMString>: A sequence of supported focus modes. Each string MUST be one of the members of MeteringMode.
brightness, of type MediaSettingsRange: This reflects the supported range of brightness setting of the camera. Values are numeric. Increasing values indicate increasing brightness.
contrast, of type MediaSettingsRange: This reflects the supported range of contrast. Values are numeric. Increasing values indicate increasing contrast.
saturation, of type MediaSettingsRange: This reflects the permitted range of saturation setting. Values are numeric. Increasing values indicate increasing saturation.
sharpness, of type MediaSettingsRange: This reflects the permitted sharpness range of the camera. Values are numeric. Increasing values indicate increasing sharpness, and the minimum value always implies no sharpness enhancement or processing.
focusDistance, of type MediaSettingsRange: This reflects the focus distance value range supported by the UA.
zoom, of type MediaSettingsRange: This reflects the zoom value range supported by the UA.
torch, of type boolean: A boolean indicating whether camera supports torch mode- on meaning supported.

9.3. `MediaTrackConstraintSet` dictionary

MediaTrackConstraintSet dictionary is used for both reading the current status with getConstraints() and for applying a set of constraints with applyConstraints().

MediaTrackSettings can be retrieved to verify the effect of the application by the user agent of the requested MediaTrackConstraints. Some constraints such as, e.g. zoom, might not be immediately applicable.

partial dictionary MediaTrackConstraintSet {
  ConstrainDOMString whiteBalanceMode;
  ConstrainDOMString exposureMode;
  ConstrainDOMString focusMode;
  ConstrainPoint2D   pointsOfInterest;

  ConstrainDouble    exposureCompensation;
  ConstrainDouble    colorTemperature;
  ConstrainDouble    iso;

  ConstrainDouble    brightness;
  ConstrainDouble    contrast;
  ConstrainDouble    saturation;
  ConstrainDouble    sharpness;

  ConstrainDouble    focusDistance;
  ConstrainDouble    zoom;

  ConstrainBoolean   torch;
};

9.3.1. Members

whiteBalanceMode, of type ConstrainDOMString: This string MUST be one of the members of MeteringMode. See white balance mode constrainable property.
exposureMode, of type ConstrainDOMString: This string MUST be one of the members of MeteringMode. See exposure constrainable property.
focusMode, of type ConstrainDOMString: This string MUST be one of the members of MeteringMode. See focus mode constrainable property.
colorTemperature, of type ConstrainDouble: See color temperature constrainable property.
exposureCompensation, of type ConstrainDouble: See exposure compensation constrainable property.
iso, of type ConstrainDouble: See iso constrainable property.
pointsOfInterest, of type ConstrainPoint2D: See points of interest constrainable property.
brightness, of type ConstrainDouble: See brightness constrainable property.
contrast, of type ConstrainDouble: See contrast constrainable property.
saturation, of type ConstrainDouble: See saturation constrainable property.
sharpness, of type ConstrainDouble: See sharpness constrainable property.
focusDistance, of type ConstrainDouble: See focus distance constrainable property.
zoom, of type ConstrainDouble: See zoom constrainable property.
torch, of type ConstrainBoolean: See torch constrainable property.

9.4. `MediaTrackSettings` dictionary

When the getSettings() method is invoked on a video stream track, the user agent must return the extended MediaTrackSettings dictionary, representing the current status of the underlying user agent.

partial dictionary MediaTrackSettings {
  DOMString         whiteBalanceMode;
  DOMString         exposureMode;
  DOMString         focusMode;
  sequence<Point2D> pointsOfInterest;

  double            exposureCompensation;
  double            colorTemperature;
  double            iso;

  double            brightness;
  double            contrast;
  double            saturation;
  double            sharpness;

  double            focusDistance;
  double            zoom;

  boolean           torch;
};

9.4.1. Members

whiteBalanceMode, of type DOMString: Current white balance mode setting. The string MUST be one of the members of MeteringMode.
exposureMode, of type DOMString: Current exposure mode setting. The string MUST be one of the members of MeteringMode.
colorTemperature, of type double: Color temperature in use for the white balance calculation of the scene. This field is only significant if whiteBalanceMode is manual.
exposureCompensation, of type double: Current exposure compensation setting. A value of 0 EV is interpreted as no exposure compensation.
iso, of type double: Current camera ISO setting.
focusMode, of type DOMString: Current focus mode setting. The string MUST be one of the members of MeteringMode.
pointsOfInterest, of type sequence<Point2D>: A sequence of Point2Ds in use as points of interest for other settings, e.g. Focus, Exposure and Auto White Balance.
brightness, of type double: This reflects the current brightness setting of the camera.
contrast, of type double: This reflects the current contrast setting of the camera.
saturation, of type double: This reflects the current saturation setting of the camera.
sharpness, of type double: This reflects the current sharpness setting of the camera.
focusDistance, of type double: This reflects the current focus distance setting of the camera.
zoom, of type double: This reflects the current zoom setting of the camera.
torch, of type boolean: Current camera torch configuration setting.

9.5. Additional Constrainable Properties

dictionary ConstrainPoint2DParameters {
  sequence<Point2D> exact;
  sequence<Point2D> ideal;
};

typedef (sequence<Point2D> or ConstrainPoint2DParameters) ConstrainPoint2D;

9.5.1. Members

exact, of type sequence<Point2D>: The exact required value of points of interest.
ideal, of type sequence<Point2D>: The ideal (target) value of points of interest.

10. Photo Capabilities and Constrainable Properties

Many of the mentioned photo and video capabilities mirror hardware features that are hard to define since can be implemented in a number of ways. Moreover, manufacturers tend to publish vague definitions to protect their intellectual property.

White balance mode is a setting that cameras use to adjust for different color temperatures. Color temperature is the temperature of background light (usually measured in Kelvin). This setting can usually be automatically and continuously determined by the implementation, but it’s also common to offer a manual mode in which the estimated temperature of the scene illumination is hinted to the implementation. Typical temperature ranges for popular modes are provided below:

Mode	Kelvin range
incandescent	2500-3500
fluorescent	4000-5000
warm-fluorescent	5000-5500
daylight	5500-6500
cloudy-daylight	6500-8000
twilight	8000-9000
shade	9000-10000

Exposure is the amount of time during which light is allowed to fall on the photosensitive device. Auto-exposure mode is a camera setting where the exposure levels are automatically adjusted by the implementation based on the subject of the photo.
Focus mode describes the focus setting of the capture device (e.g. auto or manual).
Points of interest describe the metering area centers used in other settings, e.g. exposure, white balance mode and focus mode each one being a Point2D (usually these three controls are modified simultaneously by the so-called 3A algorithm: auto-focus, auto-exposure, auto-white-balance).
A Point2D Point of Interest is interpreted to represent a pixel position in a normalized square space (|{x,y} ∈ [0.0, 1.0]|). The origin of coordinates |{x,y} = {0.0, 0.0}| represents the upper leftmost corner whereas the |{x,y} = {1.0, 1.0}| represents the lower rightmost corner: the x coordinate (columns) increases rightwards and the y coordinate (rows) increases downwards. Values beyond the mentioned limits will be clamped to the closest allowed value.
Exposure Compensation is a numeric camera setting that adjusts the exposure level from the current value used by the implementation. This value can be used to bias the exposure level enabled by auto-exposure, and usually is a symmetric range around 0 EV (the no-compensation value).
The ISO setting of a camera describes the sensitivity of the camera to light. It is a numeric value, where the lower the value the greater the sensitivity. This value should follow the [iso12232] standard.
Red Eye Reduction is a feature in cameras that is designed to limit or prevent the appearance of red pupils ("Red Eye") in photography subjects due prolonged exposure to a camera’s flash.
[LIGHTING-VOCABULARY] defines brightness as "the attribute of a visual sensation according to which an area appears to emit more or less light" and in the context of the present API, it refers to the numeric camera setting that adjusts the perceived amount of light emitting from the photo object. A higher brightness setting increases the intensity of darker areas in a scene while compressing the intensity of brighter parts of the scene. The range and effect of this setting is implementation dependent but in general it translates into a numerical value that is added to each pixel with saturation.
Contrast is the numeric camera setting that controls the difference in brightness between light and dark areas in a scene. A higher contrast setting reflects an expansion in the difference in brightness. The range and effect of this setting is implementation dependent but it can be understood as a transformation of the pixel values so that the luma range in the histogram becomes larger; the transformation is sometimes as simple as a gain factor.
[LIGHTING-VOCABULARY] defines saturation as "the colourfulness of an area judged in proportion to its brightness" and in the current context it refers to a numeric camera setting that controls the intensity of color in a scene (i.e. the amount of gray in the scene). Very low saturation levels will result in photos closer to black-and-white. Saturation is similar to contrast but referring to colors, so its implementation, albeit being platform dependent, can be understood as a gain factor applied to the chroma components of a given image.
Sharpness is a numeric camera setting that controls the intensity of edges in a scene. Higher sharpness settings result in higher contrast along the edges, while lower settings result in less contrast and blurrier edges (i.e. soft focus). The implementation is platform dependent, but it can be understood as the linear combination of an edge detection operation applied on the original image and the original image itself; the relative weights being cotrolled by this sharpness.
Brightness, contrast, saturation and sharpness are specified in [UVC].
Image width and image height represent the supported/desired resolution of the resulting photographic image after any potential sensor corrections and other algorithms are run.
The supported resolutions are managed segregated e.g. imageWidth and imageHeight values/ranges to prevent increasing the fingerprinting surface and to allow the UA to make a best-effort decision with regards to actual hardware configuration vis-a-vis requested constraints.
Focus distance is a numeric camera setting that controls the focus distance of the lens. The setting usually represents distance in meters to the optimal focus distance.
Zoom is a numeric camera setting that controls the focal length of the lens. The setting usually represents a ratio, e.g. 4 is a zoom ratio of 4:1. The minimum value is usually 1, to represent a 1:1 ratio (i.e. no zoom).
Fill light mode describes the flash setting of the capture device (e.g. auto, off, on). Torch describes the setting of the source’s fill light as continuously connected, staying on as long as track is active.

11. `MeteringMode`

enum MeteringMode {
  "none",
  "manual",
  "single-shot",
  "continuous"
};

11.1. Values

none: This source does not offer focus/exposure/white balance mode. For setting, this is interpreted as a command to turn off the feature.
manual: The capture device is set to manually control the lens position/exposure time/white balance, or such a mode is requested to be configured.
single-shot: The capture device is configured for single-sweep autofocus/one-shot exposure/white balance calculation, or such a mode is requested.
continuous: The capture device is configured for continuous focusing for near-zero shutter-lag/continuous auto exposure/white balance calculation, or such continuous focus hunting/exposure/white balance calculation mode is requested.

12. `Point2D`

A Point2D represents a location in a two dimensional space. The origin of coordinates is situated in the upper leftmost corner of the space.

dictionary Point2D {
  double x = 0.0;
  double y = 0.0;
};

12.1. Members

x, of type double, defaulting to 0.0: Value of the horizontal (abscissa) coordinate.
y, of type double, defaulting to 0.0: Value of the vertical (ordinate) coordinate.

13. Examples

Slightly modified versions of these examples can be found in e.g. this codepen collection.

13.1. Update camera zoom and `takePhoto()`

The following example can also be found in e.g. this codepen with minimal modifications.

<html>
<body>
<video autoplay></video>
<img>
<input type="range" hidden>
<script>
  var imageCapture;

  navigator.mediaDevices.getUserMedia({video: true})
    .then(gotMedia)
    .catch(err => console.error('getUserMedia() failed: ', err));

  function gotMedia(mediastream) {
    const video = document.querySelector('video');
    video.srcObject = mediastream;

    const track = mediastream.getVideoTracks()[0];
    imageCapture = new ImageCapture(track);

    const capabilities = track.getCapabilities()
    // Check whether zoom is supported or not.
    if (!capabilities.zoom) {
      return;
    }

    // Map zoom to a slider element.
    const input = document.querySelector('input[type="range"]');
    input.min = capabilities.zoom.min;
    input.max = capabilities.zoom.max;
    input.step = capabilities.zoom.step;
    input.value = track.getSettings().zoom;

    input.oninput = function(event) {
      track.applyConstraints({advanced : [{zoom: event.target.value}] });
    }
    input.hidden = false;
  }

  function takePhoto() {
    imageCapture.takePhoto()
      .then(blob => {
        console.log('Photo taken: ' + blob.type + ', ' + blob.size + 'B');

        const image = document.querySelector('img');
        image.src = URL.createObjectURL(blob);
      })
      .catch(err => console.error('takePhoto() failed: ', err));
  }
</script>
</body>
</html>

13.2. Repeated grabbing of a frame with `grabFrame()`

The following example can also be found in e.g. this codepen with minimal modifications.

<html>
<body>
<canvas></canvas>
<button onclick="stopGrabFrame()">Stop frame grab</button>
<script>
  const canvas = document.querySelector('canvas');

  var interval;
  var track;

  navigator.mediaDevices.getUserMedia({video: true})
    .then(gotMedia)
    .catch(err => console.error('getUserMedia() failed: ', err));

  function gotMedia(mediastream) {
    track = mediastream.getVideoTracks()[0];
    var imageCapture = new ImageCapture(track);
    interval = setInterval(function () {
      imageCapture.grabFrame()
        .then(processFrame)
        .catch(err => console.error('grabFrame() failed: ', err));
    }, 1000);
  }

  function processFrame(imgData) {
    canvas.width = imgData.width;
    canvas.height = imgData.height;
    canvas.getContext('2d').drawImage(imgData, 0, 0);
  }

  function stopGrabFrame(e) {
    clearInterval(interval);
    track.stop();
  }
</script>
</body>
</html>

13.3. Grabbing a Frame and Post-Processing

The following example can also be found in e.g. this codepen with minimal modifications.

<html>
<body>
<canvas></canvas>
<script>
  const canvas = document.querySelector('canvas');

  var track;

  navigator.mediaDevices.getUserMedia({video: true})
    .then(gotMedia)
    .catch(err => console.error('getUserMedia() failed: ', err));

  function gotMedia(mediastream) {
    track = mediastream.getVideoTracks()[0];
    var imageCapture = new ImageCapture(track);
    imageCapture.grabFrame()
      .then(processFrame)
      .catch(err => console.error('grabFrame() failed: ', err));
  }

  function processFrame(imageBitmap) {
    track.stop();

    // |imageBitmap| pixels are not directly accessible: we need to paint
    // the grabbed frame onto a <canvas>, then getImageData() from it.
    const ctx = canvas.getContext('2d');
    canvas.width = imageBitmap.width;
    canvas.height = imageBitmap.height;
    ctx.drawImage(imageBitmap, 0, 0);

    // Read back the pixels from the <canvas>, and invert the colors.
    const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);

    var data = imageData.data;
    for (var i = 0; i < data.length; i += 4) {
      data[i]     = 255 - data[i];     // red
      data[i + 1] = 255 - data[i + 1]; // green
      data[i + 2] = 255 - data[i + 2]; // blue
    }
    // Finally, draw the inverted image to the <canvas>
    ctx.putImageData(imageData, 0, 0);
  }
</script>
</body>
</html>

13.4. Update camera focus distance and `takePhoto()`

<html>
<body>
<video autoplay></video>
<img>
<input type="range" hidden>
<script>
  var imageCapture;

  navigator.mediaDevices.getUserMedia({video: true})
    .then(gotMedia)
    .catch(err => console.error('getUserMedia() failed: ', err));

  function gotMedia(mediastream) {
    const video = document.querySelector('video');
    video.srcObject = mediastream;

    const track = mediastream.getVideoTracks()[0];
    imageCapture = new ImageCapture(track);

    const capabilities = track.getCapabilities()
    // Check whether focus distance is supported or not.
    if (!capabilities.focusDistance) {
      return;
    }

    // Map focus distance to a slider element.
    const input = document.querySelector('input[type="range"]');
    input.min = capabilities.focusDistance.min;
    input.max = capabilities.focusDistance.max;
    input.step = capabilities.focusDistance.step;
    input.value = track.getSettings().focusDistance;

    input.oninput = function(event) {
      track.applyConstraints({
        advanced : [{focusMode: "manual", focusDistance: event.target.value}]
      });
    }
    input.hidden = false;
  }

  function takePhoto() {
    imageCapture.takePhoto()
      .then(blob => {
        console.log('Photo taken: ' + blob.type + ', ' + blob.size + 'B');

        const image = document.querySelector('img');
        image.src = URL.createObjectURL(blob);
      })
      .catch(err => console.error('takePhoto() failed: ', err));
  }
</script>
</body>
</html>

"MediaStream Image Capture"

W3C Working Draft, 21 June 2017

Abstract

Status of this document

1. Introduction

2. Security and Privacy Considerations

3. Image Capture API

3.1. Attributes

3.2. Methods

4. PhotoCapabilities

4.1. Attributes

5. PhotoSettings

5.1. Members

6. MediaSettingsRange

6.1. Attributes

7. RedEyeReduction

7.1. Values

8. FillLightMode

8.1. Values

9. Extensions

9.1. MediaTrackSupportedConstraints dictionary

9.1.1. Members

9.2. MediaTrackCapabilities dictionary

9.2.1. Members

9.3. MediaTrackConstraintSet dictionary

9.3.1. Members

9.4. MediaTrackSettings dictionary

9.4.1. Members

9.5. Additional Constrainable Properties

9.5.1. Members

10. Photo Capabilities and Constrainable Properties

11. MeteringMode

11.1. Values

12. Point2D

12.1. Members

13. Examples

13.1. Update camera zoom and takePhoto()

13.2. Repeated grabbing of a frame with grabFrame()

13.3. Grabbing a Frame and Post-Processing

13.4. Update camera focus distance and takePhoto()

Conformance

Document conventions

Conformant Algorithms

Conformance Classes

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

Informative References

IDL Index

6. `MediaSettingsRange`

7. `RedEyeReduction`

8. `FillLightMode`

9.1. `MediaTrackSupportedConstraints` dictionary

9.2. `MediaTrackCapabilities` dictionary

9.3. `MediaTrackConstraintSet` dictionary

9.4. `MediaTrackSettings` dictionary

11. `MeteringMode`

12. `Point2D`

13.1. Update camera zoom and `takePhoto()`

13.2. Repeated grabbing of a frame with `grabFrame()`

13.4. Update camera focus distance and `takePhoto()`