Copyright © 2012-2016 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document specifies methods and camera settings to produce photographic image capture. The source of images is, or can be referenced via a MediaStreamTrack
.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
Comments on this document are welcomed.
This document was published by the Device and Sensors Working Group and the Web Real-Time Communications Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Device and Sensors Working Group) and a public list of any patent disclosures (Web Real-Time Communications Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 September 2015 W3C Process Document.
The API defined in this document captures images from a photographic device referenced through a valid [GETUSERMEDIA] MediaStreamTrack
. The produced image can be in the form of a [FILEAPI] Blob
(see takePhoto()
method) or as a [HTML51] ImageBitmap
(see grabFrame()
). Picture-specific settings can be optionally provided as arguments that can be applied to the device for the capture.
The User Agent must support Promises in order to implement the Image Capture API. Any Promise object is assumed to have resolver object, with resolve() and reject() methods associated with it.
[Constructor(MediaStreamTrack track)]
interface ImageCapture
{
readonly attribute MediaStreamTrack videoStreamTrack
;
Promise<Blob> takePhoto
();
Promise<PhotoCapabilities
> getPhotoCapabilities
();
Promise<void> setOptions
(PhotoSettings
? photoSettings);
Promise<ImageBitmap> grabFrame
();
};
takePhoto()
returns a captured image encoded in the form of a Blob
, whereas grabFrame()
returns a snapshot of the MediaStreamTrack
video feed in the form of a non-encoded ImageBitmap
.
ImageCapture
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
track | MediaStreamTrack |
✘ | ✘ | The MediaStreamTrack to be used as source of data. This will be the value of the attribute. The MediaStreamTrack passed to the constructor MUST have its kind attribute set to "video " otherwise a DOMException of type NotSupportedError will be thrown. |
videoStreamTrack
of type MediaStreamTrack, readonlyMediaStreamTrack
passed into the constructor.takePhoto
()takePhoto()
produces the result of a single photographic exposure using the video capture device sourcing the videoStreamTrack
, applying any PhotoSettings
previously configured, and returning an encoded image in the form of a Blob
if successful. When this method is invoked:
readyState
of the MediaStreamTrack
provided in the constructor is not live
, return a promise rejected with a new DOMException
([WebIDL]) whose name is "InvalidStateError"
. Otherwise:MediaStreamTrack
into a Blob
containing a single still image. The method of doing this will depend on the underlying device.
mute
and unmute
events to fire on the Track in question.
takePhoto()
method for any reason (for example, upon invocation of multiple takePhoto() method calls in rapid succession), then the UA MUST return a promise rejected with a new DOMException
([WebIDL]) whose name is "UnknownError"
.getPhotoCapabilities
()getPhotoCapabilities()
is used to retrieve the ranges of available configuration options and their current setting values, if any. When this method is invoked:
readyState
of the MediaStreamTrack
provided in the constructor is not live
, return a promise rejected with a new DOMException
([WebIDL]) whose name is "InvalidStateError"
. MediaStreamTrack
into a PhotoCapabilities
object containing the available capabilities of the device, including ranges where appropriate. The resolved PhotoCapabilities
will also include the current conditions in which the capabilities of the device are found. The method of doing this will depend on the underlying device. getPhotoCapabilities()
method for any reason (for example, the MediaStreamTrack
being ended asynchronously), then the UA MUST return a promise rejected with a new DOMException
([WebIDL]) whose name is "OperationError"
.PhotoCapabilities
object.setOptions
()setOptions()
is used to configure a number of settings affecting the image capture and/or the current video feed in videoStreamTrack
. When this method is invoked:
readyState
of the MediaStreamTrack
provided in the constructor is not live
, return a promise rejected with a new DOMException
([WebIDL]) whose name is "InvalidStateError"
. PhotoSettings
object is passed as argument, return a promise rejected with a new DOMException
([WebIDL]) whose name is "SyntaxError"
settings
parameter.DOMException
([WebIDL]) whose name is "OperationError"
. videoStreamTrack
. The result of applying some of the settings MAY force the latter to not satisfy its constraints (e.g. the frame rate). The result of applying some of this constraints might not be immediate.
Parameter | Type | Nullable | Optional | Description |
---|---|---|---|---|
settings | PhotoSettings |
✔ | ✘ |
The PhotoSettings dictionary to be applied.
|
PhotoSettings
represent hardware capabilities that cannot be modified instaneously, e.g. zoom
or focus
. setOptions() will resolve the Promise as soon as possible. The actual status of any field can be monitored getPhotoCapabilities().
grabFrame
()grabFrame()
takes a snapshot of the live video being held in the videoStreamTrack
, returning an ImageBitmap if successful. grabFrame()
returns data only once upon being invoked. When this method is invoked:
readyState
of the MediaStreamTrack
provided in the constructor is not live
, return a promise rejected with a new DOMException
([WebIDL]) whose name is "InvalidStateError"
.MediaStreamTrack
into an ImageBitmap
object (as defined in [HTML51]). The width
and height
of the ImageBitmap
object are derived from the constraints of the MediaStreamTrack
.
grabFrame()
is affected by any options set by setOptions()
if those are reflected in videoStreamTrack
.
ImageBitmap
object.grabFrame()
method for any reason (for example, upon invocation of multiple grabFrame()
/takePhoto()
method calls in rapid succession), then the UA MUST return a promise rejected with a new DOMException
([WebIDL]) whose name is "UnknownError"
.PhotoCapabilities
interface PhotoCapabilities {
readonly attribute MeteringMode
whiteBalanceMode
;
readonly attribute MediaSettingsRange
colorTemperature
;
readonly attribute MeteringMode
exposureMode
;
readonly attribute MediaSettingsRange
exposureCompensation
;
readonly attribute MediaSettingsRange
iso
;
readonly attribute boolean redEyeReduction
;
readonly attribute MeteringMode
focusMode
;
readonly attribute MediaSettingsRange
brightness
;
readonly attribute MediaSettingsRange
contrast
;
readonly attribute MediaSettingsRange
saturation
;
readonly attribute MediaSettingsRange
sharpness
;
readonly attribute MediaSettingsRange
imageHeight
;
readonly attribute MediaSettingsRange
imageWidth
;
readonly attribute MediaSettingsRange
zoom
;
readonly attribute FillLightMode
fillLightMode
;
};
whiteBalanceMode
of type MeteringMode
colorTemperature
of type MediaSettingsRange
exposureMode
of type MeteringMode
exposureCompensation
of type MediaSettingsRange
iso
of type MediaSettingsRange
redEyeReduction
of type booleanfocusMode
of type MeteringMode
brightness
of type MediaSettingsRange
contrast
of type MediaSettingsRange
saturation
of type MediaSettingsRange
sharpness
of type MediaSettingsRange
imageHeight
of type MediaSettingsRange
imageWidth
of type MediaSettingsRange
zoom
of type MediaSettingsRange
fillLightMode
of type FillLightMode
FillLightMode
.imageWidth
and imageHeight
ranges to prevent increasing the fingerprinting surface and to allow the UA to make a best-effort decision with regards to actual hardware configuration.
This section is non-normative.
The PhotoCapabilities
interface provides the photo-specific settings and their current values. Many of these fields mirror hardware capabilities that are hard to define since can be implemented in a number of ways. Moreover, hardware manufacturers tend to publish vague definitions to protect their intellectual property. The following definitions are assumed for individual settings and are provided for information purposes:
manual
mode in which the estimated temperature of the scene illumination is hinted to the implementation. Typical temperature ranges for popular modes are provided below:
Mode | Kelvin range |
---|---|
incandescent | 2500-3500 |
fluorescent | 4000-5000 |
warm-fluorescent | 5000-5500 |
daylight | 5500-6500 |
cloudy-daylight | 6500-8000 |
twilight | 8000-9000 |
shade | 9000-10000 |
auto
or manual
). sharpness
.
auto
, off
, on
). PhotoSettings
The PhotoSettings
object is optionally passed into the setOptions()
method in order to modify capture device settings specific to still imagery. Each of the attributes in this object is optional.
dictionary PhotoSettings {
MeteringMode
whiteBalanceMode
;
double colorTemperature
;
MeteringMode
exposureMode
;
double exposureCompensation
;
double iso
;
boolean redEyeReduction
;
MeteringMode
focusMode
;
sequence<Point2D
> pointsOfInterest
;
double brightness
;
double contrast
;
double saturation
;
double sharpness
;
double zoom
;
double imageHeight
;
double imageWidth
;
FillLightMode
fillLightMode
;
};
whiteBalanceMode
of type MeteringMode
colorTemperature
of type doublewhiteBalanceMode
is manual
.exposureMode
of type MeteringMode
MeteringMode
.exposureCompensation
of type double. iso
of type doubleredEyeReduction
of type booleanfocusMode
of type MeteringMode
MeteringMode
.pointsOfInterest
of type sequence<Point2D
>sequence
of Point2D
s to be used as metering area centers for other settings, e.g. Focus, Exposure and Auto White Balance.
Point2D
Point of Interest is interpreted to represent a pixel position in a normalized square space ({x,y} ∈ [0.0, 1.0]
). The origin of coordinates {x,y} = {0.0, 0.0}
represents the upper leftmost corner whereas the {x,y} = {1.0, 1.0}
represents the lower rightmost corner: the x
coordinate (columns) increases rightwards and the y
coordinate (rows) increases downwards. Values beyond the mentioned limits will be clamped to the closest allowed value.
brightness
of type doublecontrast
of type doublesaturation
of type doublesharpness
of type doublezoom
of type doubleimageHeight
of type doubleimageWidth
of type doublefillLightMode
of type FillLightMode
FillLightMode
.MediaSettingsRange
interface MediaSettingsRange {
readonly attribute double max
;
readonly attribute double min
;
readonly attribute double current
;
readonly attribute double step
;
};
max
of type double, readonlymin
of type double, readonlycurrent
of type double, readonlystep
of type double, readonlyFillLightMode
enum FillLightMode {
"unavailable",
"auto",
"off",
"flash",
"torch"
};
unavailable
auto
takePhoto()
is called. Use flash
to guarantee firing of the flash for the takePhoto()
or getFrame()
methods.off
flash
takePhoto()
or getFrame()
methods. torch
MediaStreamTrack
is activeMeteringMode
Note that MeteringMode
is used for both status enumeration and for setting options for capture(s).
enum MeteringMode {
"none",
"manual",
"single-shot",
"continuous"
};
none
manual
single-shot
continuous
Point2D
A Point2D
represents a location in a two dimensional space. The origin of coordinates is situated in the upper leftmost corner of the space.
x
of type doubley
of type doubleThis section is non-normative.
takePhoto()
<html>
<body>
<video autoplay></video>
<img>
<input type="range" hidden>
<script>
var imageCapture;
navigator.mediaDevices.getUserMedia({video: true})
.then(gotMedia)
.catch(err => console.error('getUserMedia() failed: ', err));
function gotMedia(mediastream) {
const video = document.querySelector('video');
video.srcObject = mediastream;
const track = mediastream.getVideoTracks()[0];
imageCapture = new ImageCapture(track);
imageCapture.getPhotoCapabilities()
.then(photoCapabilities => {
// Check whether zoom is supported or not.
if (!photoCapabilities.zoom.min && !photoCapabilities.zoom.max) {
return;
}
// Map zoom to a slider element.
const input = document.querySelector('input[type="range"]');
input.min = photoCapabilities.zoom.min;
input.max = photoCapabilities.zoom.max;
input.step = photoCapabilities.zoom.step;
input.value = photoCapabilities.zoom.current;
input.oninput = function(event) {
imageCapture.setOptions({zoom: event.target.value});
}
input.hidden = false;
})
.catch(err => console.error('getPhotoCapabilities() failed: ', err));
}
function takePhoto() {
imageCapture.takePhoto()
.then(blob => {
console.log('Photo taken: ' + blob.type + ', ' + blob.size + 'B');
const image = document.querySelector('img');
image.src = URL.createObjectURL(blob);
})
.catch(err => console.error('takePhoto() failed: ', err));
}
</script>
</body>
</html>
<html>
<body>
<canvas></canvas>
<button onclick="stopGrabFrame()">Stop frame grab</button>
<script>
const canvas = document.querySelector('canvas');
var interval;
var track;
navigator.mediaDevices.getUserMedia({video: true})
.then(gotMedia)
.catch(err => console.error('getUserMedia() failed: ', err));
function gotMedia(mediastream) {
track = mediastream.getVideoTracks()[0];
var imageCapture = new ImageCapture(track);
interval = setInterval(function () {
imageCapture.grabFrame()
.then(processFrame)
.catch(err => console.error('grabFrame() failed: ', err));
}, 1000);
}
function processFrame(imgData) {
canvas.width = imgData.width;
canvas.height = imgData.height;
canvas.getContext('2d').drawImage(imgData, 0, 0);
}
function stopGrabFrame(e) {
clearInterval(interval);
track.stop();
}
</script>
</body>
</html>
<html>
<body>
<canvas></canvas>
<script>
const canvas = document.querySelector('canvas');
var track;
navigator.mediaDevices.getUserMedia({video: true})
.then(gotMedia)
.catch(err => console.error('getUserMedia() failed: ', err));
function gotMedia(mediastream) {
track = mediastream.getVideoTracks()[0];
var imageCapture = new ImageCapture(track);
imageCapture.grabFrame()
.then(processFrame)
.catch(err => console.error('grabFrame() failed: ', err));
}
function processFrame(imageBitmap) {
track.stop();
// |imageBitmap| pixels are not directly accessible: we need to paint
// the grabbed frame onto a <canvas>, then getImageData() from it.
const ctx = canvas.getContext('2d');
canvas.width = imageBitmap.width;
canvas.height = imageBitmap.height;
ctx.drawImage(imageBitmap, 0, 0);
// Read back the pixels from the <canvas>, and invert the colors.
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
var data = imageData.data;
for (var i = 0; i < data.length; i += 4) {
data[i] = 255 - data[i]; // red
data[i + 1] = 255 - data[i + 1]; // green
data[i + 2] = 255 - data[i + 2]; // blue
}
// Finally, draw the inverted image to the <canvas>
ctx.putImageData(imageData, 0, 0);
}
</script>
</body>
</html>