Copyright © 2012-2017 W3C® (MIT, ERCIM, Keio, Beihang). W3C liability, trademark and document use rules apply.
This document specifies methods and camera settings to produce photographic image capture. The source of images is, or can be referenced via a MediaStreamTrack.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.
Comments on this document are welcomed.
This document was published by the Device and Sensors Working Group and the Web Real-Time Communications Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-media-capture@w3.org (subscribe, archives). All comments are welcome.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by groups operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures (Device and Sensors Working Group) and a public list of any patent disclosures (Web Real-Time Communications Working Group) made in connection with the deliverables of each group; these pages also include instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 1 September 2015 W3C Process Document.
The API defined in this document captures images from a photographic device referenced through a valid [GETUSERMEDIA] MediaStreamTrack. The produced image can be in the form of a [FILEAPI] Blob (see takePhoto() method) or as a [HTML51] ImageBitmap (see grabFrame()). Picture-specific settings can be optionally provided as arguments that can be applied to the device for the capture.
The User Agent must support Promises in order to implement the Image Capture API. Any Promise object is assumed to have resolver object, with resolve() and reject() methods associated with it.
[Constructor(MediaStreamTrack track)]
interface ImageCapture {
readonly attribute MediaStreamTrack videoStreamTrack;
Promise<Blob> takePhoto();
Promise<PhotoCapabilities> getPhotoCapabilities();
Promise<void> setOptions(PhotoSettings? photoSettings);
Promise<ImageBitmap> grabFrame();
};
takePhoto() returns a captured image encoded in the form of a Blob, whereas grabFrame() returns a snapshot of the MediaStreamTrack video feed in the form of a non-encoded ImageBitmap.
ImageCapture| Parameter | Type | Nullable | Optional | Description |
|---|---|---|---|---|
| track | MediaStreamTrack |
✘ | ✘ | The MediaStreamTrack to be used as source of data. This will be the value of the attribute. The MediaStreamTrack passed to the constructor MUST have its kind attribute set to "video" otherwise a DOMException of type NotSupportedError will be thrown. |
videoStreamTrack of type MediaStreamTrack, readonlyMediaStreamTrack passed into the constructor.takePhoto()takePhoto() produces the result of a single photographic exposure using the video capture device sourcing the videoStreamTrack, applying any PhotoSettings previously configured, and returning an encoded image in the form of a Blob if successful. When this method is invoked:
readyState of the MediaStreamTrack provided in the constructor is not live, return a promise rejected with a new DOMException ([WebIDL]) whose name is "InvalidStateError". Otherwise:MediaStreamTrack into a Blob containing a single still image. The method of doing this will depend on the underlying device.
mute and unmute events to fire on the Track in question.
takePhoto() method for any reason (for example, upon invocation of multiple takePhoto() method calls in rapid succession), then the UA MUST return a promise rejected with a new DOMException ([WebIDL]) whose name is "UnknownError".getPhotoCapabilities()getPhotoCapabilities() is used to retrieve the ranges of available configuration options and their current setting values, if any. When this method is invoked:
readyState of the MediaStreamTrack provided in the constructor is not live, return a promise rejected with a new DOMException ([WebIDL]) whose name is "InvalidStateError". MediaStreamTrack into a PhotoCapabilities object containing the available capabilities of the device, including ranges where appropriate. The resolved PhotoCapabilities will also include the current conditions in which the capabilities of the device are found. The method of doing this will depend on the underlying device. getPhotoCapabilities() method for any reason (for example, the MediaStreamTrack being ended asynchronously), then the UA MUST return a promise rejected with a new DOMException ([WebIDL]) whose name is "OperationError".PhotoCapabilities object.setOptions()setOptions() is used to configure a number of settings affecting the image capture and/or the current video feed in videoStreamTrack. When this method is invoked:
readyState of the MediaStreamTrack provided in the constructor is not live, return a promise rejected with a new DOMException ([WebIDL]) whose name is "InvalidStateError". PhotoSettings object is passed as argument, return a promise rejected with a new DOMException ([WebIDL]) whose name is "SyntaxError"settings parameter.DOMException ([WebIDL]) whose name is "OperationError". videoStreamTrack. The result of applying some of the settings MAY force the latter to not satisfy its constraints (e.g. the frame rate). The result of applying some of this constraints might not be immediate.
| Parameter | Type | Nullable | Optional | Description |
|---|---|---|---|---|
| settings | PhotoSettings |
✔ | ✘ |
The PhotoSettings dictionary to be applied.
|
PhotoSettings represent hardware capabilities that cannot be modified instaneously, e.g. zoom or focus. setOptions() will resolve the Promise as soon as possible. The actual status of any field can be monitored getPhotoCapabilities().
grabFrame()grabFrame() takes a snapshot of the live video being held in the videoStreamTrack, returning an ImageBitmap if successful. grabFrame() returns data only once upon being invoked. When this method is invoked:
readyState of the MediaStreamTrack provided in the constructor is not live, return a promise rejected with a new DOMException ([WebIDL]) whose name is "InvalidStateError".MediaStreamTrack into an ImageBitmap object (as defined in [HTML51]). The width and height of the ImageBitmap object are derived from the constraints of the MediaStreamTrack.
grabFrame() is affected by any options set by setOptions() if those are reflected in videoStreamTrack.
ImageBitmap object.grabFrame() method for any reason (for example, upon invocation of multiple grabFrame()/takePhoto() method calls in rapid succession), then the UA MUST return a promise rejected with a new DOMException ([WebIDL]) whose name is "UnknownError".PhotoCapabilities
interface PhotoCapabilities {
readonly attribute MeteringMode whiteBalanceMode;
readonly attribute MediaSettingsRange colorTemperature;
readonly attribute MeteringMode exposureMode;
readonly attribute MediaSettingsRange exposureCompensation;
readonly attribute MediaSettingsRange iso;
readonly attribute boolean redEyeReduction;
readonly attribute MeteringMode focusMode;
readonly attribute MediaSettingsRange brightness;
readonly attribute MediaSettingsRange contrast;
readonly attribute MediaSettingsRange saturation;
readonly attribute MediaSettingsRange sharpness;
readonly attribute MediaSettingsRange imageHeight;
readonly attribute MediaSettingsRange imageWidth;
readonly attribute MediaSettingsRange zoom;
readonly attribute FillLightMode fillLightMode;
};
whiteBalanceMode of type MeteringModecolorTemperature of type MediaSettingsRangeexposureMode of type MeteringModeexposureCompensation of type MediaSettingsRangeiso of type MediaSettingsRangeredEyeReduction of type booleanfocusMode of type MeteringModebrightness of type MediaSettingsRangecontrast of type MediaSettingsRangesaturation of type MediaSettingsRangesharpness of type MediaSettingsRangeimageHeight of type MediaSettingsRangeimageWidth of type MediaSettingsRangezoom of type MediaSettingsRangefillLightMode of type FillLightModeFillLightMode.imageWidth and imageHeight ranges to prevent increasing the fingerprinting surface and to allow the UA to make a best-effort decision with regards to actual hardware configuration.
This section is non-normative.
The PhotoCapabilities interface provides the photo-specific settings and their current values. Many of these fields mirror hardware capabilities that are hard to define since can be implemented in a number of ways. Moreover, hardware manufacturers tend to publish vague definitions to protect their intellectual property. The following definitions are assumed for individual settings and are provided for information purposes:
manual mode in which the estimated temperature of the scene illumination is hinted to the implementation. Typical temperature ranges for popular modes are provided below:
| Mode | Kelvin range |
|---|---|
| incandescent | 2500-3500 |
| fluorescent | 4000-5000 |
| warm-fluorescent | 5000-5500 |
| daylight | 5500-6500 |
| cloudy-daylight | 6500-8000 |
| twilight | 8000-9000 |
| shade | 9000-10000 |
auto or manual). sharpness.
auto, off, on). PhotoSettings
The PhotoSettings object is optionally passed into the setOptions() method in order to modify capture device settings specific to still imagery. Each of the attributes in this object is optional.
dictionary PhotoSettings {
MeteringMode whiteBalanceMode;
double colorTemperature;
MeteringMode exposureMode;
double exposureCompensation;
double iso;
boolean redEyeReduction;
MeteringMode focusMode;
sequence<Point2D> pointsOfInterest;
double brightness;
double contrast;
double saturation;
double sharpness;
double zoom;
double imageHeight;
double imageWidth;
FillLightMode fillLightMode;
};
whiteBalanceMode of type MeteringModecolorTemperature of type doublewhiteBalanceMode is manual.exposureMode of type MeteringModeMeteringMode.exposureCompensation of type double. iso of type doubleredEyeReduction of type booleanfocusMode of type MeteringModeMeteringMode.pointsOfInterest of type sequence<Point2D>sequence of Point2Ds to be used as metering area centers for other settings, e.g. Focus, Exposure and Auto White Balance.
Point2D Point of Interest is interpreted to represent a pixel position in a normalized square space ({x,y} ∈ [0.0, 1.0]). The origin of coordinates {x,y} = {0.0, 0.0} represents the upper leftmost corner whereas the {x,y} = {1.0, 1.0} represents the lower rightmost corner: the x coordinate (columns) increases rightwards and the y coordinate (rows) increases downwards. Values beyond the mentioned limits will be clamped to the closest allowed value.
brightness of type doublecontrast of type doublesaturation of type doublesharpness of type doublezoom of type doubleimageHeight of type doubleimageWidth of type doublefillLightMode of type FillLightModeFillLightMode.MediaSettingsRange
interface MediaSettingsRange {
readonly attribute double max;
readonly attribute double min;
readonly attribute double current;
readonly attribute double step;
};
max of type double, readonlymin of type double, readonlycurrent of type double, readonlystep of type double, readonlyFillLightMode
enum FillLightMode {
"unavailable",
"auto",
"off",
"flash",
"torch"
};
unavailableautotakePhoto() is called. Use flash to guarantee firing of the flash for the takePhoto() or getFrame() methods.offflashtakePhoto() or getFrame() methods. torchMediaStreamTrack is activeMeteringMode
Note that MeteringMode is used for both status enumeration and for setting options for capture(s).
enum MeteringMode {
"none",
"manual",
"single-shot",
"continuous"
};
nonemanualsingle-shotcontinuousPoint2D
A Point2D represents a location in a two dimensional space. The origin of coordinates is situated in the upper leftmost corner of the space.
x of type doubley of type doubleThis section is non-normative.
takePhoto()
<html>
<body>
<video autoplay></video>
<img>
<input type="range" hidden>
<script>
var imageCapture;
navigator.mediaDevices.getUserMedia({video: true})
.then(gotMedia)
.catch(err => console.error('getUserMedia() failed: ', err));
function gotMedia(mediastream) {
const video = document.querySelector('video');
video.srcObject = mediastream;
const track = mediastream.getVideoTracks()[0];
imageCapture = new ImageCapture(track);
imageCapture.getPhotoCapabilities()
.then(photoCapabilities => {
// Check whether zoom is supported or not.
if (!photoCapabilities.zoom.min && !photoCapabilities.zoom.max) {
return;
}
// Map zoom to a slider element.
const input = document.querySelector('input[type="range"]');
input.min = photoCapabilities.zoom.min;
input.max = photoCapabilities.zoom.max;
input.step = photoCapabilities.zoom.step;
input.value = photoCapabilities.zoom.current;
input.oninput = function(event) {
imageCapture.setOptions({zoom: event.target.value});
}
input.hidden = false;
})
.catch(err => console.error('getPhotoCapabilities() failed: ', err));
}
function takePhoto() {
imageCapture.takePhoto()
.then(blob => {
console.log('Photo taken: ' + blob.type + ', ' + blob.size + 'B');
const image = document.querySelector('img');
image.src = URL.createObjectURL(blob);
})
.catch(err => console.error('takePhoto() failed: ', err));
}
</script>
</body>
</html>
<html>
<body>
<canvas></canvas>
<button onclick="stopGrabFrame()">Stop frame grab</button>
<script>
const canvas = document.querySelector('canvas');
var interval;
var track;
navigator.mediaDevices.getUserMedia({video: true})
.then(gotMedia)
.catch(err => console.error('getUserMedia() failed: ', err));
function gotMedia(mediastream) {
track = mediastream.getVideoTracks()[0];
var imageCapture = new ImageCapture(track);
interval = setInterval(function () {
imageCapture.grabFrame()
.then(processFrame)
.catch(err => console.error('grabFrame() failed: ', err));
}, 1000);
}
function processFrame(imgData) {
canvas.width = imgData.width;
canvas.height = imgData.height;
canvas.getContext('2d').drawImage(imgData, 0, 0);
}
function stopGrabFrame(e) {
clearInterval(interval);
track.stop();
}
</script>
</body>
</html>
<html>
<body>
<canvas></canvas>
<script>
const canvas = document.querySelector('canvas');
var track;
navigator.mediaDevices.getUserMedia({video: true})
.then(gotMedia)
.catch(err => console.error('getUserMedia() failed: ', err));
function gotMedia(mediastream) {
track = mediastream.getVideoTracks()[0];
var imageCapture = new ImageCapture(track);
imageCapture.grabFrame()
.then(processFrame)
.catch(err => console.error('grabFrame() failed: ', err));
}
function processFrame(imageBitmap) {
track.stop();
// |imageBitmap| pixels are not directly accessible: we need to paint
// the grabbed frame onto a <canvas>, then getImageData() from it.
const ctx = canvas.getContext('2d');
canvas.width = imageBitmap.width;
canvas.height = imageBitmap.height;
ctx.drawImage(imageBitmap, 0, 0);
// Read back the pixels from the <canvas>, and invert the colors.
const imageData = ctx.getImageData(0, 0, canvas.width, canvas.height);
var data = imageData.data;
for (var i = 0; i < data.length; i += 4) {
data[i] = 255 - data[i]; // red
data[i + 1] = 255 - data[i + 1]; // green
data[i + 2] = 255 - data[i + 2]; // blue
}
// Finally, draw the inverted image to the <canvas>
ctx.putImageData(imageData, 0, 0);
}
</script>
</body>
</html>