HTML Media Capture

W3C Working Draft 28 September 2010

This version:
Latest published version:
Latest editor's draft:
Previous version:
Ilkka Oksanen, Nokia
Dominique Hazaël-Massieux, W3C


This specification defines HTML form enhancements that provide access to the audio, image and video capture capabilities of the device.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is a the first part of the split of the previous version of this document, focused on the integration of media capture in HTML forms based on an extension to the FileAPI. The second part of the split focused on programmatic access to the capture devices will be published separately.

The Working Group is looking for feedback on the general approach of this new version, and will coordinate with the HTML and Web Applications Working Group to ensure the proper progress of this document.

Issues and editors notes in the document highlight some of the points on which the group is still working and would particularly like to get feedback.

This document was published by the Device APIs and Policy Working Group as a Working Draft. This document is intended to become a W3C Recommendation. If you wish to make comments regarding this document, please send them to public-device-apis@w3.org (subscribe, archives). All feedback is welcome.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1. Introduction

The HTML Form Based Media Capturing specification defines a new interface for media files, a new parameter for the accept attribute of the HTML input element in file upload state, and recommendations for providing optimized access to the microphone and camera of a hosting device.

Providing streaming access to these capabilities is outside of the scope of this specification.

The Working Group is investigating the opportunity to specify streaming access via the proposed <device> element.

2. Conformance

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words must, must not, required, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC2119].

This specification defines conformance criteria that apply to a single product: the user agent that implements the interfaces that it contains.

Implementations that use ECMAScript to implement the APIs defined in this specification must implement them in a manner consistent with the ECMAScript Bindings defined in the Web IDL specification [WEBIDL], as this specification uses that specification and terminology.

3. Security and Privacy Considerations

This specification builds upon the security and privacy protections provided by the [HTML5] <input type="file"> and the [FILE-API] specifications; in particular, it is expected that any offer to start capturing content from the user’s device would require a specific user interaction on an HTML element that is entirely controlled by the user agent.

In addition to the requirements already highlighted in the [HTML5] and [FILE-API] specifications, implementors should take care of additional leakage of privacy-sensitive data from captured media. For instance, embedding the user’s location in a captured media metadata (e.g. EXIF) might transmit more private data than the user might be expecting.

4. Capture aware file-select control

This section is normative.

[HTML5] links <input type="file"> to the File interface. This specification defines a refined MediaFile interface to be used when the accept attribute take certain values — this will require coordination with the HTML5 Working Group.

If an input element in the File Upload state [HTML5] contains accept attribute with values image/*, sound/*, or video/*, the user agent can invoke a file picker that allows respectively the user to take a picture, record a sound file, or record a video in addition to selecting an existing file from the file system.

See the User Interface Examples appendix for the illustration.

In case the user chooses to capture video, audio, or image content, the user agent creates media files on the fly as specified in [HTML5].

If the user selects files of whose MIME types match image/*, sound/*, or video/* (on the filesystem or via a successful media capture), the relevant files in the files attribute [HTML5] must implement the MediaFile interface.

<input type="file" accept="image/*" id="capture"> 

5. The capture parameter

This section is normative.

The capture parameter may be specified on the media type values of the accept attribute to provide user agents with a hint of that by the default a file picker should be in media capturing mode.

[HTML5] defines the accept attribute to take no parameters on MIME types. This specification proposes to use a MIME type parameter — this will require coordination with the HTML5 Working Group.

The capture parameter can take one of the following values: camera, camcorder, microphone, filesystem. These values indicate which source the file picker interface should preferably present to the user by default.

The values and their exact meaning are still very much in flux.

For example, the following code indicates that the user is expected to upload an image from the device camera:

<input type="file" accept="image/*;capture=camera" id="capture"> 

A possible rendering of a file picker taking this parameter into account is offered in the User Interface Examples appendix.

6. WebIDL interfaces

6.1 Example

After the user successfully captured or selected an existing media file, the format properties of the file can be retrieved as follow:

var captureInput = document.getElementById('capture');
// Accessing the file object from the input element with id capture
var file = captureInput.files[0];
if (file) {
  // getting format data asynchronously
  file.getFormatData(displayFormatData, errorHandler);

// success callback when getting format data
function displayFormatData(formatData) {  
  var mainType = file.type.split("/")[0]; // "image", "video" or "audio"
  var mediaDescriptionNode = document.createElement("p");
  if (mainType === "image") {
    mediaDescriptionNode.appendChild(document.createTextNode("This is an image of dimensions " + formatData.width + "x" + formatData.height);
  } else {
    mediaDescriptionNode.appendChild(document.createTextNode("Duration: " + formatData.duration  + "s");
  captureInput.parentNode.insertBefore(mediaDescriptionNode, captureInput);

// error callback if getting format data fails
function errorHandler(error) {
  alert("Couldn’t retrieve format properties for the selected file (error code " + error.code + ")");

6.2 MediaFileData interface

MediaFileData encapsulates format information of a media file.

The relationship between this MediaFileData interface and the properties made available through the API for Media Resource 1.0 [MEDIAONT-API] needs further investigation.

interface MediaFileData {
    attribute DOMString     codecs;
    attribute unsigned long bitrate;
    attribute unsigned long height;
    attribute unsigned long width;
    attribute float         duration;

6.2.1 Attributes

bitrate of type unsigned long
The codecs attribute only specifies the profile and level of the encoded content which doesn't specify the actual bitrate. It only specifies the maximum encoded bitrate, thus this bitrate attribute is the average bitrate of the content. In the case of an image this attribute has value 0.
No exceptions.
codecs of type DOMString
The type attribute of the Blob interface (inherited from the File interface) is not sufficient to determine the format of the content since it only specifies the container type. The codecs attribute represents the actual format that the audio and video of the content. The codecs attribute must conform to the [RFC4281]. For example, a valid value for H.263 video and AAC low complexity would be codecs="s263, mp4a.40.2".

This could be turned into a list of DOMString rather than keeping it as a comma-separated values list; this needs some care with regard to the RFC ref.

No exceptions.
duration of type float
The duration attribute represents length of the video or sound clip in seconds. In the case of an image this attribute has value 0.
No exceptions.
height of type unsigned long
The height attribute represents height of the image or video in pixels. In the case of a sound clip this attribute has value 0.
No exceptions.
width of type unsigned long
The width attribute represents width of the image or video in pixels. In the case of a sound clip this attribute has value 0.
No exceptions.

Some of the proposed attributes of the MediaFileData interface could possibly be integrated as parameters of the MIME type, or as MIME options object.

6.3 MediaFile interface

MediaFile encapsulates a single photo, video or sound from the device. It inherits from File [FILE-API].

interface MediaFile : File {
    void getFormatData (in MediaFileDataSuccessCallback successCallback, in optional MediaFileDataErrorCallback errorCallback);

6.3.1 Methods

The getFormatData() method takes one or two arguments. When called, it returns immediately and then asynchronously attempts to obtain the format data of the given media file. If the attempt is successful, the successCallback is invoked with a new MediaFileData object, reflecting the format data of the file. If the attempt fails, the errorCallback is invoked with a new MediaFileDataError object, reflecting the reason for the failure.
No exceptions.
Return type: void

6.4 MediaFileDataSuccessCallback interface

[Callback=FunctionOnly, NoInterfaceObject]
interface MediaFileDataSuccessCallback {
    void onSuccess (in MediaFileData formatData);

6.4.1 Methods

formatDataMediaFileData The MediaFileData object describing the relevant properties of the given media file.
No exceptions.
Return type: void

6.5 MediaFileDataErrorCallback interface

[Callback=FunctionOnly, NoInterfaceObject]
interface MediaFileDataErrorCallback {
    void onError (in MediaFileDataError error);

6.5.1 Methods

errorMediaFileDataError The MediaFileDataError object describing the error encountered while retrieving the format data.
No exceptions.
Return type: void

6.6 MediaFileDataError interface

The MediaFileDataError interface encapsulates all errors in the retrieval of format data associated with a MediaFile object.

interface MediaFileDataError {
    const unsigned short UNKNOWN_ERROR = 0;
    const unsigned short TIMEOUT_ERROR = 1;
    readonly attribute unsigned short code;

6.6.1 Attributes

code of type unsigned short, readonly
An error code assigned by an implementation when an error has occurred in retrieving format data.
No exceptions.

6.6.2 Constants

TIMEOUT_ERROR of type unsigned short
The requested method timed out before it could be completed.
UNKNOWN_ERROR of type unsigned short
An unknown error occurred.

A. User Interface Examples

A media capture file picker might render as:

A File picker with camera support

B. References

B.1 Normative references

Arun Ranganathan. File API. 17 November 2009. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2009/WD-FileAPI-20091117/
Ian Hickson; David Hyatt. HTML 5. 4 March 2010. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2010/WD-html5-20100304/
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Internet RFC 2119. URL: http://www.ietf.org/rfc/rfc2119.txt
R. Gellens, D. Singer, P. Frojdh. The Codecs Parameter for "Bucket" Media Types November 2005. Internet RFC 4281. URL: http://www.ietf.org/rfc/rfc4281.txt
Cameron McCormack. Web IDL. 19 December 2008. W3C Working Draft. (Work in progress.) URL: http://www.w3.org/TR/2008/WD-WebIDL-20081219

B.2 Informative references

WonSuk Lee; Florian Stegmaier; Chris Poppe. API for Media Resource 1.0 8 June 2010. W3C Working Draft (Work in progress). URL: http://www.w3.org/TR/2010/WD-mediaont-api-1.0-20100608