Audio Processing API

W3C Working Draft 15 December 2011

This version:
Latest version:
Robert O'Callahan, Mozilla Corporation <robert@ocallahan.org>
Chris Rogers, Google <crogers@google.com>


This specification introduces and compares two client-side APIs for processing and synthesizing real-time audio streams in the browser.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This is the First Public Working Draft of the Audio Processing API specification. It has been produced by the W3C Audio Working Group, which is part of the W3C Rich Web Client Activity.

This publication references two proposals for advance audio API functionality: Google's Web Audio API specification and Mozilla's MediaStream Processing API specification.

Please send comments about this document to public-audio@w3.org mailing list (public archive).

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the 5 February 2004 W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

  1. Introduction
  2. API Proposals
    1. Web Audio API proposal
    2. MediaStream Processing API
  3. Acknowledgements

1 Introduction

This section is informative.

The HTML5 audio and video elements allow for the playback of prerecorded audio files. For many purposes, this functionality is not enough. The specifications described here extend the capabilities of the browser to add more advanced audio capabilities than are currently offered by these simple multimedia elements.

These APIs will support the features required by advanced interactive applications including the ability to process and synthesize audio streams directly in script. This API can be used for interactive applications, games, 3D environments, musical applications, educational applications, and for the purposes of accessibility. It includes the ability to synchronize, visualize, or enhance sound information when used in conjunction with graphics APIs. Sound synthesis can be used to enhance user interfaces, or produce music.

2. API Proposals

This section is informative.

The W3C Audio Working Group has produced two proposals for the Audio API: Google's Web Audio API specification and Mozilla's MediaStream Processing API specification.

Each specification covers many of the same use cases, but in some areas the specifications are complementary. The group is evaluating the technical merits of each specification, in terms of its ease of use, implementability, and satisfaction of use cases and requirements of the Audio and WebRTC Working Groups. The group welcomes feedback from interested Web developers and audio experts, as well as implementers of browsers and authoring tools, and hardware manufacturers.

2.1 Web Audio API proposal

Google's Web Audio API proposal describes a high-level JavaScript API for processing and synthesizing audio in web applications. The primary paradigm is of an audio routing graph, where a number of AudioNode objects are connected together to define the overall audio rendering. The actual processing will primarily take place in the underlying implementation (typically optimized Assembly / C / C++ code), but direct JavaScript processing and synthesis is also supported.

This API is designed to be used in conjunction with other APIs and elements on the web platform, notably: XMLHttpRequest (using the responseType and response attributes). For games and interactive applications, it is anticipated to be used with the canvas 2D and WebGL 3D graphics APIs.

To learn more, refer to the Web Audio API proposal.

2.2 MediaStream Processing API

Mozilla's MediaStream Processing API specifies a number of existing or proposed features for the Web platform deal with continuous real-time media:

  • HTML media elements
  • Synchronization of multiple HTML media elements (e.g. proposed HTML MediaController)
  • Capture and recording of local audio and video input (e.g. proposed HTML Streams)
  • Peer-to-peer streaming of audio and video streams (e.g. proposed WebRTC and HTML Streams)
  • Advanced audio APIs that allow complex mixing and effects processing (e.g. Mozilla's AudioData, Chrome's AudioNode)
  • Many use-cases require these features to work together. This proposal makes HTML Streams the foundation for integrated Web media processing by creating a mixing and effects processing API for HTML Streams.

To learn more, refer to the MediaStream Processing API proposal.

3. Acknowledgements

This section is informative.

This document is the work of the W3C Audio Working Group.

Members of the Working Group are (at the time of writing, and by alphabetical order):

Berkovitz, Joe (public Invited expert); Gregan, Matthew (Mozilla Foundation); J├Ągenstedt, Philip (Opera Software); Kalliokoski, Jussi (public Invited expert); Lowis, Chris (British Broadcasting Corporation); MacDonald, Alistair (W3C Invited Experts); Michel, Thierry (W3C/ERCIM); Raman, T.V. (Google, Inc.); Rogers, Chris (Google, Inc.); Schepers, Doug (W3C/MIT); Shires, Glen (Google, Inc.); Smith, Michael (W3C/Keio); and Thereaux, Olivier (British Broadcasting Corporation).

The co-chairs of the Working Group are Alistair MacDonald and Olivier Thereaux.

The people who have contributed to discussions on public-audio@w3.org are also gratefully acknowledged.