Audio Working Group update 2025
By Hongchan Choi (Google), co-chair of the Web Audio WG
The Audio Working Group advanced two key specifications in 2025. The Web Speech API was transferred from the WICG, with Chrome shipping its implementation and Firefox beginning theirs; spec work focused on on-device recognition and TAG privacy feedback. For the core Web Audio API, the AudioContext interrupted state was specified and entered a Chromium Origin Trial. The group also initiated a new Playout Statistics API and continued collaboration with the Web Performance WG on high-resolution time in worklets.
Video
Transcript
Hello everyone.
I'm Hongchan Choi and it is a pleasure to present the annual highlights and achievements of the audio working group. As one of the co-chairs, alongside with Paul Adenot and Matthew Paradis, I'm thrilled to share our progress in advancing audio capabilities on the web platform.
This past year covering late 2024 through September 2025, the audio working group prioritized cross-pollination across different technology areas, emphasizing cross browser collaboration and coordination with other working groups within W3C.
Our goal remains a commitment to one open platform, collaborative API design and fostering implementation diversity.
We also took the first step to expand our charter to own more audio-centric features, signaling an evolution for the group.
The audio working group benefits from a small group but robust participation.
We are also supported by W3C staff contact Chris Lilley.
Our membership includes the representatives from major implementers and industry partners such as Apple, BBC, Google, Microsoft, Mozilla Foundation and Netflix, and so on.
We have made significant strides across several key specifications focusing on developer utility.
A major effort this year has been advancing the Web Speech API.
This API, originally somewhat abandoned, has been retrofitted with new AI technologies for better speech recognition and synthesis capabilities.
The goal is to incorporate features for on-device speech recognition and support for contextual biasing, using phrases provided by developers to improve recognition accuracy.
The API is designed to be agnostic of the underlying speech recognition synthesis implementation, supporting both server side and client side recognition.
This is a strong example of cross browser corporation with Chrome, on track to ship its implementation, including on device recognition and contextual biasing.
Firefox also has begun its implementation initially supporting English with plans for concurrent stream recognition.
The Web Speech API was officially transferred from the WICG to become a technical report of the audio community group. The audio working group has since volunteered to keep this API under its charter, ensuring its continuous maintenance.
We have also dedicated time to refining the specification and addressing critical privacy and security feedback received from W3C TAG.
We successfully specified and merged a major feature, AudioContext Interrupted State.
This feature addresses a critical need identified by partners like Microsoft, by providing a mechanism for the user agent to interrupt the audio playback on its own.
This allows web applications to respond appropriately when, for example, another application requires exclusive access to audio hardware or laptop lid is closed or an incoming phone call.
The group initiated the work on a new API to provide developers with statistics on audio playout quality including measuring glitches and average latency.
The proposal involves adding an audio playout stats object to the audio context interface.
Discussions are ongoing regarding the definition of latency metrics and anti-fingerprint mitigations, such as limiting the update frequency of the values. We are actively discussing the longstanding issue of exposing Performance.now() in AudioWorklet GlobalScope.
This discussion is ongoing in collaboration with the web performance working group.
Having high resolution time available is highly desirable for accurate performance measurement within the audio worklet operations.
Implementers are taking input from internal security privacy teams and planning to inform W3C TAG about this addition.
A dedicated meeting is scheduled for a TPAC to resolve this issue.
The audio working group is known for its community interaction and collaboration, and it is a central matter to the group's operation.
The 9th Web Audio conference, marking the 10-year anniversary of the event, will be held in Paris, France from November 19th to 21st 2025.
It is co- organized by IRCAM and Mozilla.
We are pleased to report that many key working group members are involved in the conference planning.
The Chromium Google Summer of Code program provides a real project with tangible impact.
For GSOC 2025, there is an array of exciting projects, including one for focused on Chromium web audio testing to improve testing methodologies.
In 2024, contributors also successfully enhanced the developer resources for the audio worklet.
To ensure the continuity of our mission, we have done some crucial administrative tasks as well.
The audio working group was successfully rechartered until November 2026.
The new charter period officially began on November 8th 2024.
We also have consistently maintained our operational health by running steady by weekly teleconferences, with meetings alternating between US and European time zones to facilitate wider participation.
Browser implementors have pushed important features to production or field trials.
Here are some updates.
These updates are from Chrome.
We focused on improving audio stability and performance.
AudioContext Interrupted State; the implementation for this feature is currently undergoing Origin Trial in chromium.
The Output Buffer Bypass feature is shipped which removes one buffer of latency and prevents the latency from growing over time by disabling adaptive buffer growth mechanism.
We also shipped the two critical Performance Improvements.
First, resampler removal: significant performance gains were sought by removing the redundant resampler in the audio infrastructure.
Gravity- collection impacting audio stream stability was also improved.
We successfully addressed the audio glitches caused by Gravity collector synchronous to blocking the real-time audio threat in media stream audio destination node.
Looking ahead, we are focusing on advanced capabilities that will significantly enhance a developer experience for professional and real-time audio application.
The next revision of the web audio API is expected to be completed in Q4 2026, incorporating several key features.
First, Configurable Render Quantum.
This major feature will allow developers to configure the size of the render quantum.
This directly addresses a request from partners like Google Meet and Soundtrack, by eliminating the major source of complexity and latency, leading to higher performance web audio applications.
The spec work is already done, but the working group will monitor the implementors progress and the initial feedback from developers next year.
Performance.now() for AudioWorklet.
As mentioned earlier, finalizing the availability of high resolution timer within the audio thread is a key deliverable for this working group.
We are keeping a close watch on emerging object-based spatial audio rendering technologies.
We expect to discuss topics like the BBC's next generation audio and related spec proposal being drafted by several companies at TPAC.
That's all for the updates from audio working group.
Thank you for watching and we look forward to another productive year of a collaboration with all of you at the W3C.