WebRTC December 2025 meeting – 09 December 2025

Meeting minutes

Recording: https://www.youtube.com/watch?v=NUfd2YzfTXk

Slideset: https://docs.google.com/presentation/d/1YW1Ump4IFnQEkTVlgI5rEzJVuWycVSVDDqj3S0s__9k/ (archived PDF copy)

usermedia element 🎞︎

minh: the <geolocation> element seems to work well; <usermedia> is a bit more complex, we're iterating on it based on input from Jan-Ivar and feedback from adopters, in particular with regard to muting

Jan-Ivar: generally supportive; there are different versions of PEPC with fairly different shapes - we need to look at the details, to make sure we can have PEPC buttons that can replace existing buttons
… I suspect this WG might need to own the problem, should at least help

Youenn: confirming interest as shared in TPAC; still some concerns
… some aspects don't belong to WebRTC, but the interactions of <usermedia> with device enumeration would best be handled in this WG
… it would be good to understand how to work together on this
… I'm not sure we're ready to dive into all the details, esp since it's still evolving
… from what I've heard <geolocation> is about to ship in Chrome
… for <usermedia>, I really want the WebRTC WG to have input to it before it ships

Guido: mute/unmute is being added to the <usermedia> element - can you say more? can you share more about VC apps developers feedback on this?

Thomas: when people click on <usermedia>, they get prompted for permission; then later it can be used to mute/unmute

Discussion on <usermedia> element in PEPC repo

Jan-Ivar: at Mozilla, we wanted the PEPC button to replace existing buttons in apps
… for <usermedia>, we identified the unmute/mute button as the most likely affordance (matching .enabled from a MST perspective)
… an alternative is relying on .muted state in MST (not under JS control)
… a bit of a double-edge sword: stronger user guarantees vs ways for developers to provide additional values (e.g. detect speak-on-mute)
… also, developers need to be on board with the approach, otherwise they can simply switch out to a regular button if the PEPC button is too restrictive

MinhLe: re pre-prompt vs in-app UI, this is mostly based on VC apps preferences which approach they take with PEPC

youenn: re mute and unmute, media session solves this, with different level of user gesture requirement
… if mute/unmute gets integrated with PEPC, the main difference would be the constraints under which this gets callable
… in media session, we felt that user gesture was sufficient for unmuting given that permission was granted previously - we only wanted to protect against e.g. a web page unmuting while hidden
… Device enumeration is a pain point for Web pages given differences among browsers - it would be great if PEPC could help solve that

Jan-Ivar: +1 to improving device selection (possibly making enumerateDevices less important)
… but maybe that's longer term

youenn: I'm thinking short term; is <usermedia> starting capture immediately? is it only giving permission for a potential later capture? how does it relate to device selection?

jan-ivar: my perspective is that it should be equivalent to a getUserMedia() call

youenn: so equivalent to a synchronous single-click getUserMedia
… would be good to compare how different UAs deal with these situations; Safari doesn't behave the same with or without user gesture

jan-ivar: another feedback we gave was a <microphone> / <camera> elements to avoid polymorphic buttons

guido: that doesn't allow for asking for both

jan-ivar: we could have <usermedia> for this, but making sure we keep types clear

guido: so 3 elements

jan-ivar: which is already the case, but hidden behind the type attribute

dom: we should discuss how to split the work - are there PEPC hooks that could help with this?

Minh: we are very open to whatever format to find creative solutions e.g. in a workshop, to fulfill needs both from users and app developers

Youenn: the closest effort we had was with the permission spec which had different integrations across different specs
… e.g. media capture defines how the media capture permissions are defined/granted
… I wonder if we could do the same here

Minh: we're open to anything that helps coordination

Youenn: we should be OK from a charter perspective to have some of the work done in our group

Minh: today <usermedia> is in origin trial which isn't very comfortable; it would be great to find a way to reach an MVP to get some of it out of origin trial

Youenn: having one API ship in one browser before the spec is ready creates compat issues down the line
… we can't commit to a specific schedule, but I'd be happy to support allocating WebRTC WG time for this

Jan-Ivar: +1 - happy to have seen positive changes to PEPC based on our engagement

Dom: +1 for being careful with interop - if you have a clear idea of what an MVP could look like, this could help prioritize the discussion of the WG

Minh: no clear description of this, but we could build a proposal with backwards compat in mind

Jan-Ivar: for me, an MVP would be buttons that can be used as toggles
… we're flexible on styling if that helps with adoption

RESOLUTION: Use Media Capture and Streams (or an extension of it) to provide a PEPC integration section as an anchor for further discussions

DataChannel close event 🎞︎

[Slide 13]

[Slide 14]

Harald: +1 to the general rule and the specific application
… will this be done via a queued task or done synchronously?

Youenn: we should check what browsers do; I suspect we should queue a task

Jan-Ivar: +1 to aligning with implementations; not sure about agreeing on the general rule, but it sounds nice

dom: re general rule, maybe worth submitting it to the Design Principles repo?

Youenn: we should check whether other specs have hit that pattern
… I'll file an issue

RESOLUTION: Align close event spec with Firefox/Chromium implementation

Mediacapture-transform track transfer 🎞︎

Developed feedback 🎞︎

<hta> Side remark: I checked the Chrome code, and there's a dispatch (PostTask) inbetween the pc.Close() and the firing of the event.

[Slide 21]

[Slide 22]

[Slide 23]

Youenn: re synchronous stopping of track - is it in the case you have a track for camera which you're transforming (e.g. background blur), if you're stopping one, you have to stop the other synchronously?

Guido: the shim is meant to hide the complexity of managing the track, but turns synchronous stop() operation into an asynchronous one
… e.g. makes it harder to reason about whether all the tracks have been stopped

Youenn: so this isn't about having frames leaking

Guido: this may create situations where a track is stopped in one module but not in another

Youenn: when a track is stopped, there can be frames that can be processed after the track was stopped
… I'm hearing with postMessage, there is more latency than with a native implementation

Guido: after stop(), expect for sink-in-flights in another thread, no further frames will be delivered to a sink; esp. since there is no guarantee a postMessage will always be delivered
… this level of complexity should be fixed in the API

jan-ivar: I hope people don't rely on the synchronicity of stop() to stop frames being processed if they're already in a queue

Guido: the problem of synchronicity also applies to .enabled

jan-ivar: stop() is synchronous on the media thread, not on the main thread
… with workers, there are 3 ways to get things to and from a worker: postMessage, transfered streams, rtcscripttransform
… we as a WG have invented one API and are talking about invending a second one

Guido: this isn't a first for media processing
… given the feedback from developers, it seems clear to me that this is something we should fix

jan-ivar: the shim can be fixed to communicate to multiple clones

Youenn: I can understand the value for very complex applications to avoid replacing a track with another one
… although web apps have to deal with replacing tracks dynamically in case of capture failure

Guido: workers are targeted at complex applications in the first place; the fix is easy, we should fix it

dom: we've asked for feedback, we should listen to that feedback

jan-ivar: we are listening and it's important to reach consensus for interop, so we're willing to compromise on this

MediaStreamTrackHandle 🎞︎

Guido: worth it, yes; good enough, maybe
… this would address the developers concerns I highlighted
… so definitely in the right direction
… I think GC-handling needs more discussion
… I think the ScriptTransform approach is better as it doesn't have to expose a new object and allows for more optimizations
… and it shouldn't be hard to replicate in other browsers already implementing encoded transform
… we could compromise to that approach for interop
… for processor

Youenn: so you would be OK with implementing Generator and Processor in worker, and track transfer

Guido: eventually

Harald: a problem with MST transfer is the size of the surface it exposes in worker
… MSTHandle exposes a bit more surface than EncodedTransform-lookalike (which exposes only streams)
… whereas MSTHandle might need some more (e.g. closed state?)
… so slightly more complex, more API surface to manage (e.g. lifecyle of the object)

Jan-Ivar: this feels redundant, but it's the cleanest solution among the proposed ones
… I believe we would be OK with implementing it, but I'll have to check with the team
… will there be source support for handle? I wouldn't support this; it's important that track transfer be supported

Youenn: so PR to mediacapture-transform? should we do a CfC?

Guido: we should include audio in such a CfC

RESOLUTION: start a PR to add MediaStreamTrackHandle to mediacapture-transform as a basis for a future CfC

Mediacapture-transform for audio 🎞︎

MediaStreamTrackProcessor for Audio 🎞︎

Youenn: thanks for the presentation and prototyping; for mstp, we could improve performance but there may be performance penalty for some browsers, and web developers can reasonably want something simpler
… don't see a big drawback to exposing audio like we're doing for video frames
… the feedback from Zoom seems to match a pretty natural pattern that the API should allow
… for MSTG, it's useful for a sink for a PC, but a footgun for other sinks (e.g. recorder, video/audio elements)
… this is problematic, but I'm not sure that MSTG is the right place to fix this

Guido: re PC as a sink - any other network-based API would have the same characteristics
… note that AudioWorklet is conversely harmful to use with PC
… without good alternative

Youenn: this is work addressing, but we shouldn't limit ourselves to MSTG; we should ask the Audio WG if they have thoughts about how this should be addressed from an audio processing perspective
… it could be that a different audio context would work

Guido: offlineaudiocontext is this, but doesn't have integration with MST

Youenn: definitely worth discussing the issue more

Jan-Ivar: thanks a lot for investigating this - were the measures done in Chrome?

Guido: the shim run on all browsers; we ran the native comparison in Chrome

Jan-Ivar: we'll want to look more into these; re timestamp, that's worth filing an issue on audio worklet
… we'll need a bit more time on the measurements to build a position on our end

Harald: there is a good reason for a worklet not to have a timestamp - it's real-time/synchronous

Guido: a worklet is dealing with a rendering quantum, not necessarily a full audio sample with its timestamp
… there could be a situation where a quantum corresponds to more than one timestamped sample

Jan-Ivar: I'll follow up on the github issue once we've analysed the measurements in Firefox

– DRAFT –
WebRTC December 2025 meeting

09 December 2025

Attendees