W3C

– DRAFT –
WebRTC September 2025 meeting

16 September 2025

Attendees

Present
Carine, DiegoPerezBotero, dom, Fippo, Guido, harald, Jan-Ivar, JasperHugo, KonradHofbauer, NishitaDey, PeterT, SergeySilkin, SteveBecker, SunShin, TimP, TonyHerre, Youenn
Regrets
-
Chair
Guido, Jan-Ivar, Youenn
Scribe
dom

Meeting minutes

Recording: https://www.youtube.com/watch?v=bDovLB_In-8

Slideset: https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/ (archived PDF copy)

Audio Output Devices API 🎞︎

Ask for user gesture to call setSinkId #84 🎞︎

[Slide 10]

Jan-Ivar: a bit concerned that this adds variance among implementations and might lead to compat issues
… what's the motivation for this change?

Youenn: similar to gUM or gDM: when a web site starts playing audio, we put a user activation check - which I think the media spec requires
… setSinkId is starting to play audio on a given speaker - it seems logical to have a user gesture check there as well

Harald: the most common use case where a device is no longer available is when people unplug their earbuds
… when they do and the app notices, will there be a user activation event available or not?

Youenn: in that situation in Safari, setSinkId would be available given the devicechange event

harald: so a devicechange event counts as user activation in this case?

Youenn: that's how we implemented in Safari
… and we propagate to the async enumerateDevices call as well since it's likely to be called afterwards

harald: so harmess if it includes the devicechange event

Youenn: that's why the spec should allow it; we think we've solved web compat concerns, we'll see if more emerge

Fippo: a common use case is to play the ring tone on the speaker and the call on the headset - would that still be supported?

Youenn: in a call, you call gUM, you get the devices and then call setSinkId - microphone starting also counts as activation in our heuristics
… When we call gUM, the enumerateDevices list is changing with a devicechange event, which is used as the trigger to do the device setup

Jan-Ivar: so user activation or devicechange event?

Youenn: that's what we've implemented

Jan-Ivar: how does this deal with multiple gUM?

Youenn: this hasn't been a concern so far

Jan-Ivar: I'm not sure Firefox would be able to implement this; I'm also not a big fan of SHOULD

TimP: I'm supportive of this change, with a bit more work on details
… as it can protect against misuses

Dom: what would it take to turn this into a MUST?

Youenn: I thought of first making this a first step and get implementation experience before requiring it

Guido: I'm ok with adding it as a may, but making a must will require a more extensive list of circumstances

Youenn: I can start with a MAY and open a separate issue to make it stronger

RESOLUTION: Proceed with a MAY require user activation

Should media capture output define an explicit default speaker device? #151 🎞︎

[Slide 11]

[Slide 12]

Jan-Ivar: I agree the situation is unfortunate for Web compat; is this only for speakers, or also for microphone?

Youenn: the problem seems less prominent for microphones

Jan-Ivar: the problem with that approach is that it clashes with the rest of the spec in terms of the devicechange event

Youenn: if you want the default, you call setSinkId("")

Jan-Ivar: but this doesn't let detect when the default OS device changes

Youenn: we've done that change for webcompat and seems to be working well

Guido: the default for the output device is a good idea; I think Firefox does it in some circumstances with selectAudioOutput

Jan-Ivar: yes, we have it in specific conditions for selectAudioOutput

Youenn: default speaker is also a widely used concept across OS

Jan-Ivar: Firefox needs to improve its compat on devicechange event in any case; so I'm in favor if we can clarify the situation with devicechange event

Youenn: ready for PR then?

Jan-Ivar: we can continue on the issue but not opposed to a PR

RESOLUTION: proceed with proposed change with additional discussion expected

Expose the type of device in MediaDeviceInfo #1 🎞︎

[Slide 13]

Jan-Ivar: LGTM, doesn't seem to bring privacy issues since they're only exposed when the device is exposed
… should there be a headset category?

Youenn: we can try this

Harald: I'm a bit worried about the specifics of the enumeration
… e.g. many mics are usb even they're built-in

Youenn: I can bring more info on what Windows / MacOS expose

RESOLUTION: Proceed with a pull request with additional discussion on enumerated values

Speaker devices may not always work with all microphones #149 🎞︎

[Slide 14]

Guido: what would be the alternative to failing gUM/setSinkId?

Youenn: I don't have a specific proposal; I was first trying to get a sense if that's a problem worth fixing (e.g. if it affects other OS)

Guido: maybe let's focus first on identifying how widespread an issue it is

Jan-Ivar: this reminds me of the issue where you can't open multiple mics on phones; don't have a good solution off the top of my head either

Decoder exposure and software fallback 🎞︎

[Slide 17]

[Slide 18]

[Slide 19]

[Slide 20]

[Slide 21]

Youenn: the Privacy Working Group had raised concerns - we should ask them if we're looking at this again
… if media capabilities already provide that info through polling - is that good enough?
… polling instead of an event might be a feature here

Nishitha: media capabilities doesn't expose what is actually reflect what's happening during streaming

Diego: mc signals the potential to have the hardware decode enabled, but it doesn't say if it is happening
… e.g. we can't get info on situations where MC says there is hardware decode but we're not seeing it used
… there are software-fallback situations where errors occur that are can't be monitored via telemetry

Youenn: MC solves hardware support, but not lack of temporary availability. maybe it's a shortage of MC?

Diego: given that streams get negotiated during SDP O/A with the codec profile and format and characteristics that can lead to a decoder giving up in the middle of the stream; MC require predicting all possible cases to detect these situations when this event could give much more specific direction

Youenn: if the PRivacy Working Group is fine with this, I'm fine too; but these issues might arise in WebCodecs as well, so having a single solution would be nice

Jan-Ivar: I understand the event proposal comes from feedback from the Privacy WG (vs stats)
… It's not really clear what fallback means
… e.g. would that event be fired in a system without hardware decode support?
… there are also situations (e.g. small frames) where software decode wouldn't be a sign of a problem
… in terms of API shape, I prefer B rather than A that makes it harder to distinguish fatal errors
… supportive of direction but with more clarification on situation of failures

TimP: supportive of this, but less supportive of the "fallback" concept and the hardware/software dichotomy
… I think the event we want is "decoder implementation changed"
… I think what we really care about is latency, not whether it's hardware or software
… it would be nice to have stats on average frame decode time if we don't have one

Fippo: we do

TimP: then trigger on implementation change + stats would work

Diego: detecting device type is really hard for (good) privacy protection reasons, so we can't really figure the characteristics of the devices on which the stream is running, in particular to detect regressions

TimP: but knowing the decoder has changed under your feet, would that help?

Diego: CPU decode is not only about latency: it has impact on battery and thermal impact
… the event would be useful, but less useful

dom: if we want to do this, we should do this and get feedback from Privacy WG
… events can add privacy attacks by surfacing on two different origins

Harald: why on Transceiver vs Receiver?

Fippo: +1 to Receiver

RESOLUTION: discuss proposal in more depth and prepare for Privacy review

generateKeyFrame() API consolidation (Jan-Ivar) 🎞︎

Issue #273 / PR #274: Remove sender.generateKeyFrame() 🎞︎

[Slide 25]

TimP: I don't like the first API - encoding parameters should be less dynamic than that, this is not an encoding parameter; the second API makes much more sense

Jan-Ivar: the argument why we went for this API is that it allows to combine changing all parameters and sending a keyframe at the same time

RESOLUTION: Proceed with removing unimplemented API

Issue #147: expose rid as metadata on outgoing frames 🎞︎

[Slide 26]

Fippo: we would like the encoding index in addition of the rid; we would like the mid since it isn't available in workers

Jan-Ivar: the mid can be passed as an option

Youenn: or in the Transformer itself
… there will be one per mid
… not exposing it in frames makes it more lightweight
… we should file an issue on this

Jan-Ivar: adding an encodingIndex can also be filed an issue

Fippo: having it in addition to rid is an ergonomy value

RESOLUTION: Proceed with pull request

PR #276: Default the generate key frame algorithm to all layers 🎞︎

[Slide 27]

Youenn: the main use case is changing the encryption key in which case you want to generate keyframes for all layers

RESOLUTION: proceed with merging PR

Issue 143: should transform.generateKeyFrame() take an array of rids? 🎞︎

[Slide 28]

[Slide 29]

[Slide 30]

Youenn: no strong preference, but a slight preference to keep it as is since it matches the design requirements (encryption, per-rid keyframe); not sure what the use cases would be for different subsets; it adds complexity (e.g. what happens if one is invalid)

Fippo: there are use cases which only require 2 layers, e.g. on this call

Youenn: but this a use case for Transformer - we have setParameters otherwise

Fippo: what would be return value? Originally, it returned a timestamp which wouldn't work for an array

Jan-Ivar: already changed to undefined

Dom: let's leave it as is; we can change it to DOMString or Array if there is an important use case

RESOLUTION: Leave current API with single DOMString argument

RTCDataChannel (SDP and stats) 🎞︎

Always negotiate datachannels 🎞︎

[Slide 33]

Jan-Ivar: the problem is that BUNDLE attaches to the first m-line by default?

Fippo: right; since datachannels can't be rejected, they're the right target for BUNDLE

Jan-Ivar: thanks, makes sense to me

Youenn: +1
… would be good to look into a JSEP revision given this is a second item on the revision list

Fippo: I have a bunch of issues against JSEP, I can talk with Justin on a third revision

Jan-Ivar: that'd be great
… are there any concerns on compat issues?

Fippo: sounds unlikely

Dom: let's file a JSEP issue at the same time

RESOLUTION: Proceed with PR and JSEP issue

What is the lifetime of stats? 🎞︎

[Slide 34]

[Slide 35]

Youenn: will changes there create web compat issues? hopefully there is still room for making the right decisions

Fippo: I have web compat concerns for inboundrtp, we'll see

Jan-Ivar: thanks a lot for the analysis, showing diversity across stats, implementations (not clear what the spec asks for)
… when implementations have a shared behavior, that's hopefully a good direction to go
… +1 to documenting and cleaning as much as web compat enables

Fippo: the problem is creating stat objects before the relevant object is indeed created
… Documenting rules and their motivation would be good, before seeing what we can change

Jan-Ivar: another parameter to take into account is rollback

Harald: there was a specific situation with candidate pair that some pairs contain ip addresses that are considered sensitive
… I don't know if that impacts on when they're exposed

Fippo: they're hidden, so it shouldn't matter

Harald: the number of outgoing datachannels you create shoudl be equal to the number reflected in stats; I would be in favor to have them show up early

Fippo: let's document the behavior and then disagree on the right one :)

TimP: I'm happy with it being early; it's unpleasant but necessary

Jan-Ivar: having it late mean having it more useful; more generally, for early stats, we should be clear on what data they expose

Fippo: will report on this at the next meeting

data channel ids set before SCTP init #3071 🎞︎

[Slide 36]

TimP: in theory, you could request more datachannels than the other party would accept

Fippo: good point, we need to look into that

Jan-Ivar: another aspect is workers
… if all browsers do the same thing, documenting it sounds good to me

RESOLUTION: Proceed with a PR

Bring Your Own Degradation Adaptation 🎞︎

[Slide 39]

[Slide 40]

[Slide 41]

Jan-Ivar: SGTM

Fippo: +1
… would it make sense to expose the QP on the insertable stream as well? (rather than using getStats)

Sergey: exposing QP per frame sounds good

Guido: please file an issue

TimP: also in favor; as Fippo said, there may be more data needed outside of stats

Fippo: maintain-framerate / maintain-resolution - this adds the 3rd point of the triangle (with balance in the center)

Guido: there is an existing PR where the conversation can continue

RESOLUTION: proceed with PR

SFrameEncrypterStream rename 🎞︎

[Slide 44]

Youenn: the behavior is not undefined
… that isn't to say having different objects would be useful - e.g. for decrypting/encrypting
… initially, one object for everything was sufficient

Harald: initially, we thought SFrameTransform would be added as a first or last step in a chain of transforms
… if we're abandoning that model (which I think we should since nobody has implemented), we should look at how to apply it to a sender/receiver

[Slide 45]

Youenn: +1
… we will be able to add management key APIs dedicated to decryption and encryption
… we might be able to duplicate these APIs in SFrameTransform as well
… letting the UA do the encryption sounds like a good thing in general

Harald: I would like to see an example of ScriptTransform and SFrameTransform together

Jan-Ivar: see slide

Harald: that makes sense

Jan-Ivar: that wouldn't work for SPacket though

Harald: SGTM

RESOLUTION: Proceed with PR

Summary of resolutions

  1. Proceed with a MAY require user activation
  2. proceed with proposed change with additional discussion expected
  3. Proceed with a pull request with additional discussion on enumerated values
  4. discuss proposal in more depth and prepare for Privacy review
  5. Proceed with removing unimplemented API
  6. Proceed with pull request
  7. proceed with merging PR
  8. Leave current API with single DOMString argument
  9. Proceed with PR and JSEP issue
  10. Proceed with a PR
  11. proceed with PR
  12. Proceed with PR
Minutes manually created (not a transcript), formatted by scribe.perl version 235 (Thu Sep 26 22:53:03 2024 UTC).