WebRTC April 2025 meeting

Meeting minutes

Recording: https://www.youtube.com/watch?v=UlndQj7vIdc

Slideset: https://docs.google.com/presentation/d/1weRIFjbNC0Nf-8xUIKP1Fs98HHVB-JyaVmxDXg_xVbg/edit#slide=id.g2bb12bc23cb_0_0 (archived PDF copy)

Media Capture and Streams 🎞︎

Issue #1019 What is the purpose of requiring a successful gUM call before enumerateDevices? 🎞︎

[Slide 11]

[Slide 12]

Jan-Ivar: the proposal is a bit different from what I expected
… it's common to ask for camera & microphone in a lobby UX prior to the meeting
… the Zoom flow is a good use case to solve
… I'm not sure about tying it to permissions with all browsers supporting one-time permission
… some users may prefer to never persist a permission

Youenn: you're suggesting to relax the room further?

Jan-Ivar: right - this would solve the zoom flow not just in Chrome but also in Firefox
… the issue is to detect a video conferencing web site where mic/camera sharing is kind of expected; less so on other sites
… e.g. it could be a Web site where mic & camera have been exposed in the past
… (or expose both always)
… I don't like the dependency on persistent permissions

Youenn: my goal was trying to solve a specific issue, not opening new privacy holes - we had something like you described before and got push back that this created surprises
… this proposal was narrowly focused on improving interop

Jan-Ivar: what if added "UA may expose camera devices if they can detect cameras have been used in the past"?

Youenn: I could go with that

Guido: Youenn's proposal is an improvement; I think exposing cameras would be surprising to end users, but would be OK if it's a MAY
… clarification: is it active usage of gUM or a successful gUM call occured?

Youenn: I meant the latter (with edge cases to take into account as the current spec does)

Guido: +1 to the change, but let's make sure we leave UA freedom on determining when to persist permissions
… let's adopt this and leave time for implementation experience

Dom: so you would be comfortable with implementing this for gathering implementation experience

Guido: yes

Jan-Ivar: we have a compat issue, where the "Zoom" experience behaves differently between browsers with and without persistent permission
… that's where my proposal for the "MAY" clause comes from
… I'm happy to iterate on more specific language

Youenn: we can also discuss with the Zoom folks to understand their needs

Guido: I'm not sure the status of "returning users" differs when there is no permission

Jan-Ivar: my focus is on the differences between users with persistent and one-time permissions (and reducing them)

Guido: I think persistent permissions is a signal of trust that we shouldn't try to override

Jan-Ivar: in order to get a user to give permission to a mic/camera, you need to let the user pick it, but if they can only be listed if permission has been granted, this is a catch-22

Jan-Ivar: we end up getting user reports on subpar experience for the non-persistent permission workflow
… hence my desire to improve it

RESOLUTION: #1019 is ready for a PR with a partial solution as per the proposal, with a MAY for additional relaxing and clarification on successful gUM

Issue #1035 - Support multiple echoCancellation modes 🎞︎

[Slide 13]

[Slide 14]

Youenn: a prototype would be nice
… I think there was discussion about removing some audio sources from echo cancellation
… which would require a more complex Web Audio API

Guido: the problem is that this only work for audio sources within the Web app - it could come from another web app or another application altogether
… so it wouldn't solve all the use cases, and would come with more complexity
… my main question is whether supporting the use case of cancelling remote participants or all audio sources is worth doing

Youenn: that seems worth pursuing

Jan-Ivar: it seems useful
… I'll want to consult with more audio folks at Mozilla

Guido: I'll prepare an initial proposal to iterate on

Youenn: in mediacapture-extensions

RESOLUTION: Proceed with a mediacapture-extensions pull request to address this use case

Issue #1036 - HTMLVideoElement.currentTime is increasing differently on different UAs 🎞︎

[Slide 15]

Guido: my impression is that this a Chromium bug
… I think it's supposed to work the way Safari and Firefox are implementing it

Youenn: I'll file a bug

RESOLUTION: Firefox and Safari's behavior is correct and the spec doesn't need an update

WebRTC Encoded Transform 🎞︎

Issue #230 - Clarification on "not processing video packets" requested 🎞︎

[Slide 19]

[Slide 20]

Harald: rejecting as part of the audio receiver is that it will never change
… the inactive transceiver is problematic; not sure there is a point in rejecting a request for a key frame

Youenn: I don't have a use case

Jan-Ivar: this LGTM

RESOLUTION: proceed with proposal A

Issue #244 - Add audioLevel to RTCEncodedAudioFrameMetadata 🎞︎

[Slide 21]

Youenn: SGTM - it would be apply to both Sender and Transceiver

Guido: right

RESOLUTION: prepare a PR

SFrame 🎞︎

[Slide 24]

[Slide 25]

Richard: we're interested in SFrameTransform for our WebEx Web client

Issue #214 - Evaluate how to expose SFrame packetization format to RTCScriptTransform and SFrameTransform 🎞︎

[Slide 26]

Harald: a year ago we had a proposal for adding encoded frames to O/A through a property to indicate the packetization mode to use
… if we have a specific packetization mode for SFrame, using that encoding format allows to use a ScriptTransform that is compatible with SFrame
… and the native SFrameTransform would work

Youenn: a single value per transform sounds better than per frame

Harald: the proposal for multiple payload types in ScriptTransform describes how to deal with this

Youenn: this sounds promising, we should revive that effort

Richard: ScriptTransform could impact packetization in many other ways beyond what can be taxonomized
… I wonder if the better solution is not to make this possible via ScriptTransform at all, and leave the proper approach to SFrameTransform

Youenn: having a way to signal the packetization allows for a nicer migration path from ScriptTransform

Richard: wrt dichotomoy between payload type vs media type, I think both need to be exposed
… the transform needs to know both

Youenn: indeed; right now we're exposing the payload type which is equivalent to the media type; but SFrame changes this

Richard: +1 to attaching it to the Transform

Jan-Ivar: supporting of working on SFrame, and to favoring per transform vs per frame

Youenn: let's try to make Harald's proposal work

Harald: I've been working on this - it has required major refactoring on the payload code in libwebrtc
… I don't think starting with a Boolean would help

Youenn: maybe an enum we can extend with additional values to support more use cases

Harald: the list of possible transforms is shorter than the list of codecs
… it would express to the media stack the class of frame it should assume for packetization

Add description of an API for controlling SDP codec negotiation

RESOLUTION: use PR #186 as the starting point for this discussion

[Slide 27]

Harald: installing an SFrameTransform should obey normal SDP O/A rules
… ie if you want to send SFrame, that should be included in your SDP
… and if the other side doesn't support receiving it, you should not send it
… SFrame would be treated like any other codec

Youenn: when you remove a codec, the engine will select the next one; SFrameTransform can be set before or after negotiation

Harald: we should require that SFrameTransform sets the negotiation needed flag

Richard: we need to ensure that if an app expects to use SFrame content, it is only sent if it is so

Jan-Ivar: treating SFrame as a codec for a negociation it makes sense, but you wouldn't want the UA to choose whether it's on or off

Richard: but that's consistent with requiring re-negotiation

Youenn: but you'll need to be negotiate both the media type and the sframe payload type

Harald: on the sender side, we have a similar case with RED
… if you negotiate both RED and Opus with RED first in your prioirty list, you get RED-encoded Opus
… if it comes second, it's only signaling ability to receive
… for SFrame, we would have to say it has to appear before the media type it would be used on

Youenn: I agree on the sending side; on the receiving side, not sure

Harald: I'm not sure how to describe the risk of receiving non-encrypted frames

Richard: this feels symetric though - there are also expectations on what I receive is encrypted

Jan-Ivar: endpoints in an SFrameTransform are allowed to do a lot kind of things; the new API Harald is proposing would be optional in any case

Youenn: so we seem to have agreement on the sender side with dropping frames; not clear on the receiving side yet
… we could start working on that part, before coming back to the WG with the receiving side

Richard: I don't think there are any valid use cases for negotiating both encrypted and not encrypted

Youenn: both the payload type and media types would need to be negotiated in the current proposal

Peter: we could have a new a-line to say we're only interested in sframe (e.g. a=sframe-only)

Richard: this sounds worth raising the topic to AVTCore

Harald: with the payload type negotiation for Sframe - it's a hop by hop payload type
… we need to resurface this discussion

Richard: with an SFrameTransform, you should only send sframe-encrypted packets / only process sframe-encrypted received packets

Youenn: ideally we would define SFrameTransform as a subcase of SCriptTransform

Dom: SFrameTransform gets you more (better packetization, possibly better trust signal) over what's currently possible with ScriptTransform

Youenn: and possibly we can make these properties also available to ScriptTransform

Harald: we may want to consider requiring setting an SFrameTransform ahead of negotiation

Youenn: please bring your ideas to issue #214

[Slide 28]

Youenn: SFrame can be used per frame or per packet
… currently SFrameTransform only works per frame, but there is a value to the per packet approach

Richard: in WebEx, we do it per media payload for a couple of reasons:
… * it's simpler to set up (?)
… * esp for videos, it requires getting the whole frame - losing one packet means losing the frame, and frames can only be processed when all packets have been received
… the proposal would be to add support for per-packet to ScriptTransform

[Slide 29]

Jan-Ivar: leaving ScriptTransform aside, couldn't this be handled transparently with SFrameTransform with a constructor switch?

Youenn: that's the SFrameTransformOptions proposal
… this would work from an implementation perspective, but it kind of breaks the model of WebRTC encoded Transform

Peter: the API might be more complicated - both options may need to be negotiatable
… a bit like bundling in WebRTC-pc

Jan-Ivar: that ties back to my question on whether SDP negotiation is really needed vs making it possible out of band

Youenn: my question is first and foremost on the model of WebRTC Encoded Transform which a per-packet approach changes, possibly adding more complexity

Jan-Ivar: we're supportive of supporting per-packet SFrame, possibly through a switch
… pending more clarity on the additional complexity

Harald: I'm not supportive of mixing these things - shoehorning per-packet in particular in the ScriptTransform model
… per-packet things should be hammered out in the RTPTransport API
… When it comes to the SFrameTransform, I'm not as critical - I don't think a switch is right way; a different SPacketTransform API would work better

Jan-Ivar: is it not possible to have SPacket feeding input into a ScriptTransform pipeline?

Youenn: yes, this would work: the depacketizer would decrypt, construct the frame and pass it to ScriptTransform

Youenn: re SPacketTransform, would this fit into the existing sender transform API, or something completely different?

Harald: I think the former should work

Youenn: the two interfaces would share a lot of similar needs / error management

Jan-Ivar: this feels like something we can bikeshed

Youenn: probably with a mix-in interface for shared properties and methods

Youenn: separately, it seems to me it would be much simpler if SFrame packetization was codec specific
… I'll raise an issue to continue that discussion

– DRAFT –
WebRTC April 2025 meeting

22 April 2025

Attendees