WebRTC November 2023 meeting – 21 November 2023

Meeting minutes

Slideset: https://lists.w3.org/Archives/Public/www-archive/2023Nov/att-0005/Copy_of_WEBRTCWG-2023-11-21.pdf

Mediacatpure-extensions 🎞︎

Issue #121: Background Blur: Unprocessed video should be mandatory 🎞︎

Hta: related to "three thumbs up" that will be presented later in the agenda

Elad: that one will focus on the mute state

Hta: it would be much easier when there are multiple layers that can have an opinion on whether an effect will be applied if there was a way to disable the effect

Riju: this sounds nice, but not clear that can be supported on MacOS at the platform level

Youenn: it's correct that there are OSes where this would not be possible at the moment
… this "MUST" would not be implementable on these OSes
… making it a SHOULD dependent on it being feasible (e.g. some cameras may not allow it)

HTA: we could have a MUST with a note that this won't be implemented everywhere

Youenn: the risk would be to discourage implementation of background blur

Youenn: SHOULD would be preferable

HTA: I'll try to phrase something making it clear it would only be if there is a good reason not to

Elad: background blur is a problem we want to solve, but one of several issues related to these effects and post-processings that can be applied by the UA or the OS
… it's a complicated space, we shouldn't tie our hands too early; we could come back with a more generalized approach in December

HTA: next time I expect there will be a specific proposal to review

Bernard: for audio, we had a way to request specifically unprocessed audio
… it helps in that it is forward compatible

Jan-Ivar: capabilities should reflect what can and cannot be done
… not try to impose what the OS can do

HTA: we've helped push VP8 support at the OS level, so this wouldn't be unprecedented

TimP: if I had a system setting to enforce background blur, I wouldn't want the UA to override it
… this has privacy impact
… this also illustrates that there may be different rules across capabilities

HTA: discussion to continue on the bug with considerations on privacy, OS-implementability, and harmonization across effect/post-processing

Issue #129: [Audio-Stats] Disagreement about audio dropped counters 🎞︎

[Slide 12]

Henrik: in previous meetings, we agreed on a set of audio stats to measure glitches

[Slide 13]

Henrik: look for a decision on whether to expose audio frame drops to JS

[Fippo emoji-supports]

TimP: +1 on exposing it; there are ways to react too (e.g. turning up FEC, changing p-time, ...)
… it's useful for the app to see this

youenn: it's fine as long as there is no fingerprint issues - we should look at that
… e.g. maybe audio drop patterns may identify a device
… we should have the same discussion with videoframe drops
… on macos there is no way for video frames to be dropped

Paul: this is about local things - no network or packets involved
… i.e. microphone to the page only
… re fingerprinting, if there is a certain device showing issues, the UA is responsible for dealing with it
… FF cannot drop audio between microphone and the Web page - it cannot happen
… except if the machine is overloaded with real-time threads, but that's an edge case since at that point other threads wouldn't work either
… so this feels like UA architectural bugs, and thus not something that should be exposed to the Web
… for videos, dropping frames would be expected since the perceptual constraints are different

Henrik: any time we do a perf related experiment, we see issues
… I don't think we can mandate that these issues can't happen
… even if it was only a UA bug, there would still be value in exposing this for e.g. bug report

Harald: 3 parties in this game: the platform, the UA, the app
… all 3 have opportunities to mess up audio
… independently
… many UAs are running on multiple OS, multiple versions of OS with different features and different bugs
… there will be many cases of these combinations
… Paul mentioned instrumentation and telemetry
… the use case of WebRTC is so small that you have to have dedicated telemetry to make it show up at all
… having the ability to have the application report on what happens when *it* runs, and not in the general case when the UA runs is important
… my conclusion is that we should expose this to JS

TimP: is it really the case that changing the framerate and encoder settings would have no impact?

Paul: this is before encoding

Harald: this would have an impact in the PC

Henrik: adding encoder load could have an impact

TimP: surely the sample rate from the microphone is affected by p-time?
… overall, it's not implausible you could influence it from the app; but in any case, would be good to have the information

youenn: native apps have access to this info
… the app could decide e.g. mute capture in some circumstances
… unless there are security or privacy issues, I don't see a reason not to expose it to Web apps as well
… in terms of telemetry, the app could have telemetry of its own
… I still support implementing it

Paul: there is nothing you can do in a Web page that will change the complexity of the CPU usage on the real-time thread that is used for input, except disabling @@@

Henrik: we have real-world data from Google with playout glitches due to lost frames; software/hardware encoder has an impact, also quality of device

HTA: I'm hearing rough consensus except for Paul

Jan-Ivar: I support Paul; FF would not implement this API since we don't think it's needed

WebRTC Grab Bag 🎞︎

Issue 146 Exposing decode errors/SW fallback as an event 🎞︎

[Slide 14]

youenn: looking at the privacy comment, I'm not reading it as "this is fine"

youenn: I don't see how we could isolate these events across origins
… we could try to go with a PR, but this will need a closer privacy look before merging

Issue 92: Align exposing scalabilityMode with WebRTC “hardware capabilities” check 🎞︎

[Slide 15]

[Slide 16]

[Slide 17]

Bernard: some of the modes are supported, but not as "smooth" or "powerEfficient" despite the machine being high specs
… the hardware acceleration is not exposed for SVC

Henrik: in ChromeOS we can do some of the SVC modes in power efficient; L1T1 in Windows
… there is what the device can do and what the UA has implemented, and how accurately this is represented in powerEfficient
… on Windows lots of devices where L1T2 is available but not exposed in the UA yet

Bernard: webrtc-pc doesn't expose whether something is powerEfficient, only if it is supported

[Slide 18]

Bernard: proposal is to bring something back to SVC if/when media capabilities limits this exposure

Jan-Ivar: hardware support being limited today doesn't mean it will be tomorrow
… but in general, +1 to bringing this to media capabilities
… the "capture check" is not necessarily a good fit for all use cases (e.g. games over data channel)
… it also risks driving to escalating permissions

Henrik: I'm hearing agreement that media capabilities should solve this

SDP Issue 186: New API for SDP negotiation 🎞︎

[Slide 24]

HTA: same functionality, different API shape

[Slide 25]

[Slide 26]

[Slide 27]

HTA: having the API on the transform, the app would have to talk to the transform, and the transform have to talk to the SDP which seems bad

Jan-Ivar: the Transform between the encoder and the packetizer is not the one we're talking about
… we're talking about RTCScriptTransform which runs on the main thread and which would say "here is how this transform should be exposed as in terms of codecs"
… it's an inherent property of the Transform
… I don't think we should organize APIs around SDP but around the functionalities as perceived by Web developers

HTA: the purpose of isolating SDP is to not entangle SDP with stuff that we might want to use without SDP
… keeping a distinction is a good thing in that sense

Jan-Ivar: transceiver.sender/receiver are always created at the same time

HTA: at the moment yes

Youenn: at some point we'll have SFrame packetization that we'll likely want to use in connection with SFrameTransform
… I wonder if looking how SFrame would work in this model would help establish the underlying infrastrcture

HTA: when installing an SFrameTransform, SDP has to be affected, but the sframe spec doesn't say how yet

Bernard: the SFrame packetization spec covers RTP

Peter: it's a work in a progress

HTA: not sure we can depend on that progress to drive our architectural decisions

Henrik: in my mind, a transceiver and an m-section map one-to-one
… in Jan-Ivar's proposal where the Transform contains information about SDP - would the only difference be that the transceiver ask the Transform what is its payload type?
… does it make a huge difference between the two? e.g. will it be the same number of methods

Jan-Ivar: my proposed API is much simpler - it's one step done through the constructor of the transform
… it's not true that Transceiver encompasses all the negotiation needs (e.g. addTrack / createDataChannel)
… using two codec names is confusing - the packetization should be a subproperty of the codec

HTA: I look forward to a more flushed out proposal in that direction

[Slide 28]

Fippo: I would say the PT
… since the mime type is the result of the lookup of the payload type

Jan-Ivar: I was going to say the opposite, but Fippo's point makes sense
… how would you find the PT?

Harald: go through the codecs from getParameters

Henrik: if if it's one-to-one mapping, the ergonomics would go for mime type

HTA: if we move a frame between PC, the PT may not have the same meaning, when the mime type does

youenn: I would go with PT; PT and mime going out of sync may suggest there should be a different field for packetization

HTA: you could specific that once you enqueued a frame, it sets the other based on the other and ignores it if it can't find the mapping

[fippo: +1]

HTA: I'm hearing slightly favorable to mime type, but this needs more discussion - will summarize it on the github issue

RtpTransport 🎞︎

[Slide 31]

[Slide 33]

Bandwidth Estimation 🎞︎

[Slide 35]

[Slide 36]

Bernard: the RtpTransport is both for sending and receiving - right?

Peter: right

Stefan: have you thought about what it would look like for a user of this API that would want to implement its own bandwidth control in the Web app?
… would this done through this API or something else?

Peter: I have thought about that, think we should discuss it, but not today :)

Jan-Ivar: the other transports have back-pointers from the transport; shouldn't this be under the dtlsTransport

Peter: the dltsTransport should be under the rtpTransport, but we can't change this at this point

Orphis: with Bundle semantics, I think you really need it on each media section

Forwarding 🎞︎

[Slide 38]

Using existing m-lines 🎞︎

[Slide 40]

[Slide 41]

[Slide 42]

HTA: this crosses multiplexing
… I would rather not make it too easy to go over these bumper lanes

Jan-Ivar: we're already way too low level than I'm comfortable with; this is a very low level API
… if every sender has an RTPTransport, does it also show up if it has a track
… I was imagining more of a new type of datachannel
… this would mutually exclusive with the track API
… rather than exposing a JS API to programatically send packets
… the benefits of using a writable we're seeting for WebTransport helps let the UA manage the throughput and keep performance
… I'm looking for an API that allows to change the source of RTP rather than control of RTP

Florent: how would the API work with the various ways of encrypting data in an RTP packet, e.g. cryptex

Peter: we haven't talked yet about encryption of header extensions - a good topic to cover, like Stefan's; this would need more time

Henrik: you could register to send for a specific payload

Peter: I'd need to see examples of what you're thinking to help
… I think a more in-depth design meeting would be useful

Bernard: the point about cryptex makes it clear that there needs to be a facility to create a fully-formed packet for the wire
… WhatWG streams won't do SRTP when you call write

Dom: so the Chairs should figure a separate meeting for more in-depth design discussions

Multi-mute (Three Thumbs Up - Setup) 🎞︎

[Slide 58]

Elad: we'll focus on muting today

Elad: this is a problem worth solving

[Slide 63]

[Slide 64]

Elad: I think exposing upstream state is more important than changing it

[Slide 65]

Elad: the mute event doesn't suffice - it is defined as a temporary inability to provide media
… muted refers to the input to the mediastreamtrack

jan-ivar: hear 3 proposals: muteReasons, potentiallyActionable, requestUnmute
… I support requestUnmute
… I think muteReasons should only have 2 states; things outside of the browser can be correlated across origins
… regarding requestUnmute, we already have a toggleMicrophone in the media session API

Elad: re muteReasons, having more granularity would be nice to deal with the diversity of OSes

Jan-Ivar: the UA would be responsible to deal with the upstream OS when it can't deal with requestUnmute on its own

Youenn: I see convergence on requestUnmute
… depending on how scary calling that method would be for the Web app, it may impact how much information to expose in muteReasons
… in terms of the boolean, some of the definitions wouldn't be implementable e.g. in iOS
… hence why we should focus on requestUnmute first
… requestMute would be nice for later

Elad: requestMute would risk muting other apps - but it feels orthogonal anyway
… re boolean, see [slide 66]
… if we don't expose the distinction between UA and OS (e.g. call it "upstream"), would that work for you?

Youenn: I would want to better understand requestUnmute
… I believe that will help drive the discussion on the boolean value - I'm not opposed to it

Elad: I would argue that the MuteSource is useful even if requestUnmute is never called

Youenn: the case that is interesting is the one where the user muted

Guido: right now, the spec doesn't define muted the way Youenn suggests; any interruption in the capture cause a muted event
… it doesn't reflect user intent

Jan-Ivar: the examples from the spec refer to user-intended actions
… maybe we should fix the spec to allow the distinction between a "temporal" mute and a user-intented mute

Elad: changing the spec and risking to break existing implementations will be painful compared to just adding a boolean

Jan-Ivar: I would be happy to propose slides with toogleMic / toggleCamera
… mutesource has value (but not without so many values)

Harald: OS capabilites change over time, we shouldn't limit ourselves to these current capabilities

Guido: re media session, would this be re-using the event or interacting with the media session API?

Jan-Ivar: the event

Bernard: given how much content we didn't cover, should we schedule another meeting in December?

– DRAFT –
WebRTC November 2023 meeting

21 November 2023

Attendees