Meeting minutes
Recording: https://
Slideset: https://
Should the remote track mute in response to replaceTrack(null)? 🎞︎
Jan-Ivar: I agree with your reading of the spec
… track.enabled is for the Web site, track.muted for the user agent
… media only flows if both are turned on
… the mute is a signal from the UA to say "this is why you're not seeing frames"
[support for this view from Youenn and Harald]
[support for proposal from Jan-Ivar, Youenn, Harald]
RESOLUTION: Proceed with the two proposals presented in the slides
WebRTC-extensions receiver.on[c/s]srcchange event 🎞︎
Jan-Ivar: ssrc change based on decode vs RTP?
Henrik: they're based on the last decoded; you want to get the closest to reception time to take into account the jitter buffer e.g. if you want to adjust the volume
Jan-Ivar: so these events allow to avoid polling. Can we made the timing more explicit in the PR ?
… otherwise supportive
… timeline on fixing on unmute?
Henrik: we had to revert due to a bug, but it's still on track to be fixed
RESOLUTION: Merge the PR with clarification on decode
WebRTC-extensions 5G network slicing 🎞︎
Peter: is there any risk to the UA enabling this and something going wrong with the app?
Youenn: it's a trade-off - using 5G network slices for latency might reduce your bandwith (although it's usually not the case)
… there might be slices for preserving energy, so if the UA is doing it wrong, it might have downsides
… but in general, for typical webrtc apps, this should be fine
Peter: having the app being able to opt-in and out is one thing, what should be the default is another question
Youenn: my assumption is that for PeerConnection, the default should be opt-in
TimP: This doesn't tie with our experience of how carriers are delivering it to most customers
… network slices are on demand and cost money
… I'm not sure we can come to a good conclusion yet
Youenn: 5G network slices can be used in many different contexts - I'm focusing the much more narrow set of things exposed in iOS (and probably Android)
Jan-Ivar: I'm in support of keeping the UA in control and letting web apps declare their preference in terms of low-latency needs
Youenn: I tend to agree on a hint approach
… not sure we should tie this to 5G vs "best low latency possible" (which the UA would pick a 5G network slice if available)
… WebTransport has a congestionControl attribute to guide the UA
harald: to make 5G network slices usable for the Web app, there needs to be visibility on which slices are available under what constraints, or let the UA deal with it based on a declaration of needs from the app
… we've had this discussion about control ownership between app and UA any number of times
… big apps tend to want and need control, so do browsers
… given the pace of 5G rollout, I don't think we're in a hurry
Henrik: if you change which 5G network slice you use, does that change the ICE candidate you need to use?
Youenn: I don't think so; typically, for iOS, it's at the time you instantiate the connection (the UDP socket) that you'll need to tell that this particular connection should use a low-latency slice - and it will remain like this for the rest of the connection
Henrik: if you're changing from low-latency to bandwith, you would need to an ICE restart?
Youenn: I'm thinking of an immutable configuration here
Jan-Ivar: an important use case for WebTransport is MOQ - so not all WebTransport is low latency
… I support a hint, wouldn't want the app to learn about a slice being used
Youenn: so 1 or 3
… low-latency could be the default, with an opt-out
Jan-Ivar: but if it's a scarce resource, opt-in might be better
… we could use a 3 value enum ("default", opt-in value, opt-out value)
TimP: when we tried slicing last year, it got you a completely different IP address (although that might have changed since then)
… so I agree it would be hard to make it a dynamic setting
… a concern with the enum is that you might have different needs for uplink and downlink (which slices in theory can support)
Youenn: worth digging into this - if you have more details on uplink/downlink settings, that'd be useful; with an enum, we can add values over time
Peter: having an enum to say "I really really care about latency" feels a bit awkard given that the whole stack is built for latency
… esp if it's a just synonym to enable network slicing
Youenn: this could enable other optimizations later on
SunShin: NVidia is interested in taking advantage of 5G network slicing; we've enabled this on our Android client and would like to see it expanded to the Web client
Harald: to match TimP's point, should this be moved to Receiver/Sender instead of the PC?
Youenn: I'm hearing interest in a hint-based API, possibly with 3 values ("default", "low-latency", "not-low-latency"), a separation between uplink and downlink; I can come back with a concrete proposal along these lines
RESOLUTION: Craft a PR to webrtc-extensions to reflect discussion
SFrame processing model 🎞︎
Jan-Ivar: how do you get SFrameOptions to the ScriptTransform?
Youenn: it could be a type to the options, or an additional argument - something we can bikeshed on
Henrik: if a=sframe is not present but you wanted to use SFrame, you would renegotiate on the receiver side
… if it is present, is the SFrameTransform created for you
Youenn: you will need to create it yourself (or a ScriptTransform), otherwise the packets will be dropped as they can't be decrypted
Henrik: so until the transform is set up, there is a race condition where the first few frames can be dropped because they can't be decrypted yet
Youenn: on the sender, if you start with no transform, you need to renegotiate; until the negotiation goes back to stable, the packets won't be able to flow - there will be delay in the switch
Henrik: [realizing there may not be a race condition after all]
… re a=sframe
Youenn: if A send a=sframe, and B doesn't support it, B will respond without the a=sframe, and A will understand B doesn't support it
Jan-Ivar: how will that exposed to the app?
Youenn: the UA will reject the m-line
Jan-Ivar: should we open the possibility for the app to fallback to no-sframe?
Youenn: this could be exposed with a reason why the m-line was rejected
Henrik: this isn't exposed with existing m-line rejections today
Youenn: the web app will have to react to the logic of rejection implemented by the UA
Brian: that's the trade-off of locking the sframe association to the m-line (which avoids the race condition Henrik was worrying about)
Youenn: yes, that rigidity helps avoid situations where e.g. the app would think sframe is set up when it wasn't actually
Brian: we should document the recommended way to support a fallback scenario (where the app prefers sframe but is happy to go without it) by providing two m-lines, with and without sframe
Henrik: if B doesn't support sframe, it might end with a receiver where nothing comes in - which is probably fine
Youenn: if we see web apps needing to parse SDP to understand sframe rejection, this might suggest we need an API to surface it
Henrik: maybe you can detect it through a stopped transceiver?
Youenn: right, but there could be multiple reasons
Kacpper: how SFrame packet vs frame should be negotiated? in SFrameOptions? in Transceiver
Youenn: the SFrameTransform object would have it set with an options object
… for ScriptTransform, there are ways to make it work, but we haven't received request to support per-packet in ScriptTransform so far
Harald: the SDP rule is "ignore what you don't understand"
… if you want to offer "communicate in the clear or in sframe", you have to send an offer with sframe, receive an answer where it's removed, and then turn off sframe on that transceiver
… (in most cases, falling back to non-encrypted would at least require going back to the user, and so probably a different PC)
Youenn: one situation that we'll need to consider is starting with no sframe, rolling back, then switching to sframe - we might need to allow switching the transform to null
Jan-Ivar: not sure why we would not support SPacket with ScriptTransform
Youenn: on the receiver side, you receive an SFrameChunk, either several RTP packets concatenated, or or single packet, based on the payload
… if you're receiving per packet, on the receiver side you could receive an encoded videoframe
… the scripttransform would decode it; it would then feed it to its depacketizer until it has a full frame, which can then be passed to the writablestream
Jan-Ivar: why would the JS even see the encryption? why can't the UA decrypt it for me, whether at the packet or frame level
Youenn: if we do that, we need to expose key management to ScriptTransform
… my thinking is that ScriptTransform would be used e.g. for crypto suites not supported by the UA
… It is possible to add SPacket support to ScriptTransform, but it requires new API on the sender side
… you need to give to the SFrame packetizer where to split chunks (either enqueuing several frames, or providing delimiters to the packetizer) - it's feasible, but it will require changes either to the API or the processing model
… since nobody has requested it, I think we can leave it for later
Jan-Ivar: maybe so, but we still need to clarify what gets exposed on the receiver side - in particular that it would need to go through the SFrameDecrypter (which isn't clear on the slide) rather than done transparently by the UA
Youenn: we could change "type" to "packetizationFormat" to clarify that
Henrik: Making sframe fail fast sounds good; but the fact that the answerer may still think the m-section still exists until the next negotiation
… the fallback scenario would not need a negotiation
Youenn: for simplicity sake, rejection seemed easier; otherwise, this needs additional API surface
… we could add it later
Harald: is there any use case for migration? this would avoid the whole rollback discussion
… this would also help with the timing of the transform object
Henrik: +1
… I think it should be prerequesite that sframe transform is set when you do the offer and the answer
… if we don't have to support migrations, we avoid the problems with rollback, but also race conditions during negotiation
Youenn: I like this, it would be simpler; the transform setter would throw if this wasn't negotiated
Brian: we would be interested in supported both nosframe to sframe scenarios and vice versa
Youenn: sframe comes with new payload types (based on Jonathan's feedback) which can help with disambiguating m-lines
… I think we should start with the simpler model Harald suggested, and extend later if we see real benefits for migration
Jan-Ivar: in my mind, SFrame or SPacket isn't really a matter of use case, it's only the underlying technology; I'll file an issue to follow up on this
Youenn: we could have a different setter, but that seems more complex
Youenn: I'll update the PR with the feedback, with a more constrained model; we'll discuss it in the editors meeting and see if it needs to come back to the WG, but I hear overall consensus on the direction
RESOLUTION: Update the Pull Request to align with the feedback at the meeting
Media Capture: Clarify what "system default" means 🎞︎
Henrik: do we need the app to care about browser or OS default?
Jan-Ivar: e.g. if you change the system default microphone in system settings in MacOS, you'll get a devicechange event in Safari and Firefox with a changed order in enumerated devices
… this allows web sites to learn about user choices at the OS level, but this isn't behaving that way with the picker approach in Chrome
Youenn: for microphone, there is an OS default - in that case, FF's behavior follows my reading of the spec
… there is no OS default for cameras, so that leaves us in a bit of limbo with an undefined behavior
… it's a real issue - some web sites get a track and if they don't get the device id they expect, they re-ask with the first enumerated devices
… I'm not sure how to handle the camera case
Jan-Ivar: that's an issue we have with Teams (we're working on it with them); Firefox has had that behavior for years, so most web sites should be OK
… aligning with Firefox should help increase web compat
Guido: I agree with Youenn that if interpreting system default as OS default, Chrome has a bug for microphones here
… the intent from the Permissions team was to show the device chosen by the user to be "more default" (featured more prominently)
… I agree we should fix it for microphones; for camera, since there is no system default, we can't assume there is one, and apps shouldn't assume there is one, which is why the "default" entry is useful
Jan-Ivar: there is often a "primary" camera that would be good to list first
Guido: but "primary" might also apply to the one chosen by the user
… we would have to define it, and it's not obvious what this would be
Youenn: +1 to filing an issue on camera, informed by what Chrome is doing for primary
Jan-Ivar: what about support for devicechange event on change to system default?
Youenn: that would benefit from understanding the underlying approach to picking the primary camera in Chrome - e.g. if several pages are capturing, this may trigger a devicechange event
RESOLUTION: Confirm the spec is as expected for mics (and Chrome needs a fix), camera needs more discussion on interop in a dedicated spec issue
Detect speech on muted microphone 🎞︎
Youenn: there is a solution to that in the MediaSession API, with the Voice activity media session with an action handler called when voice activity is detected and the mic is muted
… this is already implemented in Safari
… it's not at the track or getUserMedia level
… that allows to mute the microphone while allowing the web site to use mediasession to request unmute
Guido: I like the direction, and it's good to hear media session has it (would need to check support for multiple mics)
… I'm not completely sure having that will suffice in getting web sites to not keep the mic open - one use case is to keep the audio processing model working correctly with recent audio
… e.g. to avoid echo
… even if it is done in the browser, it needs to be warmed up to do the cancelling properly
Jan-Ivar: good info; in any case, I'll take a look at Media Session
Henrik: there may be apps doing also more advanced voice detection than what the UA would provide with such a mechanism
… but I agree the discrepancy between app UI and device signals is creepy