W3C

– DRAFT –
WebRTC May 2023 meeting

16 May 2023

Attendees

Present
BernardAboba, CarineBournez, Dom, EladAlon, FlorentCastelli, GuidoUrdaneta, HaraldAlvestrand, HenrikBostrom, JanIvarBruaroey, JaredSiskin, PatrickRockhill, PeterThatcher, PhilippHancke, SameerVijaykar, SunSHin, TimPanton, TonnyHerre, YouennFablet
Regrets
-
Chair
Bernard, HTA, Jan-Ivar
Scribe
dom

Meeting minutes

Recording: https://www.youtube.com/watch?v=XqYcdxWvlVw

Slideset: https://lists.w3.org/Archives/Public/www-archive/2023May/att-0000/WEBRTCWG-2023-05-16.pdf

WebRTC-NV Use Cases 🎞︎

[Slide 10]

TimPanton: we're renaming the "NV" Use Cases into "WebRTC Extended Use Cases"
… questions were raised about its usefulness, which I think are worth discussing
… looking for guidance and some sort of agreement of possible improvements

[Slide 11]

RFC7478: Web Real-Time Communication Use Cases and Requirements

[Slide 12]

TimP: "NV" is no longer what we're doing, so we renamed it into "WebRTC Extended Use Cases"
… different level of consensus across use cases and requirements

[Slide 13]

[Slide 14]

Bernard: the Machine Learning use case is a weird one - it does raise a question of what use cases are for
… in IETF, we distinguish applicability docs from use cases doc
… the applicability tells you things the standard doesn't apply for
… documenting that something can be done with the tech would not show up in use cases, but in applicability doc

TimP: worth discussing - will be covered somewhat later

[Slide 15]

[Slide 16]

[Slide 17]

TimP: WISH is addressing 3.9 and 3.10, and yet the use cases don't have consensus

[Slide 18]

[Slide 19]

[Slide 20]

[Slide 21]

[Slide 22]

TimP: proposal to rename it is in process
… I want to focus on things that we can only or best do with WebRTC - that implies refocusing on P2P
… since that's what WebRTC uniquely does
… similar to RFC7478 did
… we should take out use cases & requirements done by other standards
… I think we need to include use cases that don't have new requirements but extend what 7478 describe
… e.g. IoT
… We should remove use cases or requirements that don't get consensus in a few months - they can always be added back
… we should remove use cases that otherwise don't add new requirements
… proposed API changes should all include changes to the use case doc if there is no use case for it - otherwise, why are we changing the API?
… this also raises the question of the relationship between explainers and the use case doc
… we've also been struggling with where the input comes from; happy to use webrtc.nu to that end if that's of interest

Bernard: good suggestions - my opinion on them
… +1 to the important of P2P; this has been a point of confusion in other WGs use cases
… Streaming use cases often require P2P operations
… All of the major cloud streaming services use WebRTC also for P2P among clients
… "met by other standards" - I would like to see wide usage; WebRTC often wins over other standards because of its reach
… re removing non-consensus content, +1
… A big question: are we implying that we need to give blessing to a use case to make it "valid"?

TimP: I think for a small developer, getting some level of reassurance that your usage is available by accident or something that can be relied on makes a big difference
… as illustrated in the quote in slide 21

Bernard: but even if you put in the doc - it doesn't create a requirement for browsers to support it properly

TimP: True, but when somebody comes up with a change to the API that removes the ability to do this, there is no structural way to push back on the change

Harald: one of the thing that surprises me in this handling of the doc is the handling of consensus
… e.g. the "one way use case" - the developers are eager to use it, but the WG is pushing back on consensus
… likewise for trusted JS conferencing - everyone is doing it, but we removed it for lack of consensus
… I'm worried by the distance between the use case doc and the use cases that I need to support in the real world

Peter: +1 to getting our use cases in order, +1 to having a place where devleopers can ask for things and get a status on "yes, it is in", "yes, it will be soon", "we can't figure out how"

TimP: asking for things shouldn't be asking for features

Jan-Ivar: +1 this needs a clean up
… we should be clear about the scope for this - this is only for this WG
… NV was supposed to be WebRTC 2.0
… instead, we've been unbundling WebRTC in other specs - which make some of these use cases are no longer in scope for our WG
… the purpose of this doc is to drive discussions for our WG and in our github
… to help decide what's in or out

Tony: hope we don't silo P2P vs alternatives; there may be use cases for integration with other standards

Youenn: in terms of scope, +1 to Jan-Ivar - this doc is for the WebRTC WG
… +1 to harald that we haven't had a lot of success with it
… explainers allow to combine use cases requirements, API and examples - that provide a better structure from my perspective

Youenn: maybe we should migrate towards that more than a standalone use cases doc

TimP: What remains in the use case doc then?
… is it a list of pointers to use cases in explainers?
… I'll take this to the list - I can put some time on this once I understand what we want to achieve
… I want to expand the scope a bit more than just for the WG

Bernard: I'll try to create PRs (esp removal PRs) for our call next month

WebRTC Extensions 🎞︎

Issue #134 / PR #164: Remove JSEP Modifications 🎞︎

[Slide 25]

Harald: the RFC errata process is not the process to make change to an IETF document; only the IETF process does that
… now that the IETF process has turned the crank with the updated draft, we're in a better situation

[Slide 26]

Issue #158 / PR #167: Requesting a key frame via setParameters 🎞︎

[Slide 26]

[Slide 27]

Jan-Ivar: I see a requestFrame boolean - a bit odd for a parameter; why not making it a method? maybe it could be a counter?

Fippo: I didn't want to separate in 2 methods as it creates all kind of issues
… e.g. when deactivating a simulcast layer

Jan-ivar: this could be done with a Promise.all (vs a boolean attribute); but that's bikeshedding

Fippo: it could cause 2 keyframes to be generated

TimP: are there situations where setParameters isn't the right place for it?
… e.g. in encryption situations

Fippo: that's why it has been added to encoded transform
… but we haven't needed it so far for E2EE

Harald: I don't like setParameters setting something that isn't a parameter
… I would prefer something like sLD
… getParameters should return what setParameters set
… requestKeyFrame shouldn't be set

Fippo: maxBitRate is typically unset in the first call of setParameter

Harald: yes, but once it's set, it's set

Florent: I understand the need to have it synchronized with setParameters, but I'm not sure we want it in the encoding parameters
… we might want to have another member in the structure passed to setParameters
… that may also make it work for getParameters
… that would require rid validation - not sure that would necessarily be hard to do
… the symetric API on a receiver would be very different since we don't have setParameters on receivers
… how would we handle this there?
… We do have getParameters on receivers

Fippo: there are use cases for this indeed, e.g. to adjust jitter buffer

Henrik: I support this use case

Jan-Ivar: just because the browser isn't doing the right thing doesn't necessarily imply that it needs to be exposed to JS
… e.g. it could be done automatically when active is set

Youenn: in encoded transform, there is sender/receiver API, and a transformer API
… they solve different issues
… this one is mostly targeting the sender API
… it makes sense to have it sync'd with setParameters
… the transform API would still remain, right?

Fippo: it's probably necessary for some E2EE use cases

Jared: this makes sense to me - two reasons to deactivate the top layer: the person left the call or or they're switching down
… you'll almost always want to send a key frame when deactivating a higher layer, except for the last participant
… this may not require an API change for most cases

Peter: I think this is a good approach to solve the real-world need in a simple way

RESOLUTION: refine PR #167 based on input during the call

JIB: I'd like to explore whether we need an API at all

Issue #159: RTCRtpEncodingParameters: scaleResolutionDownTo 🎞︎

[Slide 28]

[Slide 29]

Peter: what happens if you set the top layer to 360, and the next layer scale down by 2?

Henrik: if you mix the two, we should throw an exception

Peter: so either a factor, or a specific value
… do we need any specific value?

Henrik: same as for scaleResolutionBy - forbidding upscaling

Peter: what happens when the top layer gets dropped?

Henrik: generally, do the same as SRDB

Youenn: are there existing workarounds with setting active to false?
… may create short glitch

Henrik: there are workarounds, but I'm not sure how good they are in practice

Florent: an API like this would be great
… there is a problem with orientation (portrait vs landscape) in terms of interpreting that value
… also, as Peter asked, what happens if you feed frames that are smaller into layers that expect bigger frames
… this may result in several layers with a single lower resolution

Henrik: should I provide a PR?

Youenn: I would want to be careful if this adds a lot of complexity

RESOLUTION: Overall support in the direction, but details need to be fleshed out and complexity assessed

Media Capture Extensions 🎞︎

Issue #98 track.getFrameStats() allocates memory, adding to the GC pile 🎞︎

[Slide 30]

[Slide 31]

[Slide 32]

[Slide 33]

[Slide 34]

[Slide 35]

[Slide 36]

[Slide 37]

[Slide 38]

Youenn: the use case is to get stats, not real-time information
… that's why you call an API, you get a result; if you need it 10s later, you call it again
… with an interface, you get the object and then the UA starts the processing whether or not you're going to get data from it
… that's more appropriate for real-time usage
… it seems an ill-fit for the usage implied by the name getStats
… so I would go with getStats()
… if we need real-time info later, they can come through different APIs
… WebRTC has already this pattern: getStats vs requetsVideoframeCallback
… so +1 to Proposal A

Jan-Ivar: unfortunately Paul couldn't join us today
… Paul commented on the issue about IPC - the data has to be in the content process eventually
… "real-time" is a spectrum (e.g. audio vs graphics)
… this WG has fallen into a pattern - it would be useful to compare e.g. with the Media WG; not everything has to be a stat
… there are ways to implement this without locking (which the design guide asks not do in a getter)
… it all depends on the use case
… other WGs with real-time have APIs already - media.currentTime or audioCtx.outputLatency

Henrik: the use case is to call it at 1Hz
… I fail to see use cases to call it more frequently on the main thread
… if we're confident we're not getting other metrics, this would be fine
… I don't see a lot of ergonomics difference in using await for async
… if we do get requests for additional metrics, we would have painted us in a corner

Youenn: with a promise-based API, it's clear that the results will be gathered asynchronously
… with a getter, it requires updating it very frequently or be specific on the update frequency in the spec
… the contract is clearer with async

Jan-Ivar: "we should make decision out of love not out of fear"
… i.e. based on the best information we have now, not anticipating all future uses

Youenn: if we have to design it as a real-time API - would the sync API still be the best choice?
… we don't have a real strong use case to guide us

Jan-Ivar: a variant of Proposal B is to have the attributes directly on the track

Henrik: it falls down to async vs sync; could you describe what you meant by "lockless"?

JanIvar: there is code that allows reading values from another process without a lock
… Paul would be the expert on this - he has implemented this for AudioContext (not yet landed in Firefox)

Harald: implementing it as proposal A seems faster and easier

Henrik: not detecting consensus at this point

Jan-Ivar: I'm not hearing a lot of interest in our attribute-based API - will discuss internally and get back to the group on this

IceController: Prevent removal of candidate pairs w3c/webrtc-extensions#166 🎞︎

[Slide 41]

[Slide 42]

[Slide 43]

[Slide 44]

[Slide 45]

[Slide 46]

[Slide 47]

Peter: this makes sense as a first item to tackle - although it's more useful once combined with candidate pair selection
… I like the "removable" attribute and a setRemoval method
… the idleTimeout is interesting, but there may be underlying complexity there; also the ergonomics of max integer value isn't great

Harald: removing candidates isn't timer driven - candidates are removed when they fail
… the end-of-candidate spec'd in ICE would throw away everything when we reach completed
… ICE implementations wouldn't fit well with a timeout
… because of this, cancelable event is a better fit - it allows the ICE Agent to match the spec'd protocol

Jan-Ivar: not an ICE expert - but I prefer preventDefault()
… still not clear to me what the UA should done once it's prevented though?
… ICE Transport doesn't let you look at all the pairs... is this an insight problem into the state of things?
… could you talk a bit more on what happens then?

Sameer: this is a first set of improvements - listing all candidate pairs would be another improvement
… allowing to keep candidate alive is to allow to switch to a different pair (with another additional API)

Peter: in practice, the UA would continue sending checks on that candidate pair

TimP: but then what happens when it SHOULD remove it - e.g. because it's no longer working; you don't want to prevent THAT default

Sameer: Removal would be only for redundant pairs; failed candidates would "deleted"

JanIvar: that seems confusing - maybe it's a bikeshedding issue

Sameer: the UA can also send cancelable events that aren't systematically cancelable

Dom: I think it's primarily a bikeshedding issue - important to keep separate "deleted" and "removed", possibly with better names

Jared: the RFC uses "prune"

Sameer: my initial proposal had this; we could revert to that
… I'm hearing most support for the cancelable approach, happy to write a PR for that

RESOLUTION: Write a PR for a cancelable event approach

RtpTransport Follow-up 🎞︎

[Slide 50]

[Slide 51]

TimP: I'm not convinced these are use cases :)

Peter: fair - this are things that people want to do :)

[Slide 52]

[Slide 53]

[Slide 54]

[Slide 55]

[Slide 56]

Peter: Depacketization would be built-in in the jitter buffer

[Slide 57]

[Slide 58]

[Slide 59]

[Slide 60]

Peter: what use cases would be needed to make this proposal more appealing? are there other type of examples people would like to see?
… how much of the gap should be filled by us vs libraries?
… would an explainer be useful?

Bernard: very helpful to see examples, thanks
… how would SSRCs be handled? would that be another gap?
… in a conference system, keeping track and routing ssrcs is a lot of work
… would there be some help to manage this in the API?
… Another question is about the depacketizer - would there be a generic video buffer distinct from the packetizer?

Peter: you're highlighting that an RTP demuxer would also be part of the gap
… it is indeed in the gap, it could be provided

Bernard: it would provide an handler for a particular stream - that would enable handling them differently

Peter: for the depacketizer, audio and video need different handling since it's so trivial in audio

Bernard: how would audio/video sync work? I regularly get that question

Peter: worth adding to the gap indeed - thanks!

Jan-Ivar: looking at the use cases - it feels like we're abandoning the original WebRTC API with this, which worries me a bit
… maybe we should solve time sync on the datachannel e.g. with timecodes
… I'm worried that focusing on the new API means we don't solve it with the existing API
… likewise, maybe we should look at exposing the same codecs as WebCodecs without having to use this low level API
… This API seems to be using events (vs WhatWG Streams) and operating at very low level
… I'm not sure all these use cases should or need to be solved with a new API
… it feels this may be stepping on WebTransport (although of course WT isn't P2P)

Peter: WT is not P2P (yet), doesn't have real-time congestion control (yet), doesn't interoperate with RTP endpoints (ever)
… in terms of WebCodecs - I designed an alternative with media senders / receivers for the media half of RTPSender/Receiver - it ended up being almost a duplicate of WebCodecs
… in terms of abandoning the WebRTC API - that's part of this "filling the gap" approach I'm suggesting
… overall, I wouldn't be sad if we end up abandoning RTCPeerConnection because something better emerges
… there are a lot of things that we can continue to do incrementally, but I don't think we're going to get to all of it, esp at the pace we're going

Youenn: I've had people asking for similar features
… for HEVC, there are some niche usages where ti would be hard to get UA support, and an RTPTransport would allow to fulfill these
… what is the future though? would we still evolve RTCPeerConnection? it comes with some consistency and performance advantages
… it would be interesting to see if you can get good performance with a WebCodecs / RTCTransport approach compared to RTCPC
… An explainer would be good
… Wrt packetizer, jitter buffer - WT apps would likely want something like that
… we should check if a pure JS based approach isn't sufficient - this should be doable
… we should focus on RTPTransport for P2P
… it would still be useful to design packetizer API / jitter buffer API to help with protocol integration, but it should not be the focus

Harald: I do want to explode the PeerConnection object in smaller objects
… but I would rather concentrate in defining APIs à la Transform API - it defines a dividing point to split the encoder from the packetizer
… I'm not happy with the API we've proposed for this and am trying to reshape it
… but I think we can go faster to where we want if we make it possible to plug your own object into an existing PeerConnection
… breakout box did a good job at that
… breaking things apart that way that leaves working systems, this will allow us to get there faster

Jared: RTPTransport makes me picture a nicely packaged component for SFUs
… I would rather let people use this without having to fully unbundle everything
… I'm worried by leaking API boundaries
… like congestion window pushback where control congestion messages can end up telling the encoder to drop frames
… how many of these leaky abstraction will we need to support?

Bernard: both mediacapture transform and encoded transform use WHATWG streams
… Youenn had raised issues about that usage, not all of them have been put to bed
… the danger of not using Streams is likely to create issues when using it in workers
… transfering individual chunks or frames to worker isn't going to work very well

Youenn: FWIW, the Streams API issues are being addressed (for transferring ownership)

Bernard: but not for the one-object per pipeline

Youenn: might be the next thing worth addressing there indeed

Peter: I'm hearing support for an incremental approach towards something like RTPTransport

Summary of resolutions

  1. refine PR #167 based on input during the call
  2. Overall support in the direction, but details need to be fleshed out and complexity assessed
  3. Write a PR for a cancelable event approach
Minutes manually created (not a transcript), formatted by scribe.perl version 208 (Wed Dec 21 15:03:26 2022 UTC).