14:47:00 RRSAgent has joined #webrtc 14:47:00 logging to https://www.w3.org/2022/04/26-webrtc-irc 14:47:01 Zakim has joined #webrtc 14:47:07 Meeting: WebRTC April 2022 VI 14:47:16 Chair: Harald, Jan-Ivar, Bernard 14:47:16 Recording: https://www.youtube.com/watch?v=qSlXLqouxCs 14:47:29 Agenda: https://www.w3.org/2011/04/webrtc/wiki/April_26_2022 14:49:44 Slideset: https://lists.w3.org/Archives/Public/www-archive/2022Apr/att-0004/WEBRTCWG-2022-04-26.pdf 14:50:33 agenda+ -> https://github.com/webmachinelearning/webnn/issues/226 WebNN Integration with real-time video processing 14:51:01 agenda+ -> https://github.com/w3c/webrtc-extensions WebRTC Extensions 14:51:23 agenda+ WebRTC-PC Simulcast Issues 14:52:03 agenda+ -> https://github.com/w3c/mediacapture-extensions/issues/47 Voice Isolation Constraint 14:52:33 agenda+ -> https://github.com/w3c/mediacapture-handle/issues/35 support for contentHint in Capture Handle 14:52:52 agenda+ -> https://github.com/w3c/mediacapture-screen-share/issues/219 Avoid user-confusion by avoiding offering undesired audio sources 14:53:17 agenda+ -> https://github.com/w3c/mediacapture-region Region Capture 14:53:37 agenda+ CaptureController 15:01:15 Present+ Sergio, Jan-Ivar, TimP, Elad, PatrickRockhill, Anssi, Ningxin, Dom 15:01:32 Recording is starting 15:02:03 Present+ Youenn 15:02:11 Present+ Harald 15:02:40 Present+ PhilippHancke 15:03:33 ningxin_hu has joined #webrtc 15:03:46 jib has joined #webrtc 15:04:05 youenn has joined #webrtc 15:04:19 Zakim, next item 15:04:19 agendum 1 -- -> https://github.com/webmachinelearning/webnn/issues/226 WebNN Integration with real-time video processing -- taken up [from dom] 15:05:01 [slide 10] 15:05:05 [slide 11] 15:06:03 [slide 12] 15:06:04 [slide 12] 15:06:23 Present+ Florent 15:06:32 [slide 13] 15:07:00 [slide 15] 15:08:15 ningxin: slide 15 is high level pipeline to build a background blur video pipeline 15:09:22 Present+ MichaelSeydl 15:09:42 Present+ Carine 15:10:24 Two implementations: WebGL and WebGPU/WebNN. 15:10:37 texture uploads to GPU in both cases. 15:12:40 Last shader is taking 3 input: original image, blurred image, and computed segmentation map. 15:15:05 TimPanton has joined #webrtc 15:15:46 q+ 15:17:12 Description of the perf issues, in particular CPU usage and GC. 15:18:52 Bernard: is there a copy on the output at offscreencanvas level? 15:18:57 Ningxin: not sure. 15:19:29 ack TimPanton 15:19:49 Tim: is the perf acceptable? or do we need to make massive improvements? 15:20:17 ningxin: we need to measure battery impact 15:21:46 dom: we are doing this prototype to evaluate what HW acceleration can bring us. And identify potential roadblocks when trying to do video processing on media capture 15:22:28 for instance cooler conversion or pixel format. 15:22:34 s/cooler/color/ 15:22:39 scribe+ 15:23:13 youenn: looking at 20% CPU on GC - can that be fixed by implementations, or is it an architectural issue with having lots of objects created per frame? 15:23:23 ... on native, there is usually a buffer pool to help with that 15:23:41 ... does that need to be surfaced to the JS, or can that be dealt solely by the UA? 15:25:01 ningxin: GPUBuffers are created beforehand. Some objects are created for every frame, like textures. 15:25:17 There are ways to avoid many object allocations. 15:25:44 at JS level. Maybe UA optimisations might help. 15:26:01 dom: what are the next steps for this project? 15:26:18 ningxin: 1. enable WebGPU backend. 15:27:14 2. new APIs that allow import frames as GPU textures and see whether that will improve efficiency. 15:27:43 3. Improve VideoFrame GC PR: we will try out when it is merged in Chrome. 15:28:11 Present+ ChrisCunningham 15:28:37 youenn: re CPU efficiency - this is moving between main thread and worker thread, that may have a small perf impact 15:28:50 ... doing everything in the worker might be helpful once that's possible 15:30:52 Zakim, next item 15:30:52 agendum 2 -- -> https://github.com/w3c/webrtc-extensions WebRTC Extensions -- taken up [from dom] 15:31:02 [slide 19] 15:31:12 Issue 95 15:31:46 -> https://github.com/w3c/media-capabilities/issues/185 Media Capabilities issue 185 15:32:16 [slide 20] 15:33:00 [slide 21] 15:33:48 [slide 22] 15:34:03 [slide 23] 15:34:21 [slide 24] 15:34:42 Bernard: question to WG is: is it a goal for MC to deprecate getCapabilities? 15:35:44 youenn: my understanding is that media capabilities is really about audio/video capabilities 15:35:53 ... so it doesn't make sense to expose e.g. CN there 15:36:03 ... they should stay in WebRTC getCapabilities 15:36:37 ... getCapabilities() being sync is problematic; that's less of an issue for software capabilities such as CN 15:36:56 ... so deprecated getCapabilities fully is not a goal, but partially, yes 15:39:12 Florent: +1 on the approach usability of resulting split is a concern 15:39:43 chris: seems fine to use that split; do we want to return rtc codec info from media capabilities? 15:40:00 ... if so, please take at look at https://github.com/w3c/media-capabilities/issues/185 15:40:35 youenn: +1 on disambiguating the outcome of this situation 15:40:47 ... listing all codecs is a non goal 15:41:03 ... an SFU is typically only interested in a few codecs 15:41:25 ... for P2P, setCodecPreferences is probably not needed in the first place - you can deal with a generic codec negotiation 15:41:47 s/all codecs/all codecs in just one call/ 15:43:00 jib: would be good to clarify if we want to deprecate "real" codecs from getCapabilities? this sounds like a good long term goal for me 15:43:26 harald: I worry that RTX/RED/FEC info needs to be available somewhere 15:43:40 ... getCapabilities has known problems and would be the only way to get it 15:44:11 ... changing getCapabilities is actually harder to deprecating it 15:45:23 ... in the long run, it's best to deprecated getCapabilities and replace it with a better dedicated API 15:46:17 Florent: two different scenarios for setCodecPreferences: talking with an SFU in which case you can make specific codec queries; in a P2P scenario, if you can't enumerate all the codecs, you won't be able to call setcodecpreferences 15:46:29 ... this would require hardcoding a list of codecs 15:46:44 ... is there a way to make getCapabilities evolve in a shape that would satisfy everyone? 15:47:10 ... getCapabilities+setCodecPreferences has a lot of current usage, will be hard to deprecated 15:47:19 Issue 100 15:47:22 [slide 25] 15:47:46 s/Issue 100/Subtopic: Issue #100 15:48:04 s/Issue 95/Subtopic: Issue #95 15:48:32 [slide 27] 15:49:09 youenn: might be fine, but I worry about the defaults? would they be the same across browsers? 15:49:22 ... there are current codecs that are defaults, but that may need to evolve over time 15:49:28 ... this could create Web compat issues 15:49:48 Sergio: some of the codecs are receive-only 15:50:21 ... the list would be based on common sense, but I don't have a strong opinion 15:50:46 youenn: my worry is about P2P - if the defaults aren't same across UAs, the negotiation will fail 15:51:09 sergio: my suggestion was to use defaults in the offer, and adapt the answer based on the offer 15:52:07 harald: two interfaces needed: the list of codecs currently willing to offer, the set of codecs you can offer 15:52:22 ... the 1st one might be getCapabilities, the proposal on the slide for the 2nd 15:53:04 ... in terms of interop, MTI codecs should be the safety net, and they should be in the mandatory-to-offer 15:53:38 florent: the proposal seems ot have a lot of overlap with setCodecPreferences / getParameters - could we improve these instead of coming up with new API 15:53:50 [Philipp supports this on the chat] 15:54:42 Sergio: would be fine; I started from the rtp header extensions, maybe that should apply there? 15:54:55 florent: the difference is that there is already an API to set codec preferences 15:55:03 sergio: but header extensions could be added there too? 15:55:44 Bernard: let's continue the discussion in the issue 15:55:49 ... or work on a matching PR 15:56:09 Zakim, pick agendum 4 15:56:09 I don't understand 'pick agendum 4', dom 15:56:16 Zakim, take agendum 4 15:56:16 I don't understand 'take agendum 4', dom 15:56:20 Zakim, take item 4 15:56:20 I don't understand 'take item 4', dom 15:56:24 Zakim, item 4 15:56:24 I don't understand 'item 4', dom 15:56:28 Zakim, agendum 4 15:56:28 I don't understand 'agendum 4', dom 15:56:48 Topic: https://github.com/w3c/mediacapture-extensions/issues/47 Voice Isolation Constraint 15:56:56 [slide 41] 15:57:07 [slide 42] 15:57:33 Resolution for issue 95: mark issue as ready for PR 15:58:11 [slide 43] 15:59:33 [slide 44] 16:00:24 [slide 45] 16:01:09 youenn: it makes sense; reasonable to ignore `noiseSuppression` 16:01:28 ... there is also `echoCancellation` in the audio pipeline 16:01:43 ... does it make sense to do `echoCancellation` when this is set? 16:02:03 harald: I think it's mostly orthogonal 16:02:23 youenn: so `echoCancelation: false` is compatible with `voiceIsolation: true` 16:02:45 ... it may be challenging for some implementations to support these combinations 16:03:44 jan-ivar: I like this too; what should the default be? that may bring concerns 16:03:49 harald: we can discuss this in the PR 16:04:03 ... conservatively, the default should be the current behavior (false) 16:04:53 dom: instead of boolean, we could use strings for extra flexibility. 16:05:39 Resolution: mark issue as ready for PR 16:05:53 s/issue/voiceIsolation issue #47/ 16:05:59 Zakim, take up item 5 16:05:59 agendum 5 -- -> https://github.com/w3c/mediacapture-handle/issues/35 support for contentHint in Capture Handle -- taken up [from dom] 16:06:09 [slide 48] 16:07:31 [slide 49] 16:07:55 [slide 50] 16:09:39 [slide 51] 16:11:06 youenn: setting the track hint is unnecessary - if the capturer is setting the hint on its side, the UA knows that the track being captured is text - there is no need to transmit it to the capturer 16:11:16 ... except maybe if WebCodecs is the picture 16:11:43 ... having the UA taking care of this seems preferable 16:12:31 elad: the suggestion would be that the captured content self-declare its type and the UA uses it? 16:12:59 ... but that removes the liberty of the capturer to decide whether to use the hint or not 16:13:44 ... which could be based on e.g an allowlist 16:14:04 ... autodetection by the UA would have its own limitation 16:14:46 bernard: re the WebCodecs case - contentHint is not automatically consumed by WebCodecs, it's up to the app to apply it as codec setting 16:15:56 jib: I agree with youenn that the UA is in good place to shortcircuit the capturer part 16:16:13 ... the proposal could be useful for the capturee side 16:17:04 ... exposing further metadata to the controller might be an interesting addition to my capturecontroller proposal 16:17:23 youenn: it could be exposed at the videoframe level 16:17:32 jib: I see agreement on the need, not yet on the API shape 16:17:43 Zakim, next item 16:17:43 agendum 2 -- -> https://github.com/w3c/webrtc-extensions WebRTC Extensions -- taken up [from dom] 16:17:46 Zakim, next item 16:17:46 agendum 2 was just opened, dom 16:17:55 Zakim, take up item 6 16:17:55 agendum 6 -- -> https://github.com/w3c/mediacapture-screen-share/issues/219 Avoid user-confusion by avoiding offering undesired audio sources -- taken up [from dom] 16:18:03 [slide 54] 16:18:55 [slide 55] 16:19:25 [slide 56] 16:21:26 Tim: is this only applicable for echo management? 16:21:56 elad: it could be that an application is interested in recording a specific tab, no more than that. 16:22:21 Tim: this use case does not seem address: identifying the desired tab would be needed. 16:24:50 Elad: some VC applications usually do not want to capture system audio. 16:25:38 Jan Ivar: supportive, how about reusing displaySurface constraint here? 16:25:43 Elad: Might work for me. 16:26:12 Jan Ivar: I would like to remove monitor from here. 16:28:04 dom: if we do not include monitor here, audio: true might capture system audio. But applications would not be able to explicitly ask for system audio. 16:28:38 dom: displaySurface would be a strange name for audio. 16:33:08 youenn: let's enumerate the different approaches: avoidSystemAudio, displaySurface, sources 16:33:16 youenn: scope is unclear, we need to clarify this before going to PR. 16:34:08 youenn: different properties allow to do feature detection on what kind of recording the UA can do 16:34:47 elad: my focus is only limiting access to system audio, but I also think flexibility is helpful 16:35:12 timp: back to my echocancellation point - the constraint could be linked to whether the source can be echocancelled 16:35:47 Harald: source being echo cancellable is a second concern. Biggest point is avoiding system audio. 16:35:52 Tim: as well as window audio. 16:35:57 harald: echoCancellation is a secondary concern - capturing system audio could disclose info from a 3rd party 16:39:57 Zakim, take up item 7 16:39:57 agendum 7 -- -> https://github.com/w3c/mediacapture-region Region Capture -- taken up [from dom] 16:40:00 Resolution: continue discussions on GitHub. 16:40:15 [slide 59] 16:40:39 youenn: #11 is an issue on the shape of the CropTarget API 16:40:53 ... given current chrome implementation work, feels it's useful to converge on the API shape 16:41:17 [slide 60] 16:41:41 youenn: do we want to attach the API to element or to MediaDevices? 16:44:04 ... element feels like a better path 16:44:15 jib: +1 16:44:54 elad: I prefer mediaDevices given its linkage to screen capture 16:45:29 youenn: cropTarget is linked to MediaStreamTrack, not mediaDevices 16:45:34 ... and it's really tied to an element 16:46:03 elad: it can be used through an object you get from getDisplayMedia 16:47:50 youenn: but with a detached mediaDevices, you can't reject the promise 16:49:31 dom: prefer element option. 16:52:32 youenn: next question is attribute vs method 16:52:45 ... slight pref for attribute, but no strong feeling 16:53:19 elad: there is a cost to minting a crop target - we mark the element in the rendering pipeline in specific ways that we shouldn't abuse 16:53:33 youenn: I thought you were going to use a lazy approach to reduce that cost 16:54:29 elad: lazy tagging might help, but this needs more thinking 16:55:12 jib: +1 to attribute 16:55:29 ... developers value trump implementators value 16:55:50 elad: I don't think it matters much to developers in the first place 16:56:22 harald: disagree with messing with the element interface, and on hiding the fact that the operation has a cost 16:57:36 ... also async (promises) may be needed for some implementations 16:57:45 ... let's not hide the reality of the situation 16:58:03 jib: the cost seems to be Chrome-specific 16:58:17 ... the real goal of this API is a transferable reference 16:59:11 youenn: +1 17:00:10 ... other APIs in the past have re-used the element interface, have made similar decisions on methods / attributes, async vs sync 17:00:24 ... we should follow existing implemented platterns 17:01:24 dom: is there any other API that may be use this tranferable reference? 17:02:31 youenn: that's something I bring up in the issue 17:02:44 elad: this may create unsafe usage for this well-defined targett 17:02:53 s/platt/patt/ 17:03:00 s/targett/target/ 17:03:16 jan-ivar: this could be evaluated 17:03:31 hta: but this shouldn't block progress on the specific narrow goal we have 17:07:07 youenn: my focus is aligning with current API patterns for this API 17:08:15 elad: the TAG will chime in; but if they don't give a clear specific suggestion 17:09:00 ... we could move with the current design that can be polyfilled 17:12:14 RRSAgent, draft minutes 17:12:14 I have made the request to generate https://www.w3.org/2022/04/26-webrtc-minutes.html dom 17:12:24 RRSAgent, make log public 17:12:49 i/Meeting:/ScribeNick: youenn 17:12:51 RRSAgent, draft minutes 17:12:51 I have made the request to generate https://www.w3.org/2022/04/26-webrtc-minutes.html dom