04:50:57 RRSAgent has joined #mediawg 04:51:01 logging to https://www.w3.org/2025/11/13-mediawg-irc 04:51:01 present+ Chris_Needham 04:51:01 Zakim has joined #mediawg 04:51:11 alwu has joined #mediawg 04:51:18 mjwilson has joined #mediawg 04:51:34 Meeting: Media and WebRTC WG joint meeting 04:51:55 eric-carlsone has joined #mediawg 04:52:20 Present+ 04:52:21 kota1 has joined #mediawg 04:52:22 handellm has joined #mediawg 04:52:44 Present+ Guido, Youenn, Jan-Ivar, MarkusH, Paul, Marcos, SongXu, Francois 04:53:00 present+ Alastor Wu 04:53:01 Present+ SunShin 04:53:14 Agenda: https://github.com/w3c/media-wg/wiki/TPAC-2025#-thursday-13-november 04:53:16 scribe+ 04:53:19 song has joined #mediawg 04:53:35 Topic: MSTP for Audio / AudioTrackGenerator 04:54:00 Slideset: https://docs.google.com/presentation/d/1sd5zEnvlXO5Sk3ENQorUUIQiRz65sv0KZKxDMMYHM3I/edit? 04:54:04 [slide 101] 04:54:16 Guido: Proposal to add audio support in the mediacapture-transform spec. 04:54:28 ... The spec allows working with raw media and MediaStreamTracks. 04:54:37 ... Given a track, access to raw media. 04:54:38 present+ Eric Carlson 04:54:43 ... Given raw media, produce a track. 04:55:10 ... Raw media uses interfaces from WebCodecs, with consensus to use VideoFrame. 04:55:29 -> https://github.com/w3c/mediacapture-transform/ MediaCapture Transform repo 04:55:33 [slide 102] 04:55:33 ... No consensus yet for AudioDate for audio, but this is shipped in Chrome and I'm going to report experience there. 04:55:55 [slide 103] 04:56:01 present+ 04:56:13 Guido: VideoTrackSource allows to produce a track, given frames. 04:56:34 Present+ Harald, ChrisBlume, PeterThatcher, XIaohanWang, KenMomatsu, LeiZhao, EricCarlson 04:56:34 [slide 104] 04:56:48 Guido: Both can be combined to produce a transformed track. 04:57:23 nigel has joined #mediawg 04:57:28 [slide 105] 04:57:49 Guido: My proposal is to try to achieve consensus on audio. 04:58:18 ... We don't have consensus yet because it's kind of redundant with AudioWorklet, which allows you to do audio processing. 04:58:24 s|Topic:|Topic: -> https://github.com/w3c/mediacapture-transform/issues/29 04:58:42 present+ Ken_Komatsu, Xiaohan_Wang, Peter_Thatcher, Chris_Blume, Markus_Handell, Nishitha_Dey, Eric_Carlson, Jan-Ivar_Bruaroey 04:59:04 ... Audio procssing sometimes requires a realtime thread all the way through, you will see glitches. 04:59:08 [slide 106] 04:59:34 Guido: There are good use cases where the transform thread is a good fit, for example if buffering is acceptable. 04:59:43 ... if there is no audio playback. 05:00:05 ... If the processing has high variance. You may want the processing to have very low CPU. 05:00:22 ... In the worker, if you use buffering, you may have tolerance for variance. 05:00:44 ... You may want to combine audio and video processing in the same worker, which you can do with WebCodecs. 05:01:05 Present+ GabrielBrito 05:01:19 ... If you use directly the data that comes from AudioDate, you can access directly the input metatada such as capture timestamps. 05:01:33 ... It also saves you a realtime thread, which is more expensive. 05:02:40 ... Some examples of applications include audio analysis, encoding (leaving the main thread untouched), or transforms with no playback, for example if you want to send the result to the network. 05:02:54 [slide 107] 05:03:37 Guido: Comparing using of stream track processor, generator, and audio worklet, in general (not audio specific). 05:03:43 ... Both are popular APIs. 05:03:48 [slide 108] 05:04:41 Guido: If you go to medicapture-transform usage, 50% of AudioWorklet, audio is ~27% of all Processor usage. Widely used, and is used mainly without connecting to a generator. 05:04:51 ... E.g., to compute metrics about the audio. 05:05:26 ... Generator usage for audio is ~2.6%, less used than Processor. Main use case is transformations that output only to the network. 05:06:07 ... For video, generator is used 47% more than the generator. For audio, it's the opposite, and hundreds of percents more. 05:06:11 [slide 109] 05:07:00 Guido: Processor for Audio is widely used. Generator has more limited use cases. It's not good for playback. The use cases are more restricted, but experience suggests that people understand that. 05:07:12 q? 05:07:14 ... My proposal is to add it to the spec, with guidance on when to use it and when not to use it. 05:07:49 Youenn: I would see a use case for live recording. Sending data through WebSockets for example. In many cases, they are using Media Recording, but not always. 05:08:11 ... Usage to generate PCM, which works but is a bit of an abuse. 05:08:26 Sun has joined #mediawg 05:08:28 ... The other is [missed]. 05:08:39 ... It would be interesting to do measurements. 05:08:58 ... For generator, I feel that the only use case is doing processing before encoding the data in peer connections. 05:08:58 s/[missed]/that avoiding realtime threads would help/. 05:09:12 q+ 05:09:13 ... You have your microphone, doing some processing, and then sending over. 05:09:25 ... If you're using generator elsewhere, that's not very gret. 05:09:29 s/gret/great/ 05:09:46 Markus: Recording audio is also a use case for generator. 05:09:48 ack h 05:10:02 ... Doing realtime processing on the audio. 05:10:22 Youenn: If you have a peer connection approach. 05:10:22 q+ 05:10:51 ... I will concentrate on a peerconnection approach for generator. 05:11:06 ... We're only talking about making things from CPU-friendly. We have some time. 05:11:23 Guido: It's a niche use case, but it's quite good for it. 05:11:29 q? 05:11:32 ack jib 05:11:34 Youenn: We should put priority on the Processor first. 05:11:55 q+ 05:12:00 jib: The whole thing can easily be shimmed over AudioWorklet. 05:12:24 q+ guido 05:12:40 ... The interop gap is more problematic. 05:12:50 ... We don't a lot of traction as a result. 05:12:59 ack handellm 05:13:06 Markus: I'm not sure that you can observe timestamps in the right way. 05:13:31 Youenn: That's something that we can look at in the AudioWorklet context. 05:13:47 Markus: But then you pass a high priority thread 05:13:57 Youenn: For the use cases considered, that seems reasonable. 05:15:01 Youenn: One issue with Generator is usage in contexts when it should not be used, and one implementation may be optimized, leading to applications that ship because they work well in one particular browser, but then not as well in others. 05:15:42 ack guido 05:15:44 jib: Even once we get implementations, I worry that it might negatively impact progress on web types. 05:16:20 Guido: We saw that people were smart enough to use the processor and not the generator, so people understand the constraints. 05:16:48 ... The argument that not supporting a very good use case because you can also use it for other use cases when it's less good is weak. 05:16:58 Youenn: I'm more thinking about it in terms of priorities. 05:17:15 q? 05:17:32 q+ 05:17:42 Guido: Re. spec compliance, as long as we fix the issues, we're happy to align. 05:18:01 q+ jib 05:18:02 q+ 05:18:14 dom: We want to hear feedback from audio folks. 05:18:17 ack me 05:18:45 Paul: As mentioned by jib, we shipped a shim, it's easy and it works. There is no need to do something more specific. 05:18:51 q+ mjwilson 05:18:54 q- 05:19:04 hta has joined #mediawg 05:19:35 ... In modern OS, the energy consumption gap is between 0 and 1, and there is almost no gap between 1 and 2. 05:20:06 ... Careful measurements have shown that improvements exist between 0 and 1, not above. 05:20:17 q+ 05:20:28 ... You do have a realtime thread. 05:21:03 Guido: If you want to do a processing that's doing something such as encoding. Why would I want to spin up a realtime thread? 05:21:33 Paul: If you have some audio running, you already have a realtime thread. If you have an audio context, you don't need another one. 05:22:39 q? 05:22:41 dom: The point is not only about performance, more about audio and video processing being combined and requiring separate contexts to run as a result. 05:22:58 Present+ VincentScheib 05:23:03 Youenn: In Safari, we have a limit on the number of audio contexts that you can create. 05:23:15 ... If we can reuse some of them, that's good. 05:23:35 ... This proposal for Processor would allow us to reuse, which seems good and measurable. 05:24:15 ack handellm 05:24:43 Markus: The type of processing we're considering here is WebGPU, WebNN. 05:24:55 ack mjwilson 05:25:01 Paul: In our polyfill, we send the data zero copy, that's fine. 05:25:41 mjwilson: [refers to a proposal worth considering as part of this discussion] 05:26:05 GabrielBrito has joined #mediawg 05:26:21 https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/OfflineAudioContext/explainer.md 05:26:25 Youenn: Having performance measurements would help convince people. 05:26:39 ... The Web Audio proposal would be worth hearing. 05:27:43 Topic: Decoder failure API 05:27:47 [slide 111] 05:28:12 s|Topic: D|Topic: -> https://github.com/w3c/webrtc-extensions/issues/146 D 05:28:12 Steve: Edge team, talking about adding decoder fallback and failure events to WebRTC. 05:28:23 ... We're continuing to operate on feedback that we received since last year. 05:28:36 [slide 112] 05:29:22 Steve: Game streaming platforms realy on hardware decoding, but they don't have access to mic/cameras, so don't know when a decoder fallbacks from hardware to software. 05:29:43 ... We want applications to know when this happens and help them analyze what happens. 05:29:57 [slide 113] 05:30:16 Steve: We don't want to expose any vendor-specific hardware information. 05:30:21 [slide 114] 05:30:36 Steve: Current proposal is two new events on the RTCRtpReceiver. 05:30:50 [slide 115] 05:31:07 Steve: Here's the proposal for the decoder state change event. 05:31:36 ... The codecParameters and powerEfficientDecoder are nullable to allow implementations to determine whether to report them. 05:31:42 [slide 116] 05:32:18 Steve: You can send telemetry for example. 05:32:21 [slide 117] 05:32:40 Steve: The error event could be useful if you have an hardware decoder that cannot fall back to software. 05:32:46 ... Questions? 05:33:30 Youenn: I talked with our privacy folks. For the nullable properties, it should be pretty easy to get PING's attention through their repository. 05:33:42 q+ 05:34:06 ... That would be a good next step to do for this proposal. 05:34:19 ack jib 05:34:39 jib: Some nits on the API. 05:35:00 mjwilson has joined #mediawg 05:35:04 ... The example is not entirely clear when the event fires. 05:35:16 Steve: The proposal is that there will be an initial event when it starts decoding. 05:35:29 jib: It may be worth exposing another attribute then. 05:35:32 hta1 has joined #mediawg 05:35:43 s/on the API./on the API. First, why on the peerconnection rather than the transceiver?/ 05:35:45 q+ 05:36:13 jib: Agree with the need to do a privacy review. 05:36:15 ack hta 05:36:35 hta1: Since this is decoder, the only place it belongs is on the receiver. 05:36:45 Steve: That's what we did! 05:36:51 hta1: You did it right. 05:37:44 Youenn: When I'm looking at errorText, I'm wondering whether it will be platform specific. How you make use of that error code or error text. Are you trying to achieve "I should retry?" or is it more for stats purpose? 05:37:44 lei_zhao has joined #mediawg 05:38:00 Steve: That's a good question. We copied this from the WebRTC error. 05:38:20 ... It's mainly for error reporting and troubleshooting. 05:38:30 q+ 05:38:32 ... There may be some more errors where this could be interesting. 05:39:31 q+ 05:39:45 q- 05:39:57 Nishita: Another scenario is telemetry for the app. Coding negotiations or showing the user some report. 05:40:18 Youenn: For telemetry, there's a proposal from Guido to use the Reporting API. That may be a useful place for this as well. 05:40:33 -> https://www.w3.org/2025/11/13-webrtc-minutes.html#f2b1 Discussion on Reporting API for WebRTC this morning 05:40:38 ... What can I do with it for my users might be the right question for this proposal. 05:41:22 cpn: Wondering whether PING already reviewed RTCRtpCodecParameters? I'm wondering whether we're opening the discussion to a much broader scope. 05:42:08 hta1: We switched to Media Capabilities as a result of previous discussions. What PING thinks about it right now, I do not know. 05:42:36 Topic: Media Playback Quality 05:42:45 [slide 119] 05:43:05 Markus: Wanted to collect feedback on a proposal for improved metrics for temporal video quality. 05:43:12 [slide 120] 05:43:44 Markus: To recap, the video conferencing problem is you capture a bunch of frames, with timestamps, and the job of the receiver is to represent that accordingly. 05:43:56 [slide 121] 05:44:11 Markus: One way to measure that is through the framerate. It doesn't use timestamps at all. 05:44:25 ... You don't see itches with this. 05:44:44 ... Good for understanding video. For screen sharing, it's not so good anymore because it's variable framerate. 05:44:51 ... Accuracy is very good. 05:44:54 [slide 122] 05:44:59 Markus: More complicated territory. 05:45:33 ... The Harmonic mean. The way it averages across frame durations. 05:45:53 ... This is unfortunately not available for video tiles, because we don't expose them. 05:46:26 ... What's good is that since the weights are the frame durations themselves, then lenghty durations will be over presented. 05:46:33 ... We will see more the itches. 05:46:41 ... But not so good for accuracy. 05:46:56 ... WebRTC does not know when the frames will be rendered. 05:47:02 ... It also cannot see frame drops. 05:47:05 [slide 123] 05:47:23 Markus: Proposal is to add getVideoPlaybackQuality() 05:47:29 ... Harmonic FPS. 05:47:42 ... And then we have the info for local video tiles. 05:47:47 ... Great accuracy. 05:47:54 ... We can still not measure frame drops. 05:48:03 [slide 124] 05:48:23 Markus: We also propose to add another measure which is reproduction jitter RMSE. 05:49:00 ... Difference should be 0 but fluctuates. 05:49:29 ... Root min square. With that, you can measure how accurate. 05:49:37 q? 05:49:38 ... It can actually miss frame drops. 05:49:40 ack c 05:49:48 [slide 124] 05:49:59 s/[slide 124]// 05:50:02 [slide 125] 05:50:15 Markus: And then add reproduction jitter metric. 05:50:38 ... Accuracy for all current codecs is 90kHz. 05:51:11 [slide 126] 05:51:24 q+ 05:51:30 Markus: Here's the idea, with a few new fields. 05:51:56 [showing vibe coded demo] 05:52:45 Markus: I'm doing video processing here in terms of burning the CPU. 05:52:54 ... I can then play with it and look at the graphs. 05:53:51 ... [going through different demo settings display harmonic FPS, WebRTC harmonic FPS] 05:54:47 ... [and jitter reproduction metric] 05:57:09 cpn: Last TPAC, we talked about Video Playback Quality API. Your proposal just adds more things, right? 05:57:13 Markus: Yes. 05:57:28 cpn: Do the metrics belong on this or on getStats()? 05:57:40 Markus: I don't think getStats() is the right place for this. 05:57:47 q? 05:57:48 ... E.g., Youtube playback quality. 05:57:50 ack cpn 05:58:16 Youenn: Using RTP timestamps, I guess? If there are none? 05:58:23 Markus: Then I don't [missed] 05:58:47 cpn: Are there comments on the definitions of the stats? 05:59:08 dom: Are these standard metrics for some definition of standard? 05:59:16 ... Are we paving the cowpath or are we trying something new? 05:59:27 Markus: A bit of both. 05:59:48 Youenn: Some A/V stats may be computed by native apps. 05:59:54 Markus: Only Chrome does it, I think. 06:00:11 ... There is a performance hit as well as you need every frame. 06:00:47 ... In getStats(), there's "jitter squared duration" or something. 06:01:51 https://w3c.github.io/webrtc-stats/#dom-rtcinboundrtpstreamstats-totalsquaredinterframedelay 06:01:56 jib: I don't think provisional stats have gone through a standardization process. 06:02:12 dom: totalSquaredInterFrameDelay, it is. 06:02:48 cpn: There's a question around the Media Playback Quality spec as it is. 06:02:57 ... A few years ago, we agreed to hand it over to the WHATWG. 06:03:46 ... What you're proposing now is to make it a more lively document. 06:03:57 ... And the question is whether the Media WG should handle it. 06:04:05 ... That requires discussion with the WHATWG. 06:04:53 Francois: Charter suggests that Media WG will not do any actual work on the spec over than to transition it as well. 06:05:13 cpn: Let's discuss as part of the Media Playback Quality repository. 06:05:33 https://w3c.github.io/media-playback-quality/ 06:05:39 https://github.com/w3c/media-playback-quality/ 06:05:39 RRSAgent, draft minutes 06:05:40 I have made the request to generate https://www.w3.org/2025/11/13-mediawg-minutes.html tidoust 06:21:56 nigel has joined #mediawg 06:22:30 nigel has joined #mediawg 06:23:27 nigel has joined #mediawg 06:23:48 nigel has joined #mediawg 06:28:02 RRSAgent, make log public 07:03:45 nigel has joined #mediawg 07:07:22 nigel has joined #mediawg 07:12:23 nigel has joined #mediawg 07:29:38 nigel has joined #mediawg 07:50:32 nigel has joined #mediawg 08:09:25 Zakim has left #mediawg 08:11:12 nigel has joined #mediawg 08:22:45 dom has left #mediawg 08:29:54 nigel has joined #mediawg 08:52:50 nigel has joined #mediawg 23:53:55 RRSAgent has joined #mediawg 23:53:55 logging to https://www.w3.org/2025/11/13-mediawg-irc 23:53:57 Zakim has joined #mediawg 23:54:05 RRSAgent, make log public 23:54:37 Meeting: Media Working Group TPAC meeting 23:54:46 Chair: Chris, Marcos 23:54:51 Agenda: https://github.com/w3c/media-wg/wiki/TPAC-2025 23:54:57 RRSAgent, draft minutes 23:54:59 I have made the request to generate https://www.w3.org/2025/11/13-mediawg-minutes.html tidoust 23:58:18 Lei_Zhao has joined #mediawg 23:58:29 nigel has joined #mediawg 23:58:51 present+ Chris_Needham, Francois_Daoust, Markus_Handell, Nigel_Megitt, Dom_HazaelMassieux, Wolfgang_Schildbach, Bernd_Czelhan 00:00:12 present+ Mark Foltz, Alastor Wu 00:00:12 cpn has joined #mediawg 00:00:25 scribe+ nigel 00:01:36 alwu has joined #mediawg 00:01:44 present+ Andy Estes, Xiaohan Wang, Fredrik Hubinette, Eric Carlson, Jean-Yves Avenard, Gabriel Brito 00:01:51 present+ Alastor Wu 00:02:05 RRSAgent, draft minutes 00:02:06 I have made the request to generate https://www.w3.org/2025/11/13-mediawg-minutes.html tidoust 00:04:10 RRSAgent, bye 00:04:10 I see no action items