00:04:27 <RRSAgent> RRSAgent has joined #mediawg
00:04:31 <RRSAgent> logging to https://www.w3.org/2025/11/14-mediawg-irc
00:04:39 <tidoust> Meeting: Media Working Group TPAC meeting
00:04:39 <nigel> Agenda: https://github.com/w3c/media-wg/wiki/TPAC-2025
00:04:42 <tidoust> Chair: Chris, Marcos
00:04:45 <tidoust> Agenda: https://github.com/w3c/media-wg/wiki/TPAC-2025
00:04:48 <tidoust> present+ Chris_Needham, Francois_Daoust, Markus_Handell, Nigel_Megitt, Dom_HazaelMassieux, Wolfgang_Schildbach, Bernd_Czelhan
00:04:55 <tidoust> present+ Mark Foltz, Alastor Wu
00:05:00 <tidoust> present+ Andy Estes, Xiaohan Wang, Fredrik Hubinette, Eric Carlson, Jean-Yves Avenard, Gabriel Brito
00:05:04 <tidoust> RRSAgent, draft minutes
00:05:06 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html tidoust
00:05:08 <nigel> scribe+ nigel
00:05:25 <tidoust> present+ Youenn Fablet
00:05:28 <nigel> Previous meeting: https://www.w3.org/2025/11/13-mediawg-minutes.html
00:05:36 <nigel> Topic: Agenda
00:05:38 <Lei_Zhao> Lei_Zhao has joined #mediawg
00:05:43 <nigel> cpn: Quick review of agenda
00:05:54 <tidoust> present+ Guido Urdaneta
00:05:55 <nigel> .. EME - I have some extra info from CTA WAVE
00:06:40 <nigel> .. Then Markus on Media Pipeline performance
00:06:43 <nigel> .. Then the morning break
00:06:50 <eric-carlson> eric-carlson has joined #mediawg
00:06:55 <nigel> .. AB visit to talk about Process
00:07:01 <tidoust> present+ Dana Estra, Steven Becker
00:07:38 <nigel> .. WebCodecs Reference Frame Control and Encoded Transform, following up from joint meeting with WebRTC yesterday
00:07:43 <nigel> .. iframe media parsing
00:07:49 <nigel> .. lunch
00:07:58 <nigel> .. Media Capabilities
00:07:59 <tidoust> present+ Yuichi Morioka, Greg Freedman
00:08:02 <nigel> .. Audio Session
00:08:12 <nigel> .. Registries
00:08:17 <nigel> .. MSE
00:08:20 <nigel> .. wrap-up
00:08:52 <tidoust> present+ Erik Sprang
00:09:03 <nigel> .. Any changes to the agenda?
00:09:07 <nigel> no changes requested
00:09:18 <tidoust> present+ Eugene Zemtsov
00:09:48 <nigel> cpn: [Health and safety rules, Code of conduct reminder, antitrust and competition policy]
00:10:04 <nigel> .. IRC channel: #mediawg
00:10:17 <Dana> Dana has joined #mediawg
00:10:17 <youenn> youenn has joined #mediawg
00:10:19 <aestes> aestes has joined #mediawg
00:10:23 <xhwang> xhwang has joined #mediawg
00:10:32 <nigel> .. Please join - we would like to use that for the speaker queue, q+ yourself
00:10:36 <xhwang> q+
00:10:44 <eugene> eugene has joined #mediawg
00:10:53 <wschildbach> wschildbach has joined #mediawg
00:10:54 <GabrielBrito> GabrielBrito has joined #mediawg
00:10:57 <nigel> .. I will also try to respond to people raising a hand on zoom
00:10:57 <tidoust> ack xhwang
00:11:00 <eugene> present+ Eugene Zemtsov
00:11:01 <Bernd> Bernd has joined #mediawg
00:11:04 <GabrielBrito> present+
00:11:05 <Lei_Zhao> present+
00:11:07 <nigel> xhwang: I was just seeing if it worked!
00:11:08 <aestes> present+
00:11:09 <Bernd> present+
00:11:15 <nigel> Topic: Charter
00:11:19 <Dana> present+
00:11:23 <nigel> cpn: We've just rechartered
00:11:35 <nigel> .. Main changes are looking at a protected media pipeline including WebCodecs
00:11:39 <nigel> .. Nothing on that today
00:12:08 <nigel> .. We're v specific about what we intend to do with EME, we have coverage for  features Xiohang will take us through
00:12:33 <nigel> .. We have 10 specifications in progress, all in WD except Media Playback Quality in ED.
00:12:46 <nigel> .. Question if we should hand that over to WHATWG or keep working on it here.
00:13:05 <handellm> handellm has joined #mediawg
00:13:05 <nigel> .. My goal as Chair is to help us make progress and advance specs to their next maturity status, eventually to Rec.
00:13:19 <nigel> .. Next after WD is Candidate Recommendation Snapshot.
00:13:35 <Hubbe> present+
00:13:36 <nigel> .. That needs us to address feedback and complete Horizontal Review
00:13:42 <nigel> and Wide Review
00:13:58 <nigel> cpn: Want to maximise time taken today to move what we can forward.
00:14:03 <nigel> Topic: EME
00:14:25 <nigel> xhwang: [shares screen]
00:14:43 <nigel> .. Suggest following the order of issues on the agenda
00:14:50 <tidoust> -> https://github.com/w3c/encrypted-media/issues/573 Key System divergence on setServerCertificate() fallback behavior
00:15:06 <nigel> Subtopic: Key System divergence
00:15:16 <nigel> issue #573
00:15:30 <nigel> xhwang: Recent issue
00:15:50 <nigel> .. Some key systems do XXX but not all.
00:16:02 <nigel> .. Overall I feel we should not regulate this in the spec but relax it a bit.
00:16:25 <nigel> .. There's a note saying it's intended to be an optimization, and is not required.
00:16:43 <nigel> .. There are existing applications that do not call `setServerCertificate()`
00:17:01 <nigel> .. I think it's a nice optimization but the spec shouldn't need to enforce it.
00:17:09 <nigel> .. I added details about Fairplay and Widevine
00:17:19 <SteveBeckerMicrosoft> SteveBeckerMicrosoft has joined #mediawg
00:17:25 <nigel> cpn: If it's not a required step, the developer needs to know which system they're using to know whether to call it?
00:17:34 <nigel> xhwang: I actually have this situation.
00:18:06 <nigel> .. The key system is required to support requesting cert from the server via a message
00:18:10 <nigel> .. so why bother doing it?
00:18:18 <nigel> q?
00:18:33 <nigel> .. Our current spec makes it look like there's optimization built in but it can still work if not called,
00:18:39 <nigel> .. in reality I don't think that's the case.
00:18:55 <nigel> s/XXX/setServerCertificate
00:19:26 <nigel> xhwang: My proposal is to say SHOULD call `setServerCertificate()` otherwise it may fail.
00:19:45 <nigel> .. and a note to say that the message option is not supported by all
00:19:51 <nigel> cpn: Does this look okay to people?
00:20:07 <nigel> .. You had a question around Fairplay behaviour. Is the answer to that going to change the proposal?
00:20:20 <nigel> xhwang: I don't think so. I don't see a reason to regulate it, it's too strict.
00:20:27 <nigel> .. Unless Fairplay or Apple really wants to implement it.
00:20:40 <nigel> eric-carlson: It seems like a reasonable change, and not have a MUST when not every system can support it.
00:20:43 <nigel> xhwang: Agreed
00:20:46 <nigel> cpn: Agreed
00:20:56 <nigel> .. If it's not a PR already please open one, it seems uncontroversial
00:21:02 <nigel> xhwang: I will work on that
00:21:18 <nigel> Subtopic: Mixed encrypted and unencrypted content
00:21:22 <cpn> https://github.com/w3c/encrypted-media/issues/251
00:21:40 <nigel> xhwang: Not all systems support specifying mixed encrypted and unencrypted content
00:21:54 <tidoust> -> Specify mixed encrypted/unencrypted content https://github.com/w3c/encrypted-media/issues/251
00:22:03 <nigel> .. There are a lot of discussions in the issue.
00:22:23 <nigel> .. My summary of where we agreed:
00:22:45 <nigel> .. UAs must support switching between clear and encryped streams if MediaKeys is set before playback starts
00:22:53 <nigel> .. Sets clear expectations
00:23:02 <nigel> .. Ad insertion is a use case
00:23:13 <nigel> .. The UA can set up the media pipeline correctly if this is set.
00:23:32 <nigel> .. Otherwise the media pipeline assumes clear playback and when the stream switches to encrypted then in many cases it causes a failure
00:23:46 <nigel> .. For UAs to support this especially with hardware decoding it is complicated
00:23:54 <nigel> .. Having this text in the spec will make the implementation a lot easier
00:23:58 <nigel> .. That's the proposal
00:24:08 <nigel> .. There are some questions about the detail of the spec
00:24:24 <nigel> cpn: I haven't had time to review your answers.
00:24:29 <nigel> .. Do any of them change the proposed text?
00:24:36 <nigel> xhwang: I think it's more about the notes.
00:24:50 <nigel> .. You ask a good question what does it mean "before playback starts"?
00:25:04 <nigel> .. I think it's something like when the READYSTATE is HAVE_NOTHING
00:25:09 <nigel> cpn: Makes sense to tie it to the state
00:25:16 <nigel> xhwang: We can work on those details.
00:25:26 <nigel> .. Does anyone have any bigger comments or objections to this direction?
00:25:54 <nigel> jya: From what I've seen working on Safari bugs, when there is a switch of content,
00:25:58 <cpn> s/state/media element state/
00:26:08 <nigel> .. whether playback starts with encrypted content or switches to it, especially with Netflix I've seen
00:26:18 <nigel> .. them send a new init segment that indicates the encryption state,
00:26:24 <nigel> .. so there's no notice period
00:26:41 <nigel> .. They might play 1 minute of clear then switch to encrypted.
00:26:53 <nigel> .. From the UA perspective all I see is a new init segment
00:26:59 <nigel> .. and no other information from the website
00:27:23 <nigel> xhwang: That's true. Subtle difference is that the clear lead allows them to send a signal that the content is encrypted
00:27:25 <cpn> q+ greg
00:27:31 <nigel> .. but all the frames are clear at first
00:27:38 <nigel> .. The media pipeline doesn't get any signal.
00:27:46 <nigel> .. Either way you're right that there's no other signal
00:28:01 <nigel> .. My point is that implementations like Chromium: pipeline cannot support the switch
00:28:12 <nigel> .. So the page needs a workaround like a new media element, but the switch is not smooth
00:28:32 <nigel> .. Intent of the proposal is to prep, without needing even to fetch a license.
00:28:40 <nigel> .. Cost is minimal, useful signal for setup.
00:29:15 <nigel> jya: I understand from an EME perspective but in MSE I don't think you can indicate the encryption
00:29:18 <nigel> xhwang: True
00:29:19 <nigel> q?
00:29:45 <nigel> greg: What signal does jya need, we are doing exactly what he says
00:29:54 <nigel> xhwang: For MSE change type we should have some indication for the switch
00:30:06 <nigel> greg: You're right we can indicate when we switch from clear to encrypted
00:30:16 <nigel> .. and we were in favour of this proposal to ensure this continues to work.
00:30:26 <nigel> .. We're open to suggestions but would prefer not to change media sources.
00:30:44 <nigel> .. That's one of the alternatives - we want it as seamless as possible but don't object to that.
00:31:01 <nigel> jya: MediaSource has been made so that you simply unqueue a new init segment that indicates what
00:31:17 <nigel> .. you are going to change to. It's commonly used even though there's no clear indication that you're going to switch.
00:31:24 <nigel> xhwang: Are you proposing an addition to this proposal?
00:31:36 <nigel> jya: No, good to have coordination between the two [MSE and EME]
00:31:47 <nigel> .. When you do the switch it's already too late. We need an early signal.
00:31:57 <nigel> .. I agree some coordination for additional information would be nice.
00:32:11 <nigel> .. Will the UA or application do anything differently with the additional signal?
00:32:18 <nigel> .. (asking myself!)
00:32:40 <nigel> jya: I'm not too sure. A lot of things I've seen recently start with MSE and clear content,
00:32:50 <nigel> .. and start a new media key session later in the stream and it just magically works.
00:33:10 <nigel> .. Not seen any way to signal that pattern. I believe it's supported by all UAs so maybe it is a non-issue.
00:33:19 <nigel> .. I don't personally deal with hardware that has such restrictions.
00:33:37 <nigel> .. But there's no way to query if starting with clear and then switching to encrypted, with the same MSE object, is supported.
00:33:42 <nigel> xhwang: Right, there's no such API
00:33:52 <nigel> .. We do work with hardware this issue a lot
00:34:01 <nigel> greg: Not all UAs handle this in practice
00:34:24 <nigel> jya: Oh I see. I've seen Firefox and Chrome do it, but maybe not on all hardware.
00:34:41 <nigel> xhwang: There are more issues with hardware pipelines related to this
00:34:54 <nigel> .. I feel the change we're saying is pretty straightforward for apps to do and the cost is minimal
00:35:21 <nigel> cpn: The normative change is the first part, requiring implementations to support switching if the Mediakeys is set.
00:35:58 <tidoust> q+
00:35:59 <youenn> youenn has joined #mediawg
00:36:02 <nigel> .. Concern about notes re quality of implementation issues and if some of it should be normative
00:36:19 <nigel> .. Do we have enough capability detection, maybe we don't?
00:36:31 <tidoust> ack greg
00:36:38 <tidoust> ack tidoust
00:36:57 <nigel> tidoust: I'm wondering if having some test cases would help make sure we have a shared understanding
00:37:08 <nigel> .. about what it means to switch and what "before playback starts" means.
00:37:21 <nigel> .. Making sure that this proposed statement can be tested.
00:37:30 <nigel> cpn: And whether the mechanism is to have a separate media source.
00:37:43 <nigel> .. MSE has a detachable source buffer change - does it have a bearing on this?
00:37:56 <nigel> greg: It's related and we're in favour of that for other reasons but this is another issue.
00:38:21 <nigel> .. jya's proposal allows a solution to a different problem, keeping buffers while switching sources
00:38:24 <Guido> Guido has joined #mediawg
00:38:37 <nigel> xhwang: What MSE does at the javascript level doesn't affect this issue.
00:38:47 <nigel> .. I agree we can give some examples.
00:39:04 <nigel> .. Example would be code with MSE start with clear and switch to encrypted
00:39:18 <nigel> .. and either set mediakeys up front or not. That would be the test.
00:39:56 <nigel> sushraja: Question if this implies if the DRM systems now all have to support clear.
00:40:14 <tidoust> rrsagent, draft minutes
00:40:15 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html tidoust
00:40:18 <nigel> .. We know that older OSes don't support "clearly" on MSE.
00:40:24 <tidoust> rrsagent, make log public
00:40:34 <cpn> s/clearly/clear-lead/
00:40:37 <Mark_Foltz> Mark_Foltz has joined #mediawg
00:40:41 <nigel> xhwang: I don't know whether this implies that or not.
00:40:44 <tidoust> present+ Sushanth Rajasankar
00:40:48 <Mark_Foltz> Present+
00:40:57 <nigel> .. That's a great question.
00:41:07 <nigel> .. I suppose somewhere in EME it says clear-lead should be supported.
00:41:19 <nigel> .. The original thing here was about switching from clear to encrypted.
00:41:41 <nigel> sushraja: I think we need capability detection then
00:42:02 <nigel> xhwang: The fact that one key system doesn't support one case should not dominate this discussion.
00:42:23 <nigel> .. We are trying to say the reasonable correct behaviour. We are separately trying to fix Playready behaviour so that won't
00:42:25 <tidoust> present+ Kensaku Komatsu
00:42:31 <nigel> .. continue as an issue anymore. The spec is long lasting.
00:43:03 <nigel> sushraja: We will fix it but not for old implementations so we will need capability detection for clear-lead
00:43:05 <nigel> xhwang: I see
00:43:16 <nigel> sushraja: Then systems that are supporting it would work
00:43:34 <nigel> xhwang: I think that's a different issue. I filed an issue about more feature detection - clearly this is an example.
00:43:44 <nigel> .. I had a different example.
00:43:53 <nigel> .. There are a lot of reasons for a common system to detect support.
00:43:59 <nigel> .. We can probably discuss this in that other issue.
00:44:19 <nigel> cpn: Is that the plan, to move ahead with this proposal and investigate the need for capability detection
00:44:24 <nigel> xhwang: That's my suggestion
00:44:31 <nigel> cpn: I see Greg nodding. Any objections?
00:44:33 <nigel> no objections
00:44:54 <nigel> cpn: That sounds like a plan. Can we open a PR for this and then fold in the detailed points?
00:45:06 <nigel> xhwang: Yes. I propose we continue that discussion especially about the Notes, and then open the pull request
00:45:34 <nigel> Subtopic: Key rotation support in EME
00:45:38 <tidoust> -> https://github.com/w3c/encrypted-media/issues/132 Support continuous key rotation per MPEG Common Encryption (ISO/IEC 23001-7)
00:46:25 <greg_f> greg_f has joined #mediawg
00:46:44 <nigel> xhwang: In the common expression spec we are saying embedded keys should be supported,
00:46:56 <nigel> .. but it wasn't present originally in 2011. It was added in 2012 as an amendment.
00:47:11 <nigel> .. The title includes "key rotation" even though the spec doesn't mention that at all.
00:47:28 <nigel> .. There are many ways to embed keys. Just one example:
00:47:52 <nigel> .. Root keys and embedded keys, the embedded keys are encrypted with the longer lived root key
00:48:06 <nigel> .. The embedded keys can be used to decrypt the actual stream.
00:48:15 <nigel> .. How does the root key get delivered to the CDN? There's no one way.
00:48:30 <nigel> .. Some people bake it into the client, others have a separate delivery channel.
00:48:46 <nigel> .. There's an EME way with a normal EME licence requiring a change to deliver the root key to the client.
00:49:14 <nigel> .. With that one, there's packaging, then the encrypted content keys are in the "moof" PSSH box.
00:49:55 <nigel> .. At playback we continue to get the embedded keys and use the CDM to decrypt the key and use it to decrypt the media.
00:50:06 <nigel> .. This is about performance and efficiency
00:50:31 <nigel> .. Especially for live streams millions of clients might cause a client storm getting the updated root key when it rotates.
00:50:39 <nigel> .. Problems with the current spec?
00:51:13 <cpn> ... [Reads the explainer]
00:51:18 <cpn> scribe+ cpn
00:51:21 <nigel> .. Big summary is that now the spec says the keys are only used [scribe missed]
00:51:48 <nigel> .. Spec is incompatible with the key rotation problem we describe.
00:51:57 <nigel> .. Not a new problem. There were a lot of discussions before,
00:52:16 <nigel> .. both technical and the spec editor really tried to make EME into a Rec and didn't have time to deal with this.
00:52:40 <nigel> .. I'm trying to state the requirements for Compat, Simplicity and Interop.
00:52:52 <nigel> .. Not sure about all systems, but I know some have a key rotation feature.
00:53:01 <nigel> .. Many TV industry people request this features.
00:53:04 <nigel> s/res/re
00:53:09 <nigel> .. 1st proposal:
00:53:25 <nigel> .. Relax generateRequest() to accept data without generating a key request
00:53:41 <nigel> .. [goes through sequence diagram]
00:54:32 <nigel> .. I did check with Joey the owner of Shaka player and former spec editor
00:54:45 <nigel> .. He said most players don't track the pair of generateRequest() and key message,
00:55:07 <nigel> .. so just having generateRequest() shouldn't break anything, so I believe this won't break existing implementations.
00:55:41 <nigel> .. 2nd thing more interesting. What if one session closes while another one is opening - keep the rootkey from the first or discard it?
00:55:57 <nigel> .. When a session is closed we say now that keys in other sessions must be unaffected
00:56:09 <nigel> .. But in our new model if one session is closed then others might be affected.
00:56:18 <nigel> .. Trying to introduce a parent-child model for sessions.
00:56:36 <nigel> .. [goes through 2nd proposal sequence diagram]
00:56:50 <nigel> .. Proposing a new attribute on media key session called "parent".
00:56:58 <nigel> .. s2 can point to s1 as a parent.
00:57:17 <nigel> .. Then if s1 closes then it closes s1 and s2.
00:57:22 <nigel> .. But it is not required.
00:57:48 <nigel> .. If for some system the key is baked in the client such that all the sessions are independent of each other
00:58:00 <nigel> .. then closing one session wouldn't close another, so it's versatile in that case.
00:58:27 <nigel> .. Single Session Mode: the implementation requires every session to be closed before another is opened.
00:58:46 <nigel> .. For compat, must close s1 before creating s2.
00:59:14 <nigel> .. In this case the key can only work with one session so there's something like "parent-merged"
00:59:29 <nigel> cpn: Time check - any thoughts of what we've seen so far?
00:59:41 <nigel> .. Have others had time to review this before the meeting?
01:00:03 <nigel> q?
01:00:38 <nigel> sushraja: Context question. In s2 session if the keys are changed and the UA handles the chain of keys internally, why do we need this?
01:00:44 <nigel> xhwang: Yes, some systems work like that.
01:00:57 <nigel> .. In current implementations, in Chromium, the PSSH is not passed in the media pipeline at all.
01:01:11 <nigel> .. Also if we do that then the demuxer needs to know which PSSH to use.
01:01:22 <nigel> .. They are specific. I don't know how the browser can tell which one is which.
01:01:40 <nigel> .. I think moof box handling rules are too specific for EME, there are other box types.
01:02:07 <cpn> sushraja: So you'd need to parse the media segment in JavaScript?
01:03:35 <cpn> xhwang: Two options: JS, when it sees initdata it will create a session. 1, Keep the new session, and manage the keys there. 2, if the CDM can support in one session it can close the session and manage them as one
01:03:36 <tidoust> [xhwang to stay closer to WebIDL and make it easier for external people to review, I would suggest to reformulate "Add a new attribute `readonly attribute MediaKeySession.parent`" as "Add a new attribute to `MediaKeySession` defined as `readonly attribute MediaKeySession parent` (interface this goes to clarified, attribute type defined, no ".")]
01:03:40 <tidoust> q+
01:03:54 <nigel> cpn: I think people need time to digest this and work through the detail. Can schedule for a future call.
01:04:02 <nigel> xhwang: Of course, I'm just raising awareness
01:04:09 <nigel> .. I think it's an important feature
01:04:22 <nigel> cpn: Very much so, I'm hearing that from other industry groups
01:04:32 <nigel> tidoust: Thank you for the explainer
01:04:33 <nigel> ack t
01:04:45 <nigel> .. the next step is to send this to TAG for review.
01:04:57 <nigel> .. TAG sent some comments about being careful with EME.
01:05:15 <nigel> .. The explainer highlights the key aspects that justify such a change, such as access to live streams as the main use case.
01:05:23 <cpn> present+ Randell_Jesup
01:05:27 <nigel> .. Maybe worth reformulating the explainer a bit to really highlight what end users will get.
01:05:34 <nigel> xhwang: I see, highlighting the use cases.
01:05:51 <nigel> tidoust: For example the part that says you avoid impacting servers is good for content providers but not for end users
01:06:04 <nigel> cpn: Emphasise the end user
01:06:37 <cpn> nigel: That might be a false distinction, as reducing load on servers improves the user experience
01:07:06 <cpn> tidoust: what I'm highlighting is the benefit of having live streams in the first place
01:07:16 <dom> dom has joined #mediawg
01:08:10 <cpn> Topic: Media Pipeline Performance
01:09:19 <Mark_Foltz> Slides: https://docs.google.com/presentation/d/e/2PACX-1vQehy6wDMrwNory7-2QfO1EtOY8q-nqf4gZ6G69uikGRDleoO4g3oADR809kRnmA2weHxD9w2HKXkZ5/pub?start=false&loop=false&delayms=60000
01:09:26 <cpn> markus: Web Worker quality of service, right now there's no notion of QoS
01:09:33 <dom> s/Slides:/Slideset:/
01:09:47 <nigel> markus: History - Intel presented a proposal in 2023
01:10:15 <jya> jya has joined #mediawg
01:10:19 <guidou> guidou has joined #mediawg
01:10:20 <dom> -> https://www.w3.org/2023/Talks/TPAC/breakouts/web-worker-qos/ Intel presentation on Web Workers Quality of Service
01:10:39 <cpn> markus: at TPAC. That proposal was that the worker construtor is given a dictionary of options
01:10:50 <dom> -> https://github.com/riju/web-worker-quality-of-service/blob/main/explainer.md Web Worker Quality of Service Explainer
01:10:52 <cpn> ... Work on this stopped, not sure why.
01:11:16 <dom> -> http://www.w3.org/2023/09/tpac-breakouts/47-minutes.pdf Minutes of "Worker QoS" breakout in TPAC 2023
01:11:20 <cpn> ... Now we have more and more issues with the AI worklet we want to run speech recogniition at the same time as the user running LLM queries
01:11:33 <cpn> ... At the same time, VC applications run background blur, de-noising, etc
01:11:36 <dom> [slide 2]
01:11:54 <cpn> ... So it's hard for the system to udnerstand what to prioritise, so you get jitter in vide, etc
01:12:00 <dom> [slide 3]
01:12:14 <cpn> ... QoS issue including audio glitches, dropped video frames, janky video, etc
01:12:22 <dom> [slide 4]
01:12:27 <cpn> ... The proposal was power-focused
01:12:29 <dom> [slide 5]
01:13:21 <cpn> ... I want to reboot this proposal. Remove the high/load/default classes, and have workload-descriptive hints, e.g., video-capture, audio-recording, or have a  long running batch computation ,and you don't want to interfere with showign video frames
01:13:43 <dom> [slide 6]
01:13:44 <cpn> .. If you use WebGPU or WebNN in a worker, the proposal is to let these usage hints influence priorities of those other APIs in the system
01:13:46 <tidoust> present+ Jordan Bayles
01:14:06 <cpn> ... Abuse control, you could have an audio application take priority.
01:14:25 <cpn> ... Audio application is woken by stuff related to playout, or a video application woken by camera frames
01:15:01 <dom> [slide 7]
01:15:27 <cpn> ... Alternatives, heuristically classify workloads. Why not just give them the priority they need, with no APIs
01:15:41 <dom> [slide 8]
01:15:55 <cpn> ... This is where we are, a forced opt-in to the system. Giving applications more control seems beneficial
01:16:09 <cpn> ... Want to also bring this to Web Performance, and Audio WG
01:16:34 <aestes> aestes has joined #mediawg
01:16:43 <aestes> q+
01:16:55 <cpn> Youenn: I read the 2023 breakout minutes, Paul Adenot said for audio needs to use audio threads in addition to worklets. It's the same scenario, want to ensure the worker and worklet schedule at the same pace
01:17:09 <dom> q?
01:17:11 <cpn> ... If something bad happens, both could be demoted to regular priority, which would be bad
01:17:15 <dom> ack aestes
01:17:27 <Igarashi> Igarashi has joined #mediawg
01:17:33 <Igarashi> present+
01:17:47 <cpn> Andy: About the descriptive priorities, UAs would implement based on platform understanding of user-initiated, background etc
01:17:59 <cpn> What if those vary across platforms, leading to differing behaviour?
01:18:09 <dom> q+
01:18:16 <cpn> ... Would the spec need to address, in terms of a ranking of the descriptions
01:18:20 <cpn> s/What/... What/
01:18:36 <cpn> Markus: Not sure how much you could spec. Good to be as relaxed as posssible in the spec
01:19:27 <cpn> Dom: Issue is interoperability. I wonder if describing in an ideal world what you want to prioritise for a given hint. UA's may not always honour all of it, then it leads some discretion to the UA
01:19:32 <cpn> ack d
01:20:01 <cpn> ... If we find the right criteria for what the hints are optimising for, and leave to the UA to decide, better for interop
01:21:02 <cpn> Youenn: [Example of a source to a native sink, piping]. Different implementations might have different prioritise for native sources and sinks, but all should have same QoS
01:21:07 <nigel> q+ to ask if user needs should be able to impact implementation and interpretation of the priorities
01:21:09 <cpn> Dom: That's about groupings?
01:21:29 <cpn> Youenn: Audio worklet you know is real time, except when everything falls apart
01:21:42 <cpn> Dom: Maybe what we should decsribe is the pipeline
01:21:55 <cpn> Youenn: Yes, the big use cases are audio processing and camera capture
01:22:19 <cpn> Markus: Various use cases: audio processing in WebRTC, where we have problems. Also present problems. Intel issue linked in the presentation
01:22:21 <cpn> q?
01:22:31 <cpn> ... On windows, easy to make the system misbehave
01:22:57 <cpn> Youenn: Suggest grouping priorities of parts of the pipeline, some in JS, some native
01:23:19 <cpn> Markus: Is that a proposal to not expose categories?
01:23:35 <cpn> Youenn: If a group connects to a native source or sink, the UA knows the priority
01:23:41 <cpn> ... Could be a way to reduce interop concerns
01:24:19 <cpn> Markus: So instead of exposing categories, do it automatically?
01:24:47 <cpn> Dom: No, have a way to tell the UA that a set of workers are connected to a sink, so the UA can optimise to make that operate smoothly
01:25:00 <cpn> ... So declaring the pipeline rather than each piece
01:25:18 <cpn> Markus: In that case, how would the UA identify the constituent pieces?
01:25:34 <cpn> Dom: Could give an API the sink and source
01:25:57 <cpn> Markus: Could put a QoS on getUserMedia, then anything connected inherits that
01:26:08 <cpn> ... There are opportunities for that system to misunderstand what's going on
01:26:14 <jophba> jophba has joined #mediawg
01:26:23 <dom> q+ Mark_Foltz
01:27:10 <dom> ack nigel
01:27:10 <Zakim> nigel, you wanted to ask if user needs should be able to impact implementation and interpretation of the priorities
01:27:12 <cpn> Nigel: On interop and consistent implementation, the user need might influence the priority, e.g., if they're using a screen reader, they wouldn't want that to influence the priority
01:27:29 <cpn> ... In that case, janky video might not be so bad. Just something to consider
01:27:48 <cpn> Markus: Yes, it could factor in usage of screen reader, for example
01:27:53 <dom> ack Mark_Foltz
01:28:22 <cpn> Mark: For realtime media applications, what's the ideal model from the application point of view, from each media input, camera or over the network
01:28:55 <cpn> ... Browser could learn the inherent data rate feature of each pipeline. Then, e.g., prioritise the audio pipeline over the video pipeline
01:29:29 <cpn> ... What's the application's ideal model? Give the app time to process each incoming or outgoing frame, and tell the time box that's availalble, such as a time deadline
01:29:45 <cpn> Markus: Specifying deadlines would be the best, then we don't need to talk about priorities any more
01:29:59 <cpn> ... Not many OS's can do something with that
01:30:20 <cpn> Mark: You'd add the deadines to the scheduler so it makes the prioirity decision
01:30:57 <cpn> Youenn: From UA perspective, hard to know the time of input or output for the work item. Could describe more, e.g., i'm an image processing tool doing blur
01:31:05 <cpn> ... If you know the timing of input and output
01:31:27 <cpn> Mark: Streams API gives you a way do define a pipeline, so could be a place to expose
01:31:31 <cpn> Nigel: How to test?
01:32:01 <cpn> Markus: Tests might be flaky, when you have real time tests
01:32:30 <cpn> ... It's a hard problem. A big challenge is the range of OS's that can  or can't support this in a good way
01:33:15 <cpn> Dom : Groups involved?
01:33:32 <cpn> Youen: Media WG, WebRTC, Web Performance
01:33:49 <cpn> ... Audio
01:34:07 <cpn> Dom: Suggest writing an explainer, and circulate across the groups
01:34:19 <cpn> ... Get something started
01:34:31 <cpn> Mark: WebGPU and WebNN would want input
01:34:59 <cpn> Markus: I can produce the explainer
01:36:33 <nigel> nigel has joined #mediawg
01:36:56 <nigel> nigel has joined #mediawg
01:53:11 <nigel> nigel has joined #mediawg
02:01:19 <nigel> nigel has joined #mediawg
02:02:24 <nigel> Topic: Suggestions for improving the W3C Process
02:02:36 <nigel> cpn: Welcome Igarashi-san from the advisory board
02:03:03 <nigel> igarashi: Thank you, I'm here to discuss potential Process issues.
02:03:18 <nigel> .. The Process CG discusses issues with the Process and gets feedback from the community.
02:03:34 <nigel> .. The AB has decided to participate in each WG's meeting during TPAC to get more feedback.
02:03:48 <nigel> .. The Process is very complicated so in a short session we may not have time for details,
02:03:57 <nigel> .. but I'd like to get any feedback about concerns with the Process.
02:04:31 <nigel> song: In the last few days XXX has had discussions so we are going to work together for the Media WG.
02:04:37 <SteveBeckerMicrosoft> SteveBeckerMicrosoft has joined #mediawg
02:04:43 <nigel> .. The Process document is the first document we need to check when drafting the Charter,
02:05:06 <nigel> .. and we are trying to improve the documentation so that's the goal from us.
02:05:27 <nigel> .. If there are any frustrations about the Process for the organisation you can answer the questionnaire.
02:05:43 <nigel> igarashi: Let's open it for discussion.
02:05:52 <igarashi> igarashi has joined #mediawg
02:06:00 <nigel> cpn: Straw poll: how many people are aware of it and how many have read it in detail?
02:06:12 <igarashi> present+
02:06:18 <nigel> 8 hands raised
02:06:29 <nigel> cpn: Maybe a third to a half of the group
02:06:32 <nigel> .. It's a big question.
02:06:41 <nigel> .. Are there things that cause concern?
02:06:46 <nigel> .. Is it understandable?
02:06:53 <nigel> .. Does it help us achieve our goals?
02:06:56 <Mark_Foltz> q+
02:07:00 <nigel> .. Not going to review it all
02:07:01 <nigel> ack M
02:07:08 <nigel> Mark_Foltz: I don't think I've ever read the entire document.
02:07:22 <tidoust> q+
02:07:23 <nigel> .. Whenever I have a question I ask someone like Francois who knows it and answers correctly.
02:07:40 <nigel> .. That works very well. Maybe better socialisation would mean I wouldn't have to ask so many questions.
02:07:43 <nigel> Youenn: +1
02:07:53 <nigel> cpn: I recognise that too.
02:08:00 <nigel> .. The Process has flexibility and different WGs can operate differently.
02:08:05 <kazho> kazho has joined #mediawg
02:08:07 <nigel> .. Without Francois we would really struggle.
02:08:28 <nigel> .. That points to... there's complexity, and how much is essential vs how much is accidental.
02:08:31 <jophba> jophba has joined #mediawg
02:08:45 <nigel> Mark_Foltz: In any development environment, when there's a complex process we usually build tools to manage it,
02:08:52 <nigel> .. which distils it into a series of manageable steps.
02:09:09 <nigel> .. I've noticed with things like Wide Review is that someone creates a GitHub issue with tick boxes,
02:09:20 <nigel> .. but maybe it needs to be supported by better tooling to automate it.
02:09:32 <nigel> .. For a long time in Blink when we did intent to implement everyone wrote an email,
02:09:46 <nigel> .. it got complicated so we developed a web tool to make it easier to do consistently.
02:09:52 <nigel> ack tid
02:10:03 <nigel> tidoust: Same thing, the AB is interested in what could change in the Process,
02:10:20 <nigel> .. I guess the two main impacts for WGs are the Chartering process that we do every 2 years.
02:10:30 <nigel> .. I refresh the Charter and come back to the group to get feedback.
02:10:32 <igarashi> q+
02:10:38 <nigel> .. I'm not sure how useful that is, maybe it could be simplified.
02:11:00 <nigel> .. The other impact is Wide Review, where I see some struggle in the MediaWG. Chris showed the list with everything in WD
02:11:20 <nigel> .. earlier. If we find it hard to get to CRS maybe the Process is making it more difficult than it needs to be.
02:11:33 <nigel> .. I see that some of the specs are widely implemented but are still WD, so some steps are missing.
02:11:45 <igarashi> q?
02:11:49 <nigel> .. I'm taking feedback on tooling, we've had a lot of discussion about wide review earlier this week as well.
02:12:18 <nigel> .. I'm dreaming of a world where WGs naturally track wide review tracking changes as they are made.
02:12:31 <nigel> .. So it's not too late to get review when things have been shipped.
02:12:44 <nigel> .. Some things need tooling but not Process changes, others need Process changes.
02:13:01 <nigel> cpn: In each of our specs we have e.g. a CR tracking issue, so we can work out what we want to resolve before entering CR.,
02:13:05 <nigel> s/.,/.
02:13:26 <nigel> .. Looking at "adequate implementation experience" - I need to check the Process each time to see if we are meeting the right criteria.
02:13:39 <dom> RRSAgent, draft minutes
02:13:40 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html dom
02:13:46 <nigel> .. It is all described here but I personally find it hard to keep track of.
02:13:51 <nigel> .. Then what does it mean in practice for each spec.
02:14:03 <nigel> .. Is shipping enough, or do we need developer feedback for Wide Review.
02:14:07 <igarashi> q?
02:14:13 <nigel> .. Turning these requirements into more specific criteria in our case.
02:14:14 <nigel> ack ig
02:14:27 <nigel> igarashi: Thank you very much, very interesting proposals about tooling.
02:14:56 <nigel> .. Chris, we may have other comments, some sort of polling about complexity?
02:15:24 <nigel> cpn: Good suggestion. Anything that [scribe missed] would be a good idea. I don't think we need to poll right now.
02:15:50 <nigel> Song: We will try to get feedback and it will be a priority for the AB to improve the Process document.
02:16:00 <nigel> cpn: I look forward to hearing more about where you take this next.
02:16:07 <nigel> .. Thank you for bringing it to us, it is helpful.
02:16:23 <nigel> Topic: WebCodecs Reference Frame Control
02:16:26 <dom> s|Slideset: https://docs.google.com/presentation/d/e/2PACX-1vQehy6wDMrwNory7-2QfO1EtOY8q-nqf4gZ6G69uikGRDleoO4g3oADR809kRnmA2weHxD9w2HKXkZ5/pub?start=false&loop=false&delayms=60000|Slideset: https://lists.w3.org/Archives/Public/www-archive/2025Nov/att-0004/Worker_QoS_reboot.pdf
02:16:29 <dom> RRSAgent, draft minutes
02:16:30 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html dom
02:16:58 <youenn> youenn has joined #mediawg
02:17:06 <cpn> scribe+ cpn
02:17:19 <guidou> guidou has joined #mediawg
02:17:26 <tidoust> -> https://github.com/w3c/webcodecs/issues/285 Reference frame control
02:18:20 <dom> Slideset: erik_slides
02:18:25 <dom> [slide 1]
02:18:30 <cpn> erik: I work on Google Meet. We want to use WebCodecs more, but it lacks some features
02:18:47 <cpn> ... We want to reference any structure using minimal tools in a codec agnostic way
02:19:01 <dom> [slide 2]
02:19:16 <cpn> ... This is the proposal from Eugene, from 2 years ago
02:19:29 <cpn> ... There a new 'manual' scalability mode
02:19:51 <dom> [slide 3]
02:19:52 <cpn> ... Call getAllFrameBuffers(), and for each frame you encode, you signal the references
02:20:24 <cpn> ... To ship something, we have some constraints. You get an array of buffers out. VP9 can have 7 buffers but you can only reference 3 at once
02:20:47 <cpn> ... Since the encoder doesn't know the layer you're encoding....
02:21:02 <cpn> ... We don't deal with spatial scalability. Hoping to address in future
02:21:09 <dom> [slide 4]
02:21:14 <cpn> ... Can do 80 of the things we'd like to with this API update
02:21:21 <cpn> s/80/80%/
02:21:25 <tidoust> present+ Randell Jesup
02:21:47 <cpn> ... Scale down the frame rate.Frame 0 is a leaf node, Frame 2 references the temporal layer
02:21:49 <dom> [slide 5]
02:22:00 <cpn> ... In code, it's straightforward
02:22:31 <cpn> ... I made some demos
02:22:39 <hiroki> hiroki has joined #mediawg
02:23:07 <cpn> ... Top left is my self preview, AV1 bitstream with 3 temporal layers. On right, at top we show entire bitstream, and middle we drop frames
02:23:28 <dom> [shows demo https://sprangerik.github.io/webcodecs-demos/webcodecs_manual_scalability.html ]
02:23:30 <cpn> ... Rate control demo
02:23:36 <dom> [shows https://sprangerik.github.io/webcodecs-demos/rate_control_demo.html ]
02:23:55 <cpn> ... Compares target and actual bitrate. Use the CBR rate controller in VP9 and AV1
02:24:22 <cpn> ... It's a realtime rate control, so you have to do some guessing. The bitrate fluctuates
02:24:38 <cpn> ... With external rate control, I set the quantiser value
02:24:54 <cpn> ... It's a good rate controller, reacting faster
02:25:27 <cpn> ... Instead of always referencing the last frame, I have two that I alternate between
02:26:09 <cpn> ... Re-do with a new QP, it's a semi-realtime encode mode. We use this for screensharing. When you do a slideshare, there's a huge spike in bitrate. That's what this is made for
02:26:18 <dom> [xlide 6]
02:26:21 <dom> s/xlide/slide
02:26:29 <dom> [slide 7]
02:26:41 <cpn> ... Another use case is long term references, where you have an active feedback signal
02:27:05 <cpn> ... When you send a frame, send a signal back. Then you have a guarantee that the frame will exist, even if there's loss in the network
02:27:14 <dom> [shows LTR demo https://sprangerik.github.io/webcodecs-demos/ltr_demo.html ]
02:28:11 <cpn> ... The spatial quality suffers, as it uses references that are older
02:28:37 <dom> [slide 8]
02:28:59 <cpn> ... Lots of use cases: TLDR is about it helping it be dynamic and react to network conditions, with knowledge of the state of the system
02:29:25 <cpn> ... Status: There's a Chrome implementation. I published an explainer yesterday
02:29:40 <cpn> ... Intent to experiment in Q4
02:30:00 <cpn> ... Implementation-wise, we're rolling out support for D3D12 hardware video encoders on Windows
02:30:20 <cpn> ... Works in same way for H264 and H265 and AV1
02:30:33 <cpn> ... Should work across Intel, nVidia, AMD chipsets. Some work to get it stable
02:30:33 <dom> [slide 9]
02:30:35 <dom> [slide 10]
02:30:42 <cpn> ... Next step? Interest from others in using this?
02:31:22 <cpn> Youenn: Long term, if you have a peer connection object ,should web apps just forget about peerconnection and use this web codecs fine tuning
02:31:47 <cpn> Erik: This unlocks more customisation and experimentation. Long term, we can move more logic in libraries instead of being baked into the UA
02:32:06 <cpn> ... For conferencing the transport option of delivering is a question. Not just the encoded frame, you need metadata with it
02:32:43 <cpn> Youenn: How does this relate to the complexity / priority idea?
02:33:00 <cpn> Erik: This is the minimum thing that gets us most of the way there. Should have speed control
02:33:13 <cpn> ... Need to know what speed settings the encoder supports
02:33:20 <cpn> ... Skipping that for now to have something shippable
02:33:45 <dom> -> https://www.w3.org/2024/09/TPAC/breakouts.html#b-49386363-7a65-4f4a-9580-bff867a1c6e9 TPAC 2024 breakout: Evolved Video Encoding with WebCodecs
02:33:52 <cpn> ... Not much of a fingerprinting issue?
02:34:07 <cpn> Chris: Who do you want feedback from, other apps than Google Meet, browser vendors?
02:34:18 <cpn> Alastor: I'll ask my colleague who works on this
02:34:51 <cpn> Youenn: Implementation-wise, with software encoders it's easy. Less sure about hardware codecs
02:35:25 <cpn> ... Two teams involved: hardware-level, and the API surface
02:35:35 <cpn> ... Feedback from WebCodecs users like Zoom?
02:35:51 <cpn> Erik: From Microsoft Teams, yes.
02:36:04 <nigel> q?
02:36:37 <cpn> Randell: It's interesting, want to look more for implications on hardware support. We'd have to see from a prioritisation perspective
02:36:47 <cpn> ... Demos look interesting.
02:37:07 <nigel> cpn: Good next step, all to take a closer look, investigate feasibility of implementation at hardware level.
02:37:13 <nigel> .. bring it back to a future meeting when you're ready,
02:37:17 <nigel> s/y,/y.
02:37:31 <nigel> Topic: WebCodecs and WebRTC encoded transform
02:37:59 <cpn> Youenn: We mostly converged... We have WebCodecs and encoded transform interfaces
02:38:09 <cpn> ... Doing similar things, expressing encoded video and audio data
02:38:20 <cpn> ... They're used differently, and expose different information
02:38:57 <cpn> ... One thing could be done is use the EncodedVideoChunk, could reduce the WebRTC Encoded Transform Spec
02:39:34 <cpn> ... Second topic, there's a proposal in WebRTC WG to add an encoded source to a peer connection
02:39:44 <cpn> ... A WebCodecs encoded piped into PeerConnection
02:40:11 <cpn> ... How to have a good match between EncodedAudio|VideoChunk and the peer connection pipeline
02:40:50 <cpn> ... The encoded source proposal has an API. Should we try to increase exposure of encoded audio frame or reduce XXX?
02:41:05 <cpn> ... Do WebCodecs folks have thoughts?
02:41:54 <tidoust> scribe+
02:42:20 <sprang> sprang has joined #mediawg
02:42:44 <dom> -> https://w3c.github.io/webcodecs/#encodedaudiochunk-interface EncodedAudioChunk interface
02:43:13 <dom> -> https://w3c.github.io/webrtc-encoded-transform/#RTCEncodedAudioFrame-interface EncodedAUdioFrame interface
02:43:26 <tidoust> Youenn: Interfaces are quite different currently.
02:43:49 <dom> s|erik_slides|https://docs.google.com/presentation/d/1gnhEvCFPUsmaiz-jpQyiGS7WDzjCKIhH59hu-DdnVcI/edit
02:43:56 <tidoust> ... Basically, you have a metadata object which was meant for use in transforms.
02:44:04 <dom> RRSAgent, draft minutes
02:44:05 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html dom
02:44:17 <tidoust> ... EncodedVideoChunk is quite different. It's immutable.
02:44:29 <tidoust> ... You can construct it very easily.
02:45:06 <tidoust> ... If you have a WebCodecs that is giving you such a chunk, you will need to create two destination and create and EncodedAudioFrame from that chunk.
02:45:45 <tidoust> ... WebRTC tries to reuse as much as possible from WebCodecs.
02:45:56 <tidoust> ... But we cannot change interfaces for old usage.
02:46:08 <GabrielBrito> GabrielBrito has joined #mediawg
02:46:26 <tidoust> Eugene: From a WebRTC point of view, wondering about adding dependency?
02:46:35 <tidoust> Youenn: It's already the case in practice.
02:46:39 <GabrielBrito> Sorry, my mic was open by accident
02:46:59 <tidoust> Eugene: If you're already using EncodedVideoChunk and VideoFrame, I'm all in favor of extending that.
02:47:07 <tidoust> Youenn: What's missing is the metadata.
02:47:14 <tidoust> Eugene: I see what you mean.
02:47:21 <guidou> guidou has joined #mediawg
02:47:22 <cpn> q?
02:47:23 <dom> q+ erik
02:47:33 <guidou> q+ guidou
02:48:21 <tidoust> Erik: My preference would be to add metadata as part of the WebRTC code rather than in WebCodecs
02:48:21 <dom> ack erik
02:48:26 <dom> ack guidou
02:48:37 <dom> s/code/call/
02:49:05 <tidoust> guidou: To give a little history, encoded and raw were practically developed at the same time, but encoded came out first.
02:49:28 <tidoust> ... Now that we encoded chunks and new use cases, it would make sense to use the WebCodecs one.
02:49:39 <tidoust> ... But we need to add the metadata to the raw version one.
02:50:25 <tidoust> ... The idea would be to add metadata here and support this for this use case, which lets you use EncodedAudioChunks directly without having to convert.
02:51:00 <tidoust> Youenn: About the funneled use case, in that case, the input is really an rtc encoded audio object.
02:51:07 <tidoust> ... You can transfer the data from what I can tell.
02:51:14 <tidoust> ... It should be fast as well.
02:51:16 <dom> s/funneled/fan-out
02:51:28 <tidoust> ... Going to encoded audio chunks should work.
02:52:01 <tidoust> ... I'm hearing that if there's consensus, then we should extend WebCodecs, and not do it in WebRTC.
02:52:43 <tidoust> ... About the second part, is it something that WebCodecs encoders should implement or is it easy to do?
02:52:49 <tidoust> Eugene: We had the discussion before.
02:53:11 <tidoust> ... We came to the conclusion that the encoders/decoders won't be doing the work of transferring the metadata from the inputs to the output.
02:53:28 <cpn> q?
02:53:50 <tidoust> ... Since we have don't have 1:1 correspondance between video frames and encoded chunks and so on, it's up to apps to do the mapping based on timestamps and so on.
02:54:01 <tidoust> cpn: No need here for metadata on AudioData then?
02:54:34 <tidoust> Youenn: Right.
02:55:02 <tidoust> Markus: Is is fully possible to associate this metadata with what we send to the encoder?
02:55:12 <tidoust> Guido: We would need to add that separately.
02:55:19 <tidoust> Markus: So you need to match on the outside.
02:55:33 <dom> s/separately./separately via timestamp matching
02:56:32 <tidoust> Erik: For the drop case, garbage collecting metadata for something that isn't known may be hard.
02:56:36 <tidoust> ... That's a separate discussion.
02:57:27 <tidoust> Eugene: If you feel that your API design would be more ergonomic with RTC versions of WebCodecs constructs, go for it. We're not going to make the encoders/decoders pass the metadata.
02:58:20 <tidoust> ... For VideoFrame, the main reason for metadata is that we have different sources, with different parameters and extra information.
02:58:50 <tidoust> ... For encoded audio/video chunks, if they come from RTC streams, I can also imagine that they have metadata attached to them and it makes sense to have the metadata there as well.
02:59:58 <tidoust> ... What I mean is that, originally, we added metadata on VideoFrame because they can come different sources (cameras, canvas, etc.), with different information attached to them. From a camera, you may have rectangles attached to detected faces for example.
03:00:13 <tidoust> ... We have metadata on VideoFrame because of that.
03:00:34 <tidoust> ... The same logic can be applied to encoded audio/video chunks if we get them from different streams.
03:00:53 <tidoust> ... E.g. demuxed streams, webrtc streams.
03:01:32 <tidoust> ... I see that as an argument that we can potentially add metadata to audio/video encoded chunks. Curious to see what Paul thinks about it.
03:01:51 <tidoust> cpn: Paul is not in the room right now.
03:01:58 <dom> q+ jesup
03:02:35 <tidoust> Youenn: What Eugene proposes is arbitrary metadata, whereas the proposal we have is more specific metadata.
03:02:58 <tidoust> ack jesup
03:03:14 <tidoust> jesup: Will discuss with Paul.
03:04:04 <tidoust> Youenn: I was looking for guidelines to put encoded sources in the right path. Next step is, I guess, to come up with a concrete proposal and let it be reviewed by both groups.
03:04:24 <tidoust> ... I suspect Guido will be driving this.
03:05:14 <tidoust> Guido: Yes, we can perhaps write an explainer for encoded sources. How to work with encoded chunks and putting emphasis on integration with encoders/decoders.
03:05:42 <tidoust> cpn: Sounds good to me.
03:06:01 <tidoust> Topic: iframe media pausing
03:06:16 <tidoust> -> https://pr-preview.s3.amazonaws.com/gabrielsanbrito/iframe-media-pausing/pull/1.html iframe media pausing
03:06:35 <SteveBeckerMicrosoft> slides for iframe media pausing: https://docs.google.com/presentation/d/1uzyCPbnLvQ-ME_CNETFubUrUZQLiPbFhwzzRDUT9pcQ/edit?usp=sharing
03:07:16 <tidoust> s/slides for iframe media pausing:/slideset:
03:07:21 <tidoust> [Slide 1]
03:07:49 <tidoust> GabrielBrito: Software engineer for Microsoft Edge. We have been working on this feature.
03:07:53 <tidoust> [slide 2]
03:09:07 <tidoust> GabrielBrito: Main motivation is when an iframe needs to be hidden away for some reason. Since the iframe can be arbitrarily complex, if the application chooses to hide it for some reason, it has no control over the media, and needs to destroy it to make sure it pauses the media. Re-creating an iframe is resource intensive.
03:09:31 <tidoust> ... If it does not destroy and the audio keeps playing, there's no way to control it from a UX perspective.
03:09:35 <tidoust> [slide 3]
03:10:17 <tidoust> GabrielBrito: We propose a new permission policy that can be used to prevent an iframe and its children from playing media when the iframe is hidden.
03:10:32 <tidoust> ... We have integration with media playback, Web Audio, and autoplay.
03:10:35 <tidoust> [slide 4]
03:11:08 <tidoust> GabrielBrito: We have a working prototype behind a flag in Chromium. Mzoilla now supports the feature and we're incubating the feature in WICG.
03:11:16 <tidoust> ... The spec is fairly simple.
03:11:21 <tidoust> [slide 5]
03:11:29 <tidoust> GabrielBrito: That's what we've been doing the last year.
03:11:45 <tidoust> ... Looking forward for questions, ideas.
03:11:49 <cpn> q?
03:12:06 <tidoust> SteveBeckerMicrosoft: Some developers have adopted the features and provided feedback.
03:12:07 <nigel> q+ to ask why generic media playback is intersecting with visibility specifically - what about audio only media?
03:12:48 <tidoust> ... Use cases include use of cross-origin iframes with complex apps being loaded.
03:13:12 <tidoust> Youenn: What about the case of a top-level document that never wants an iframe to play media?
03:13:17 <tidoust> ... Has it been discussed?
03:13:35 <tidoust> SteveBeckerMicrosoft: There's a possibility to consider other uses for the permission.
03:13:48 <Mark_Foltz> q+
03:13:56 <tidoust> ... If there are other customers for these scenarios, happy to iterate.
03:14:21 <tidoust> Eric: Did you say when the iframe becomes invisible, do you pause playback or do you mute playback?
03:14:31 <tidoust> GabrielBrito: The current spec says that it should be paused.
03:14:51 <tidoust> Eric: Have you found compat issues with scripts that don't expect playback to be paused externally?
03:15:02 <tidoust> GabrielBrito: Haven't heard problems so far.
03:15:32 <tidoust> Eric: There are some sites that have problems when playback is paused via a script that isn't initiated by their own.
03:15:33 <cpn> q+ randall
03:15:43 <tidoust> ... It's a big problem at some sites.
03:15:49 <tidoust> ack nigel
03:15:49 <Zakim> nigel, you wanted to ask why generic media playback is intersecting with visibility specifically - what about audio only media?
03:16:19 <tidoust> nigel: This is expressed as media element playback. But the condition is visibility. What if the element is only playing audio?
03:16:36 <tidoust> ... If it's not producing visible changes, should it be restricted to video playback?
03:17:01 <tidoust> GabrielBrito: The permission policy would apply to the iframe, not to the media element. It's when the iframe itself is hidden.
03:17:53 <tidoust> cpn: In the audio case, a music player may want to continue playing music. The top level document would choose not to use the feature.
03:17:57 <tidoust> ack Mark_Foltz
03:18:33 <tidoust> Mark_Foltz: I was wondering whether the explainer talks about the integration with the MediaSession API?
03:18:47 <tidoust> GabrielBrito: This is something we need to take into consideration.
03:19:09 <nigel> q+ to ask about the interaction with PiP
03:19:22 <tidoust> Andy: Directly pausing the media element should fire the event with a reason.
03:19:47 <tidoust> ... It might be more compatible with sites that don't expect the playback to be paused.
03:19:56 <cpn> q+ youenn
03:19:57 <tidoust> ack randall
03:20:42 <tidoust> randall: About these sites and audio only cases, because that is being applied at the iframe level, in most cases, they know that this is the case.
03:21:08 <tidoust> ... If an app encloses arbitrary content, they probably know that things can break. I don't see that as a blocker.
03:21:28 <tidoust> GabrielBrito: Yes, AudioContext comes to mind as well for the interrupted state.
03:21:51 <tidoust> Eric: I think this is a great feature, just raised the issue as something to watch for.
03:22:28 <tidoust> nigel: Question about picture-in-picture. If the iframe is hidden, does PiP still continue?
03:22:36 <tidoust> Youenn: We should consider that it is still visible, yes.
03:22:59 <cpn> q?
03:23:01 <cpn> ack n
03:23:01 <Zakim> nigel, you wanted to ask about the interaction with PiP
03:23:04 <tidoust> Randell: I think the spec should detail this and have opinion on it.
03:23:40 <cpn> Picture in Picture spec on interaction with page visibility: https://www.w3.org/TR/picture-in-picture/#page-visibility
03:23:45 <tidoust> Youenn: Also relation with the AudioSession spec. One of the purposes is to expose to the page "Hey, you're interrupted". This is already happening for a page on iOS, when another application starts playing audio.
03:24:06 <tidoust> ... AudioSession has an algorithm that says how to interrupt media playback in a document.
03:24:22 <tidoust> ... It would be good to relate the two as I think there could be hooks between the specs.
03:24:47 <tidoust> ... Similarly, for MediaSession, AudioSession may allow you to resume, I think... [checking]
03:24:52 <tidoust> ... Ah, no.
03:25:32 <cpn> q?
03:25:34 <cpn> ack y
03:25:38 <tidoust> ... I would recommend reading the spec, and let Alastor and I know if adjustments to the spec may be needed to ease usage.
03:25:56 <tidoust> cpn: You have a PR opened. Are you looking for feedback on that before iterating?
03:26:10 <tidoust> GabrielBrito: I was discussing this with Paul.
03:26:20 <tidoust> ... Looking for feedback from Mozilla team.
03:27:12 <tidoust> cpn: Sounds good. I think you have a good starting point. And then some of the issues that were raised today can perhaps be addressed separately in different issues.
03:27:40 <tidoust> GabrielBrito: I'll create issues for things we discussed today.
03:27:59 <tidoust> RRSAgent, draft minutes
03:28:00 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html tidoust
03:28:16 <jophba> jophba has joined #mediawg
03:38:46 <Simonth> Simonth has joined #mediawg
03:45:47 <nigel> nigel has joined #mediawg
03:46:18 <nigel> nigel has joined #mediawg
04:05:42 <nigel> nigel has joined #mediawg
04:18:49 <Mark_Foltz> Mark_Foltz has joined #mediawg
04:22:19 <nigel> nigel has joined #mediawg
04:41:34 <nigel> nigel has joined #mediawg
05:01:10 <cpn> cpn has joined #mediawg
05:01:22 <cpn> Topic: Media Capabilities
05:02:57 <SteveBeckerMicrosoft> SteveBeckerMicrosoft has joined #mediawg
05:05:52 <tidoust> slideset: mark_foltz
05:06:28 <cpn> Mark: I became spec editor a year ago. I'm working through existing PRs and issues
05:06:34 <adekker> adekker has joined #mediawg
05:06:41 <wschildbach> wschildbach has joined #mediawg
05:06:42 <cpn> ... Making good progress on clearing up the backlog of issues to resolve
05:06:44 <tidoust> [slide 2]
05:07:06 <cpn> ... The current status, the spec is a Working Draft on the Rec Track. We have work to do to go to CR
05:07:21 <cpn> ... There are working implementations in major engines
05:07:46 <cpn> ... We still have about 33 open issues in GitHub. Several are questions or feature ideas, or things that don't impact the spec so much
05:08:15 <cpn> ... Short term, we have 14 v1 issues, that we should resolve. Of those, about 8 require changes to normative text, so want to make sure we have consensus on those
05:08:22 <tidoust> [slide 3]
05:08:23 <cpn> ... Two open PRs from before I took over
05:08:46 <cpn> ... Interop-wise, it looks OK. We added tests on being stricter on validating codec strings and mime types
05:09:09 <cpn> ... For example, using a video mime type for audio and vice verse. Some interop work needed there
05:09:21 <cpn> ... There's an open issue for Chrome, on main issues failing WPTs
05:09:34 <cpn> ... Haven't found someone with time to work on it yet
05:09:44 <cpn> ... If other vendors want to work on them, that's encouraged
05:09:52 <tidoust> [slide 4]
05:09:59 <cpn> subtopic: Horizontal reviews
05:10:18 <cpn> Mark: Some were done a while ago. Do we need new reviews. Some have pending actions for us
05:10:47 <cpn> ... There's one on a11y, seems more of a question on whether you can use MC API to detect whether media has a text track, and infer user preferences
05:10:59 <cpn> ... I don't think it's relevant to the spec, but needs an answer
05:11:26 <cpn> ... Some substantial TAG feedback, all the issues have been done and closed. Nothing needing TAG input now, AFAIK
05:11:44 <cpn> ... Talking with Chris, there was a Privacy review that raised questions. We responded and added text
05:12:14 <cpn> ... So all the things they fed back on we've addressed. Might want to ask for another review
05:12:26 <cpn> ... Security review might still need doing
05:13:01 <cpn> Francois: Security reviews are restarting, so can do that now
05:13:38 <cpn> Mark: Have separate privacy and security sections? An issue was raised then they changed their mind
05:14:08 <tidoust> scribe+
05:14:15 <jophba> jophba has joined #mediawg
05:14:41 <tidoust> cpn: Privacy group always has concerns over exposing capabilities to applications.
05:15:22 <hiroki> hiroki has joined #mediawg
05:15:24 <tidoust> ... Due to fingerprinting considerations. It seems to me that any feature we add might impact that. I'm wondering whether the current spec is materially different from the one that was reviewed, warranting another round of review.
05:15:45 <Hubbe> q+
05:16:04 <tidoust> ... One question was: is capability detection the right approach to start with? Why doesn't the site offer a set of choices that the user agent could then choose from.
05:16:15 <tidoust> ... But I think we answered this question.
05:16:26 <tidoust> ... Partly because they would become observable anyway.
05:18:00 <cpn> ack H
05:18:03 <tidoust> Fredrik: Querying capabilities is very powerful. It's very much the right choice. You can always infer all of this information anyway by trying videos out and figuring out what works. Media Capabilities does not reveal new bits, but it does make it slightly easier to query these capabilities.
05:18:40 <tidoust> Eric: If it's easy to do by attempting to play video, then question could be: why do we need that in the first place?
05:18:50 <tidoust> ... It's very easier to use the API.
05:19:22 <cpn> q?
05:19:25 <tidoust> ... But it makes also it much easier for sites to get additional fingerprinting bits.
05:19:41 <tidoust> ... I second Chris in getting an additional privacy review.
05:19:46 <alwu> alwu has joined #mediawg
05:20:09 <tidoust> Mark_Foltz: Next step would be to gather the list of changes that were made since last review and determine whether that warrants another review.
05:20:32 <tidoust> ... I just want to point out that this is not a new conversation.
05:21:12 <tidoust> ... I will put it in the queue of things to do. Help would be welcome!
05:22:07 <tidoust> cpn: There's a whole question here around text tracks.
05:22:23 <tidoust> ... Around querying some capabilities of the media you're playing.
05:22:36 <tidoust> ... which is not what the spec does.
05:22:44 <tidoust> ... But it may be about captioning support.
05:23:01 <tidoust> Mark_Foltz: There's no algorithm in the spec to report on any text track capability.
05:23:26 <tidoust> ... If it says something, we'll want to add a note about not linking that to user settings, only to browser settings.
05:24:08 <jophba> jophba has joined #mediawg
05:24:18 <tidoust> cpn: I'm reading the issue as coming from the perspective of someone who thought the spec allowed querying the capabilities of the media being played.
05:24:46 <tidoust> hta: To me, it makes more sense to compile a delta of changes, and send that to horizontal review groups.
05:25:03 <tidoust> ... Reviewers looking at a delta might be faster.
05:25:15 <tidoust> Eric: I was going to say the same thing.
05:25:52 <tidoust> ... Opening up for an entire review for a simple question like that seems unnecessary. It would work to answer the question.
05:26:18 <tidoust> ... A full review would take a substantial amount of time.
05:26:47 <tidoust> cpn: OK, so let's compose an answer to the question that was put forward and see if that satisfies the reviewers.
05:27:13 <tidoust> cpn: Security has now been restarted. Simone now leads the activity at W3C. I assume that's now an expected part of the horizontal review process.
05:27:26 <tidoust> ... We should do a self-review and request review afterwards.
05:27:54 <tidoust> Mark_Foltz: If you can look at the self review from 2020, and update it accordingly, that would be great!
05:27:59 <tidoust> cpn: Happy to do that.
05:28:15 <tidoust> [slide 5]
05:28:35 <SteveBeckerMicrosoft> SteveBeckerMicrosoft has joined #mediawg
05:28:38 <cpn> subtopic: Media Capabilities and webrtc
05:28:56 <cpn> https://github.com/w3c/media-capabilities/issues/185
05:29:05 <tidoust> Mark_Foltz: Feature that allows site to get additional information when they query the API for webrtc use cases.
05:29:25 <tidoust> ... There was a PR that was put together some time ago to add the feature.
05:29:37 <tidoust> ... Since then, we did a refactoring of the spec.
05:29:51 <tidoust> ... I rebased the PR, no substantive changes introduced in that rebase.
05:29:56 <tidoust> ... Now ready to be reviewed.
05:30:16 <tidoust> ... I wanted to assess if it's still accurate. No implementations for now.
05:30:37 <tidoust> ... It looks like it could be done with a small amount of JS. Is it something that the API needs to do?
05:31:00 <tidoust> ... Is there something that the browser knows that the site does not?
05:31:07 <tidoust> Youenn: In my mind, it's more convenience.
05:31:14 <tidoust> ... There may be issues around clock rate.
05:31:41 <JordanBayles_Google> JordanBayles_Google has joined #mediawg
05:31:46 <tidoust> ... Normally, you should be able to implement everything yourself.
05:32:01 <tidoust> ... If you're using the WebRTC API to set preferences, then it should still work.
05:32:03 <wschildbach> q+
05:32:08 <tidoust> hta: I think you're wrong on that one.
05:32:31 <tidoust> ... The way that setCodecPreferences work is that it requires an exact match between a codec that the system knows about and the codec that is requested.
05:33:00 <tidoust> ... If you try to construct one through JS, you're likely to miss what the platform is capable of, as it will vary across platforms.
05:33:15 <tidoust> ... You want to set complete different parameters depending on the device.
05:33:41 <tidoust> ... We need to get back the exact information that you need to put in the RTC space.
05:34:21 <tidoust> ... Also, H.264 defaults to 0, and the only sensible value is 1.
05:34:33 <tidoust> ... The underlying spec is very old and now outdated.
05:35:08 <tidoust> Mark_Foltz: Two jobs for the user agent here: 1. pick the correct values depending on the implementation of the codec, and 2. produce correct values with defaults.
05:35:20 <tidoust> Youenn: Yes.
05:35:47 <tidoust> hta: The Media Capabilities spec should say that what is returned by the query is one member of the family of codecs that the device is capable of supporting.
05:36:17 <tidoust> Youenn: If you have two browsers on the same machine, you want them to report the same values.
05:37:01 <tidoust> hta: In the WebRTC codec world, we landed on specifying only VP8, OPUS, and GSMA were to be supported by everyone, with other codecs being up to implementations.
05:37:04 <guidou> guidou has joined #mediawg
05:37:16 <tidoust> Mark_Foltz: As an editor, I need to know where the steps to populate this are.
05:37:31 <tidoust> ... If there's a spec I can normatively reference, that's good, otherwise I need someone to provide the steps.
05:38:04 <tidoust> hta: In the WebRTC spec, there's a spec that says "populate this with a platform-defined...". You can point at that.
05:38:15 <tidoust> ack wschildbach
05:39:10 <tidoust> wschildbach: I wasn't familiar with Media Capabilities API being used to query codec capabilities for WebRTC usage. Now I wonder about codec support that does not work for WebRTC.
05:39:31 <tidoust> Mark_Foltz: On the right side, you get "supported: false" in that case.
05:39:43 <tidoust> wschildbach: So there's a flag to query only for WebRTC usage.
05:39:46 <tidoust> Mark_Foltz: Yes.
05:39:58 <tidoust> cpn: The flag also distinguishes file based media.
05:40:20 <tidoust> Mark_Foltz: I will make an attempt to use the existing steps to populate the things on the right.
05:41:07 <tidoust> Youenn: I'm hoping that with the hook on WebRTC, you get a list of codecs, and then filter the list, and then pick up the first one in the ordered list as I suspect there are cases when you still have multiple entries.
05:41:40 <tidoust> ... That is, I'm hoping it's a sorted list, and that the first one is the preferred one.
05:41:56 <tidoust> Mark_Foltz: It might be implementation independent.
05:42:47 <tidoust> Subtopic: Stereoscopic video support
05:43:13 <tidoust> Hubbe: From Meta. We want to query stereoscopic video support. Media Capabilties seems the right place to do that.
05:43:28 <cpn> q+
05:43:30 <tidoust> ... Most browsers will not support the stereo mode.
05:43:41 <tidoust> ... Small change, pretty straightforward.
05:43:59 <tidoust> cpn: Is it a decoding or rendering capability? That's a difference we make.
05:44:24 <tidoust> ... If it's more a rendering capability, then Media Capabilities may not be the right spec for that.
05:44:32 <Mark_Foltz> q+
05:44:35 <tidoust> Hubbe: There's something for audio capabilities.
05:44:37 <cpn> ack c
05:44:51 <tidoust> cpn: There is, but the general feeling is that this may not have been a good decision.
05:45:04 <tidoust> Hubbe: Question being where else should it go to answer the question correctly?
05:45:31 <aestes> aestes has joined #mediawg
05:45:49 <tidoust> Mark_Foltz: It feels similar to HDR where we landed with a two-part approach: one query for whether you can decode HDR, and one query for whether your display supports HDR.
05:46:17 <tidoust> ... I'm thinking that the second could be used to detect stereoscopic display.
05:46:43 <tidoust> Hubbe: We don't necessarily have this capability in CSS or so on. It's purely for videos.
05:47:21 <tidoust> ... Media Capabilities also has this efficiency parameter, that is often tied to rendering as the defect is often in the renderer.
05:47:42 <tidoust> Mark_Foltz: I don't object to using the API to query into the metadata support.
05:47:55 <tidoust> ... Details about the rendering aspects of it would need to be discussed somewhere else.
05:48:12 <tidoust> ... There's CSS. There's document.screen.
05:48:50 <tidoust> Hubbe: If you try an AV1 video, it won't work is software-decoder. With an H.264 video, it will because it's hardware decoder.
05:49:36 <tidoust> Mark_Foltz: I'm not disagreeing. Media Capabilities is the right place for that part. I'm saying that whether the screen can actually render is out of scope.
05:50:27 <tidoust> hta: left-right, top-bottom is probably not the right thing for some codecs that encode stereo in one stream.
05:50:58 <tidoust> Hubbe: There's a "multiview" value for that.
05:51:20 <tidoust> cpn: What would a pure decoding capability query look like?
05:51:41 <tidoust> ... I'm wondering about the distinction between decoding and rendering.
05:51:59 <tidoust> Mark_Foltz: What happens when you render one of these videos to a canvas?
05:52:30 <tidoust> Hubbe: It depends on what type of video. For top-bottom, you get two halves. For multiview, you usually cannot see that they are even there.
05:53:01 <tidoust> Mark_Foltz: In that scenario, you could be decoding the video but not putting the video anywhere on a display. You are rendering in some sense but not targeting any device.
05:53:10 <tidoust> ... Similar to HDR support.
05:53:32 <tidoust> Eric_Cabanier: The CSS part you mentioned wasn't handled in this group?
05:53:40 <tidoust> Mark_Foltz: No, that was handed over to CSS.
05:54:08 <tidoust> Hubbe: dynamic range and also video dynamic range.
05:55:07 <tidoust> Xiaohan: In Media Capabilities, we do have HDR metadata. I feel the line between decoding and rendering is not super clear. Some capabilities are very close to the screen.
05:56:45 <tidoust> cpn: Are there pieces of this that could go to the screen interface? I wonder whether they can be factored out and done through media queries or some other mechanism.
05:57:16 <tidoust> Hubbe: If we implement this in our browser today, can we do that in the meantime while we figure things out?
05:57:37 <tidoust> ... The videos look very ugly if the browser does not support stereoscopic.
05:58:08 <tidoust> Mark_Foltz: I see the value in the use case for media capabilities for querying whether the browser can at least understand the stereoscopic metadata.
05:58:37 <tidoust> ... Most of my feedback is around the API shape, whether enum values can be implemented through various specs that deal with stereo.
05:58:49 <tidoust> ... I think we need more discussion there as to whether these are the right values.
05:58:57 <tidoust> ... or maybe start with a boolean.
05:59:18 <tidoust> ... And then the developer needs to work with the implementation to understand whether it supports all of the values.
05:59:32 <tidoust> Hubbe: Totally fine to continue the discussion on GitHub about the actual value.
05:59:46 <tidoust> cpn: The overall objective seems good. It's just figuring out how.
05:59:59 <tidoust> Eric_Cabanier: Should we also talk to the CSS WG?
06:00:08 <tidoust> cpn: It may be too early to do that.
06:00:34 <tidoust> ... That's the model we followed for HDR rendering, but I would suggest we reach to that conclusion first before we bring them in.
06:01:45 <tidoust> Subtopic: Interaction between Media Capabilities with WebCodecs
06:01:51 <tidoust> -> https://github.com/w3c/media-capabilities/issues/#202 Interaction between Media Capabilities with WebCodecs
06:02:17 <tidoust> Mark_Foltz: Two related APIs to query for codec support. WebCodecs has isConfigSupported().
06:02:29 <tidoust> ... The APIs are related but query different things.
06:02:43 <tidoust> ... The APIs have some overlap.
06:02:56 <tidoust> ... WebCodecs have this registry.
06:03:03 <tidoust> ... Media Capabilities is more open ended.
06:03:16 <tidoust> ... But you may want them to work together.
06:03:30 <tidoust> ... E.g., encode with WebCodecs and playback with WebRTC or MSE.
06:03:48 <tidoust> ... It would be nice to be able to use the same query in both APIs.
06:04:01 <tidoust> ... I checked. They're pretty much aligned.
06:04:31 <tidoust> ... WebCodecs has an unsigned long to describe audio channels, Media Capabilities has a DOMString (with an open issue to describe the format).
06:04:47 <tidoust> ... Also spatial capability query is different.
06:05:25 <tidoust> ... So I think we should resolve #73. Then have some working examples of how the two APIs can work together so we can devise a follow-up plan.
06:05:33 <tidoust> ... That's my very short version.
06:06:42 <tidoust> hta: At IETF, there's ongoing work for defining a mulit-channel codec. There may be prior art to copy from.
06:07:23 <cpn> Topic: Audio Session setDefaultSinkId
06:07:30 <SteveBeckerMicrosoft> Set default audio output device slides: https://docs.google.com/presentation/d/1t3aK1CuqyFO4ytWHubLeUiXFesRSuTYYWXlgxSC_TEE/edit?usp=sharing
06:07:43 <tidoust> -> https://github.com/w3c/audio-session/issues/6 Should AudioSession be able to specifiy the output speaker and/or route options (a la sinkId)?
06:07:54 <youenn> youenn has joined #mediawg
06:07:55 <tidoust> s/Set default audio output device slides:/slideset:
06:08:04 <sprang> sprang has joined #mediawg
06:08:32 <tidoust> [slide 1]
06:08:33 <tidoust> SteveBeckerMicrosoft: From Edge team. We talked about pausing media in iframe. This proposal builds on that
06:08:35 <tidoust> [slide 2]
06:09:18 <tidoust> SteveBeckerMicrosoft: [conferencing system with a.com and z.com]. No cooperation between the two sites, no way to change audio input.
06:09:23 <tidoust> [slide 3]
06:09:45 <tidoust> SteveBeckerMicrosoft: Existing API. selectorAudioOutput, enumerateDevices to select the input and output.
06:10:05 <tidoust> ... But the problem is that the top level cannot call setSinkId of its top level iframe.
06:10:11 <tidoust> s/iframe/window
06:10:19 <tidoust> [slide 4]
06:10:25 <tidoust> [slide 5]
06:10:32 <tidoust> SteveBeckerMicrosoft: We published an explainer in 2024.
06:10:43 <tidoust> ... We launched an experiment in April 2025.
06:10:59 <tidoust> ... Some feedback already. We'd like to continue to gather feedback.
06:11:14 <tidoust> ... We'd like to spec the API if there's support, not exactly clear where.
06:11:24 <tidoust> Youenn: Why not in the AudioSession API directly?
06:11:34 <tidoust> SteveBeckerMicrosoft: I think this is broader.
06:12:11 <tidoust> Youenn: AudioSession of the top level document would trickle down to its children, except if the iframe itself would override setSinkId()
06:12:23 <tidoust> alwu: Is it restricted to the top level page?
06:12:32 <tidoust> SteveBeckerMicrosoft: I think we restricted to the top level page only.
06:13:05 <tidoust> Youenn: It would be exposed to the web but it would not be supported and we should not worry about an iframe expecting support.
06:13:13 <Mark_Foltz> q+
06:13:23 <tidoust> ... We will want to expose this in AudioSession objects anyway.
06:13:23 <guidou> guidou has joined #mediawg
06:13:27 <guidou> q+
06:13:54 <tidoust> ... Either it's on MediaDevice or in AudioSession.
06:14:19 <tidoust> Youenn: AudioSession is implemented in Safari, main use case is on mobiles, but also exposed in desktop.
06:14:51 <tidoust> alwu: Wondering about integration afterwards in AudioSession. Done before?
06:15:49 <tidoust> Youenn: At some point in the future, we might want to construct AudioSession objects, linked to media elements.
06:16:11 <tidoust> ... That's something we cannot get with MediaDevices. Small difference, I agree.
06:16:54 <tidoust> Mark_Foltz: When a top level frame sets setSinkId, does the iframe get an event?
06:17:05 <tidoust> ... If there's a UI, that may cause an issue.
06:17:22 <cpn> ack M
06:17:29 <tidoust> ... The user may be confused that the audio may be coming through the speaker if the UI says otherwise.
06:17:39 <tidoust> Youenn: The iframe can always override.
06:17:48 <tidoust> Mark_Foltz: But if they don't get an event.
06:17:59 <tidoust> SteveBeckerMicrosoft: That's a good use case. We can file an issue.
06:18:18 <tidoust> ... In order to do that, you also need a permission policy.
06:18:25 <tidoust> ack guidou
06:18:53 <tidoust> guidou: AudioContext and media element has a sinkId. Do you change that property?
06:19:31 <tidoust> s|slideset: mark_foltz|https://docs.google.com/presentation/d/e/2PACX-1vT4gctxWcgjcsWk9mIm8M3cR7lwdlBTQlBRkGnVk1_RGG0H-g2mlmZp_89jfSjxoOfYbL-X1PKpOpzV/pub?start=false&loop=false&delayms=60000
06:20:12 <tidoust> Youenn: How do you get the deviceId in the top level frame?
06:20:19 <tidoust> ... selectDeviceOutput()?
06:20:24 <tidoust> SteveBeckerMicrosoft: That would work.
06:21:21 <tidoust> Youenn: [mentions no exposure of default speakers to top level]
06:21:54 <tidoust> ... The top level frame who wants to set setSinkId would not have the IDs.
06:22:04 <tidoust> ... That may require some changes to the media capture spec.
06:22:18 <tidoust> ... Microphone and camera access would need to be granted.
06:23:05 <tidoust> [discussion on cross-origin rules and implications for sinkId]
06:23:22 <tidoust> ?: It may just be that people interested in this already have that permission anyway.
06:23:29 <tidoust> Youenn: Yes and we are trying to converge.
06:23:56 <tidoust> SteveBeckerMicrosoft: Where would we like to spec this remains the biggest question.
06:25:47 <cpn> Topic: Media WG Registries
06:25:59 <cpn> Francois: We have a number of registries, in WebCodecs, MSE, and EME
06:26:34 <cpn> ... The Registry Track is relatively new. I'd hoped that other WGs would have progressed their Registries, so we can learn from their experience
06:26:49 <cpn> ... But we're one of the first to publish registries.
06:27:04 <cpn> .... We have a Draft Registry status. We can move to Candidate Registry, then Registry
06:27:18 <cpn> ... Shouldn't be too hard, as we don't plan to change the registry definitions
06:27:43 <cpn> ... We would be able to change the registry entries at any time, according to the requirements in the registry itself
06:28:16 <cpn> ... As a group, if we're fine with what the Draft Registry says, in terms of who the custodian is and how we review and approve additions, the next step is Wide Review
06:28:33 <cpn> ... Not sure what the review groups will review.
06:28:49 <cpn> I'd suggest we do them all at once, to ease the reviewers' workload
06:29:03 <cpn> s/I'd/... I'd/
06:29:17 <Mark_Foltz> q+
06:29:23 <cpn> Chris: This is the equivalent of moving to CR for specs
06:29:43 <cpn> Paul: Sounds good. We've exercised the WebCodecs registries a few times, to add entries
06:30:00 <cpn> ... So happy, for the WebCodecs ones, happy as editor
06:30:36 <cpn> Present+ Paul_Adenot
06:30:44 <Mark_Foltz> q-
06:30:49 <cpn> Mark: Does it trigger a call for exclusions?
06:31:02 <cpn> Francois: There are no patent implications as far as I know
06:34:47 <tidoust> Topic: Media Source Extensions
06:35:32 <tidoust> Subtopic: seekable range for a MediaSource with finite duration
06:35:54 <tidoust> -> https://github.com/w3c/media-source/issues/369 Model seekable range for a MediaSource with finite duration
06:36:26 <tidoust> cpn: When you're using MSE the seekable attribute returns a duration from 0 to the duration if the duration is finite.
06:36:38 <tidoust> ... For live, it returns infinity.
06:37:03 <tidoust> ... You may have a finite duration stream where the 0 time may no longer be seekable.
06:37:20 <tidoust> ... Events where we overrun the 24h capacity of our CDNs for example at the BBC.
06:37:55 <tidoust> ... My colleague who raised this issue is that JS players tend to ignore the seekable attribute in the media element and implement their own seekable time range logic.
06:38:08 <tidoust> ... I'm wondering whether this is something we should look to address in the spec in some way.
06:38:23 <tidoust> jya: I looked at the Webkit implementation.
06:38:30 <tidoust> ... I saw comments there.
06:38:57 <tidoust> ... It seems that the seekable range when the duration is set is the intersection of the buffer ranges of the media elements, and the seekable range on the media source itself.
06:39:09 <tidoust> ... It is really that it's continuous from 0 to duration?
06:39:59 <tidoust> ... Am I incorrect? I have the feeling that the problem raised here is not actually one.
06:40:26 <tidoust> jya: endOfStream does not make the range continuous.
06:41:17 <tidoust> ... I am not sure that there's an actual problem here.
06:41:36 <tidoust> cpn: Is seekable supposed to indicate what the user agent buffered?
06:42:05 <tidoust> jya: It does not mean that 0 is always seekable if the media source is ended.
06:42:22 <tidoust> cpn: Maybe we need more details or a test case to show where this is causing problems.
06:42:54 <tidoust> jya: seekable take a TimeRanges, and it does not have to be continuous.
06:42:59 <tidoust> s/take/takes
06:43:17 <tidoust> ... What the bug describes may be a particular user agent problem. I don't think that's what is in the spec.
06:43:40 <tidoust> ... Can I take more time to look more into it?
06:43:49 <tidoust> cpn: I think that would help, yes.
06:44:39 <tidoust> ... Also see the way JS players implement this timing properties.
06:45:13 <tidoust> jya: It seems to me that if implementations follow the spec, 0 may not be seekable.
06:45:35 <tidoust> cpn: That's the mismatch, difference between what the server can seek to and what the user agent says.
06:47:17 <tidoust> ... Maybe there's a question there about what implementations do, and whether that's per spec or whether the spec has issues.
06:48:26 <tidoust> ... I can check to see if it's just a mismatch in the spec or if we're seeing actual problems in implementations.
06:48:30 <tidoust> Subtopic: Detachable MediaSource
06:48:35 <tidoust> -> https://github.com/w3c/media-source/issues/357 Proposal: Have a detachable MediaSource object
06:49:22 <tidoust> cpn: We've had a number of calls that point to the need in the case when players want to avoid having to maintain two different media elements each with their own buffer.
06:49:38 <tidoust> ... I'm wondering whether Webkit has an implementation of this.
06:49:50 <tidoust> jya: Yes, behind a flag.
06:50:03 <tidoust> ... I let that rot a little bit.
06:50:25 <tidoust> ... I also have a series of tests that I've written, to be integrated in WPT.
06:50:35 <tidoust> ... The spec is simple to define what to do.
06:50:51 <tidoust> ... Maybe the next step would be to write a spec proposal?
06:51:12 <tidoust> cpn: Some worry about objects having to contain a large amount of data around.
06:51:28 <tidoust> jya: It will only be kept if the script keeps a reference to the objects.
06:52:10 <tidoust> ... We still have an issue around collecting media stream. Bug that was opened by Mozilla like 10 years ago.
06:52:17 <tidoust> ... I don't believe this to be a problem in practice.
06:52:40 <tidoust> cpn: Moving this forward into a proposal sounds like a valuable next step.
06:53:23 <tidoust> jya: For the underlying test, I avoided the never expiring issue through detach which is a very clear lifecycle.
06:55:56 <tidoust> Topic: Media Capabilities - Dolby Vision HDR Metadata
06:56:24 <tidoust> -> https://github.com/w3c/media-capabilities/issues/136 DolbyVision HDR Metadata
06:57:01 <tidoust> Mark_Foltz: Sites may want to know the availability of playback with DolbyVision HDR metadata. There are multiple MIME types.
06:57:17 <tidoust> ... I read through the issues and it wasn't clear what spec changes were requested.
06:57:46 <tidoust> ... There were some comments about a "cross compatibility ID".
06:57:59 <tidoust> ... The other question is how we try to get multiple implementations.
06:58:41 <tidoust> ... Compatibility queries are generally delegated to the OS/CDM.
06:59:12 <tidoust> ... I don't see major issues with this, but would like to see examples of how to parse the MIME types.
06:59:45 <tidoust> wschildbach: There are certain profiles of DolbyVision that are backward compatible.
07:00:02 <tidoust> ... Meaning that you can decode the same stream in different ways.
07:00:13 <tidoust> ... That's not the case in every profile.
07:00:34 <tidoust> ... The ask is that it's possible to query the system to know whether it can decode.
07:00:56 <tidoust> Mark_Foltz: So single stream which can be decoded either as SDR or HDR.
07:01:19 <tidoust> ... Would the profile be different? Would it have to be part of the query?
07:01:41 <tidoust> ... Would I be able to make a second query to know whether I can decode as SDR or HDR?
07:01:46 <tidoust> wschildbach: Unfortunately, no.
07:02:05 <tidoust> ... If you take dvh1 as an example.
07:02:31 <tidoust> ... The answer might be "yes", but you don't know at this point whether it can decode into a stream that supports Dolbyvision.
07:02:36 <tidoust> ... It's a property of the decoder.
07:03:47 <tidoust> ... Having an enum for HDR Metadata type for DolbyVision would solve it.
07:04:16 <tidoust> cpn: If the solution is to add an enumeration value, the question becomes what is that string.
07:04:28 <tidoust> ... We don't have an example of a vendor specific value.
07:04:43 <tidoust> ... The question is the criteria that we have to add an enum value.
07:05:19 <tidoust> wschildbach: If you have an enum, somewhere you need to specify what this enum means, with a reference to some specification.
07:05:30 <tidoust> Eric: Is there a publicly available specification?
07:05:48 <tidoust> wschildbach: There are documents that describe profiles, but it depends on what your expectations are.
07:05:59 <tidoust> Eric: It sounds that the answer is no.
07:06:23 <tidoust> ... That was the issue when we originally talked about this.
07:06:45 <tidoust> wschildbach: Why is it an issue?
07:07:00 <tidoust> Mark_Foltz: That is an issue with proprietary codecs in general.
07:07:51 <tidoust> ... For Media Capabilities, we need to know how to parse a MIME type and parameters. The minimum that needs to be public is that.
07:08:12 <tidoust> wschildbach: The user agent needs to understand what they need to do to query the underlying system, right?
07:08:19 <tidoust> Mark_Foltz: Yes.
07:08:32 <tidoust> wschildbach: OK. Not public today, but could perhaps be made public in the future.
07:08:47 <tidoust> ... There's still going to be something that is specific to the implementations in practice.
07:09:09 <tidoust> ... If the implementation knows how to talk to codecs in Windows, that should work.
07:09:36 <tidoust> Mark_Foltz: As far as the API shape, I'm still looking at what's missing from the current API to enable what's missing at all.
07:10:00 <tidoust> ... HDR Metadata, and then a type fallback.
07:10:16 <tidoust> wschildbach: No, there is no additional thing that someone would need to query for the MIME type.
07:10:41 <tidoust> Mark_Foltz: Are all DolbyVision streams going to be represented by these MIME types.
07:10:54 <tidoust> wschildbach: These are the only ones right now in the scope of this discussion.
07:11:41 <tidoust> ... If a codec returns "can decode", it doesn't imply that it can decode DolbyVision. It may decode HDR10.
07:12:22 <tidoust> Mark_Foltz: So you need a query that can tell whether you can support DolbyVision without a fallback.
07:12:33 <tidoust> ... I think I understand the problem more.
07:12:44 <tidoust> ... I can take that back internally.
07:12:50 <tidoust> RRSAgent, draft minutes
07:12:52 <RRSAgent> I have made the request to generate https://www.w3.org/2025/11/14-mediawg-minutes.html tidoust
08:38:36 <Zakim> Zakim has left #mediawg