15:58:37 <RRSAgent> RRSAgent has joined #mediawg
15:58:41 <RRSAgent> logging to https://www.w3.org/2025/04/30-mediawg-irc
15:58:42 <Zakim> Zakim has joined #mediawg
16:02:10 <tidoust> RRSAgent, make logs public
16:02:30 <cpn> present+ Chris_Needham, Eric_Carlson, Mark_Foltz, Francois_Daoust, Alastor_Wu, Youenn_Fablet, Tommy_Steimel
16:02:52 <tidoust> Meeting: Media WG meeting
16:02:55 <tidoust> Chair: Chris
16:03:04 <tidoust> Agenda: https://github.com/w3c/media-wg/blob/main/meetings/2025-04-30-Media_Working_Group_Teleconference-agenda.md
16:03:17 <cpn> present+ Sunggook_Chue
16:05:06 <cpn> Topic: Media Capabilities #231
16:05:12 <xhwang> xhwang has joined #mediawg
16:05:25 <cpn> Mark: https://github.com/w3c/media-capabilities/pull/231
16:05:53 <cpn> Mark: I created this PR in response to issue 152. When a site wants to query for decoding support with parameters that include color gamut and transfer function
16:06:12 <cpn> ... The discussion was updating the steps in the spec taking these into account when they don't match the MIME type
16:06:38 <cpn> ... If the passed MIME type isn't compatible with the color gamut and transfer function parameters, we want to return unsupported
16:06:55 <cpn> ... We rewrote the steps to check MIME type support in #222
16:07:00 <cpn> ... There are some details to check on
16:07:43 <cpn> .... I think these parameters only matter when decoding video. The way the steps are written, they're passed as extra input to Check MIME Support, but they're undefined unless you're checking for video support
16:08:00 <cpn> ... That seemed the cleanest way to do it, otherwise you'd have to fork the steps
16:08:39 <cpn> ... The second thing is: I now realise this doesn't cover HDR metadata. I don't know if this was intentionally omitted from the discussion, or an oversight. It could easily be added, with the other parameters
16:09:08 <cpn> ... Finally, there's one test case that checks for the mismatch between color gamut and MIME type. It passes in Chrome and Edge but not in Safari or Firefox
16:09:22 <cpn> ... If we land this I'd want to add tests for all three parameters
16:09:44 <cpn> ... I think the PR is good to land, unless someone feels strongly they want HDR metadata included
16:09:48 <cpn> ... I'd like review feedback
16:10:23 <cpn> Jer: I can have a look at the PR, to see if I still have comments
16:10:57 <cpn> ... WebKit already does some validation of the MIME type to see it matches other parameters, e.g., height and width, so this is in the same category
16:11:14 <tidoust> q+
16:11:22 <cpn> Mark: That seems to be how the discussion concluded. Also want feedback if the steps make sense in spec language
16:11:32 <cpn> ... I can look into the test failures too
16:11:38 <cpn> Jer: I can look too
16:12:31 <cpn> Francois; Some of the checks look at the color gamut for the MIME type. Is there a way to make that clearer, from an interop point of view? Does it mean you need to check the codec spec for the MIME type? Risk of different interpretations?
16:13:19 <cpn> Mark: Codecs inherently have support for one or more color gamuts. That hopefully is expressed through profile arguments in the MIME type. So you'd have to refer to individual codec specs to know what's valid
16:14:08 <cpn> Jer: To expand on that: one is that the codec doesn't support a color gamut or transfer function. Or if parsing the MIME type is somehow in conflict with other parameters passed in
16:14:31 <cpn> Mark: VP9 has some profile info to say what color space is
16:15:35 <cpn> Youenn: A question about MIME type validation. WebRTC is looking at this for MediaRecorder
16:16:25 <cpn> ... Would it be a good approach to reference MIME type parsing from MC API?
16:16:45 <cpn> Mark: You can query using the 'record' type
16:16:56 <cpn> Chris: The algorithms could well be reusable, if not exported already
16:17:17 <nigel> nigel has joined #mediawg
16:17:29 <cpn> Youenn: So we could call the algorithm from MediaRecorder, and reject if not supported? Are there hooks in the spec for that?
16:18:12 <cpn> Mark: There's an xxx algorithm
16:18:21 <cpn> Youenn: I'll file an issue and tag you, Mark
16:18:56 <nigel> Present+ Nigel
16:19:25 <cpn> Topic: Media Session #358
16:19:49 <cpn> Tommy: When we addded the enterpictureinpicture event, I suggested adding a flag to know why it was triggered
16:20:06 <cpn> ... We didn't have use cases at the time, so we omitted it for launch
16:20:26 <cpn> ... But we found sites do have a use, some way to distinguish between manually and automatically triggered PiP
16:20:57 <cpn> ... There are web developers who want more info than that. Right now Chrome only automatically opens PiP on a tab switch, but we're thinking about other scenarios, such as mimising
16:21:07 <cpn> ... So we might want an enum rather than a boolean
16:21:29 <cpn> ... Youenn asked about a more declarative way
16:22:08 <cpn> ... I want to propose two things. Some automatic PiP API you can turn on or off, and on a video element
16:22:23 <cpn> ... Additionally, in the Media Session action details, have a reason enum to say why it's triggered
16:23:00 <cpn> ... If a website wants auto-pip on, they want to know the reason. Developers want both
16:23:34 <cpn> Youenn: I think a declarative autoplay policy is something we could consider. Being able to decide in the action handler whether to auto-pip or not isn't something we may be able to implement
16:23:52 <cpn> ... Having a way to declare they want auto-pip is fine. It would just be a preference, and the UA decides
16:24:23 <cpn> ... Having a boolean in the details is useful for statistics, to understand what the user is doing, but not for deciding whether to auto-pip or not?
16:24:44 <cpn> Tommy: I think having both makes sense, both for stats, and for sites to decide on the fly so they don't set a boolean over and over
16:24:58 <cpn> Youenn: And not auto-pipping would stop playing the video?
16:25:28 <cpn> If they want to decide on the fly, it means on iOS the autopip would start, and the video playback would stop. Is that good for the user?
16:25:37 <cpn> Tommy: Could they close?
16:25:51 <cpn> Jer: There's a distinction between closing and returning to inline from PiP
16:26:07 <cpn> Tommy: So in the iOS case their only option would be to turn it off in advance
16:26:34 <cpn> ... Do you oppose adding the details? Is having declarative more important?
16:27:03 <cpn> Youenn: We haven't discussed in detail internally. The declarative might be higher priority. We can discuss and comment on the issue
16:27:18 <cpn> Tommy: I also need to update the issue based on discussino with web developers
16:27:46 <cpn> Jer: Whether or not the details is added, I don't tihnk it would be possible for us to implement it given the architecture of PiP on iOS so it wouldn't be something we expose or use
16:28:00 <cpn> ... So as long as the spec doesn't require those values to be used...
16:28:21 <cpn> Tommy: If you open an auto-pip, I don't think it's required to call the enter-pip action. So you can avoid it if you want
16:28:35 <cpn> Youenn: I agree, the dictionanry has attributes that aren't required
16:29:05 <cpn> Tommy: With iOS, when would you call the enter-pip handler?
16:29:52 <cpn> Jer: I don't think there's anywhere we expose an enter pip. I don't want to preclude it in the future
16:30:18 <cpn> Tommy: I'm fine with it being optional, and happy to have both declarative and the extra details for websites to make their UX user friendly
16:30:47 <cpn> Jer: Declarative is higher priority because the iOS auto-pip is already declarative. You have to say which things will go into auto-pip
16:31:20 <cpn> Tommy: For the declarative approach, make as part of Media Session, or something on Window or video maybe?
16:31:31 <cpn> Jer: Good question, will need to think about it
16:32:43 <cpn> Tommy: I can capture this in the issue
16:32:56 <cpn> Chris: Suggest raising an issue against the PiP spec for the declarative part
16:33:20 <cpn> Youenn: Agree. Could be a boolean, but having a way to select which element to auto-pip
16:33:41 <cpn> Jer: If you make it an element on the media element that reflects to a DOM property, it's the most declarative approach
16:34:28 <cpn> ... PiP is a shared resource, one at a time. So you may have a situation where you reduce but not eliminate ambiguity. So having it on Media Session, making it a choice wich is auto-pipped is more declarative
16:35:10 <cpn> Tommy: If all we have is a way to say "this video element is the one", that's not enough. We have docment PIP, so there may not be  video element
16:35:17 <cpn> ... I'll file the issue
16:35:39 <cpn> Topic: Audio Session #6
16:35:42 <cpn> Chris: https://github.com/w3c/audio-session/issues/6
16:36:08 <cpn> Chris: Should AudioSession be able to specifiy the output speaker and/or route options
16:37:16 <cpn> Sunggook: Last time we discussed making this part of Audio Session. But every frame could have their own audio session, so this proposal is making it global
16:37:44 <cpn> ... Currently we have setSinkId, and it globally changes the output device for the top level and all sub frames
16:37:56 <cpn> ... It can be called from any iframe
16:38:39 <cpn> ... Question at the time: Do we need a new permission to call setSinkID? setSinkId is a per Audio Context or element
16:39:38 <cpn> ... Second issue is iframe, children and siblings. Do we support the vertical tree only? Proposal: alllow it to be called from the top-frame only or same-origin
16:40:44 <cpn> Chris: Is this the same as discussed last time? Top level + permission policy to allow calling from iframes
16:41:15 <cpn> Sunggook: Use the existing speaker-selection permission? Call from top level and from same-origin iframes
16:41:46 <cpn> Youenn: I'm not sure whether we want the whole tree or the vertical tree. For the top-frame only, we don't have to decide...
16:42:18 <cpn> ... I think it's fine to set at top level, but we should discuss about whole tree or vertical tree
16:42:28 <cpn> ... I'm not sure I like any iframe to set the output for the whole page
16:42:45 <cpn> ... We could allow the API call for the top frame only and for sub-frames to say not supported or not allowed
16:43:04 <cpn> Sunggook: That would be clearer in this case
16:43:37 <cpn> ...  Assuming we only allow calls from the top frame, where to put the API?
16:44:33 <cpn> Youenn: No preference, for the vertical tree, I prefer to have the mechanism in Audio Session, but it's not clear there yet if vertical or whole tree. It could be in Audio Session or in the Audio Output spec (in WebRTC)
16:44:49 <cpn> Sunggook: Can there be multiple audio sessions in a single frame?
16:45:15 <cpn> Youenn: Not possible, we could have an audio session constructor, to let you tie different audio producers to different audio sessions
16:45:32 <cpn> ... We'd need web developer input to do that work
16:46:28 <cpn> Youenn: Ok to discuss here, but we'll need to discuss with WebRTC
16:47:30 <cpn> Nigel: Is this a declarative API? There could be calls at different times at different points in the heirarchy, so which takes precedence? So having a declarative model could help
16:47:49 <cpn> Sunggook: It's a global API, so the latest call takes over
16:48:10 <cpn> Nigel: Is that good for users? Can be confusing for users if the order in which you do things becomes important
16:48:58 <cpn> Sunggook: That depends on the developer providing multiple ways to confuse the user, to call at different times. This is like getUserMedia
16:49:20 <cpn> ... Hence the discussion today about restricting to the top frame
16:50:17 <cpn> Nigel: The use case in the explainer includes routing different to audio to different devices. Seems difficult to set up, a declarative model might be clearer and easier to manipulate
16:51:13 <cpn> Sunggook: In the explainer there's example with the web page and a native player both playing audio. So there's a different default device for all pages
16:51:47 <cpn> Nigel: A related example is an accessibilty use case. You might have a group of people watching the same people, someone wants to hear it with audio description mixed in, others don't
16:52:24 <cpn> Sunggook: I think that's already supported through setSinkId, The audio element or audio context can have its own sink id
16:53:02 <cpn> Nigel: What if someone changes at the global level, to override a specific setting elsewhere?
16:54:10 <cpn> Sunggook: If someone chooses a specific output, this global API wouldn't override it. They use the default output from the iframe. So there's an existing setSinkId, that any frame can use. The global API doesn't affect them
16:54:31 <cpn> Nigel: I may need to do some more reading
16:54:47 <cpn> Sunggook: Please file an issue, we can continue to discuss
16:55:50 <cpn> Chris: Summary?
16:56:35 <cpn> Youenn: There's some consensus that going with top-frame only in the short term is fine
16:57:04 <cpn> ... Two questions to address: Which spec should these go in ? And in future should we define the API in terms of the vertical tree or the whole tree?
16:57:41 <cpn> ... For the second issue, a use case to check is WebRTC solutions in websites to provide video calls. They're in an iframe. What do they want to do? Route only their own audio, or route the whole page audio?
16:58:28 <cpn> ... It could depend on where the device picker is, is it in their hands, or are they only controlling the rendering and sending of audio
16:59:15 <cpn> Chris: And continue the discussion here before taking to WebRTC WG
17:00:36 <cpn> Topic: Next meeting
17:01:01 <cpn> Chris: Our next call is in 2 weeks, at the later time
17:01:03 <cpn> [adjourned]
17:01:13 <cpn> rrsagent, draft minutes
17:01:14 <RRSAgent> I have made the request to generate https://www.w3.org/2025/04/30-mediawg-minutes.html cpn
17:01:21 <cpn> rrsagent, make log public