15:58:37 RRSAgent has joined #mediawg 15:58:41 logging to https://www.w3.org/2025/04/30-mediawg-irc 15:58:42 Zakim has joined #mediawg 16:02:10 RRSAgent, make logs public 16:02:30 present+ Chris_Needham, Eric_Carlson, Mark_Foltz, Francois_Daoust, Alastor_Wu, Youenn_Fablet, Tommy_Steimel 16:02:52 Meeting: Media WG meeting 16:02:55 Chair: Chris 16:03:04 Agenda: https://github.com/w3c/media-wg/blob/main/meetings/2025-04-30-Media_Working_Group_Teleconference-agenda.md 16:03:17 present+ Sunggook_Chue 16:05:06 Topic: Media Capabilities #231 16:05:12 xhwang has joined #mediawg 16:05:25 Mark: https://github.com/w3c/media-capabilities/pull/231 16:05:53 Mark: I created this PR in response to issue 152. When a site wants to query for decoding support with parameters that include color gamut and transfer function 16:06:12 ... The discussion was updating the steps in the spec taking these into account when they don't match the MIME type 16:06:38 ... If the passed MIME type isn't compatible with the color gamut and transfer function parameters, we want to return unsupported 16:06:55 ... We rewrote the steps to check MIME type support in #222 16:07:00 ... There are some details to check on 16:07:43 .... I think these parameters only matter when decoding video. The way the steps are written, they're passed as extra input to Check MIME Support, but they're undefined unless you're checking for video support 16:08:00 ... That seemed the cleanest way to do it, otherwise you'd have to fork the steps 16:08:39 ... The second thing is: I now realise this doesn't cover HDR metadata. I don't know if this was intentionally omitted from the discussion, or an oversight. It could easily be added, with the other parameters 16:09:08 ... Finally, there's one test case that checks for the mismatch between color gamut and MIME type. It passes in Chrome and Edge but not in Safari or Firefox 16:09:22 ... If we land this I'd want to add tests for all three parameters 16:09:44 ... I think the PR is good to land, unless someone feels strongly they want HDR metadata included 16:09:48 ... I'd like review feedback 16:10:23 Jer: I can have a look at the PR, to see if I still have comments 16:10:57 ... WebKit already does some validation of the MIME type to see it matches other parameters, e.g., height and width, so this is in the same category 16:11:14 q+ 16:11:22 Mark: That seems to be how the discussion concluded. Also want feedback if the steps make sense in spec language 16:11:32 ... I can look into the test failures too 16:11:38 Jer: I can look too 16:12:31 Francois; Some of the checks look at the color gamut for the MIME type. Is there a way to make that clearer, from an interop point of view? Does it mean you need to check the codec spec for the MIME type? Risk of different interpretations? 16:13:19 Mark: Codecs inherently have support for one or more color gamuts. That hopefully is expressed through profile arguments in the MIME type. So you'd have to refer to individual codec specs to know what's valid 16:14:08 Jer: To expand on that: one is that the codec doesn't support a color gamut or transfer function. Or if parsing the MIME type is somehow in conflict with other parameters passed in 16:14:31 Mark: VP9 has some profile info to say what color space is 16:15:35 Youenn: A question about MIME type validation. WebRTC is looking at this for MediaRecorder 16:16:25 ... Would it be a good approach to reference MIME type parsing from MC API? 16:16:45 Mark: You can query using the 'record' type 16:16:56 Chris: The algorithms could well be reusable, if not exported already 16:17:17 nigel has joined #mediawg 16:17:29 Youenn: So we could call the algorithm from MediaRecorder, and reject if not supported? Are there hooks in the spec for that? 16:18:12 Mark: There's an xxx algorithm 16:18:21 Youenn: I'll file an issue and tag you, Mark 16:18:56 Present+ Nigel 16:19:25 Topic: Media Session #358 16:19:49 Tommy: When we addded the enterpictureinpicture event, I suggested adding a flag to know why it was triggered 16:20:06 ... We didn't have use cases at the time, so we omitted it for launch 16:20:26 ... But we found sites do have a use, some way to distinguish between manually and automatically triggered PiP 16:20:57 ... There are web developers who want more info than that. Right now Chrome only automatically opens PiP on a tab switch, but we're thinking about other scenarios, such as mimising 16:21:07 ... So we might want an enum rather than a boolean 16:21:29 ... Youenn asked about a more declarative way 16:22:08 ... I want to propose two things. Some automatic PiP API you can turn on or off, and on a video element 16:22:23 ... Additionally, in the Media Session action details, have a reason enum to say why it's triggered 16:23:00 ... If a website wants auto-pip on, they want to know the reason. Developers want both 16:23:34 Youenn: I think a declarative autoplay policy is something we could consider. Being able to decide in the action handler whether to auto-pip or not isn't something we may be able to implement 16:23:52 ... Having a way to declare they want auto-pip is fine. It would just be a preference, and the UA decides 16:24:23 ... Having a boolean in the details is useful for statistics, to understand what the user is doing, but not for deciding whether to auto-pip or not? 16:24:44 Tommy: I think having both makes sense, both for stats, and for sites to decide on the fly so they don't set a boolean over and over 16:24:58 Youenn: And not auto-pipping would stop playing the video? 16:25:28 If they want to decide on the fly, it means on iOS the autopip would start, and the video playback would stop. Is that good for the user? 16:25:37 Tommy: Could they close? 16:25:51 Jer: There's a distinction between closing and returning to inline from PiP 16:26:07 Tommy: So in the iOS case their only option would be to turn it off in advance 16:26:34 ... Do you oppose adding the details? Is having declarative more important? 16:27:03 Youenn: We haven't discussed in detail internally. The declarative might be higher priority. We can discuss and comment on the issue 16:27:18 Tommy: I also need to update the issue based on discussino with web developers 16:27:46 Jer: Whether or not the details is added, I don't tihnk it would be possible for us to implement it given the architecture of PiP on iOS so it wouldn't be something we expose or use 16:28:00 ... So as long as the spec doesn't require those values to be used... 16:28:21 Tommy: If you open an auto-pip, I don't think it's required to call the enter-pip action. So you can avoid it if you want 16:28:35 Youenn: I agree, the dictionanry has attributes that aren't required 16:29:05 Tommy: With iOS, when would you call the enter-pip handler? 16:29:52 Jer: I don't think there's anywhere we expose an enter pip. I don't want to preclude it in the future 16:30:18 Tommy: I'm fine with it being optional, and happy to have both declarative and the extra details for websites to make their UX user friendly 16:30:47 Jer: Declarative is higher priority because the iOS auto-pip is already declarative. You have to say which things will go into auto-pip 16:31:20 Tommy: For the declarative approach, make as part of Media Session, or something on Window or video maybe? 16:31:31 Jer: Good question, will need to think about it 16:32:43 Tommy: I can capture this in the issue 16:32:56 Chris: Suggest raising an issue against the PiP spec for the declarative part 16:33:20 Youenn: Agree. Could be a boolean, but having a way to select which element to auto-pip 16:33:41 Jer: If you make it an element on the media element that reflects to a DOM property, it's the most declarative approach 16:34:28 ... PiP is a shared resource, one at a time. So you may have a situation where you reduce but not eliminate ambiguity. So having it on Media Session, making it a choice wich is auto-pipped is more declarative 16:35:10 Tommy: If all we have is a way to say "this video element is the one", that's not enough. We have docment PIP, so there may not be video element 16:35:17 ... I'll file the issue 16:35:39 Topic: Audio Session #6 16:35:42 Chris: https://github.com/w3c/audio-session/issues/6 16:36:08 Chris: Should AudioSession be able to specifiy the output speaker and/or route options 16:37:16 Sunggook: Last time we discussed making this part of Audio Session. But every frame could have their own audio session, so this proposal is making it global 16:37:44 ... Currently we have setSinkId, and it globally changes the output device for the top level and all sub frames 16:37:56 ... It can be called from any iframe 16:38:39 ... Question at the time: Do we need a new permission to call setSinkID? setSinkId is a per Audio Context or element 16:39:38 ... Second issue is iframe, children and siblings. Do we support the vertical tree only? Proposal: alllow it to be called from the top-frame only or same-origin 16:40:44 Chris: Is this the same as discussed last time? Top level + permission policy to allow calling from iframes 16:41:15 Sunggook: Use the existing speaker-selection permission? Call from top level and from same-origin iframes 16:41:46 Youenn: I'm not sure whether we want the whole tree or the vertical tree. For the top-frame only, we don't have to decide... 16:42:18 ... I think it's fine to set at top level, but we should discuss about whole tree or vertical tree 16:42:28 ... I'm not sure I like any iframe to set the output for the whole page 16:42:45 ... We could allow the API call for the top frame only and for sub-frames to say not supported or not allowed 16:43:04 Sunggook: That would be clearer in this case 16:43:37 ... Assuming we only allow calls from the top frame, where to put the API? 16:44:33 Youenn: No preference, for the vertical tree, I prefer to have the mechanism in Audio Session, but it's not clear there yet if vertical or whole tree. It could be in Audio Session or in the Audio Output spec (in WebRTC) 16:44:49 Sunggook: Can there be multiple audio sessions in a single frame? 16:45:15 Youenn: Not possible, we could have an audio session constructor, to let you tie different audio producers to different audio sessions 16:45:32 ... We'd need web developer input to do that work 16:46:28 Youenn: Ok to discuss here, but we'll need to discuss with WebRTC 16:47:30 Nigel: Is this a declarative API? There could be calls at different times at different points in the heirarchy, so which takes precedence? So having a declarative model could help 16:47:49 Sunggook: It's a global API, so the latest call takes over 16:48:10 Nigel: Is that good for users? Can be confusing for users if the order in which you do things becomes important 16:48:58 Sunggook: That depends on the developer providing multiple ways to confuse the user, to call at different times. This is like getUserMedia 16:49:20 ... Hence the discussion today about restricting to the top frame 16:50:17 Nigel: The use case in the explainer includes routing different to audio to different devices. Seems difficult to set up, a declarative model might be clearer and easier to manipulate 16:51:13 Sunggook: In the explainer there's example with the web page and a native player both playing audio. So there's a different default device for all pages 16:51:47 Nigel: A related example is an accessibilty use case. You might have a group of people watching the same people, someone wants to hear it with audio description mixed in, others don't 16:52:24 Sunggook: I think that's already supported through setSinkId, The audio element or audio context can have its own sink id 16:53:02 Nigel: What if someone changes at the global level, to override a specific setting elsewhere? 16:54:10 Sunggook: If someone chooses a specific output, this global API wouldn't override it. They use the default output from the iframe. So there's an existing setSinkId, that any frame can use. The global API doesn't affect them 16:54:31 Nigel: I may need to do some more reading 16:54:47 Sunggook: Please file an issue, we can continue to discuss 16:55:50 Chris: Summary? 16:56:35 Youenn: There's some consensus that going with top-frame only in the short term is fine 16:57:04 ... Two questions to address: Which spec should these go in ? And in future should we define the API in terms of the vertical tree or the whole tree? 16:57:41 ... For the second issue, a use case to check is WebRTC solutions in websites to provide video calls. They're in an iframe. What do they want to do? Route only their own audio, or route the whole page audio? 16:58:28 ... It could depend on where the device picker is, is it in their hands, or are they only controlling the rendering and sending of audio 16:59:15 Chris: And continue the discussion here before taking to WebRTC WG 17:00:36 Topic: Next meeting 17:01:01 Chris: Our next call is in 2 weeks, at the later time 17:01:03 [adjourned] 17:01:13 rrsagent, draft minutes 17:01:14 I have made the request to generate https://www.w3.org/2025/04/30-mediawg-minutes.html cpn 17:01:21 rrsagent, make log public