Media WG meeting – 16 May 2023

Meeting minutes

Media Session

Tommy: I'd like to propose a new Media Session Action, enterpictureinpicture
… There's value in having a MS action to delegate to the website, e.g., if there's more than one video, or if the website uses a canvas-backed video
… I don't think we need exitpictureinpicture, as the UA can close any PiP window without going through the website
… Chrome has UI the user can click. We're thinking of also having a way to automatically enering PiP when switching between tabs
… Two possibilitities: have a boolean or enum to distinguish auto PiP or if the user explicitly wants it, or maybe two separate actions
… Any feedback on which might make more sense?

Jer: I wonder whether the information you're trying to convey is already available from the Visibility API
… In the case of auto PiP behaviour?

Tommy: But the page might not also be visible, so may not be enough

Jer: So it's a boundary between visibility and not visibility, and a user action in the UA UI somehow?
… Would a site want to block auto pip but enable pip more generally?
… iPad OS has an auto pip feature where any full screen video goes to PiP
… I haven't seen a request to disable that, except for disabling PiP entirely
… Not an argument against putting context info in the action handler
… From my experience as an implementer, it's all or nothing, you either want it or not
… Do you want a boolean or an enum with reasons?

Tommy: I don't foresee other reasons to need an enum

Jer: They're both user actions, so how to distinguish in an enum? One way out is to not provide context, unless there's a case where someone wants to prevent PiP in once scenario and not the other
… Suggest considering what the enum values might be, a signal for why the pip started

Tommy: So you'd prefer that to a separate action?

Jer: I think it does
… When auto-PiP starts, for Safari, there's no time to enter PiP, the user is swiping, entry to PiP has to be simultaneous with other animations
… Worry that won't complete in time. Could we implement without breaking the animation. Shouldn't block the media session action for enter PiP
… We might not be able to implement for the same cases as Chrome

Chris: When would Chrome do auto enter PiP

Tommy: We have global controls for manual use, and we're talking about when switching tabs would useful for users

Chris: So follow up is to look at the enumeration, and what to put in the media session action details

EME

Xiaohan: This is about mixed encrypted an unencrypted content
… There's been lots of discussion, I made a recommendation in the GH issue, so want feedback
… w3c/encrypted-media#251
… Right now the spec is unclear, may be a quality of implementation issue
… It's used with ad insertion where ads are unencrypted
… on Disney+, they show a logo then switch to encrypted, which can cause a failure
… The media pipeline doesn't know it will be encrypted later, so when we have a clear pipeline it might be harder to switch to an encrypted pipeline
… Can handle seamlessly in Chrome, but there are cases in hardware where it's harder to do the switch
… My recommendation is if the site knows it will be mixed content, it should set the media keys up front, so the UA knows to setup the correct pipeline
… Doesn't need license exchange, get the keys and set on the media element, then there's a strong hint what to do
… If media keys is set before playback, it should mixed playback, otherwise it's a quality of implementation issue
… Then we can advice sites what to do. Improve compatibility / interop as well
… Any comments?

Jer: To clarify, do you propose adding a note on recommended practice, or is there a normative spec change to always set the media keys before starting playback?

Xiaohan: Current recommendation is to change the spec text, and add a note on the QoI issue
… If keys are set up from, the implementation *must* support ... etc
… It's been open issue for a few years

Jer: I see same behaviour with Netflix, but they may be doing something different than Disney+

Mark: There are corner cases on some platforms where it doesn't work, so I support having this clearer in the spec

Jer: I remember this came up with changeType, changing between clear and protected content

Xiaohan: That's on the MSE side

Jer: On Apple platforms we support protected playback in HLS, elsewhere I assume it's done with MSE

Xiaohan: that's true
… There are two ways you can start with clear playback. You append the segment, where initial samples are clear. singal that the whole thing is encrypted, that's fine
… Other case is append all clear buffers, then later encrtpyted. But that can be too late as we already set up the pipeline

Jer: Seems reasonable that setting the keys you can change to encrypted later

Xiaohan: Seems there's agreement. I'll ping you on the bug to agree the exact text

Chris: Timecale for EME draft spec?

Xiaohan: Editors met to discuss scope for V2, no progress, so I'll ping them again to follow up, get a more firm commitment

Chris: Concern that we have implementations but not a spec, and AC reviewers may ask about our intentions

WebCodecs

Eugene: w3c/webcodecs#92
… This is to allow when we readback a video frame there'll be a way to convert the pixel format, e.g., YUV to RGB at the same time
… Currenlty we can use getImageData on the canvas, but that's synchronous, requires a Window context, so inconvenient in workers
… Other way is manually in WASM, but needs a lot of code, and account for color spacves
… Since browsers can already render all video frames, they could convert at least to RGB, so not arbitrary conversions
… Proposal is to add an extra format field in copyTo, and when the UA can do the pixel format conversion, it'll return the size of the buffer
… if UA doesn't support it will throw an exception. If can determine the buffer size, it could convert during readback of the Video frame
… Dale pointed out it may not be a good idea to signal if conversion supported by throwing an exception
… Another option is to introduce an extra method, canCopyTo, for the purpose of checking if a conversion is supported by the UA
… What do you think? What's the best way if a conversion is OK?

Jer: Converting YUV to RGB is fine if they have the same color space. But decoded frames might not be SRGB, so implies a colorspace conversion
… Are we doing both color conversion and pixel format conversion?

Eugene: Video is tagged with color space, so it's both

Jer: If you have an HDR image and you want to convert to P3 colorspace, do we need a separate enum? Or a second call?

Dale: For color space, we'll need both format and color space. For Chrome we're thinking of only converting to RGB

Jer: There's different white points and matrices...

Bernard: This shows why canCopyTo might be problematic, needs lots of options

Dale: Has to have some way to indicate if supported or not, either a boolean or an exception

Jer: Is copyTo sync?

Dale: It's async

Jer: So instead of throwing it could return a rejected promise

Eugene: Yes, but ideally we want to know in advance, we do prep to determine buffer size, so if we know that we know if we can do it at all

Jer: With the caveat that the color spaces might not be available

Eugene: True, but we can synchronously determine that. Alternatively, people will do it anyway, e.g., for doing ML models on video frames they want RGB, so they use canvas currently
… We want to removes those extra steps, async, and in a worker

Jer: The Window requirement for canvas, is that due to lack of OffscreenCanvas?

Eugene: With OffscreenCanvas we're good

Jer: But that only supports RGB, so you wouldn't be able to

Chris: Are looking at Color on the Web CG, they're figuring out some of this stuff

Jer: Yes

Eugene: copyTo will reject if you call it with something unsupported. If nobody objects, we could go with this, I'll follow up with Paul, he was against this approach I think
… I can prepare a pull request for the spec

Dan: Are there other cases than SRGB where the browser will know what colorspace we want rather than rely on the user?

Jer: Use case is painting to a P3 or HDR canvas and you don't want to reduce to SRGB. Not somehting the UA knows the client in the page may know the intent
… If canvas will support more color spaces, you may need a convert to RGB

Dan: Odd coded sizes issue: w3c/webcodecs#666
… Using coded sizes that aren't a multiple of the canvas size. VP9 lets you specify it has an odd coded size, so this happens in practice a lot
… Proposal to define, and round up. Implies you can sample a half pixel outside the visible rect
… This is what ffmpeg and VLC will do
… Is that an OK definition to use throughout WebCodecs, or any concerns?

Jer: Not hearing any concerns

Eugene: Content hint issue
… Some codecs benefit from knowing, some codecs take the info into account. IsConfigSupported won't reject if there's a content hint it doesn't like
… We have a spec, content hints spec, motion, detail, text, etc

Bernard: Motion vs frame rate is more for rate control. In libwebrtc for h264 hints theres QP stuff we already support, but this other things

Jer: Difficulty is coming up with list of hints across all decoders

Bernard: Detail motion, text, text added for AV1 screen content tools

<eugene> https://w3c.github.io/mst-content-hint/

Bernard: The only concern is make sure it doesn't conflict with other things like screen coding tools

Eugene: it's an implementation detail

Bernard: They'll change to minimum QP.

ChrisN: Follow up in a WG meeting?

Bernard: Useful to do sooner than next month

Rechartering and TPAC

ChrisN: AC review is open, please ask your AC rep to review
… And we've requested meeting time for Monday September 11. Hope to see you there

[adjourned]

Attendees

Meeting minutes

Media Session

EME

WebCodecs

Rechartering and TPAC