W3C

– DRAFT –
Media WG meeting

14 May 2024

Attendees

Present
Bernard_Aboba, Chris_Needham, Erik Språng, Eugene_Zemtsov, Joey_Parrish, Marcos_Caceres, Mark_Foltz
Regrets
-
Chair
Chris_Needham, Marcos_Caceres
Scribe
cpn, marcos

Meeting minutes

<Marcos> CN: we need decide on meeting time for TPAC

<Marcos> CN: I'd like to discuss EME next steps, and Web Codec topics getting the spec to CR

TPAC 2024

CN: We put together a plan for TPAC. We plan to meet for a whole day. Tentative it will be on the Thursday.

CN: However, we need to figure out how much time joint meeting time we need. Specifically with the RTC and Web Codecs. The Web Codec folks wanted to about integration points, but we should go to their meetings - possibly in coordination with Kronos. Based on that feedback, we might not meet with WebGPU folks at TPAC

CN: What do folks think about that?

BA: we did a 90 minute slot last time. That's the default?

CN: We would do a 90 minute session with WebRTC. That would be part of the same day.

CN: The same meeting day, that is.

CN: There the discussion with the related groups. And also things that are more for exploration.

CN: topic groupings would be Media. WG issues in progress , Collaboration with other WGs, and Explorations

CN: We will put in the meeting time requests based on this

CN: Anything else we want to discuss around TPAC?

Eugene: Topics that are important for RTC scenarios, Frame drop notifications , Advanced SVC, video encoder buffer controls, inter-frame dependencies

CN: We will get into those details as we pull in the agenda

EME next steps

CN: We have a pull request to add minHdcpVesion (issue #535). The PR is good to land. Got positive feedback

CN: We need guidance to from W3C Team for publication of FPWD

CN: We have a number of other issues labelled as V2. These might not block us from FPWD, but we should triage them. We don't need to look at them now, unless anyone wants to discuss them now

CN: There are a couple that I raised recently. After I did some fixes to the spec, I noticed some bugs so filed them.

JP: I'll take a look. I'll sync up with the others implementation to build consensus.

CN: It would be great to check if any of these are important if any are important for FPWD

JP: I'll go through and close any that need to be closed. I'll try to do some updates and make some kind of hot list or milestone.

CN: There are a number of other issues that we need to think about, #529, #521, things like a "valid mime type" that are defined elsewhere. Annevk raised some issues.

Marcos: I can also help with this

CN: What would layering involved?

MC: EME interacts with HTML media behaviours, conceptually, so even for "valid MIME type", which comes from mimesniff, passing through the parser. So where does EME sit and interface with other parts of the platform
… and the eventing model, what task queue is used

MC: Happy to go through it in detail

CN: Perhaps we could do together

MC: It's worth doing, as the spec could end up smaller by reusing things from elsewhere
… There's a structure we can use of inputs and outputs to hook into the wider platform

CN: we are making good progress on EME. Let's get the spec out soon

VideoFrame Metadata Registry

EZ: Things coming from the WebRTC capture pipeline. It's informational, it doesn't survive encoding or decoding
… Useful for people working with VideoFrames
… It's empty, but have suggestions for what to put there
… One suggestion was to put RTC timestamps there. This is the time of the sender when the frame was created
… Might be useful to measure lag or synchronize audio and video
… But it's only exposed when the video comes from a WebRTC channel
… WebRTC already does synchronisation
… WebCodecs encoders don't do anything with it. The info is already available in the callback
… So it looks useful, but don't know what to do with it exactly
… Another thing proposed to be added is face detection. Certain webcams do face detection
… This information can be obtained from the capture pipeline at no additional cost
… Looks like it could be useful for encoders, but WebCodecs doesn't do anything with it
… But it can be useful to web developers to do video processing on the CPU or GPU
… It's only for certain cameras and certain circumstances. In most cases we don't have the data
… If you're video encoding, these will be mostly empty
… This is in the MediaCapture Extensions spec already
… We discussed this issue for the WebCodecs spec, but at the same time it's in MCE
… The info only comes from the capture pipeline, so makes sense there so not clear it should be part of WebCodecs at all

MC: We should check this, there's already a Shape Detection API. Does this overlap? Hope we don't end up with two specs doing the same thing

EZ: There's a subtle distinction here. Shape Detection runs algorithms on the video frames to detect shapes from scratch
… But this comes from the camera

MC: But it should have the same structure?

BA: It doesn't have the same structure. It's being used for background blur etc

EZ: Next, there's another MCE extension to show where the background is. It's a background mask
… It's not in the metadata registry, but we should move it there from VideoFrame
… So there's a set of possible additions that sound useful
… But WebCodecs encoders and decoders doesn't have use for the data, as it's not always available, very ad hoc
… Not clear we need the registry. It's a burden for spec maintenance
… Why couldn't it be done as part of other specs
… So I'm suggesting to get rid of the registry and address these things in other specs, if we really need them
… The registry has been empty for two years

CN: we introduced the registry for different application types, to avoid collisions between them, etc. the motivation was things like face detection metadata. So I'm a bit surprised that we haven't put anything in the registry. We didn't implement the process we had for updating the registry. So we should at least have added the face detection

metadata.

CN: The media working group doesn't take ownership of populating the entries, but we expect other WGs to send us pull requests. It seems unfortunate that no one has sent us PRs for, at least, face detection metadata.

BA: So there's some confusion about what the review should involve. But it doesn't affect WebCodecs

Bernard: The w3c are saying that because of WebIDL, the features can be detected automatically, so we might not need the registry. There is some confusion about why we would use the registry. Is it supposed to just copy the IDL into the registry or are we supposed to get other levels of scrutiny?

CN: happy to revise the process to keep things simple

Eugene: I'll reach out to Youen and Intel folks, who asked about adding those registry entries. We don't have their points of views, but it would be good the get them.

CN: I would like to hear to from Youen

w3c/webcodecs#559

w3c/webcodecs#607

BA: IDL tooling will detect if you have the same dictionary entry multiple times

MC: That depends how it's done
… You'd have a partial interface in each spec, and combine them and run the parser over them, which should detect

BA: The key thing is to define what we want to achieve
… What should the review consist of? Is there architectural view

CN: Worth having a review for architectural issues, overlaps or duplications

MC: Yeah, we did that with permissions. Same situation as here, we defined steps to introduce a permission and we'd check them in the registration process

BA: Are there guidelines to review against?

MC: Yes there is a set of criteria, is it specified properly, monkey patching, implementation commitments,. Assures oversight and gives quality
… We'd lose that if it's purely automated. Do we want that level of scrutiny that needs the coordination?

EZ: We'd have to go to WebRTC WG and MCE spec editors and ask them to revert the change from their spec and make a PR for the registry

CN: We could point the registry entry to MCE?

EZ: That sounds viable, doesn't put much burden on us

CN: PA's concern about the monkey patching approach

MC: Ideally we should have extension specs at all

BA: This is extending a dictionary, not a monkey patch

MC: Even so
… use of partials outside of the main spec, and it's valid WebIDL, even use of partial can be a concern

CN: But this would be an argument for folding into WebCodecs

MC: Perhaps you and I can look at some examples and figure it out from here

BA: Example of extending a VideoFrame with a VideoFrame, can be an architecture issue

CN: Examples to add?

EZ: RTC timestamps, but background is more speculative. Face detection is in between, mostly fine if webcams really provide it. But concern is if there's a Shape Detection API with its own object
… But it has little to do with WebCodecs itself, so WC isn't a natural home

CN: So WebRTC WG should be the review group?

BA: That's asking them to review their own stuff

CN: We can provide an independent view in this WG

BA: Have review criteria and guidelines

MC: Yes, we could do that and refine
… Part of the W3C registry process is to define guidance

https://docs.google.com/presentation/d/1ltl1ZV1KYV02wmFymBGe3q613bM5e3VsZxo-Lu5yn90/edit

Slides above cover the topics we discussed today

CN: So in summary, it sounds like we're not ready to drop the registry yet. We see a need to review entries being added and can add criteria to the registry definition based on what we learn
… If Paul is OK with it, we could leave the face detection where it is and refer to it from the registry - moving the spec elsewhere wouldn't address the partial dictionary
… Or move it to its own spec if leaving it in MCE is a problem

[adjourned]

Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

Diagnostics

Succeeded: s/CN/MC/

Succeeded: s/scribe: marcos/scribe+ marcos/

Maybe present: BA, Bernard, CN, Eugene, EZ, JP, Marcos, MC

All speakers: BA, Bernard, CN, Eugene, EZ, JP, Marcos, MC

Active on IRC: cpn, Marcos, marcos