Meeting minutes
Slideset: https://
cpn: [goes through reminders]
jean-yves: If I have audio related issues, may I raise them today?
cpn: We'll see how we manage the schedule, if time allows
cpn: [reviewing tips]
Agenda
Introduction
Bernard: Some background. Streaming and RTC converging in general.
… Game streaming, broadcast with fan-out, perhaps to be called low-latency.
… Point is to combine things at large scale.
… WebCodecs combined with WebRTC Data channel.
… We see this solved differently.
… Raises concerns about duplication of efforts.
… WebRTC encoded transform often used as Poor man's WebCodecs
… Some things built into WebCodecs but not in WebRTC.
… Also two distinct code paths in the browsers. That creates issues
Bernard: Here are some examples of similar issues in both worlds.
… Example of QP-based rate control issue in Chromium. We have it in WebCodecs, not in WebRTC.
… [goes through other examples, including HDR support, encoding/decoding times]
… These encoder/decoder APIs need to run across a huge range of hardware, and platforms.
… That's difficult to test.
… Also differences in codec support, e.g., HEVC and AV1 with subtle differences.
… And then support for SVC and simulcast.
… Issues opened in WebCodecs.
Bernard: Another question has come up in WebRTC: whether goal is to support every desirable feature or to enable apps to build their own support?
… Examples: In WebRTC streaming, interest in HEVC which is not in WebRTC (work in progress in WebCodecs). In music contexts, AAC.
… Some of the the use cases that may addressable with a combination of WebCodecs and WebRTC transport.
… Unified encoder API, which Erik proposes. Under the cover, not a JavaScript API, but it illustrates some of the issues we're seeing that might benefit from being addressed in a more uniform way.
Bernard: This is an example of an issue I discovered yesterday.
Bernard: Look at the frame RTT.
Bernard: The glass-to-glass latency is slightly larger.
… Somewhere in the system, we're adding 200ms of delay and it's not due to network. That's in the browser.
Bernard: Encoding latency is pretty low, that looks good.
Bernard: But the decoding latency is excessive
… That seems pretty weird.
… Example of something that does not happen in WebRTC but happens in WebCodecs, and that needs testing.
Randell: Have you validated that the bug is a decoder stack issue or due to the codev AV1.
Bernard: It's not the API, something to do with the decode pipeline.
Jan-Ivar: In Firefox, we now support VideoDecoder, feel free to give it a try.
QP-based rate control in WebCodecs
<padenot> (this only work in Fx Nightly right now fwiw, so don't use a release build)
eugene: Recent change in WebCodecs to allow app to ask about bitrate mode and quantizer use.
… Some AV1 specific option, which is why it appears in that specific part.
eugene: I was able to create a demo.
… which shows how to achieve desired bitrates.
… Feel free to give it a try
eugene: My point is that, even with the most basic algorithm, I was able to achieve pretty good results for bitrate control.
… I think that makes it valuable.
… Also, very quick response, frame-level response to changing conditions.
… It allows to work around bugs in GPU drivers. We see in Chrome that, sometimes, their rate control algorithms contain bugs.
… It gives ability to set lower bounds on image quality: "never give images lower than something", as no one likes pixelated images.
… I encourage people to try.
hta: Very interesting. I tried your demo. You don't touch resolution at all, is that correct?
eugene: Yes.
hta: I was impressed by the result. For that codec, that seems like a very useful mechanism.
… May be room to harmonize between codecs.
Bernard: Very interesting exercice. That's an example of how you can write a PR in WebCodecs that would require a complex process in WebRTC. Not everyone might want this in WebRTC. Lots of use cases to validate.
Randell: Issue is not that QP values vary from one codec to another, but also between implementations of a given codec itself.
cpn: That variability, should we test it?
Randell: Yes. I would imagine hardware implementations could vary in their response as well.
eugene: Correlation between the bitrate becoming smaller and the resolution is the same regardless of the implementation.
Erik: In Chrome on Windows, we use this type of external controlled per-frame QP.
eugene: Yes, this allows us to workaound bugs in rate control, as I mentioned earlier.
… I wanted to encourage other browser vendors to implement this as well.
Hardware Encode/Decode Error Handling
Bernard: The related issues are listed here
Bernard: Little bit of background for issue 146. You can get encode/decode error outside of SDP negotiation.
… Slide lists examples of when that can happen.
… You can switch from hardware to software and vice versa.
… Also, we're seeing increasingly that some profiles are hardware-only.
<fluffy> I don't understand the case when we get a parsing error for encoder. Someone point me the right way ?
fippo: Some things that we can do.
… [goes through the list]
… More telemetry is always a good thing.
fippo: For WebRTC, how are we going to expose the decode errors?
… [goes through list of options in the slide]
… We need to come with a precision on where we want to expose the event
Bernard: Two main directions to go. This is proposal A.
… Reuse RTCError event.
… You can see in the dictionary and enum that we can list a number of reasons.
… Might be a good idea to add in the timestamp so that you know when something happens.
Bernard: Proposal B would be to create a custom event.
… Some sketching in the slide on how that might work.
… Just wanna get some feedback on which one of these proposals makes sense to people.
fluffy: Supportive of this either way. When we talk about parsing error for encoder, I wonder what that can be.
Bernard: More a decoder thing indeed.
… Parse error.
fluffy: OK, regardless, much needed.
… No preference from me.
Henrik: I prefer proposal B because I think that the error is different enough from other errors.
… Rather than an unsigned short error number, we should rather have an enum.
Bernard: Yes, we can do that.
florent: [missed]
hta: I also prefer proposal B, to avoid coupling.
cpn: The naming here is all RTC specific. If we were to introduce that in WebCodecs, we might need a more general name.
Bernard: Instead of RTCRtpSender or RTCRtpReceiver, we might want to use Encoder/Decoder.
<dom__> [maybe inherit from ErrorEvent rather than Event? and thus moved the error specific info in the error attribute?]
Henrik: One event handler per decoding could be used.
Bernard: WebCodecs does have errors. EncodingError for errors about data and OperationError for resource issue.
eugene: Done spec-wise. In Chromium, nothing done.
… It would be nice to make this recommendation more explicit in the spec so that people know what to expect.
New Video Encoder API
Erik: This is the view of how it works today in Chrome for WebRTC. Huge entangled ways of doing things.
… The most important thing is scalability.
… In the end, that is implemented in the WebCodecs wrapper.
Erik: Plan to do an overhaul of the internal WebRTC video encoder API.
… Anything related to RTP/Transport, we want that to be external.
… And we want everything to be asynchronous.
Erik: We think that's a good opportunity to aligne WebCodecs and WebRTC to avoid duplicate code.
Erik: What I would like to see is in this slide.
… One scalability controller in WebRTC.
… If you want to do that yourself with WebCodecs, you can.
Erik: Things we'd like to solve include codec selection, flexible reference structures, as much as possible to minimize codec-specifics and rate control.
Erik: The browser can be smart but cannot always make the best choice automatically.
… Maybe one choice is optimal for the sending, but suboptimal for the receiver.
Erik: To solve this, we want the app to be in full control. If you know the context, you know how to select and prioritize.
Erik: We had all of these scalability mode systems.
… and yet they are not enough
Erik: So many other things you could do.
… E.g. If you might want to do B-frames, or whatever magic your scenario might need.
… Not feasible to support everything in the browser.
Erik: Again, solution is to let the app be in charge.
… [goes through slide]
Erik: With these hooks, you can implement all of the scalability modes yourself, and do more, in a codec-agnostic way.
… As a side effect, if you do this, you need minimal feedback from the decoder.
Erik: Not going to talk about rate control, Eugene covered it already.
Erik: Illustration of the concepts that were discussed. Take it as an abstraction for now.
Erik: Some mechanism to query bitrate control capabilities. CQP or CBR.
Erik: Total number of buffers you have avilable. Max number or references, max temporal and spatial layers (output frames per input frame).
Erik: Which input format is accepted.
… What pixel formats.
Erik: Same thing for the output.
Erik: A bunch of other discussions about what else we could have.
Erik: How do we actually select and create an encoder
Erik: Enumeration. Gives you capabilities, implementation name, codec name, code specifics.
Erik: encoder settings that will apply to the lifetime of the encoder.
Erik: The main method is encode()
Erik: The input frame is just a frame. The content hint, the speed setting and how should you do frame drop.
Erik: Params you can give to the encoder
Erik: Apart from control, you have these layers parameters.
fluffy: Ignoring all of the details of the API, arbitrary buffers referenced cannot be passed around in the underlying codecs.
… I think that you'll have a hard time guessing what pieces might work for a given type of hardware.
Erik: For hardware, it depends on drivers. In Chrome OS, we already do that under the scenes.
fluffy: So, works for VP8?
Erik: Yes.
fluffy: HEVC?
Erik: We talked with a few vendors. Some can do it. Some API limits.
Erik: This is a complete example of how that would work.
… Can skip these frames, look at them offline!
Erik: The API is not as bad as it looks regarding fingerprinting. All of it can be derived somehow.
Bernard: One of my questions would be: what would be the effects on the JavaScript that we have?
Erik: My understanding is that we have some sort of software fallback that could clash with this.
… I don't really have an opinion on what the best route forward is this.
jan-ivar: In WebRTC, we had a problem getting powerEfficient. I'm having a hard time seeing how we can expose so many stuff to JavaScript.
… Double-edged sword is that there's a lot of copy-and-paste on the Web. Good defaults are needed.
… Tying browser vendors to do the right thing.
… Puts a lot of pressure on the client to implement things correctly.
Erik: Agree. I think we could have something separate for WebCodecs that gives some help. No sure I like that.
… Would one of use write that? Or would we hope that the community does?
Jan-Ivar: I worry about how people may approach these expert APIs.
Erik: Yes, I'm thinking about 3D cases where WebGL is not your go-to target but rather your game engine.
Elad: With fingerprinting, would giving some capabilities through permissions on microphones, cameras help?
… Regarding libraries, people are good at creating them.
jan-ivar: Asking for cameras, microphones could be seen as permission escalation.
… Better direction.
hta: Is this an API that you would expose in workers, main thread?
Erik: Not an expert in that.
hta: The current position in WebRTC encoded transform is that it's worker-only.
eugene: Everywhere where WebCodecs is available seems like a good approach.
<hta> correction to minutes: I said that the current position in webrtc encoded transform is that we're still quarreling.
Francois: The API allows enumarating decoders, why do you need the exactly list?
<Zakim> tidoust, you wanted to wonder about enumerateDevices
erik: if you just ask for a particular codec it's hard to reason what you do with it
paul: most of this is doable in WebCodecs, so prefer you reframe it in terms of WebCodecs
… do a gap analysis between web exposed capabilities and what's needed
… e.g., automatic fallback is not a thing. we have capabilities, a registry with per-codec settings
… we're duplicating a lot here, which we should avoid
+1 paul
erik: is this feasible? should we move towards doing that gap analysis?
paul: file issues. professional creator users and rtc users both have needs, lots of communities engaging
… reach a uniform API is good, but a lot of what you describe is doable
… avoid duplicating lots of work
hta: you're emphasising precise user control of features, we want to have these these base features and leave the higher level modes like SVC and rate control and simulcast as documented as implemented in terms of these primitives
<padenot> +1 hta
hta: we should do that style of spec more. I want WebCodecs core to be a primitive used by WebRTC and a more user friendly WebCodecs interface, but the core is clear and simple as possible
Erik: My thoughts as well
Florent: On complexity of the API, shouldn't be a problem, there are a few expert APIs on the web, WebCrypto and WebGPU
… If this were introduced, libraries would make it easier to use
… similar happened with WebCrypto
Bernard: A cautionary note - it looks like we're on the verge of a major hardware change
… ML based codecs for audio. The nature of the hardware is likely to change in the coming years, how would these fit the framework?
… Per-macroblock QP for segmentation - this would require API changes to WebCodecs
… The API meets demands over the last few years, but need to look to future demands
Erik: On inter-picture references, I haven't seen them breaking the mold drastically, something to watch for
Xiaohan: How many of these can be option, and have good defaults? So it remains a simple higher level API?
Chris: Agree on the need for a gap analysis. Previous approach has been to add to WebCodecs incrementally, e..g, the per-frame QP. Do we want to move to exposing everything? We've heard concerns about enumeration and potential fingerprinting
… Next step is to meet again when we have a gap analysis
Web Audio
jya: Currently, use canPlayType with opus, etc, not sufficient
… have something on top of MediaRecorder, do you support recording with multiple channels
… the hope is if you can play it you can decode it, not always true
Xiaohan: Is that the MSE or WebRTC case?
jya: It's MSE, file playback also
… It's a Media Capabilities decoding query. The requirements to play aren't always the same for playing
Xiaohan: Not so familiar with WebAudio, but the next step could be to raise an issue in Media Capabilities API, and we can follow up
Media Capabilities issue 185
Chris: [recaps the issue]
hta: When you pass a mime type, question is whether it contains parameters or not
jan-ivar: it stil references the webrtc spec, would we want it to move to MC API?
hta: that was on purpose, so you can pass it to the setCodecCapabilities
… should we make the capabilities convertible, ideal if they both were the same, but they're both deployed
chris: Discussion needs youenn, so let's follow up in a future needs
Wrap up
chris: Nothing else to discuss, so let's close here. We'll follow up in future calls
[adjourned]