W3C

– DRAFT –
WebCodecs Serialization Format

13 September 2023

Attendees

Present
Eric_Carlson, Eugene_Zemtsov, Harald_Alvestrand, Paul_Adenot, Peter_Thatcher, Rijubrata_Bhaumik, Xiaohan_Wang, Youenn_Fablet
Regrets
-
Chair
Peter Thatcher
Scribe
cpn

Meeting minutes

Peter: In the IETF MoQ group, there's interest in sending media over QUIC
… How do you serialize the media into bytes so you can send over QUIC stream or datagrams
… Involves attaching metadata. Looked at different codecs, which have different modes
… WebCodecs registry is a place we can point to
… People will want to make MoQ endpoints using web clients, using WebCodecs and WebTransport
… Want to take the output from WebCodecs, serialize, then send to a decoder on the other side, which may be WebCodecs
… I did something similar in Second Screen WG, on Open Screen Protocol
… At that time we chose CBOR, a kind of binary JSON
… What is the interest from others? What would you want to do with the serialization, store it, send over QUIC?

Harald: IIUC this is about making sure we have both encoded data and metadata for a frame on the other side of the connection
… So it's like the RTP encapsulation format
… We don't have a 1:1 mapping of it
… Which parts of this should be different from RTP formats, and why?

Bernard: This would include things normally in RTP extensions as well
… Applies to the RTP packet, timestamp, things related to SVC in IETF are always packeted
… Would become patent encumbered in IETF

Peter: Different questions: where should this happen? How different should this be to work already done?
… IETF looking at how to do RTP over QUIC

??: I thought the output of the encoder was already serialized and you could plug it into the decoder. What additional metadata is needed?

Peter: For example, the output of the encoder includes EncodedVideoChunk, which includes the bytes, timestamp, keyframe flag, metadata such as how SVC is structured
… Information about size, color space, and which codec it is
… In header extensions, things like frame dependencies. In RTP payload header, there's metadata
… There's metadata tied to the unencoded media, e.g., audio level
… So if you look at RTP metadata, it's the same kind of metadata

??: Is the metadata already exposed to JS, just needs packaging, or are there internal pieces needed?

Peter: Want it done in a standardised way, but they don't want to use RTP
… There could be internal metadata, but a good start is to take everything in the API and serialize that

Youenn: Are you thinking of extensions?

Henrik: It's whether the problem can be solved in JavaScript, it sounds like I could convert to JSON and parse on the other side. Would that work?

Peter: You could, yes. Make your own non-standard format. The question is what interest is there in having a standard format?
… There's interest in the MoQ WG for that
… Is there interest here?

Bernard: The standards interest comes from the caching. MoQ uses a cache, relays need metadata to do forwarding

Peter: Relays are designed to be agnostic to the media format, and some metadata is opaque to the relay

Bernard: Not sure that's true, there's preferential forwarding, may be ready by caches

Peter: That would have to be defined by MoQ WG, as it's MoQ specific
… The serialization can be done in a separate doc, as it's opaque to relays, that's my understanding
… Are people interested in being able to serialize outside of MoQ, for other use cases

Bernard: There have been issues raised about support for DRM

Xiaohan: How is that related to this question for a need to serialize WebCodecs?

Harald: We have a proposal to define a generic packetizing format for RTP so you can send any content you want over the wire

Peter: Content of S frames are opaque

Harald: This came up in a side discussion about SDP munging for WebRTC encoded transform

Peter: One way to think about it is, if we were to start from scratch and solve the same thing, serialize the payload, and accumulate all the header extensions, what would we do?
… Similar question in the Second Screen group, we used CBOR and came up with types. It was straightforward, easier to use a CBOR parser
… Let's come to Bernard's question about where this should be worked on
… There's IPR issues for some of the metadata

Bernard: The concern would be if you take the WebCodecs spec you might start encumbering WebCodecs, that would be awful

Peter: What about trying to define the serialization format in W3C, would there be enough people interested in working on it to do it here?

Bernard: Media WG already has an ISO BMFF format for MSE

Harald: A note of caution on IPR. IETF runs on the principle that every participant should disclose their IPR, but doesn't force anyone to give their rights to anything
… At W3C all participants make a commitment when joining the WG
… If you don't join the WG, no promises made
… So unless all contributors are in the W3C WG and agree to remain members when the work item is adopted, that's what's required for W3C
… Getting all people who're asserting IPR to join might be the issue

Youenn: I'd tend to pick the venue where more people are interested

Peter: Gauging level of interest is the purpose of this

Xiaohan: What's the use case you're imagining?

Peter: It could be used for any purpose. For the MoQ WG, it's for real time and streaming use cases, e.g., live broadcast video conference

Xiaohan: There are already solutions not using WebCodecs out there. If it's for streaming then CMAF can be the format, but too heavy for real time
… Why not adopt an existing format?

Peter: MoQ WG separated the two, so the format can be CMAF based
… Others who want to do things web oriented, you don't want to containerise to CMAF, more straightforward to do what JSON does, but more efficient, hence CBOR
… For realtime or very low latency streaming it would be a better DX to do something new, which is why there's interest in MoQ WG for a low overhead container

Xiaohan: Format of MediaStream in WebRTC?

Peter: That uses RTP, as Harald described. It could be done over QUIC, there's work in IETF, but like CMAF it carries baggage

Chris: Wouldn't MoQ WG need to define a format anyway, absent any work here?

Peter: Yes, they're working on the CMAF based format, but the question is about the low latency format

Bernard: The Media WG meeting heard about the IETF 117 WebCodecs presentation
https://datatracker.ietf.org/meeting/117/materials/slides-117-moq-webcodes-container-00

Peter: WebCodecs already has the metadata defined that we'd want to serialize
… What we define might be usable in different situations
… Continue in IETF or bring to W3C

Chris: Interest expressed at IETF to work on it?

Peter: Yes, three of us at least are interested, and others

Chris: We are developing a browser based video editor, have our own format, so we have the general need

Peter: Need for interop around that?

Chris: I'd have to follow up to get precise requirements

Bernard: A frequent request we get on WebCodecs is a standard API for container formats

Peter: I have a similar situation, where I have my own non-standard serialization, there might be potential for a standard solution

Peter: So if not this, what are people interested in?

Henrik: Using WebCodecs with the RTP transport in WebRTC, don't know if packetization needs to be standardised or not

Peter: So if you were to use WebCodecs with RTP transport, how would you use it?

Henrik: I see value in coupling app-specific metadata to video frames. Haven't done enough homework to know more

Youenn: MoQ is trying to do real time for QUIC, server to client. Why not using the MoQ approach, over WebTransport

Harald: Ability to include non-standard or proprietary metadata. That might be an interesting feature to explore further
… How would you do that, for the interoperable web?

Peter: Youenn said it would be to use the same format over MoQ or other channel
… Then some interest in having P2P QUIC come back
… If you model it that way, if you add app-specific metadata, you would rather put it in the payload than header extensions, treating RTP transport as a generic transport than RTP specific
… To Harald's question about experimental metadata, it would be easy to put what you want in the format. CBOR is very flexible, key/value, you can have custom keys for custom values
… So we could do similar to header extensions with this

Bernard: Things useful for evaluating experiments, e.g., capture time, you wouldn't send on the wire but useful for experiments

Harald: We also use capture time for reasons

Peter: Since P2P and QUIC were brought up, is anyone interested in a P2P QUIC API?

Randall: P2P QUIC would be more useful for WebRTC like things
… Why was it dropped?

Peter: To focus on client/server initially, then come to it later
… Anything else to discuss?

Harald: I was hoping to learn new use cases. I'll monitor what MoQ is doing

Henrik: I was hoping to see if the serialization use case can be done, and it sounds like it's possible to do today

Peter: Yes, it is possible
… And some are already doing it, so question is about need for interop

Henrik: If the format and API is flexible enough it may be preferable to doing it yourself
… But risk if there's anything you can't do in the proposed format, people would do their own thing anyway

Xiaohan: There's lots of container formats, so depends on use cases. There's a profile Peter mentioned. If requires super low latency, what should be done?

Peter: If it's standardised and pick up and use, it might be a default people would reach for rather than start from scratch

Xiaohan: If we get to the point where the browser can do the muxing. Is the CMAF concern to do with the size of the generated stream, or compexity?

Peter: Size, if you're sending each as a separate message
… At last for depacketizing, you hand a CMAF chunk to MSE

Chris: Interest in WebCodecs for MSE. We're in effect polyfilling the video element with buffering and a canvas

Peter: You can hand over decoded VideoFrames to MSE
… Integration with MSE a good thing to think about

Youenn: Would be interesting to know what's missing in MSE for you to migrate to that, what features we could add

Peter: Thanks for coming. If you have renewed interest or use case, let us know

[adjourned]

Minutes manually created (not a transcript), formatted by scribe.perl version 221 (Fri Jul 21 14:01:30 2023 UTC).

Diagnostics

Succeeded: s/??/Henrik/

Succeeded: s/metadata extensions/header extensions/

Maybe present: ??, Bernard, Chris, Harald, Henrik, Peter, Randall, Xiaohan, Youenn

All speakers: ??, Bernard, Chris, Harald, Henrik, Peter, Randall, Xiaohan, Youenn

Active on IRC: cpn, cpn_, Ian