W3C

– DRAFT –
MEIG monthly call

02 February 2021

Attendees

Present
Barbara_Hochgesang, chcunningham, Chris_Cunningham, Chris_Needham, Dave_Bevan, David_Chiu, Francois_Daoust, Geun-Hyung_Kim, GeunHyung_Kim, Harneet_Sidhana, James_Cain, James_Pearce, Jim_Helman, John_Fletcher, Kaz_Ashimura, Kazhiro_Hoya, Kazuhiro_Hoya, Leonard_Rosenthol, leonardr, Markus_Weber, Michael_Weaver, Paul_Randall, Peter_Brightwell, Phil_Tudor, Pierre-Anthony_Lemieux, Pierre_Lemieux, Rob_Smith, Steve_Becker, Takio_Yamaoka, Takio_Yamaokam, Tatsuya_Igarashi, Will_Law, Yash_Khandelwal, Yasser_Syed
Regrets
-
Chair
Chris, Igarashi, Pierre
Scribe
tidoust

Meeting minutes

Preliminary

cpn: Welcome to the M&E IG call. Topic raised 18 months ago at TPAC 2019 around the use of Web technologies for media production.
… Since then, we've had discussions on media editing API and WebCodecs.
… Through some contacts at the BBC, we've been put in touch with guys at Grass Valley, who have a long history of developing applications, increasingly on the Web.

Web based professional media applications

Slides on "Browser Hosted Video Editing"

James_Cain: Thank you for the invitation. I will share some slides. 3 of my colleagues are on the phone.
… Talk is about using the browser as an operating system for video editing
… Grass Valley has been developing video editors over the last decade for a bunch of areas.
… We would like to start a discussion on how use cases may change the needs in that area.
… History: User picking a lot of shots and integrating them into a timeline. First approach, stream the media to the browser using WebRTC.
… Then moved to Silverlight, before it was dropped.
… We tried to switch to MSE to create a video editing environment.
… 3 different problems with MSE:
… 1. Splicing. You have to decode faster than real-time when you want to splice, that is quite tricky to get right.
… 2. Frame accuracy: you don't know which frame is being rendered. We had a stream of JPEG to complement the video. When we paused, we would pause and use the JPEG.
… 3. Aligning audio and video: You have to put time in the payloads of the compressed media. If you trim, you may end up with different times, which makes things tricky.
… Then, we started to use Emscripten and WebAssembly, through which we could ship our own codecs. We were back to a somewhat native environment.
… We could handle GOPs correctly, control precise frame rendering, etc.
… No encryption with this solution (cannot use EME).
… The real problem with this approach is that it runs really HOT.
… A 4K stream is going to make the CPU sweat quite hard.
… So, we got into WebCodecs.
… The first obvious benefit is that it runs much cooler!
… Therefore, we can put higher quality media.
… Obviously, it's only research at this point.
… On to a quick demo!

jamespearce: You see a browser with an editing timeline. One video track and a handful of audio tracks.
… The example shows a very sample timeline, with some transitions and tweaking of audio segments.
… Transitions are applied in WebGL, so we can do a lot of cool stuff, using shaders language.
… Anything that the Web Audio API can do, we can use to apply our audio transitions as well.
… Pretty good performance with WebAssembly but we need to remain with pretty low quality video clips. Not the quality you can send to an end user.
… This is where WebCodecs can help.

cpn: I think that you're getting to the point where we might want to go deeper, with WebCodecs motivations.
… The realisation that people were shipping codecs in WASM was one of the motivations for WebCodecs.

chcunningham: Co-editor of the WebCodecs spec with Paul Adenot. From Google.
… Just wanted to invite James to keep going and go deeper on technical details :)

RobSmith: I'm leading the WebVMT proposal. Have you considered metadata tracks edition, or is it just audio/video tracks?

James_Cain: That is on our roadmap, and some feedback for WebCodecs folks to expose some of the side tracks as well.
… We tie tracks together with temporal offsets. You could have completely independent tracks.

jamespearce: We do also have the concept of markers, tags.

RobSmith: Thanks, I suggest to take the rest of this conversation offline.

Will: Akamai, CTA WAVE. Even with the use of WebCodecs, do you run into issues with 60FPS where there is a buffer in the display itself where you're seeing the wrong frame on display?

jamespearce: Not a problem with WebCodecs, but a limitation if you use hardware decoders with regards to how many decoded frames you can keep.
… That can be a potential issue.
… We need to keep more buffers than what a hardware decoder might have.

Will: Is that still the case after Chrome made the change on the buffering mode?

jamespearce: For the moment, not really relevant for us with WebCodecs.

Will: Are there other components of the UI that still requires WebAssembly?

jamespearce: The MP4 parser is in WebAssembly.
… To extract the right bits to give to WebCodecs. That's a pretty trivial part though.

James_Cain: Do you have things to do with setting up SPPS?

jamespearce: Not really. Timing tends to be driven by the audio anyway. We know how many frames we need.

kaz: W3C Team. Also involved in WoT WG. From that viewpoint, I was wondering about the business benefits about handling this in the Web browser.

James_Cain: I haven't really prepared user stories. I was just presenting the technical sides.
… Our customers want something that can work anywhere without having to install a lot of software on their machine, or without e.g. a remote Adobe Premiere computer that they connect to.

jhelman: MovieLabs. Great presentation. How you see this from a usability perspective compared to remote desktop use cases. Remote is a bit challenging with regards to latency, with companies trying to offer competitive SaaS there.
… How does it currently stack up compared to more traditional alternatives?

jamespearce: We are doing some fairly significant amount of caching. 1000, 2000 fragments over a second. You're always going to be limited by the network and the backend infrastructure.
… Latency is not a real issue. It can from time to time if everyone connects at the same time, but more an issue with scaling.
… You can't really avoid desktop editing.

James_Cain: We get fairly good performance with remote timeline.
… Different perspective and complement technologies with running instances on virtual machines.
… As things progress, high quality video editors value will be eroded.
… We're quite aggressive pre-caching things, to make sure that latency remains minimal. It's quite often that you're hitting a cache.
… Randomly jumping around is not a major need.

jamespearce: We're also talking about proxies here so it's not a lot of bandwidth either compared to streaming to many users.

Digging into WebCodecs

James_Cain: Some comments on WebCodecs. The first thing we've found is that, since we're trying to build something deterministic, we need something that is deterministic as well.
… We need to be able to tell whether the browser will support a particular format will be decoded, rather than having to try and cope if it fails.
… We also want to have really good access to lots of uncompressed frames. GB of cached frames, that's what we're considering.
… With hardware codecs, if there's limited buffering available, we may prefer a software codec with lots of memory available.
… How do we get an easier way to say "we need lots of copies of this"?
… The presentation of the frame means nothing in terms of buffering time.
… Quite a lot of codecs will transport things other than audio/video, it would be good to expose these.
… We use WebGL (later WebGPU) to enable shader language rendering of effects.
… Does the WebCodecs encoding on hardware so that the browser can stream content? As opposed to using a software encoder.
… We know who our users are, they are authenticated. But there are use cases where we'll still need DRM protected stream work. Is it a reasonable assumption that DRMs offered by EME is complementary to shader rendering?
… Finally, some other use cases.
… Electronic news gathering: journalists on the field with a laptop and a phone. Could they encode media and stream it directly? Even the regular 1080p profile is not necessarily possible today.
… Could the application provide some encoding support securely?
… Also color support (HDR and Wide Color Gamut).
… Looking ahead, many broadcasters use the cloud for news and sports feeds. Production is moving to the cloud. If we've got production and distribution in the cloud, and amazing technologies in browsers, we're starting to be in a position where we can tailor the stream to each individual.
… Rendering at the edge in other words.
… If recipes can go to the edge (BBC calls that Object Based), that would enable new use cases.

chcunningham: First comment was around Capabilities API. Our intention is to provide such an API. There is a PR to add this to the spec.
… Same configuration structure that you pass to configure and we would reply with yes or no.
… We also give back the dictionary to tell which fields were understood.

<chcunningham> https://github.com/WICG/web-codecs/pull/120

chcunningham: Next issue about keeping frames around. I see the value and we understand each other about the challenges of GPU-backed decoders.
… Solution is to copy frames to canvas but we recognize that this is not ideal. Idea is to have a copy API from GPU to CPU. No PR for now.
… SEI message support. I thought there were part of the muxed stream. Did I misunderstand?

Michael: It is in the H.264 stream.

chcunningham: We'll have to look at what libraries do in that regard to see what a unifying solution we could offer.

James_Cain: I can put you in touch with people who know

chcunningham: That would be good.
… Next, the intention is to offer encoding in the same way that it works for WebRTC.
… Via the allowed/required/denied mechanism, you can detect whether hardware decoding is supported.
… On DRM, there have been no discussions on this so far. We could build on top of EME somehow, if you think of the CDM being a decoder.
… There are some tricks. Some big questions to solve.
… It's not off the table, I understand the use case. We should open a GitHub issue to keep an eye on it.
… About bespoke codecs, I was a bit confused about that one.
… We're thinking that you could wrap things so that you could provide a WASM implementation as a fallback.

James_Cain: some high quality codecs are very specific to production, very expensive. You wouldn't want every browser to have them. Any way to extend support for codecs is what I'm looking at.

chcunningham: That requires more discussion. If you have experts or pointers to such codecs.
… I don't think it will be a priority for now though.

jamespearce: Being able to control the YUV to RGB conversion could be useful.

chcunningham: We have some open issue around that. It is our intention to support BT2020
… et al.

<RobSmith> On the SEI topic - I worked on a project to parse MISB metadata (SMPTE KLVs) from MPEG-2 TS last year where metadata was part of the muxed stream, i.e. in-band. WebVTT & WebVMT use an out-of-band approach, so I'm interested in both.

cpn: W3C has a dedicated CG for discussions on that topic because it spans a number of Web APIs. CSS for page rendering, Canvas, WebGL/WebGPU.

<cpn> https://github.com/w3c/ColorWeb-CG/

<RobSmith> OGC Testbed-16 FMV: https://portal.ogc.org/files/?artifact_id=91644#PartFMV

cpn: Anybody is able to join and participate.

leonardr: Exactly the same comment. Please come to the Color on the Web CG!

cpn: What do we do next in this group? I'm hearing a success story, with some details to be worked through.
… Is there a wider discussion that we need to have in this group about other things that may be needed to support editing use cases?
… And is there appetite in the group to actively participate in such discussions?
… I suggest we have a follow-up conversation, during the next call. First Tuesday of next month.
… Take a step back from the specifics of WebCodecs and look at the Web platform to identify potential gaps that may still exist.

James_Cain: Thanks for giving us the opportunity to present.

<kaz> [adjourned]

Minutes manually created (not a transcript), formatted by scribe.perl version 127 (Wed Dec 30 17:39:58 2020 UTC).