Improving Clipchamp's in-browser video editing pipeline with WebCodecs

Presenter: Soeren Balko (Clipchamp)
Duration: 10 minutes
Slides: PDF

All talks

Slides & video

Keyboard shortcuts in the video player
  • Play/pause: space
  • Increase volume: up arrow
  • Decrease volume: down arrow
  • Seek forward: right arrow
  • Seek backward: left arrow
  • Captions on/off: C
  • Fullscreen on/off: F
  • Mute/unmute: M
  • Seek to 0%, 10%… 90%: 0-9
Slide 1 of 12

Hello, and welcome to my talk. In this presentation I'm going to talk about how Clipchamp has adopted WebCodecs in the past 12 months, round about. Clipchamp is an in browser video editor.

Slide 3 of 12

At Clipchamp our mission is to empower anyone, to tell stories worth sharing.

Slide 4 of 12

And what that looks like is we have built a fully in-browser video editor.

Slide 5 of 12

And we firmly believed from the early days of when we founded the company that only in-browser platforms offer the convenience of such a cloud service combined with the speed of a desktop application, and we've used that in-browser paradigm.

Slide 6 of 12

For one, probably the most important reason is that caters for a great user experience. And the reason for that is video files are large.

It also comes with the fantastic advantage from a commercial perspective that it has near zero runtime cost. The hardware has already been paid for.

And finally, there's also the aspect of privacy, obviously very important in this day and age where everything you do not upload to a cloud service, but that stays on your devices is obviously kind of subject to perfect privacy.

Slide 7 of 12

The browser remains a challenging platform, as you all know, especially when you do heavy lifting like video encoding or decoding.

And that is obviously first and foremost resource allocation. Memory is scarce still. Performance is equally important. Video encoding is expensive if you use software encoding paradigm, as we, for the longest time have done. Access to low-level hardware capabilities, again, is limited. Yes, these days you do have WebAssembly SIMD and we've made good use of that. You do have WebGL for a long time. WebGPU has just launched and thankfully, and obviously that's where WebCodecs fit in.

We now also have access to the circuitry that's devoted to hardware accelerated video and audio decoding and encoding. And that's fantastic for us.

The two aspects that I encircled in purple here, performance and access to low level hardware capabilities, are obviously the challenges that WebCodecs addresses

Slide 8 of 12

Talking about a lot of video processing pipelines specifically about probably the most challenging part which is when you're exporting a video project that you've been editing, where you put together various pieces of footage. So that project which combines all of that, or can combine all of that, turn that into an MP4 file, which you can then distribute around the web.

Slide 8 of 12

That is quite an involved process. And again, we run it all in the browser and currently it's subdivided into three stages.

Here we have the decoder which takes care of extracting a stream of kind of raw frames from your source footage and then passed into the middle stage, which we call the compositor that is where, all of those pieces of footage are combined, effects are layered on top. And that one again, produces a composed stream of raw frames, which is then passed into the encoder, which turns it into an MP4 file.

We basically retrofitted WebCodecs into this architecture.

Slide 9 of 12

And the way we did that is we combined that WebAssembly build of FFmpeg that I was just showing you with the WebCodecs API and the way to do that is we created codec stubs, inside FFmpeg, and what those codec stubs then do is they call out from that WebAssembly build into JavaScript land.

And that is basically to go through the WebCodecs API lifecycle, or it's lot better to say the VideoEncoder API inside of the WebCodecs specification. Initially, you initialize and configure your VideoEncoder instance, obviously.

And then you go through the process of pushing in the raw frames and then pulling out the encoded chunks, which are then put back to FFmpeg, which then puts them into the container format. And lastly, you close down and de-initialize your VideoEncoder instance.

Slide 10 of 12

Now that has worked really well for us, but it didn't come without some gotchas. And that's maybe the most interesting part of this talk.

And maybe then also, stimulate some discussions around what can be done about that for others who have had the same challenges.

What we had to do is create an artificial, pre-flight dry run of the video encoder using the same parameters, same resolution, same everything, just to generate that piece of extra data.

And I think in WebCodecs API parliance, it's called a description. And it's basically for, if you know about H.264, they have these binary units in there, which are basically metadata, SPS/PPS, and that has to be placed or has to be put back to FFmpeg, such that it can be put in the container format at the right place. Now I leave it at that. But that has given us a bit of a bit of headache to, to get this right.

One concern, which has less to do with WebCodecs, but more with the fact that FFmpeg as such is obviously normally used as a command line tool, And that means it's typically a synchronous call stack and that's inherently not a good fit for WebCodecs, which is, like any browser, almost any browser API asynchronous. You have to break up the synchronous call stack for FFmpeg into multiple asynchronous calls.

Slide 11 of 12

And lastly, we have a bit of a wishlist. Obviously WebCodecs 1.0 or the current version of WebCodecs, I should say, it's great. A big thank you goes out to all the people that have pushed for this standard, who have implemented it in browsers.

I can only imagine that it might've been a hard sell because like it's a very specialized capability. Nevertheless, it's kind of fantastic that this has happened.

That being said, if you're hoping that it doesn't remain this way, but that further iterations on that standard. And there are a few things we are hoping for that would make our lives a little bit easier.

One thing we were struggling with, and where we again have only a crutch in place, there's not an active back pressure on the VideoEncoder object where the VideoEncoder object actively tells you to throttle the way you pass raw video frames, and basically my buffers are all full. So it'd be good to see some sort of event coming out of video encoder object that actively tells people stop until I tell you otherwise".

Another concern is right now, the way to configure video encoders is largely through the bitrate. And that's okay for many typical workloads, think about streaming scenarios and so forth, but in our case for a video editor, you are probably a bit more focused on, you are probably a bit more obsessed about quality, visual quality that is. It would be good to have some sort of semantic quality tuning knob, which is more like an abstract parameter which just says, from zero to one, one means perfect quality, zero means crap quality and anything in between, but that gives you like a bit of a more objective means to control the quality other than the bitrate.

Obviously HDR matters more and more, as more displays are being in use, on phones its already commonplace, TV is pretty much the same, PC monitors maybe not quite as much yet, but it would be good to see that being supported by WebCodecs as well.

HEVC decoding, and I know there's an entire discussion around IP and patent liability and so forth, but pretty much if WebCodecs could tap into any codec that is installed on the device, even if it's selective, like if on some devices, HEVC is not supported, fine, and we have to offer a fallback in some sort of software service side encoding, but it would be good if at least the codecs that are installed would be kind of funneled through and exposed through WebCodecs.

And unfortunately there's certain well-known companies that push for HEVC despite the difficult IP situation. So that's something that's an audience we want to cater for.

And I mentioned that synchronous flavor of the WebCodecs API inside workers would be useful as well.

I didn't include the demuxing and muxing part. I mean, we support these through FFmpeg. And I wouldn't claim this goes with our challenges probably for new adopters of the WebCodecs API is one of the greatest concerns.

In general, WebCodecs as such doesn't offer anything by way of kind of parsing a video file. So the demuxing part, or producing an MP4 file.

I don't know what the solution to that might be. There's obviously a lot of complexity in this part. And FFmpeg already does all of that. So that's why I didn't include it. I'm not sure if it's a good idea to include this into WebCodecs, because this is a bit of a different concern, even though pretty much anyone using WebCodecs would probably have to deal with it one way or another.

Slide 12 of 12

And that concludes my talk. Thank you. And I'm free for questions.

All talks

Workshop sponsor

Adobe

Interested in sponsoring the workshop?
Please check the sponsorship package.