W3C

– DRAFT –
Media & Entertainment IG call

02 June 2020

Attendees

Present
Ali_C_Begen, Andreas_Tai, Barbara_Hochgesang, Chris_Needham, Estella_Oncins, Francois_Daoust, Fuqiao_Xue, Garret_Snger, Gary_Katsevman, Harneet_Sidhana, Huaqi_Shan, John_Riviello, John_Simmons, Kaz_Ashimura, Kazuhiro_Hoya, Larry_Zhao, Mounir_Lamouri, Nigel_Megitt_BBC, Paul_Adenot, Pierre, Pierre-Anthony_Lemieux, Rob_Smith, Steve_Becker, Takio_Yamaoka, Tatsuya_Igarashi, Will_Law, Yajun_Chen, Yash_Khandelwal
Regrets
-
Chair
Chris, Igarashi, Pierre
Scribe
tidoust

Meeting minutes

Agenda

cpn: Main topic is Client-side video editing proposal.
… Most of the call today will be about that. We can leave some time in the end for general status update on topics of interest: media integration guidelines, bullet chatting.
… Also, TPAC plans if time allows.

Client-side video editing (MediaBlob)

cpn: Welcome folks from Microsoft who are behind the proposal

Client-side video editing explainer

TAG review on Client-side video editing

Yash: SDE at Microsoft. Steve Becker is also here.
… We'll go through current issues, the solution, a case study, and relationship with WebCodecs.
… JS libraries are not able to take advantage of optimizations.
… Using MediaRecorder is another path. To get 15mn of video, you need 15mn of recording, which is problematic.
… With cloud-based processing, process affects the bandwidth since video needs to be sent back and forth.
… There could be a queue delay too, for instance as happens in Youtube.
… We are proposing an API called MediaBlob.
… Simply pass a blob to the constructor who will detect the media stream settings.
… "duration" gives the duration.
… We're also propositing editing ooperations separately, such as trim, concat (list of MediaBlob to one MediaBlob), split (one > two MediaBlob), finalize
… [showing code example]
… Creating a MediaBlob and operation, which we can trim, concat, and finalize to run the actual editing operation.
… The whole thing takes 728ms, in the browser.
… The API is easy to use with MediaRecorder the File API.
… No need to have specific knowledge about media concepts, codec licensing issues, etc.
… Flipgrid is owned Microsoft company.
… Flipgrid use of MediaBlob allowed them to multiply video editing capacities by 40.
… WebCodecs is a powerful API that provides support for live gaming, RTC and others. It provides transcoding. But no support for demux operations.
… For instance, with WebCodecs, you may need an external demuxer.
… And then WebCodecs and an external muxer in the end.
… With MediaBlob, for transcoding, trimming and so on, we don't need any external demuxer/muxer.
… MediaBlob exists on the Edge browser, and is under Original Trials.

<BarbH> Questions - Impact on AV1 versus V9?

cpn: In this group, we've been aware of this work for some time. The trigger for the invitation is you starting the TAG review last week.

<BarbH> Questions - Other browser beyond Chromium?

cpn: I'd like to invite questions, and then we could talk about professional use cases that the IG is currently exploring.

Will: On MediaBlobOperation, why is that abstraction necessary?

Yash: We wanted to align with things done in WebGL. Another reason was batching, but maybe not necessary.

<BarbH> Question - Have you reviewed in the Media WG and WebGPU CG?

Will: OK, I would argue that the coupling does not look necessary.
… Also, finalize can be very heavy. It may do transcoding, which may take time. Why not expose that more explicitly?

Yash: That's good feedback.

Will: What sort of containers do you support? I would hope CMAF-based.

Yash: Right now, HEIF MP4, WebM

Igarashi: About use cases, can this be applied to file uploading cases? How to create a MediaBlob in these cases?

Yash: I don't have an example right here, but can update the explainer to clarify this situation

Igarashi: This MediaBlobOperation may depend on the media type. Are there some media capabilities detection that may be needed to know whether an operation is supported?

Yash: That is good feedback. If Mime sniffing does not work, the creation operation will fail.

Igarashi: The "finalize" operation may take time. In some cases, it may be useful to reduce the bitrate. Would translating be supported on top of transcoding?

Yash: We did think about it. We're still in the process of deciding whether it would be useful or not.
… We're looking for feedback on whether that's useful to expose this.

kaz: Possible additional APIs for "create" and "delete" for MediaBlob, and possibly "send"

Yash: Basically, since we're inheriting the Blob API, all of these should be easy for us.

Igarashi: I think it's possible to use trim and concatenate to cut some portion of media.

kaz: My question was more about the object itself, not chunks within the media stream. so Yash's clarification on inheriting the Blob API is OK by me..

Igarashi: split, concat enables any kind of video editing.

Yash: Why do you think we should have a "delete" operation for part of the media blob in addition to the proposed "trim" and "split"? trim and split seem to be the right answers there. Do you think that "delete" would be useful?

Igarashi: It depends on performances. If a "delete" operation is much more efficient, then it should be considered.

nigel: A quick reflexion on the conversation. concat takes in a sequence of MediaBlob, rather than the result of MediaBlobOperations.
… It seems you're partly in the world of MediaBlob and partly in the world of MediaBlobOperation.

Yash: The editing only happens when "finalize" gets called.

nigel: So if I want to concat the result of a trim, then I need to call "finalize" to individual MediaBlobs?
… If the MediaBlob you need to pass to concat is the result of a split, it seems strange that you can't prepare all operations in advance.

Yash: I see your point.

<BarbH> add me to question queue

nigel: long long time is a number of milliseconds. Have you thought about rounding issues that you might get with certain frame rates? It seems easy to have a frame rounding error.
… 29.997ms frame rate is often used.
… I'm just concerned that there may be rounding errors.

Yash: We haven't thought about that, but that's something we'll look into.

pal: Wanted to echo Nigel's two points. Instead of a sequence of operations on blobs, have you considered a playlist approach?
… The idea with the playlist model is that it's easier to pass around and to include multiple tracks instead of just one big track which is a multiplex of mulitple ones.

<nigel> +1 to the playlist idea, also the playlist itself might be a usefully serialisable thing

pal: I would also really use rationale numbers for time.

Yash: Thanks, we already have someone who filed an issue around that, we'll look into it.

nigel: The playlist itself is something that could be easily serialized/deserialized as well. Imagine that you're doing an editing session and wants to apply it to other media files.

<nigel> Why does it not make sense just to create playlist? #13

cpn: Presumably, we can raise issues against the GitHub repo to provide feedback.

Johnsim: For media containers, HEIF was scribed (probably incorrectly, bad scribe!). Are image formats supported?

Yash: We're not supporting any image format for now.

Johnsim: So that was a mistake, ok.

BarbH: Besides Chromium, is this something that is also being socialized with other browser vendors, e.g. Apple?

Yash: We have reached out to Apple, not yet with Firefox.

BarbH: Have you gone to the Media WG? The reason I'm suggesting it is that Apple actively participates there, and could give feedback there.

cpn: For eventual standardization, after incubation, that would probably be the appropriate place to take this. I reached out to Apple for today. Looking at early proposals is what the IG does, so that's the right place to get feedback!

BarbH: Have you also reached out to the tools vendors that are doing video editing today? Adobe is very active in the W3C and they have video editing tools for instance.
… They're not the only ones.

Yash: We have not reached out to video editing tools, but this is something that we will start looking into. Engaging with other browser vendors as well.

pal: Part of what we're doing today is bringing awareness about this proposal to a wider media community.

cpn: Right and Media Production is one topic that the IG is currently exploring.
… Connections with Adobe would be good.

BarbH: I'll see what I can do.

takio: I filed issue #13 last night. I'd like to confirm that issue.

<xfq> https://‌github.com/‌WICG/‌video-editing/‌issues/‌13

takio: Quick question: Sometimes, operations need transcoding. Sometimes, they don't, right?

Yash: Right, for trim, we will do possibly partial transcoding to re-create an I-Frame for instance.

takio: The user cannot distinguish whether transcoding happens or not, this is perhaps problematic.

<Igarashi> asynchronous api using promis should be considered

Yash: Thank you, I think we got feedback some feedback here on "finalize".
… When transcoding takes place, things may take place, we'll add some abort handler.

cpn: Progress events would be nice too.

<nigel> the need to transcode depends on the input formats as well as the output formats: concat([type 1 blob, type 2 blob]) must transcode at least one. What format do you get at the output?

kaz: TV broadcasters might have some potential opinion from their viewpoint. They may be using hardware-based editors which might have powerful capabilities. They may have opinions on this software-based approach.

<Igarashi> proxy editing would be useful for professional editing

cpn: Right, typically with cloud-based edition, we'll send time points on which to run operations on the server, with the client seeing a "proxy" representation.
… I see the main use case being user generated content (Youtube et al.).

<kaz> kaz: yeah, potential gap between "professional use" and "customer use", etc.

<Igarashi> mediaBlob editing be useful for proxy editing

pal: In terms of editing, there's surprisingly few differences between professional and "casual" editing: codecs perhaps, and some special effects.
… In terms of playlist, there are a number. One is OTIO, originally from Pixar but now open-source.

<takio> I prefer playlist. If do so, we can choose client side or server side transcoding.

pal: All of these playlist models look different, but they're all the same at the end of the day.
… Main differences are support for transitions (for audio and video).
… Other than that, they're all the same. That's also why I'm encouraging exploring a playlist approach.

<takio> .

cpn: Clearly, there's a lot of interest in this API and also in WebCodecs. If people are interested, maybe we can ask some of the editors behind WebCodecs to have a closer look at the API to see where they are and how it maps to some of the use cases that we have.
… It seems we have some useful feedback to provide.

pal: What are your next steps?
… Where would you want the work to happen?

cpn: Are you looking for more input from media companies?

Yash: Immediate next step is figuring out the story with WebCodecs. Google's perspective seems that MediaBlob only provides demuxing/muxing, with the rest being covered by WebCodecs.

<takio> Even if editing into same codec, it will need to be re-encode if GOP is too long...

cpn: Interesting, lots of open questions. Here, we're happy to host these kinds of conversations. My suggestion would be that we continue this discussion certainly among media companies. You're very welcome to be part of it.

pal: Yash, would you like us to try to setup a meeting with the right folks?

Yash: I think that would be good.

Harneet: I think we should start by evaluating the classes of scenarios that we're trying to enable.
… I don't think that the goal is to replicate everything that Adobe can do in editing tools, for instance.
… It's more to enable common basic needs.

<Igarashi> proxy editing should be added to the use cases of media blob

Harneet: As a next step, we need to evaluate the pros and cons of MediaBlob vs. WebCodecs.
… I think there is room for a simple API that provides basic video editing capabilities on the Web.
… Is there interest in this level of media editing? Or is interest more on high-end video editing capabilities?
… Is there level of interest from the industry on concat, trim, split?
… We don't envision MediaBlob to be used in professional media editing, too complex for the API.

pal: I'd like to dispel that myth. I believe they can be supported with very minimal changes. The challenge in front of us is can we go a little beyond to cover much broader use cases?

<nigel> +1

Harneet: We'd love to hear about that.
… We'd like not to require deep media knowledge to be able to use the API.

pal: But that's a good requirement also for professional applications.

<Igarashi> proxy editing does not require professional codes

Harneet: Do you have a discussion area? We'd love to get all the feedback through issues on the GitHub repo.
… If you can file issues on this repo, that would be great.

cpn: Very happy to do that.

[Side discussion on sharing slides]

kaz: We have many stakeholders from media industries, which can help clarify use cases, requirements and gaps. From time to time, we can revisit this discussion.

Media Integration guidelines

cpn: We're out of time, but note we started some work on guidelines for integration of media APIs on devices with limited hardware capabilities.

JohnRiv: I migrated issues from the original CTA WAVE discussions. I haven't integrated comments from Dan yet.

cpn: OK, so this is a call for participation to get involved in this work!

Bullet Chatting

Huaqi: The CG completed the use cases and gap analysis.
… We discussed two rendering ways.
… CSS way seems to be preferred by vendors, so we're exploring that more thoroughly.

cpn: Thank you.

<JohnRiv> FYI, here is the Media Integration Guidelines repo: https://‌github.com/‌w3c/‌me-media-integration-guidelines

Minutes manually created (not a transcript), formatted by scribe.perl version 117 (Tue Apr 28 12:46:31 2020 UTC).