W3C

Media WG

08 November 2022

Attendees

Present
Chris Needham, Eric Carlson, Francois Daoust, Frank Liberato, Greg Freedman, Jean-Yves Avenard, Jer Noble, Joey Parrish, Karl Tomlinson, Mark Watson, Matt Wolenetz, Sushanth Rajasankar
Regrets
-
Chair
Chris Needham
Scribe
cpn, tidoust

Meeting minutes

Agenda bashing

cpn: Main topic is MSE Interoperability/implementation conformance issues, including perhaps reflections on changes that needed to be brought to MSE in worker
… Also topic on media capabilities towards the end of the call

MSE Interoperability/implementation conformance issues

Gapless audio support feature detection

<ghurlbot> Issue 37 Support sample accurate audio splicing using timestampOffset/appendWindowStart/appendWindowEnd (Melatonin64) feature request, needs author input, TPAC-2022-discussion

Matt_Wolenetz: When the presentation coded frame of two audio fragments overlap, I believe the spec describes a cross-fade. Chrome used to have that, but the widespread proliferation of incorrect timing and codec formats
… meant that we now truncate.
… We couldn't perceive any benefit with cross-fade.
… Meanwhile, chromium implementation allows to do frame accurate splicing.
… In order to do that, you must not drop everything that was decoded from both audio frames.
… And perform trimming.
… Issue is whether we can standardize this feature: frame-accurate audio frame post-decoding splicing
… And also can we add feature detection for whether this is supported by the browser

Jean-Yves: In Firefox, the data will be trimmed once decoded. More accuracy than the 1048 sample.

Matt_Wolenetz: You need to feed more than just the frame so that at the splice point, this is stable.
… Yes, that is what I'm talking about.

Jean-Yves: Multiple encodings use different flags. You end up with conflicts, e.g. with ADTS.
… Is it something that will be within MSE, like a message to send, or contained within the binary format that we append.

Matt_Wolenetz: In terms of specs, yes. In terms of the binary format, the decoded sequence needs to be appended in a well-formed sequence by the app, so that the splicing point gets clearly set.
… The app needs to add enough of the B frames so that the splicing can happen smoothly.

Jean-Yves: The way MSE is designed is that it's always dealing with compressed frames rather than decoded frames.

Matt_Wolenetz: We check in Chromium append calls for positions where splicing might append, especially for audio.
… And we keep track of them.
… We decode across the splice point faster than real-time so that we can adjust things. That enables the frame-accurate gapless audio scenario.
… It's been in Chromium for a long time.
… We haven't had the cross-fade feature in Chromium as a result.
… [more technical details not captured by scribe]
… Not all implementations are required to have faster-than-real-time decoding capabilities, which may make it a concern for some devices and platforms.

jernoble: Back to cross-fade, I don't think webkit does it either. Firefox?

Jean-Yves: I don't think there is cross-fading either.

Matt_Wolenetz: The spec says: if you don't do cross-fade, you need to drop. We don't do either, since we do frame-accurate splicing.

Jean-Yves: I have only seen frame-accurate splicing in demo apps, not real ones.

Matt_Wolenetz: There used to be a Google gapless music app.

jernoble: Webkit will mark windows and overlaps, with samples to throw. That has led to some discontinuities.
… I wonder if there's something that we can do in the case where the append is smaller than the codec window.
… A microsecond is a gap. You'll hear a click when that happens.
… We have gotten feedback from the hardware team as some of the recent hardware is pretty sensitive about that.
… That's what the cross-fade was intended to address.

Matt_Wolenetz: You can have both frame-accurate and cross-fade.

karlt: Normative text says you need to drop the whole block and then in a non-normative note it says you can do something else.
… It seems that all of the implementations keep blocks, perhaps that should be what the normative part of the spec says.

Matt_Wolenetz: I agree.
… We don't know whether TV devices
… keep or drop the frames.

Jean-Yves: No implementation ever stops on gaps.

cpn: So we've got consistent behavior. Question about TV devices. Is that something that we should take up to TV people?
… That would provide input on the feature detection part of the feature.

Jean-Yves: Should we consider that feature detection as part of media capabilities instead of adding a mechanism in MSE that currently does not exist?
… Media Capabilities has a way to test if MSE supports a particular system.

Matt_Wolenetz: Question on software decoder. Media Capabilities covers more than MSE. If MSE can answer the question for itself, it seems preferable to do it in MSE.
… The whole collection of issues that we're discussing are priority 2. That's because they're hard to test, hard to find the right behavior, etc.

jernoble: If the use case is just "gapless labeling", that doesn't seem like a super compelling use case to me.

Mark_Watson: gapless Audio splicing is also useful in a number of other scenarios.
… What is happening today is not ideal, but it's better to leave it at what it is in order not to introduce differences of behavior.

jernoble: Seems to match what Will Law has been asking for some time.
… Making things observable without changing the behavior.

Mark_Watson: It depends on whether we're talking about exposing more information about what browsers are already doing today, then yes that's useful, but I somehow already know that, even though it's better to know if that behavior is guaranteed.
… Or whether we're talking about exposing information about a behavior that changed in browsers.

Matt_Wolenetz: My assumption was that Chromium was the only browser doing frame-accurate splicing, but that seems wrong. So feature detection was meant to detect new browsers supporting the feature, but that may not be needed anymore.
… For lots of splicing scenarios, it depends on the codec and within the codecs whether you're at the start or in the middle of a frame.
… It doesn't seem that it would be a breaking change if an implementation starts doing this.
… but it might be useful to tell the app that a browser supports this feature upfront.
… If there are multiple behavior allowed by the spec, it's probably better to have a mechanism to detect the feature.

cpn: It sounds like next step is around testing.

Matt_Wolenetz: Yes, we seem to all be saying that we're doing some form of frame-accurate splicing.

Relax timing constraint of initial HAVE_METADATA and later HAVE_CURRENT_DATA

<ghurlbot> Issue 275 Consider relaxing timing of initial HAVE_METADATA transitioning (wolenetz) agenda, TPAC-2022-discussion

<ghurlbot> Issue 215 Spec is too rigid on requiring initial HAVE_CURRENT_DATA transition occur synchronously within coded frame processing (possibly ditto for HAVE_METADATA and init segment received processing) (wolenetz) interoperability, TPAC-2022-discussion

Matt_Wolenetz: In Chromium, we have a separate thread that holds decoding, buffering and so on.
… To not block the main thread for some actions, we don't block the update and delivery of event scheduling while we wait for transitioning statuses.
… In Segment parsing loop, I think, you're supposed to wait until HAVE_METADATA. We don't do that in Chrome. I'd like to relax the constraints in order not to do undue blocking of the main thread.

jernoble: That seems reasonable to me.
… Especially for workers.

cpn: Last time we discussed, there was a question about gathering information on what different implementations actually do. Do we need that? It seems to me that if we're just relaxing the constraints, we can go ahead.

Matt_Wolenetz: Question is whether it will bug applications if more implementations do what Chrome already does.
… You may get duration information faster than ready information. That can surprise apps, but then Chrome has been doing that forever.
… There's been some timing hiccups in the tests in WPT.
… It sounds like it may be something to propose in a PR.
… If you can see any kind of regression that this may create, please raise it.

cpn: Checking in with some of the major player libraries might be useful.
… So next step on this is a PR.

API for interoperable gap playback/tolerance

<ghurlbot> Issue 160 Support playback through unbuffered ranges, and allow app to provide buffered gap tolerance (davemevans) feature request, TPAC-2022-discussion

Matt_Wolenetz: My time on MSE is limited right now.
… As part of the HLS native implementation in Chrome that we're building on top of MSE concepts, we need to implement an internal API for interoperable gap playback/tolerance.
… The problem right now is that we have some stalls happening based on various tolerance settings.

jernoble: Some of the issues around gaps should be handled by modifying the output timeline, which is something we do for small gaps.

Matt_Wolenetz: Now there's also out of order audio codecs.
… This API came in discussions recently at FOMS, which I was unable to attend.

jernoble: Every time people attempt to implement HLS in MSE, we discover new features that MSE is missing, so interested in your exploration.

Reflections on MSE in Worker and WG review process

Matt_Wolenetz: The issue that was encountered was late phase based on comments from Mozilla
… transition from MediaSource Handle to a property.
… I made the assumption that [missed] could be retained on the property. That was not true. We ended up with regression.
… What we would be good would be more thorough reviews.
… The more eyes on such features, the better.
… No one can claim to be an expert on the whole web platform.
… Internally, we've been reflecting on binding generators that could warn us when there are problems.

cpn: I see there's like a TAG issue that you raised about this

TAG design principle issue

Matt_Wolenetz: That's about it for MSE. I will be focusing on the HLS implementation in MSE.
… Getting pre-emptive media source in the meantime would be great to unblock MSE in iOS.

cpn: What would you like to do about remaining issues?

Matt_Wolenetz: It depends on how much incoming feedback and comments we get on these.

cpn: OK, we'll check with you before next call.

Media Capabilities editors

cpn: Mounir and Chris moved on to other things. So we need new editors.
… I think Vi from Microsoft is the only one still around.

Jean-Yves: I can help with some of that.

cpn: Thanks for the offer. I'll send the call around to see if anyone is willing to help you.

Minutes manually created (not a transcript), formatted by scribe.perl version 196 (Thu Oct 27 17:06:44 2022 UTC).