MEIG monthly meeting – 04 February 2025

Meeting minutes

MSE Dynamic addition / removal of source buffers

cpn: Daniel responded to some of the comments in the reply

cpn: The use case you've given is different DASH periods where
… either there's video or no audio, and back to video and audio, or similar

Daniel: Yes, that would be the scenario

cpn: And the workaround you're using is, if there's no audio, that you continue
… to fill the audio source buffer with dummy data, silence presumably,
… then you can maintain the continuous throughput so that playback doesn't stall due to
… an underflowing buffer?

Daniel: Yes that would be more elegant than what we have today,
… where we tear the whole thing down and set it up again for each period.
… I agree dummy data would be more elegant but I don't know if anyone does that in their players today.
… You would need dummy data or reuse part of the previous segment

cpn: Two suggestions.
… 1. Use different players with different source buffer sets and switch the video player in the page

Daniel: Maybe performance problems hiding and showing players.
… Not sure about ad blocker behaviour either.
… If you're in full screen and do the switch, is it possible to continue in full screen?
… The timing of the switch might not be completely accurate.

cpn: Wonder if the switch timing is a problem when switching media sources?

Daniel: Yes that might also be a point. Would need to check how accurate it would be.
… Maybe there's a way to request removal of a buffer when empty.
… Also what about playback through unbuffered ranges.

alicia: In gstreamer we have a solution for this called gap events.
… Each track has a serialiser stream of buffers which are frames and events
… Events can be "change the coding parameters for the upcoming part of the video"

<kaz> i|Dainel responded|FYI, related discussion during the MEIG call on Jan 7|

alicia: Or Gap Events where you are not going to get frames for x seconds but don't consider it an underrun
… Explicitly say that there's no data for a period. And they're format agnostic.
… Could try to do that in MSE if it works for the use cases.
… They're just like frames. Instead of AppendBuffer could call AppendGap, and specify a time
… period in the stream where there is intentionally no data.
… Would be the correct time and duration as specified.

Daniel: Sounds a bit like Chris's proposal?

cpn: Possibly [pulls up #160]
… We're interested in keeping as close to the live edge as possible and having a mode where that's
… more important than continuous live playback, so if there's a drop out skip ahead to the most recent
… playback position, to avoid buffering and delay, important for live sports for example.

Daniel: Dave was contributing to dash.js at the time - 2016 was a long time ago!

cpn: Figuring out where this issue got to...
… In the WG this hasn't been progressed particularly. It was discussed in FOMS 2023 for example

<cpn> media-source issue 160 - Support playback through unbuffered ranges, and allow app to provide buffered gap tolerance

Daniel: I think there was a hacky way to skip the gaps and it didn't work in all browsers.

cpn: We have to be a bit careful in this group - we can't design solutions because we don't have IPR coverage.
… We can identify issues with existing APIs and be very clear about the requirements we have.
… Solution design needs to happen in the MediaWG.
… We could put this onto the Media WG agenda, then we would have the browser implementers
… in the conversation as well.

Daniel: Would be good from my point of view too.
… Also want feedback from other player makers, Shaka Player, HLS.js etc
… Would like to know if these things I'm stumbling upon also affect others.
… Want feedback from implementers.

cpn: We know e.g. Joey Parrish from Shaka Player. And does Gary Katsevman still work on players?
… The other place is the video-dev Slack group, which can be a good place to get this kind of feedback.
… There's a good community of app and player developers there.
… Common use case where there's a stall and you want to get to latest playhead position.
… Though this suggests there's sparse data so you want to resume playback and then
… restart buffering later.

alicia: I do think the two use cases are separate though
… I've never been super-convinced about low latency MSE, it always felt like a job for WebRTC.

cpn: You would need the underlying media playout to be aware it was in that mode so it could
… take over without relying on the javascript doing something perhaps.
… If we go back to #359, more concretely what are our next steps?
… I think its worth replying to explain the limitations of having multiple video elements.
… And also describe the latency and timing considerations and our goals, i.e. maintaining continuous
… throughput and playback, minimising gaps when there is a switchover.
… We think the switching of a media source would have the same timing issues as switching the media
… element itself.
… The full screen thing is interesting because if the video element itself is full screen then
… I can see how you would potentially have a visible impact when switching.
… But can there be a full screen container element that has video as a child, to make it smoother?

alicia: Additional complication. What if the event is at the beginning of the video, so we have
… a video that starts without a video. How would we handle that? Would it be empty? Maybe.
… But I guess you could still use the initial resolution of the video - that's already a thing in MSE.
… When you sent the first init segment I think that sets the resolution - may be misremembering.

Daniel: I think there are always interesting edge cases.
… I just asked in the video-dev channel.
… I'm with you that the solution, and for #160, may be just to tell the buffer that it's okay to be without
… content for a while, but then you have to handle no video at the beginning.

alicia: Or you could signal "now we're in ads".

cpn: Would the web app want control over the holding image?

alicia: I was suggesting it could be the last image shown.
… Simplest solution, then can be customised by sending the last image you want to show.

cpn: I was wondering if configuration to select "last good image" or "black screen" would be needed.

alicia: I think it's better to have an explicit gap API than just a flag of ignore underruns.
… I think it would backfire if people aren't sure how they should handle underruns.
… For audio-free ads, it needs to be clear that we are not going to have audio for some time.
… For the live stream maybe the video element needs a way to set a policy of how to handle
… underruns.

cpn: Good point you might want different behaviour

alicia: Not only MSE, could be a problem in low latency even if not using MSE,
… so could be something for the video element itself.

Daniel: The Media over QUIC group was doing something similar,
… dropping video data and pausing video while audio continues, until video data comes.
… Showing last frame of video.

alicia: What if the javascript code to handle the underruns cannot run because some other intensive
… thing is happening. What would that do to timing?

cpn: The videos I've seen are all over web transport with canvas-based rendering.

Daniel: The use case of keeping the last frame would not be possible - they would face the same
… limitation with an MSE-based implementation that we are discussing here right now.
… If we could handle gaps by playing over them then we wouldn't need to remove and add source buffers.
… Or have an API that allows us to append dummy data to fill the buffer like virtual data or whatever you want to call it.

alicia: Could be as simple a a gap with a dictionary saying the start and end timestamp and timescale.

<kaz> media-source issue 360 - Multiple attachable/detachable SourceBuffers

cpn: How does this differ from #360 multiple source buffers.

daniel: With two source buffers you could pre-buffer the second buffer and then at the segment boundary
… signal the switch so the player can pick up the already filled source buffer.

alicia: You can already do this, right?

daniel: You could do it with a virtual buffer, yes.

piers: A bit like the two player approach.

alicia: Assume that we are talking about milliseconds if not perfectly aligned?

piers: It's comparable to the two player approach, just taken down one level, having two
… source buffers instead of two players, and then switching between them.

daniel: Now you can only create one source buffer per type right?

Alicia: There's a VideoTrack API, and according to spec there's an API to switch tracks and in theory you can switch tracks
… If that's the case, the problem is in implementatrion

Daniel: https://www.w3.org/TR/media-source-2/#dfn-sourcebuffer-configuration Meia Source Extensions Working Draft - SourceBuffer configuration
… In theory, for this use case we'd need 2 SourceBuffers per track, so 4 overall

Alicia: Not sure how well tested that is

Daniel: When I tried it, it said only one is allowed

Alicia: The Track API doesn't have good cross browser support, WebKit has some support, but experimental at the time in other browsers

Daniel: The spec does mention it as a quality of implementation issue

Nigel: Is the result of this conversation, that the requirement should be "must support at least two configurations" so you can switch between them, for both audio and video?

Nigel: Or put it differently, the requirement is for smooth transitions between configurations that may have different numbers of sources

Daniel: I think the latter
… Support one per type, but if you want to guarantee smooth transition, e.g., for ad insertion, then benificial to support two SourceBuffers per type

Alicia: It mentions the Tracks API in 3.3 activeSourceBuffers

<kaz> 3.3 activeSourceBuffers attribute

Alicia: so already covered by the spec

Could say we have several tracks and want to switch between them. Seems already covered by activeSourceBuffers
… The spec allows playing multiple audio tracks but only one video track

Daniel: I agree it's implicitly there, but to select it you'd have to have multiple tracks

Alicia: Which you do when you have multiple SourceBuffers

Daniel: Could be good to point to this in the GitHub issue
… Do you support multiple SourceBuffers in WebKit?

Alicia: We do

Piers: It doesn't imply you can switch off the video though?

Alicia: Three features: 1) Signalling gaps in either a track or a SourceBuffer, 2) a way to tell the video element that you care more about being on time than not having underruns, and 3) Implementation of being able to switch between video tracks in an MSE context

Chris: Next steps: document requirements, limitations of multiple media elements, and timing issues with switch MediaSources
… Also gathering input from other video players
… Then bring it all to the working group for discussion

Multiple attachable/detachable SourceBuffers

Chris: I'm confused about the overlapping segments in #360

<kaz> media-source issue 360 - Multiple attachable/detachable SourceBuffers

Daniel: I think it means you play segment 1 to the end, then play parts of segment 2, when segment 1 ends you play segment 2.

Nigel: For ad insertion, you want to cut away from segment 1, then play segment 2 at the time of the ad, but the limitation is you can't replace segments or overwrite

Daniel: In this case you can replace the parts of segment 1 that you don't want to play

Alicia: Often there are deviations between time when things, e.g., due to edit lists

Daniel: We should clarify the diagram. Multiple IDR frames, where everything is lost to the next IDR

Piers: And what kind of media encoding change you have. This could be trying to cope with a situation where the content can't be conditioned (adjusted in length so the segments all fit together).

DataCue

Rob: I'm looking at geo-tagged video, interseted in timed metadata
… We proposed DataCue in WICG, which has stalled
… I've been doing some research to try to move it forward
… I wrote a polyfill that works in all browsers
… I looked into the history, DataCue was in HTML and then removed
… The issue is there isn't an exposed TextTrackCue constructor
… The original design had DataCue and VttCue as derived classes, which have constructors
… If we could expose the TextTrackCue constructor, it would allow us to create cues of their own making, e.g., DataCue, or other varieties
… So can be a way to progress this, and allow experimentation with different cue types
… VTTCue as a specific cue for timed text and DataCue as a generic cue for timed data
… It would work as an exposed constructor. But the only constructor today is VTTCue
… And it doesn't have the vital thing, which is a type field, for metadata, so you could have a pub-sub type interface for particular types

Nigel: I think it makes sense for VTTCue to be concrete, or for DataCue to have a constructor, and agree with the intent to have something unburdened with the additional stuff from TextTrackCue

Rob: My polyfill is based on VTTCue, but I have another for Safari where TextTrackCue has a constructor, based on a presentation from Eric and Tess from 2020 I think
… where they were trying to use TextTrackCue directly
… Exposing the constructor would enable them to do that as well, for the cue they wanted to design

Nigel: Makes sense, I'd ask what the default event handlers should be
… VTTCue has some default behaviour

Rob: Should I raise an issue?

Chris: Please, in the WICG/DataCue repo
… Also we can discuss offline, if helpful

[adjourned]

<kaz> kaz: We should clarify how to organize the discussion around these topics (and possibly some more issues around MSE). Let's talk about that during the ME Chairs call.

– DRAFT –
MEIG monthly meeting

04 February 2025

Attendees

Meeting minutes

MSE Dynamic addition / removal of source buffers

Multiple attachable/detachable SourceBuffers

DataCue

Diagnostics