Meeting minutes
Agenda
Slideset:
https://
Chris: (summarizes the
agenda)
… Sync on the Web CG, SVTA MSE issue
review
… anything else for today?
(none)
Sync on the Web CG
Chris: There was a
breakout session during TPAC hosted by Kensaku Komatsu on media
synchronization,
… input events, MIDI, GamePad, Media over
QUIC
… a CG has been created
Chris: I'm planning to
participate in the CG
… I don't know about the group's plan for how
they'll organise the work at the moment, but once they've got
settled, we can get more information.
MSE issues recap
Chris: Continue discussion about MSE issues discussed on Dec 10
Buffer capacity
Chris: (revisits the
topic around buffer capacity)
… I checked for existing MSE issues that are
related.
… There were several feature request proposals, and
Google Chrome wrote a proposed MSE Introspection API
draft
Issue 35 - Report buffer changes in update events
Issue 40 - Needs event to notify when sourceBuffer needs more data
Issue 172 - Consider adding API for app to know how much room is left in the SourceBuffer
Issue 259 - Consider adding support for apps to get metadata about what is currently buffered
MSE SourceBuffer Introspection Proposal
Chris: My
recommendation would be to use issue 172 to continue the
discussion. Commenting on the issues will flag it to the Media WG
for attention.
… I'd recommend that we consider the buffer
capacity requirements taking into account the Managed Media Source
API as well these previous discussions and proposals.
… For example, there's the Managed Media Source
bufferedchange event
… Another point raised in our last meeting is that
some MSE implementations require web apps to append entire mdat box
structures
… The MSE append algorithm doesn't require this,
but some implementations are more restricted
… So I'm wondering what we can most usefully do
here. It doesn't seem a spec issue. I'm not sure if we have tests
within the existing WPT test suite,
… but that might be useful to have as guide for
implementations on constrained devices.
… Does this issue mostly apply to older devices, or
do we still see modern implementations with this
limitation?
Daniel: In dash.js we
only pass full moov mdat boxes to MSE. It is more about legacy
devices.
… It works in WebKit, don't need to parse the
media. We could add a flag to dash.js to allow apps to see if
parsing should happen or not.
Chris: And there isn't a good way to feature detect, so applications would rely on trying to identify the device model
Daniel: thinking to add some buffer to improve the performance
Louay: We have two test
runners, with WPT and the DPCTF sreaming test suite.
… We can add test cases for this, it could be the
best way to move forward.
… Test cases and content needs preparing though,
and how to observe pass or failure.
Daniel: It might not
need new content if we could request byte ranges not aligned with
the box structures,
… just need to know where chunks are closed or
open.
Francois: From a spec perspective, it says implementations must support incomplete segments, so we should add to the main test suite, WPT?
Chris: That would make sense if we can
Louay: There's WPT and
DPCTF tests both which have relevant test cases.
… The solution depends on which observation
framework is most applicable in order to validate the
result.
… DPTF uses camera recording to capture the
rendered output.
Box parsing
Chris: Last time we discussed various diferent boxes: EMSG, PRFT, ELST. Focusing for now on EMSG, we've discussed this a lot in the past.
Issue 189 - Add Support for Media-Encoded Events
Chris: The related MSE
issue is 189 for media-encoded events
… MEIG wrote the Requirements for Media Timed
Events Note,
… and this led to the API proposal for
DataCue
Requirements for Media Timed Events Note
Draft DataCue API spec by WICG
Chris: I'm very happy
to support people who would like to continue work on that
… During TPAC 2024 there was discussion of adding
subtitles and captions in MSE
… Given EMSG is somewhat similar, relating to media
parsing, timeline, and cues, if work happens on caption support
maybe it could also consider EMSG
Chris: A generic box
parsing API might not get support, it's BMFF specific and browsers
support other media formats, also concerns discussed last time
about nested boxes.
… So I recommend considering each use case and
maybe there are different proposals for each.
… For container parsing, there was a question
around performance of using JS or WASM.
webcodecs Issue 24 - API for containers?
MEIG Issue 108 - Media Containers API
Chris: The general
feeling in Media WG is that this isn't expensive for applications,
so there isn't motivation to provide a browser API.
… so progressing that might need us to show
performance data on the benefit of a native
implementation
Chris: The other
suggestion raised last time was a WebCodecs and Canvas based
player
… The missing piece there would be content
protection.
webcodecs Issue 41 - Support for content protection
Chris: So overall, we might want to look at use cases again, e.g., timed metadata, if they need more investigation
Daniel: MOOV timescale
correction is about bad content authoring, the time in the manifest
may not be correct.
… We compare the times in the manifest and the
media segment, if there's a difference, use the media segment
value.
… It's just a requirement for people writing
content
Chris: Does this apply just at the start of media playback?
Daniel: Yes
Rob: To follow up the
generic metadata discussion,
… I found myself moving more towards generic
metadata.
… It's more than parsing efficiency, but being able
to expose the type to filter the metadata.
… It can contain any content. What you're
interested in is identifying the items of metadata of interest
among others in the stream.
… The motivation for Data Cue is to identify timed
metadata vs timed text, where as VTTCue is for timed
text.
… WebVMT is a variant of WebVMT, spatial data
related to video.
Chris: In your use case, do you have the data muliplexed in the video stream?
Rob: No. It can contain
anything, location and timed location, speed and distance, but it's
carried out of band.
… The scope now includes sensor data, dash-cam with
accelerometers.
… Use cases for monitoring collisions, e.g., if
there's a spike in accelerometer, you're interested to see the
associated video.
Chris: This is an important capability. Do you have scope to work on this?
Rob: Yes, we have a
testbed. GIMI standard, aims to harmonise existing
capabilities.
… Replacement for NITF (National Image Transmission
Format). Integrates ISOBMFF.
… HEIF is of interest, imagine an image pyramid
with different levels of detail, has application in online maps for
zooming in.
… They want to harmonise and unify all
this.
<RobSmith> OGC
Testbed-20 GIMI aims: https://
Chris: Perhaps
decoupling the DataCue API proposal from the in-band sourcing of
EMSG could be a way to make progress.
… DataCue is useful without that. Let's follow up
that.
Codec information
Chris: This seems like
a straightforward API proposal to be able to query the codec
information of the media segment.
… It's related to the proposed MSE introspection
API, there's a comment in issue 259
Issue 259 - Consider adding support for apps to get metadata about what is currently buffered
Chris: But maybe this
could be progressed independtently instead of being bundled with
the question about buffer capacity?
… I'd recommend commenting in issue 259 to follow
up.
Dynamic addition/removal of Source Buffers
Chris: I found one related issue, 160, but not sure if that covers all the issue around this.
Chris: We discussed last time there's some part of the API not necessarily implemented that would enable these use cases. Need to look more closely into it.
Daniel: DASH has
multiperiod MPDs. Some periods might not have an audio track, so we
need to transiton between audio+video and audio only.
… Or similarly if a period has audio but no video
track.
… A workaroound is to fill in dummy audio or video
data. Or in dash.js we reinitialise MSE, which can introduce pauses
or gaps.
… Use case is server guided ad-insertion, and
transitons from manifest to manifest, there may be overlapping
segments that need handling.
… If you can switch between two media sources,
could go in the similar direction.
Chris: Media over QUIC
vs DASH for low latency. DASH delivery is important
… continuing DASH playback
… I'm not sure if there is any written-down use
case about multi-period
Daniel: I could try to create an issue
Chris: Please do, something in the MSE issue tracker, let's follow up
Multiple Source Buffers
Issue 357 - Proposal: Have a detachable MediaSource object
Chris: Similarly, I'm not sure the use case is fully covered by this proposal
Daniel: segments might be overlapped
Chris: (goes through
Issue 357)
… detach a MediaSource from a media element
temporarily
… Maybe a new issue is needed for this
point,
… I didn't see another issue in the MSE repo around
your use case.
Kaz: Next steps? Create
issues or comments on the existing issues is a key question
… We can talk about additional issues next
time
Chris: Daniel and I can
follow up on the MSE issues, and with Rob on DataCue
… Our next meeting will be held on Feb
4.
[adjourned]