Media Timed Events – 16 May 2022

Meeting minutes

Proposed changes to DataCue explainer

Chris: https://github.com/WICG/datacue/pull/32/files
… This proposes to separate into two parts. One is for DASH events, which is for browser-native handling of DASH emsg events
… The second is purely about DataCue API itself
… Separating allows us to pursue the two parts independently. Collaboration with DASH-IF suggested producing a polyfill implementation based on their event model
… There would be more to do to figure out the details in an MSE based implementation
… We don't yet know if any browsers are supportive of implementing native DASH event handling
… One of the arguments I've heard is that parsing the media segments to extract events is not costly in terms of application performance, so could be left to the media player library or web application rather than the browser
… We would perhaps need performance data to show there is a real need for browser native event handling
… The intent is so we can progress DataCue by itself, and the reason for that is that even with application-level parsing, there's a need to trigger events at the right time during media playback
… That's what DataCue provides. The proposal for native emsg handling needs to have other capabilities, such as subscribing to specific event types and setting dispatch mode
… Does this approach sound OK? Open to alternative suggestions

Rob: Sounds sensible to me. My understanding is DataCue is a mechanism to present the raw data, hopefully it contains a label to indicate the type of data, but there's no action taken
… DASH events requires some action by the UA, so I see they're parallel activities. Simplest to exposed the data, then the second part talks about processing

Yuhao: Question about parsing the container to create DataCues. If I want to use emsg, I need to parse the video myself, not get it from the video element?
… If I use the appendBuffer API, does it mean the browser would get the fMP4 data and parse the emsg box and create the DataCues internally? So if I want to parse the container myself and prevent the browser from creating DataCues, how to do that?

Chris: Good question. I don't know if we considered that yet. A way it could work is to use a subscribe/unsubscribe API. The idea is that if your app subscribe to certain events, then you're asking the browser to handle those events
… When you subscribe you'd pass in an event type identifier

Yuhao: So you'd subscribe, and use addTrack from the application side. So I can still get an add-track event. If I invoke the TextTrack's addTrack, the listener will trigger

Chris: This is an early draft, I think those details would need to be figured out

Yuhao: How can I tell if the add-track is caused by being added manually, or from the browser automatically generating the track? May be not for now...

Chris: We should review based on the DASH-IF processing model too
… We need DataCue as otherwise you have to use VTTCue, which is for captions, and doesn't allow you to store arbitrary data structure
… VTTCue can be used as a workaround today for application-level timed metadata events
… The DataCue API itself is proposed as the same as WebKit implementation, we think has everything needed
… It has 'value' and 'type' fields
… Proposal is to deprecate the 'data' ArrayBuffer in favour of the new 'value' field

Iraj: If the separation makes it easier for browser vendors to support, it's a good thing
… Need also to discuss MSE handling of events. But should DataCue need extension in future if browser handling of emsg events?

<RobSmith> +1 keep it simple

Iraj: Anything that lowers the bar is a good thing. Even if the first version of DataCue is simplified, and needs further extension, that's OK

Chris: The DASH events proposal should cover that, I think

Iraj: Can we map the event timing model? Only thing maybe is whether you need a subscription model, or application can request all events coming in?
… In terms of other timing information, if the cue has start time and duration, may need to look at that

Chris: Timing requirements for emsg?

Iraj: The two dispatch modes, on-receive and on-start. For an MSE integration, both are needed. On receive mode is there for the application to process the event ahead of time, so it's ready when the data becomes active
… Is that valid in the DataCue model?

Chris: DataCue only gives on-start processing, so on-receive handling is done by the media player or web app
… Timing accuracy requirements for on-start processing?

Iraj: Every timed element, start time and duration, uses a timescale (in ticks). Both are integers. Those are used in metadata tracks. Double floating point for deciding the time, so you can do fractions, should be OK
… You can achieve frame accurate events with the media timeline with those data types.

Chris: This is where it gets more difficult, JS execution is not closely tied to media playback

Iraj: Two questions: can the API definition support it, and can implementations support frame accurate? So maybe some platforms such as DVB or HbbTV so could define if they need that

Chris: On the first question, the time precision should be enough. But on the second, the HTML spec may not allow frame accurate rendering
… If frame accuracy is needed, we should look at those use cases more closely and propose a different rendering model
… Do we need to redefine our goals?

Rob: I'm involved in a geo-aligment activity in Immersive Web, which considers not only where you are, but where you're looking. Needs frame accuracy. Need to show where you're looking in a video stream for AR applications

Chris: Should we join up with others in that discussion?

Rob: Yes, but it's early. Geo-alignment was being looked at in the Immersive Web group a few years ago. There's been a shift away from AR to VR, so a move to drop the geo-alignment
… In OGC, there's a group doing geo-pose with 6DoF for AR, and new activity with W3C just starting
… I made a short video that syncs video with location for a street drone car. The video is up on YouTube

<RobSmith> OGC T17 Moving Features autonomous vehicle analysis: https://youtu.be/-BjeAp_hgQc

Next meeting

Chris: June 20

[adjourned]

– DRAFT –
Media Timed Events

16 May 2022

Attendees

Meeting minutes

Proposed changes to DataCue explainer

Next meeting

Diagnostics