Meeting minutes
DataCue status update
Chris: Work has slowed recently, not clear what to do next
… Two main parts to the proposal: one DataCue API for timed metadata events, generic, store any object and trigger events during media playback
… Second part was for surfacing DASH emsg events through DataCue
… IF people want to progress the emsg part then we need additional contributors
… The work started in MEIG, with a presentation from Giri with ATSC and 3GPP requirements
… Needs input to develop the technical proposal
… Can't and shouldn't do this myself
Kaz: Can you share the existing resources?
Chris: The documents are linked from GitHub page: https://
<kaz> issue 21
<kaz> Explainer
<kaz> Requirements
Kaz: We might want to look for additional moderator?
Chris: Yes, also an editor who can help write documents, specifically the emsg integration with MSE
… But also for DataCue itself. Has value separate from emsg support
… Would like to progress that by itself
… Easier proposition, but needs editorial help
Kaz: So you can continue as moderator?
Chris: Yes
… We don't have many companies pushing for emsg support, so want to confirm the need for this
Kaz: Go back to the use cases and requirements and ask people their interest
Chris: Yes
<kaz> Use Cases
Francois: Also update the GitHub issues to say progress is blocked?
Chris: Could be a good thing to do next
<kaz> datacue issues
Chris: Will also talk with people at DASH-IF, they're working on interop issues around timed events
SEI events
Chris: Thank you to all who replied to the GitHub issues
… https://
<kaz> Issue 82 - Video SEI events
Chris: Explainer: https://
… and two open issues https://
… Let's discuss the open issues and decide how to update the explainer
… The goal is understand the use cases for SEI events and turn that into a technical API proposal
… One thing I would like to understand is if this is a new API proposal, or if it aligns with DataCue
<kaz> leonardo's repo - issue 2 - Interaction with Encrypted Media Extensions
<kaz> leonardo's repo - issue 3 - Timing accuracy and decode/presentation ordering
issue 2
<kaz> leonardo's repo - issue 2 - Interaction with Encrypted Media Extensions
Chris: On issue 2 (EME integration). This was mostly a clarification question, are the SEI events part of the encrypted bit stream?
… It seems so
… So would be good to describe in the explainer
Nigel: I agree, sounds like the only reasonable answer
<Zakim> nigel, you wanted to agree this must be the right answer
Chris: I recommend describing that in the explainer, to set the scope of the solution proposal
… So with EME it may not be possible to surface EME events, so worth clarifying in the explainer
Yuhao: Why can't we get the information. After decryption we have raw video frames
… After EME the video frame is handled by the rendering. So in this case maybe we can't get the information
… When using WebCodecs, how to handle? Does WebCodecs work with EME?
Chris: I don't think it does, so we don't have the same limitation
Yuhao: If we use WebCodecs, we need another way to do decryption, WASM or JS
Chris: With WebCodecs you would parse the video bitstream from the container in WASM or JS, then pass the video bitstream to WebCodecs
… If you're using encryption, that would have to be after parsing the container
Kaz: Let's clarify each use case, and what kind of framework and mechanism should be applied to each part for the expected service
… What kind of extension is expected here?
Chris: Yes, encryption is a point of detail really
Yuhao: Should we look at the media handling to see if we can get the video bitstream?
Chris: It's worth looking, yes. But unsure it's possible
Yuhao: I can spend some time on that. Suppose we can get the information after the EME pipeline, and we can still get SEI information?
Chris: Let's update issue #2 with that information, then use this to write a short description into the explainer
Issue 3 (timing accuracy and ordering)
Chris: https://
Chris: I notice that we said we want to have access to SEI events in decode order rather than presentation order
… Looking at WebCodecs issue https://
… This is the related issue for WebCodecs
… We should bring our requirements to this GitHub issue
… The WebCodecs spec describes that VideoFrames are output in presentation order https://
… So if want the SEI events in decode order, how does that affect the API we propose?
… If we want SEI events in presentation order, we can propose to attach them to VideoFrame
… But if we want them in decode order, we may need an event handler so the VideoDecoder can surface the events earlier
… The explainer doesn't describe WebCodecs currently
… https://
… So should we add it? Or should we focus on the video element?
Yuhao: I worked with WebCodecs recently. In some cases it's a better way to get the information from the media stream
… And it works well with WebRTC, so it should be considered
Nigel: Would it make sense to require WebCodecs to fire these events if it sees them?
… It would know how to fire them in the right order. If WebCodecs decodes the video it should provide this functionality
… You want anything that's decoding video to do it, somehow. But may be harder to define in the general case
Chris: Do we need to write something in the explainer, or do we simply add our input to https://
… The explainer could reference the GitHub issue
Francois: What I'd like to see is more a consideration of the synchronisation needs that the use cases have
… If we're talking about frame accuracy, are events a good approach?
… The explainer has AI based subtitles for a live stream, which might not require frame accuracy
… If you use SEI metadata for volumetric video where you need strong synchronization between SEI metadata and video, then events aren't going to work
Chris: Issue 3 talks about timing requirements: https://
Francois: Events could be good enough for 100 ms
<xfq> +1 to tidoust
Chris: Do we have a use case that requires more timing accuracy?
Yuhao: In my use case, don't need to render the SEI information on the exact frame where the SEI is payloaded. It can stand a 100ms tolerance
… If we really want to synchronise the frame with SEI, we can use WebCodecs with Canvas, to really control the frame to render
… The SEI can be a property on the VideoFrame
… I can see use cases such as video editing, where the need for accuracy is high
… Seeking a video element isn't as accurate as needed. So use WebCodecs and Canvas to really control rendering
… But SEI event may not be for this case
… The proposal is intended to be easy to use. Calculate the end to end latency. So I don't need to match SEI with the video frame
… Just need as exact as possible, also the presentation and decode order may not be so important
Francois: It could be useful to describe that the explainer is for the loose synchronization use cases, and WebCodecs is for high accurate synchronization
Kaz: Agree to work on expectations for use cases and actual need from services, then look at potential application later
… One potential use case that might require precise synchronization could be interactive multimodal avatars
… Responding to speech in a realtime manner, which needs low latency
… Clarify if we need to handle such advanced use cases
Rob: The video editor is interesting, no real time element, but you want precision to associate with video frames
… Regarding loose vs tight sync, we're looking at this in WebVMT. Location is a low sync use case, but orientation is a tight sync use case
… If you turn the camera around, the video frames change very quickly. AR use cases. Not sure what solutions are being used in AR
… A mobile phone streaming video, with a website that overlays information using AR
<Zakim> nigel, you wanted to ask about sync of message vs eventual correctness
Nigel: With applications that are doing something with metadata in realtime, human sensitivity to the timing may be higher when doing video editing
… In video editing you need to know the timing. When you pause you want the correct state for the video
… frame that you're paused on
<RobSmith> Accuracy vs latency
Nigel: Synchronisation of when messages are fired, so how quickly you can respond, then how quickly the state is updated
… For example if events are fired 120 ms behind live playback, then you pause. You want the events to come out correctly, so that the view you end up with is the correct one for that frame
Chris: Do we have that captured in a doc already? If not we should add to this explainer
Nigel: In a real world application for authoring captions and subtitles
Chris: Helpful to put application description into the explainer
RobSmith: I created a demo with video on two smartphones, then used WebVMT to sync them, then had to deal with latency in the browser
… Multiple cameras observing the same scene
Chris: For the explainer, describe the synchronization goals
… If that's loose sync, we should look at potential alignment with DataCue, which would already gives loose sync
… Applications can use VTTCue today, so can be a way to prototype
<RobSmith> WebVMT video sync demo: https://
Chris: I'll add a GitHub issue about DataCue, let's discuss there
… Use an existing mechanism if we can, or avoid proposing multiple solutions to similar problems
Kaz: Will we try to generate some concrete use case description, or continue discussing the explainer
Chris: I think the use cases could have some more detail
Takio: There other mechanisms such as ID3, video.js can use that. Recommend clarifying use cases, then do a gap analysis
Chris: I agree
Next meeting
Chris: Next planned meeting is March 21. Should we meet earlier?
… Can talk about plan, once we have the explainer ready
<kaz> [adjourned]