14:55:51 RRSAgent has joined #me 14:55:51 logging to https://www.w3.org/2022/05/16-me-irc 14:55:59 Zakim has joined #me 14:56:00 xfq has joined #me 14:56:13 Meeting: Media Timed Events 14:56:23 Chair: Chris_Needham 14:57:03 Agenda: https://www.w3.org/events/meetings/8a8b3816-aa7e-4f6f-8e84-1875fb98d85c 15:04:08 present: Yuhao_Fu, Fuqiao_Xue, Rob_Smith, Kazuyuki_Ashimura 15:07:32 scribe+ cpn 15:07:46 present+ Iraj_Sodagar 15:08:16 Topic: Proposed changes to DataCue explainer 15:08:29 Chris: https://github.com/WICG/datacue/pull/32/files 15:09:44 ... This proposes to separate into two parts. One is for DASH events, which is for browser-native handling of DASH emsg events 15:10:15 ... The second is purely about DataCue API itself 15:13:02 ... Separating allows us to pursue the two parts independently. Collaboration with DASH-IF suggested producing a polyfill implementation based on their event model 15:13:16 ... There would be more to do to figure out the details in an MSE based implementation 15:13:37 ... We don't yet know if any browsers are supportive of implementing native DASH event handling 15:14:26 ... One of the arguments I've heard is that parsing the media segments to extract events is not costly in terms of application performance, so could be left to the media player library or web application rather than the browser 15:15:01 ... We would perhaps need performance data to show there is a real need for browser native event handling 15:18:35 ... The intent is so we can progress DataCue by itself, and the reason for that is that even with application-level parsing, there's a need to trigger events at the right time during media playback 15:20:10 ... That's what DataCue provides. The proposal for native emsg handling needs to have other capabilities, such as subscribing to specific event types and setting dispatch mode 15:20:46 ... Does this approach sound OK? Open to alternative suggestions 15:21:56 Rob: Sounds sensible to me. My understanding is DataCue is a mechanism to present the raw data, hopefully it contains a label to indicate the type of data, but there's no action taken 15:22:44 ... DASH events requires some action by the UA, so I see they're parallel activities. Simplest to exposed the data, then the second part talks about processing 15:25:40 Yuhao: Question about parsing the container to create DataCues. If I want to use emsg, I need to parse the video myself, not get it from the video element? 15:29:23 ... If I use the appendBuffer API, does it mean the browser would get the fMP4 data and parse the emsg box and create the DataCues internally? So if I want to parse the container myself and prevent the browser from creating DataCues, how to do that? 15:31:03 Chris: Good question. I don't know if we considered that yet. A way it could work is to use a subscribe/unsubscribe API. The idea is that if your app subscribe to certain events, then you're asking the browser to handle those events 15:31:25 ... When you subscribe you'd pass in an event type identifier 15:32:38 Yuhao: So you'd subscribe, and use addTrack from the application side. So I can still get an add-track event. If I invoke the TextTrack's addTrack, the listener will trigger 15:33:29 RobSmith has joined #me 15:33:47 Chris: This is an early draft, I think those details would need to be figured out 15:34:42 Yuhao: How can I tell if the add-track is caused by being added manually, or from the browser automatically generating the track? May be not for now... 15:35:44 Chris: We should review based on the DASH-IF processing model too 15:37:10 ... We need DataCue as otherwise you have to use VTTCue, which is for captions, and doesn't allow you to store arbitrary data structure 15:37:46 ... VTTCue can be used as a workaround today for application-level timed metadata events 15:38:59 ... The DataCue API itself is proposed as the same as WebKit implementation, we think has everything needed 15:39:33 ... It has 'value' and 'type' fields 15:40:36 ... Proposal is to deprecate the 'data' ArrayBuffer in favour of the new 'value' field 15:42:16 Iraj: If the separation makes it easier for browser vendors to support, it's a good thing 15:44:54 ... Need also to discuss MSE handling of events. But should DataCue need extension in future if browser handling of emsg events? 15:45:28 +1 keep it simple 15:45:41 ... Anything that lowers the bar is a good thing. Even if the first version of DataCue is simplified, and needs further extension, that's OK 15:46:35 Chris: The DASH events proposal should cover that, I think 15:47:17 Iraj: Can we map the event timing model? Only thing maybe is whether you need a subscription model, or application can request all events coming in? 15:48:14 ... In terms of other timing information, if the cue has start time and duration, may need to look at that 15:48:22 Chris: Timing requirements for emsg? 15:49:31 Karen has joined #ME 15:49:38 Iraj: The two dispatch modes, on-receive and on-start. For an MSE integration, both are needed. On receive mode is there for the application to process the event ahead of time, so it's ready when the data becomes active 15:49:46 ... Is that valid in the DataCue model? 15:51:22 Chris: DataCue only gives on-start processing, so on-receive handling is done by the media player or web app 15:52:03 ... Timing accuracy requirements for on-start processing? 15:53:39 Iraj: Every timed element, start time and duration, uses a timescale (in ticks). Both are integers. Those are used in metadata tracks. Double floating point for deciding the time, so you can do fractions, should be OK 15:54:14 ... You can achieve frame accurate events with the media timeline with those data types. 15:55:35 Chris: This is where it gets more difficult, JS execution is not closely tied to media playback 15:56:48 Iraj: Two questions: can the API definition support it, and can implementations support frame accurate? So maybe some platforms such as DVB or HbbTV so could define if they need that 15:59:21 Chris: On the first question, the time precision should be enough. But on the second, the HTML spec may not allow frame accurate rendering 15:59:39 ... If frame accuracy is needed, we should look at those use cases more closely and propose a different rendering model 16:00:23 ... Do we need to redefine our goals? 16:01:47 Rob: I'm involved in a geo-aligment activity in Immersive Web, which considers not only where you are, but where you're looking. Needs frame accuracy. Need to show where you're looking in a video stream for AR applications 16:02:41 Chris: Should we join up with others in that discussion? 16:03:59 Rob: Yes, but it's early. Geo-alignment was being looked at in the Immersive Web group a few years ago. There's been a shift away from AR to VR, so a move to drop the geo-alignment 16:04:09 ... In OGC, there's a group doing geo-pose with 6DoF for AR, and new activity with W3C just starting 16:07:57 ... I made a short video that syncs video with location for a street drone car. The video is up on YouTube 16:08:38 OGC T17 Moving Features autonomous vehicle analysis: https://youtu.be/-BjeAp_hgQc 16:08:54 Topic: Next meeting 16:09:01 Chris: June 20 16:09:08 [adjourned] 16:09:14 rrsagent, draft minutes 16:09:14 I have made the request to generate https://www.w3.org/2022/05/16-me-minutes.html cpn 16:09:18 rrsagent, make log public 16:35:54 Karen has joined #ME 18:35:03 Karen has joined #ME 19:38:07 Zakim has left #me