14:53:41 <RRSAgent> RRSAgent has joined #me
14:53:41 <RRSAgent> logging to https://www.w3.org/2021/05/24-me-irc
14:53:45 <Zakim> Zakim has joined #me
14:53:56 <cpn> Meeting: Media Timed Events / DataCue
14:54:10 <cpn> Agenda: https://www.w3.org/events/meetings/2793e495-c4d0-4e3e-8e29-e302ff479154
14:54:19 <cpn> Chair: Chris
14:55:04 <cpn> scribenick: cpn
14:58:59 <nigel> nigel has joined #me
15:03:43 <nigel> Present+ Nigel_Megitt
15:04:01 <cpn> Present+ Chris_Needham, Rob_Smith
15:07:48 <RobSmith> RobSmith has joined #me
15:08:37 <cpn> Topic: TextTrackCue end time implementation
15:09:09 <cpn> Rob: I have detailed instructions on how to contribute to Chromium. How much support is already implemented?
15:09:33 <cpn> ... When I submitted the unbounded cues changes, I've run the WPT tests, and have figures for Firefox, Chrome and Safari
15:10:10 <cpn> ... There's a significant difference in number of tests that pass and fail between them. There's also a different number of tests for each of the browsers
15:11:07 <cpn> Chris: All the browsers should have TextTrackCue and VTTCue
15:11:46 <cpn> Rob: Do the tests do feature detection to enable or disable certain tests?
15:12:52 <cpn> ... I've contributed tests for unbounded cues and the VTTCue constructor
15:13:04 <kaz> rrsagent, make log public
15:13:06 <cpn> ... Three tests: two modified, one added
15:13:09 <kaz> rrsagent, draft minutes
15:13:09 <RRSAgent> I have made the request to generate https://www.w3.org/2021/05/24-me-minutes.html kaz
15:13:58 <cpn> ... WPT was easy to set up locally
15:14:28 <kaz> present+ Kaz_Ashimura, Iraj_Sodagar
15:14:41 <cpn> Chris: I think building needs a lot of disk space and time
15:14:51 <cpn> Rob: The steps are well described it seems
15:16:13 <cpn> Topic: emsg equivalency rules
15:16:16 <RobSmith> Unbounded cue Web Platform Test change details: https://github.com/web-platform-tests/wpt/pull/28394#issuecomment-814920479
15:16:19 <cpn> Chris: https://github.com/WICG/datacue/issues/28
15:17:47 <cpn> Nigel: From the last meeting, we discussed some specific questions to understand exactly how things are processed
15:17:57 <cpn> ... Chris's summary didn't match my understanding though
15:18:46 <cpn> Iraj: We use different terminology, so there may be a mismatch causing some confusion
15:19:19 <cpn> ... I may lack some understanding of TextTrackCue, but let's confirm so we can understand better
15:19:42 <cpn> ... [Reviews Nigel's question in #28]
15:20:11 <cpn> ... When you say a cue, is that one instance of an event?
15:20:38 <cpn> Nigel: That's right. There's an algorithm that runs all the time while media is playing, which processes the list of text track cues
15:21:05 <cpn> ... Whenever the playhead moves past the begin time of a cue, a onenter handler is run, and whenever it moves past the cue end time, there's a similar onexit handler
15:21:28 <cpn> ... Any state associated with a cue that applies during media playback has opportunity to change
15:21:46 <cpn> Iraj: When the playhead reaches the texttrackcue start time, what happens?
15:22:21 <cpn> Nigel: Within a short period of time after the begin time, a event handler is put on the JavaScript event queue, so there's a short delay before that gets executed
15:22:45 <cpn> ... The same thing happens at the end of the cue, in a separate onexit handler
15:22:48 <kaz> s/a event handler/an event handler/
15:23:29 <cpn> Iraj: Then a TextTrack contains multiple cues. There could be one or more active cues at any point in that timeline?
15:23:36 <cpn> Nigel: That's right
15:24:51 <cpn> Iraj: Should the new event at Tn+1 end the previous cue? It creates a new cue with start time?
15:25:26 <cpn> ... So in your question, the cue at Tn+1 defines the end time of the previous cue?
15:25:42 <cpn> ... This is how the two models differ from each other
15:26:02 <cpn> ... The three values we have, when they're equivalent, what does it mean for the application?
15:26:29 <cpn> ... Any cue with those three values are equivalent. They're not necessarily continuations. That's up to the application
15:26:42 <cpn> ... If one is processed, you don't need to process the next one
15:26:55 <cpn> ... Also the order is not important. If you process one, you don't need to process the next one
15:27:19 <cpn> ... If you miss one (e.g., doing random access), it's as if you processed it
15:27:38 <cpn> ... Do we need signaling of updating a cue - e.g., changing payload or end time?
15:28:20 <cpn> ... There is a mismatch. There's no explicit signaling for this kind of continuation update.
15:28:40 <cpn> ... The only thing the processing rules say is whether the cues are equivalent
15:29:16 <cpn> ... If at time Tn a cue is received, it's put in the TextTrack. If Tn+1 a cue with same 3 values is received, even with different payload, it replaces the previous one
15:29:59 <cpn> ... If the UA doesn't receive the emsg at Tn, instead at Tn+1, it creates the message as defined at Tn+1, becuse the UA doesn't have the history of the Tn event
15:30:16 <cpn> ... So you're example doesn't work in the processing model because the payload is different
15:30:59 <cpn> ... If we have two different versions of the emsg, e.g., using v1 that can signal events prior to arrival time. The event instance will be the same as in Tn
15:31:07 <cpn> ... If it has end time, there's no end time in either
15:32:00 <cpn> ... The second one could have a new end time. It could put a new end time at Tn+1, but what's important is that the application should consider that the event message at Tn+1 may not be processed, because Tn is processed
15:32:37 <cpn> ... If it needs updating, it should use a new id. In the payload it says it's an update of the previous message - so the application processes it as an update, not the UA
15:32:52 <RobSmith> q+
15:33:04 <cpn> Nigel: That helps, thank you. My confusion came from considering what happens if you rewind, should it recreate the state?
15:33:34 <cpn> ... The alternative algorithm: when you see a repeated cue with same 3 values, you just discard it
15:33:45 <cpn> Iraj: Yes, that's covered in the document
15:34:06 <cpn> ... It's a simple check. If it's not there, add the id to the table
15:34:59 <cpn> Rob: Thanks for the clear description. The scenario Iraj described sounds like unbounded cues and unbounded cue updates
15:35:39 <cpn> ... Some confusion - are you actually updating an existing cue? At the app level, you may think you are, but at the UA level you're just creating a new cue. That cue can be linked to an existing cue instance
15:36:23 <cpn> ... This is exactly what unbounded cues does
15:36:36 <cpn> Iraj: It's up to the application to do that interpretation, not the UA
15:36:43 <cpn> Rob: That's my understanding too
15:37:19 <cpn> Nigel: I thought the UA did process the cues in this model, some confusion
15:37:50 <cpn> Rob: I'm breaking down the unbounded cue use case, using only bounded cues, and describing why its equivalent
15:38:27 <cpn> ... It's the conceptual model that means you don't know (until later). I'll finish writing that, but it sounds like we're talking about the same thing here
15:39:02 <cpn> ... Not proposing updating bounded cues in any way. The only thing I propose adding to a cue is to change a previousnly-unspecified end time to be specified
15:39:13 <cpn> ... Infinity just means we don't know the end time
15:39:27 <cpn> Nigel: Two separate conversations: unbounded cues and emsgs
15:39:40 <cpn> Rob: There may well be a common solution for both
15:40:55 <cpn> Chris: So we can define processing rule to say that if a cue with same 3-values exists on the TextTrack, we can drop it
15:41:11 <nigel> q+ to ask if we actually must drop duplicates
15:41:30 <cpn> Iraj: We should develop the processing model for event messages, then see how well it maps to the TextTrackCue model
15:41:40 <cpn> ack RobSmith
15:42:28 <cpn> Rob: WebVMT has stateful transitions. Having developed that, I can see it maps to TextTracks. As a separate issue, if you want to update cues knowing the start time and payload, but don't know end time
15:42:50 <cpn> ... If at a later time you have a payload that coincidentally links the previous one, you don't have to link them
15:43:06 <cpn> ... Happy to help define the emsg processing model, it's a common problem for metadata
15:43:40 <cpn> ... For example, a sensor sample (speed, air quality, etc), and then you subsequently update it with a replacement value, the next in the sequence for that value
15:44:03 <cpn> ... We discussed at TPAC last year how to do interpolation between two consecutive samples
15:44:29 <cpn> ... This may be more relevant to emsg - or consider it as a step change: the previous value holds until the next message is received
15:45:26 <cpn> Chris: The use of interpolation is application specific
15:46:05 <cpn> Nigel: Coming back to the idea of if you see an emsg event with the same 3-values, you already have a cue for it, and you are allowed to drop it
15:46:51 <cpn> ... Do we need a stronger requirement, so you *must* drop it. Performance issue if there's lots of messages, each with an event handler, and they all get called
15:46:55 <nigel> ack n
15:46:55 <Zakim> nigel, you wanted to ask if we actually must drop duplicates
15:47:15 <cpn> Iraj: Chris asked about the duration of the equivalency rule. Two levels of answers for this.
15:47:40 <cpn> ... MPEG-DASH considers as an optimisation, the client may drop instances based on the 3-values, it doesn't have to
15:48:00 <cpn> ... For an MSE spec, we can make it required behaviour, so the app developer knows all browsers will behave the same
15:49:07 <cpn> ... There's a simple way of making it a 'must' requirement. Define dropping messages with same 3-values. Duration of equivalency buffer for event dispatch is length of the MSE media buffer
15:49:40 <cpn> ... Length of ids could be required to be the same. If the app developer knows the id happens 5 minutes before, and the buffer length is 5 minutes, it knows the id won't be in the dispatch table
15:49:57 <cpn> ... It could make the UA processing consistent.
15:50:17 <cpn> Nigel: Those two things seem to be separable to me
15:50:54 <cpn> Iraj: The reason they're tied together, you can't have an infinite length table. Consider a 24/7 live stream, the browser joins the stream at some point in time.
15:51:24 <cpn> ... It can't know what happened previously. It could pause, time passes, then join again, similar to skipping, and receive the same events again
15:52:05 <cpn> ... For all those behaviours, depends on the length of the table it keeps. Is there a minimum we can require the UA to keep? Could be the mininumu of the MSe media buffer
15:52:34 <cpn> ... If the web app seeks to a time prior to the media buffer, it needs to request the segment again and change the append window size
15:52:51 <cpn> ... It doesn't maintain any information on whether it's been done in the past, so treats this as a new segment
15:54:48 <cpn> Chris: MSE buffers and TextTracks currently are separate
15:55:02 <cpn> Iraj: Reason for describing this way is to add it to MSE v2
15:55:29 <cpn> ... One way to define is to define input to TextTrack from MSE, so the UA processes the cues and put into the TextTrack
15:55:38 <cpn> ... We assume the content is coming in segments in realtime
15:56:09 <cpn> ... With TextTrack, you could have the entire document. So you don't need the MSE model here
15:56:34 <cpn> ... Two cases: 1) MSE streaming, short window, segment based processing. 2) You have the entire presentation
15:56:49 <RobSmith> q+
15:56:50 <nigel> q+
15:57:52 <cpn> ... Chris: File based playback with VTT files
15:58:00 <cpn> s/... Chris/Chris/
16:00:06 <cpn> Chris: App level vs browser-level responsibility?
16:00:47 <cpn> Iraj: Is there a model right now for MSE playback mapped to TextTrack cues
16:01:04 <cpn> Nigel: Delivering TextTrack cues via MSE has been suggested, but not done yet
16:01:30 <cpn> ... Two points here. If you have a long-running stream, you don't want unlimited memory use from cues being added
16:01:45 <cpn> ... With DASH live streaming, there's a rewind window, so you don't want cues outside that period
16:02:16 <cpn> ... Also user experience and acquisition time. If the cue events are chapter markers, it would be strange if you got different chapter markers depending on how long you'd be watching for
16:02:28 <cpn> ... You'd expect to get all the chapter markers, then be able to seek
16:03:03 <cpn> ... If you have to frequently re-issue those, sending those over and over again, it works nicely with the idea of dropping stuff outside the buffer, as you don't need it
16:04:04 <cpn> Iraj: An example of that kind of application: thumbnail navigation. In DASH we have a specific representation. You download one piece of media with an array of images, parsed by application to build a timeline with thumbnails
16:04:25 <cpn> ... It provides ability to navigate without downloading the entire media, just download thumbnails
16:04:59 <cpn> ... I believe it's similar to a side-car file, download the whole file, but don't go through MSE to do that, because the content is too long you just have a short window in time.
16:05:14 <nigel> q-
16:06:00 <cpn> Iraj: Download the whole thing at once, parse timeline. It's used for seeking to points in the media timeline, navigation.
16:06:15 <RobSmith> q-
16:06:44 <cpn> Topic: Next steps
16:07:00 <cpn> Iraj: I'll be available June 21st or 28th
16:08:08 <cpn> Chris: Can we align the TextTrack based processing model?
16:09:05 <cpn> Iraj: Create a draft for MSE, and see how it maps to the DataCue
16:13:57 <RobSmith> I'd prefer 28th June if possible
16:16:55 <cpn> Chris: I'll try to write something to consolidate what we've discussed
16:17:32 <cpn> ... Thank you all, this discussion has to helped clarify
16:17:40 <cpn> [adjourned]
16:17:46 <cpn> rrsagent, draft minutes
16:17:46 <RRSAgent> I have made the request to generate https://www.w3.org/2021/05/24-me-minutes.html cpn
19:09:03 <nigel_> nigel_ has joined #me
20:13:54 <Zakim> Zakim has left #me
21:08:03 <nigel> nigel has joined #me
21:24:07 <kaz> kaz has joined #me