14:52:19 <RRSAgent> RRSAgent has joined #me
14:52:19 <RRSAgent> logging to https://www.w3.org/2021/06/28-me-irc
14:52:23 <Zakim> Zakim has joined #me
14:57:45 <cpn> present+ Chris_Needham, Kaz_Ashimura
14:58:21 <cpn> Agenda: https://www.w3.org/events/meetings/2b88a9a9-b1bc-463e-973f-018e98cb1558/20210621T160000
15:00:08 <kaz> present+ Kaz_Ashimura, Chris_Needham
15:00:26 <nigel> nigel has joined #me
15:01:20 <kaz> present+ Rob_Smith
15:04:42 <kaz> present+ Yasser_Syed
15:08:00 <cpn> scribenick: cpn
15:08:06 <cpn> Topic: Unbounded Cues
15:08:26 <cpn> Rob: Discussion is now on how to support unbounded cues represented through bounded cues
15:08:46 <cpn> ... A cue with start time and no end time would be ignored by engines that don't support unbounded cues. That's correct
15:09:18 <cpn> ... If you update the cue to have a bounded end time, you're effectively repeating it with a bounded cue. The engine can understand that but it may be late
15:10:13 <cpn> ... As an example, a cue from 1 minute to forever, then update it at 2 minutes to end at 3 minutes. Thats graceful degradation, best we can do
15:10:49 <cpn> ... Can we fill the gap, between 1 minutes and 2 minutes that makes the system display the cue earlier. Suggestion for the unbounded cue to be displayed forever, but how to end the cue?
15:11:31 <cpn> ... Whatever you do there is wrong. Looking at it differently, we know how it degrades when displayed late, but what you want is a bounded cue that runs from 1 minute to 2 minutes
15:12:02 <nigel> Present+ Nigel_Megitt
15:12:13 <cpn> ... I think it's an insoluble problem. Either update the engine to understand unbounded cues or construct your WebVTT file using only bounded cues, chop it into sections
15:12:47 <cpn> ... The bounded cue engine would understand that, but it requires the VTT file to be authored in that way. Requires knowledge of the receiver capability
15:13:23 <cpn> ... Don't see how to have the unbounded cue display instantly, the only thing to do at that point is to display forever, which may not be right
15:13:43 <nigel> q+
15:13:56 <kaz> present+ Iraj_Sodagar
15:14:16 <cpn> Chris: Other than diverging WebVMT from WebVTT
15:14:52 <cpn> Rob: Puts big overhead, if you can modify existing bounded cues, so need to be able to identify the cue to go back and modify later
15:16:02 <cpn> Nigel: We discussed in last week's TTWG call about the model. There may be something relevant there: to create an external construct, e.g., a chapter that begins and you don't know when it ends
15:16:53 <cpn> ... Add to some construct into an entry with start and end time. This could be adapted here? It's just the start of an idea. I recommend looking at the minutes from that meeting, and considering the data model
15:17:23 <cpn> ... Want to be clear about the semantic model about the cue times, with other entities you may want to model that have a different lifecycle. Could be a way of splitting those out
15:18:09 <RobSmith> RobSmith has joined #me
15:18:23 <nigel> -> https://www.w3.org/2021/06/24-tt-minutes.html#t02 Minutes from last week's TTWG meeting
15:19:45 <cpn> Chris: Let's organise a VTT specific follow up meeting, as we have that and DASH emsg events to look at
15:19:54 <cpn> Topic: DASH emsg and MSE
15:23:25 <nigel> ack ni
15:23:27 <nigel> q+ to check if the cardinalities on the first diagram are really 1..* or could be 0..*
15:31:54 <cpn> Chris: [talks through MSE and video media timeline]
15:32:15 <cpn> Iraj: Why not deliver all components via MSE? Easier to manage the timeline buffer consistency between them.
15:33:06 <cpn> Nigel: Would make sense and be simpler overall. If you're watching a live stream, the memory for storing the text tracks could extend forever. If you did it with MSE, it would be clearer what the timeline is, remove things no longer needed
15:33:29 <cpn> ... Seeking back through texttrack cues would work in the same way. Seems strange now that you can't do this
15:34:53 <cpn> Chris: For those browsers that support inband captions in MSE, I don't know what happens when you remove an MSE buffer range
15:35:21 <cpn> Nigel: Nothing. There's a cue API you can use to remove captions from the text track
15:36:47 <cpn> Iraj: Synchronization and data management. The texttrack exists forever,and MSE has a limited buffer size. The timeline alignment between texttrack and the video?
15:37:02 <cpn> Nigel: Synchronization is less of an issue than how you provide the data
15:45:13 <nigel> scribe: nigel
15:45:35 <nigel> Chris: [talks through algorithm to define cue handling for emsg DataCues]
15:45:59 <nigel> .. Have I described the equivalence correctly for step v?
15:46:22 <nigel> Iraj: Is there a separate mechanism for populating the text track cues?
15:46:26 <nigel> Chris: Two ways:
15:46:35 <nigel> .. It could be MSE extracting media events from the media.
15:46:45 <nigel> .. Or it could be the website populating the same track with MPD events.
15:46:57 <nigel> Iraj: Might be a problem, having those two possible mechanisms.
15:47:13 <nigel> .. The reason is that the equivalency rules talk about event message instances received through the same mechanism.
15:47:26 <nigel> .. So if there are inband events, then the equivalency rules apply to those event instances.
15:47:33 <nigel> .. They don't go across different delivery mechanisms.
15:47:43 <nigel> .. When you use text track cues to maintain the list of already despatched events,
15:47:56 <nigel> .. since that text track may get populated by another mechanism,
15:48:03 <nigel> .. there are inconsistencies possible.
15:48:16 <nigel> Chris: When I said "yes" I actually meant you could choose to do it, but you're not required to.
15:48:45 <nigel> .. In other words, an application could populate separate text tracks for inband vs MPD events.
15:49:00 <nigel> .. Then if we specify the equivalency rules as operating within a track, then that could work.
15:49:15 <nigel> Iraj: Yes, that would mean there is no confusion in the equivalency rules.
15:49:30 <nigel> .. One more possible issue, regarding the lifetime of text track cues, and there being no purging.
15:49:43 <nigel> .. When I wrote that document, there was an internal buffer for maintaining already despatched events.
15:49:57 <nigel> .. The question was raised: what is the lifetime of that table, how far back should it go?
15:50:02 <nigel> .. I said it was left to implementations.
15:50:12 <nigel> .. But then I thought the simpler model is the same as the length of the MSE buffer.
15:50:41 <nigel> .. In this case it seems to me that you are saying the lifetime of an event is forever.
15:50:50 <nigel> Chris: Yes ...
15:50:57 <nigel> Iraj: I need to check if that will cause problems.
15:51:09 <nigel> Chris: At the moment, removing from the source buffer is the application's responsibility.
15:51:18 <nigel> .. If the application says it wants to remove a time range that is in the past,
15:51:34 <nigel> .. the application could also inspect the text tracks to remove any events that lie within the same time window.
15:51:48 <nigel> .. We could leave it all to the application, and then the fact that the text track lasts forever
15:52:00 <nigel> .. maybe in practice does not matter because the application is going to remove cues
15:52:07 <nigel> .. so that the timeline matches the MSE buffers.
15:52:25 <nigel> Iraj: Yes. What's important is the consistency between the behaviour of different applications.
15:52:48 <nigel> .. So that when a content author provides data then they know that every UA's behaviour will be the same,
15:52:53 <nigel> .. even in random access seeking.
15:53:05 <nigel> .. If a browser keeps all events for all time, then all browsers should do it.
15:53:22 <nigel> .. Or if the retained lifetime is the MSE buffer's, then all browsers should do that,
15:53:29 <nigel> .. so that it is predictable in any given scenario.
15:53:47 <nigel> Chris: Yes. Do we think that having a model where removing a range of audio and video from
15:54:14 <nigel> .. MSE also removes the corresponding Text Track Cues in the same time range would be a good route?
15:54:20 <nigel> Iraj: It depends on the implementation requirement.
15:54:46 <nigel> .. If the MSE implementation is required also to maintain the buffer of Text Track Cues as internal book-keeping,
15:55:02 <nigel> .. that becomes very simple because the equivalency between the buffers is simple. The MSE keeps one single range.
15:55:11 <nigel> .. If we build this model that, in terms of supporting the events,
15:55:27 <nigel> .. the MSE needs to go through the Text Track Cues and maintain the despatched events in that Text Track,
15:55:45 <nigel> .. then we need to define consistent buffer management rules between MSE and Text Tracks (outside MSE).
15:55:51 <nigel> Chris: Yes.
15:56:08 <nigel> Iraj: I was a bit uncomfortable because of this hybrid model, where there has to be some correlation with MSE somehow.
15:56:18 <nigel> .. Does that mean that every deployment of browsers has to maintain that model?
15:56:28 <nigel> .. Or is it possible to instantiate MSE and not Text Track Cues?
15:56:54 <nigel> .. I believe we need to have a single model, either always handle events in Text Track Cues, or purely within MSE and not use Text Tracks.
15:57:26 <nigel> Chris: The way I'm thinking about this is to ask what should implement the requirements: fully in the browser, or partly
15:57:31 <nigel> .. in the browser and partly in the player?
15:57:54 <nigel> .. The browser could be specified to apply the processing rules to the indefinite buffer and then that would still be consistent,
15:58:08 <nigel> .. but not be complete in terms of what you're looking for, because the player would be required
15:58:19 <nigel> .. to remove the cues when you remove the media segments from the MSE.
15:58:33 <nigel> .. Another option is more along the lines of closer integration of the text tracks with MSE.
15:58:42 <nigel> q+
15:59:24 <cpn> Nigel: Could there be a programmatic way to link MSE to a text track and then have some required behaviour?
15:59:37 <nigel> i/Nigel: Could/scribe: cpn
15:59:59 <cpn> Iraj: I think that makes more sense than leaving it to the application. Then you can assume uniform behaviour between applications
16:00:21 <nigel> ack n
16:00:21 <Zakim> nigel, you wanted to check if the cardinalities on the first diagram are really 1..* or could be 0..* and to
16:01:22 <cpn> Chris: Could check with implementers on plans for VideoTrack, AudioTrack, TextTrack in MSE
16:04:20 <cpn> Iraj: With live content with segmented delivery of captions, there's no side-car for TextTrackCues unless you add significant delay? How does live streaming work with subtitles?
16:04:37 <cpn> Chris: The player requests the captions and uses the TextTrack API to add the cues
16:05:48 <cpn> Iraj: If MSE v2 supports event parsing, would also supporting captions make sense?
16:06:01 <cpn> Chris: I would think so, yes
16:06:27 <cpn> Topic: Next meeting
16:07:54 <cpn> Chris: Meet in 3 weeks, July 19th?
16:09:01 <cpn> Iraj: Need to discuss with MSE people
16:09:48 <cpn> Chris: I'll email the MSE editors
16:11:04 <cpn> ... Also arrange a follow up on WebVTT. I'll follow up with Gary and Rob to schedule a time
16:11:42 <kaz> rrsagent, make log public
16:11:47 <kaz> rrsagent, draft minutes
16:11:47 <RRSAgent> I have made the request to generate https://www.w3.org/2021/06/28-me-minutes.html kaz
18:07:39 <Zakim> Zakim has left #me
20:46:25 <kaz> kaz has joined #me