14:55:45 <RRSAgent> RRSAgent has joined #me
14:55:45 <RRSAgent> logging to https://www.w3.org/2021/04/19-me-irc
14:55:51 <Zakim> Zakim has joined #me
15:01:22 <kazho> kazho has joined #me
15:02:10 <Andy> Andy has joined #me
15:02:19 <cpn> Meeting: Media Timed Events / DataCue
15:02:24 <cpn> Chair: Chris
15:02:28 <cpn> scribenick: cpn
15:02:31 <RobSmith> RobSmith has joined #me
15:03:00 <cpn> present: Andy_Rosen, Rob_Smith, Chris_Lorenzo, Kaz_Ashimura, Kazuhiro_Hoya
15:03:21 <cpn> present+ Iraj_Sodagar
15:03:30 <cpn> present+ Nigel_Megitt
15:03:34 <dsinger> dsinger has left #me
15:04:00 <kaz> present+ Chris_Needham
15:04:07 <cpn> Agenda: https://www.w3.org/events/meetings/2b88a9a9-b1bc-463e-973f-018e98cb1558/20210419T160000
15:04:08 <kaz> rrsagent, make log public
15:04:14 <kaz> rrsagent, draft minutes
15:04:14 <RRSAgent> I have made the request to generate https://www.w3.org/2021/04/19-me-minutes.html kaz
15:05:14 <cpn> Topic: TextTrackCue end time
15:05:55 <cpn> Rob: It was proposed at TPAC Lyon, steadily progressing. Things have accelerated thanks to Gary who pointed out that the WPT needed looking at
15:06:14 <cpn> ... I've written tests, proposed a change to WebVTT as well, as it inherits HTML TextTrackCue
15:06:27 <cpn> ... There are three pull requests ready to go
15:06:39 <cpn> ... Any final reviews, and then we're done
15:07:24 <cpn> Chris: Any indications of implementer support?
15:07:49 <cpn> Rob: WebKit interested, Eric has a use case where they want to use this, mentioned in the WebVTT pull request
15:08:20 <cpn> ... Discussion on whether there should be WebVTT syntax changes. The proposed change was minimal
15:08:41 <cpn> ... Need to validate, as NaN or -Infinity is not supported
15:09:05 <cpn> ... The VTTCue constructor is needed for WPT, because TextTrackCue doesn't have a constructor
15:09:36 <cpn> ... There's no support in WebVTT for unbounded cues, in WebVMT can omit the end time
15:10:23 <cpn> ... Simple example, in discussion with WebVT, a 0-0 game score, we don't know when it will change, and we don't know when the end of the game is
15:11:03 <cpn> ... so 0-0 could be an unbounded cue. Should it be handled in WebVTT syntax? What's the best way to do that?
15:11:17 <cpn> ... Driven by use cases, propose syntax changes
15:11:50 <nigel> q+
15:12:09 <cpn> Chris: Where to discuss, here or in TTWG?
15:13:01 <cpn> Nigel: Not clear we have a validated set of use cases yet. Either as this WICG group or MEIG, I think any validated use cases we can use to test proposed solutions would be helpful
15:13:16 <kaz> s/WebVT,/WebVTT,/
15:13:45 <cpn> ... Rob has put in detailed comments. Before changing the spec, which has some complexity, eg, identifying cues to update, across multiple documents
15:14:02 <kaz> q+
15:14:25 <cpn> ... Finding a simple solution that meets those requirements, so requirements as input to the process, real world use cases would be helpful
15:15:31 <cpn> ... So MEIG could make it easier for people to provide input
15:15:39 <cpn> Chris: Agree, we can do that under MEIG
15:15:41 <nigel> ack n
15:16:03 <RobSmith> Unbounded cue use cases and WebVTT syntax issue https://github.com/w3c/webvtt/issues/496
15:16:17 <cpn> Kaz: I'd like to remind you that MEIG used to produce use case documents, use the WoT template
15:16:48 <kaz> -> https://github.com/w3c/wot-usecases/blob/main/USE-CASES/use-case-template.md use case template (MD)
15:16:54 <cpn> Chris: Stakeholders?
15:17:17 <kaz> i|Stake|-> https://github.com/w3c/wot-usecases/blob/main/USE-CASES/use-case-template.html use case template (HTML)|
15:17:43 <cpn> Rob: Eric at Apple has an existing use case, I have WebVMT. Gary mentioned FOMS, cue update matching that Nigel mentioned - by start time and content
15:18:05 <kaz> s/use the WoT template/could use the use case template updated by the WoT IG/
15:18:09 <cpn> ... Need to relax the time ordering, can only update the end time to bound an unbounded cue. May be out of order if matching by start time
15:18:45 <cpn> ... Issue #496 discusses. For WebVMT, I've thought about the syntax, which could work well in WebVTT
15:19:08 <cpn> ... Keen to support in WebVMT as a recording format. Two proposals for cue matching: cue content, and cue identifiers
15:19:11 <nigel> q+
15:19:15 <kaz> ack k
15:19:39 <cpn> ... If you use start+content to match cues, can do across different WebVTT files. But it's repetitive as you need to repeat the content, and add the end time
15:20:03 <cpn> ... Using cue identifiers, which ties with the WebKit use cases, where you have a cue to be updated periodically, but you don't know when
15:20:10 <kaz> q?
15:20:32 <cpn> ... There's a sequence of cues, content modified at each step, involves no repetition. WebVMT does this for the interpolation scheme
15:21:23 <cpn> Nigel: It's really important to understand what problem we're solving before designing solutions
15:21:45 <cpn> ... The solution design belongs in TTWG, the requirements input we need to get right, as input to that
15:22:02 <nigel> ack n
15:22:17 <cpn> Rob: I disagree, I have the use cases for WebVMT
15:23:09 <cpn> Nigel: There are edge cases not fully explored, we need to understand other use cases, e.g., specifically for WebVTT
15:23:17 <cpn> ... Don't want to paint ourselves in a corner
15:23:34 <cpn> Rob: There are 4 use cases in the issue, please give feedback
15:24:08 <cpn> Nigel: Needs a validation exercise. People may not be aware that they need to give input
15:24:54 <cpn> Rob: Let's publicise it, there's some interest already
15:26:50 <nigel> q+
15:27:10 <cpn> Chris: What's the scope? WebVTT based formats, WebVMT, in-band emsg?
15:28:15 <cpn> Rob: Cue format is shared between WebVTT and WebVMT, some additional bits in WebVTT, VTTRegion isn't needed for metadata. The way the cue works is identical, so the solution could be shared between WebVTT and WebVMT
15:28:38 <kaz> q?
15:28:42 <kaz> ack n
15:29:04 <cpn> Nigel: Specifically, we need to hear from people using WebVTT in a fragmented MP4 context, who may want to use unbounded cues
15:29:31 <cpn> ... That explores the edge case of IDs not needing to be unique across multiple documents, and not depending on cue start time and cue text being unique
15:29:43 <Andy> HEY Iraj - if the end time is very large, then would a packager such as Rufael's need to repeat the payload inside LL fragmetns?
15:29:49 <cpn> ... Would be useful to try to engage those
15:30:49 <cpn> Andy: Hope Iraj can clarify. I remember a call between myself, Iraj, Rufeal had - when subtitles are long and going into a packager making segments, you end up repeating the IMSC payload in each chunk
15:31:15 <cpn> ... If we start putting large end times in WebVTT payloads, issues with low latency?
15:32:09 <cpn> Iraj: I suppose that if the target is low latency delivery, durations for either IMSC or WebVTT, it doesn't add latency for decoding, so you get the target latency. It would be part of the encoding characteristics
15:32:24 <cpn> Andy: So we don't end up with the 0-0 score being repeated in every chunk
15:32:39 <cpn> Iraj: The main concern for low latency is other components than subtitles
15:33:20 <cpn> Nigel: You ask a good question, Andy. What happens if you miss the cue? If you also didn't see the beginning, how do you know it's supposed to be there?
15:33:35 <cpn> Iraj: I thought that part of webVTT design relies on repeating the cues
15:33:51 <cpn> Nigel: That's true for fragmented TTML and IMSC design. There's a clear model for how it works
15:34:05 <cpn> ... The WebVTT model is different, not sure of the details
15:34:19 <cpn> Iraj: Is that WebVTT in fMP4 or as a side file?
15:34:46 <cpn> Nigel: I think both should be considered. The one where that needs a good answer is framgneted mp4
15:34:54 <cpn> ... Non MP4 delivery needs to work also
15:35:16 <cpn> Iraj: Isn't it possible to repeat WebVTT cues in fragmented MP4?
15:35:27 <cpn> ... I thought that was one of the key features of the design
15:36:10 <cpn> ... I was involved recently in event message tracks discussion, which was based on WebVTT. you repeat messages in case client missed earlier messages
15:36:45 <cpn> Rob: Talking about fragments, are they fragments of a single WebVTT file, or are they a sequence of consecutive WebVTT files?
15:37:28 <cpn> Iraj: In terms of fragmentation, assume you have a WebVTT. When you package it in ISO BMFF fragments, you break down the file into different cues. You put each WebVTT cue in an envelope as samples
15:37:46 <cpn> ... Each sample can have multiple WebVTT cues, and samples that are empty
15:38:14 <cpn> ... Each WebVTT cue is the same as in the WebVTT file, which defines start and end and text and positional information
15:38:45 <cpn> Rob: Are the VTT cues treated as being in a single VTT file, or as separate VTT files becuase they're fragmented
15:39:21 <cpn> Iraj: I assume when you get streams of VTT they come from the same file. I'd need to check. Not sure if you can multiplex cues from different files in the same fragment
15:39:45 <cpn> ... I thought the design is to break a file into fragments to package in to ISOBMFF, that's the scope of the spec
15:39:59 <cpn> Rob: Do you have an example you could add to the issue?
15:40:19 <cpn> Iraj: I'd suggest asking David Singer, as one of the editors of that spec
15:42:03 <cpn> Chris: DASH-IF events group looking at repeating emsg boxes
15:42:17 <cpn> Iraj: That's right, not just across fragments but also across periods
15:44:42 <cpn> Chris: So we could draft a use case doc based on the info in issue #496. Any volunteers?
15:45:16 <cpn> Rob: Happy to help.
15:46:17 <cpn> Nigel: Gary also could given input
15:47:15 <cpn> ... Suggest contacting Gary and Media WG chairs, ask for people to be involved
15:47:19 <RobSmith> Gary's comment on use cases: https://github.com/w3c/webvtt/pull/493#issuecomment-808411871
15:47:35 <cpn> Chris: Also WAVE?
15:47:56 <cpn> Nigel: Their interest may more in IMSC, they may not have an issue
15:48:02 <RobSmith> Eric's WebKit use case comment: https://github.com/w3c/webvtt/pull/493#issuecomment-808429391
15:48:05 <kaz> s/given input/give input/
15:49:56 <cpn> Chris: Support from Chromium to implement the TextTrackCue?
15:50:25 <cpn> Rob: We talked about doing myself, also Firefox, as they're open source
15:51:17 <cpn> ... We could ask someone at Google about their interest in implementing
15:52:44 <cpn> ... I opened issues in the various browser bug trackers.
15:53:13 <cpn> Chris: So next step is to reply to those issues saying we have spec PRs ready and WPTs
15:53:18 <cpn> Rob: I can do that
15:54:24 <cpn> Topic: DASH emsg in MSE
15:54:26 <cpn> Chris: https://github.com/WICG/datacue/issues/26
15:57:22 <cpn> Iraj: Depends on whether the emsg is v0 or v1. In v0 the event start time is signalled by an offset from the earliest presentation of the segment carrying the event
15:58:05 <cpn> ... With v0, you don't need any additional information. The timestampOffset is subtracted from the earliest presentation time of the segment, to give the location in the buffer
15:58:35 <cpn> ... The offset to the start time of the event, where it sits in the buffer, you just add the start time offset of the event. It's not the same buffer
15:59:53 <cpn> ... For v1, the reference time is the presentation start time, which MSE doesn't know. Similar to timestampOffset which is provided by the application, the UA can also provide a new offset, event timestampOffset, that value can be subtraceted to give the location of the start of the message
16:00:31 <cpn> Chris: We need to write the processing rules
16:01:30 <cpn> Iraj: Just came from a MPEG meeting. I'm writing a contribution with graphs that show how the timing works compared to the MSE, showing how all the offsets can be used
16:02:11 <cpn> ... It'll go to MPEG to be discussed
16:05:13 <cpn> Chris: That's exactly what we need, to answer questions from the MSE spec editors about emsg integration
16:05:21 <cpn> [adjourned]
16:05:24 <cpn> rrsagent, draft minutes
16:05:24 <RRSAgent> I have made the request to generate https://www.w3.org/2021/04/19-me-minutes.html cpn
17:28:31 <kaz> kaz has joined #me
17:37:46 <Zakim> Zakim has left #me