16:17:35 RRSAgent has joined #me 16:17:35 logging to https://www.w3.org/2021/03/22-me-irc 16:17:47 Meeting: WICG DataCue 16:17:53 Chair: Chris 16:18:00 scribenick: cpn 16:18:53 Present: Kazuhiro_Hoya, Iraj_Sodagar, Zack_Cava, Chris_Lorenzo, Alicia_Boya, Andy_Rosen 16:19:36 Topic: MSE ISO BMFF byte stream format spec 16:20:10 Chris: Two main parts: init segments and media segments 16:22:07 ... I notice the init segment section may need to change for CMAF. Don't need to cover that in this meeting 16:23:08 ... In the media segments section, I have added proposed text for emsg boxes 16:23:34 https://docs.google.com/document/d/1J3QtUa0udRycz1u-B3QtVDmTQ8F-sotcPVVYnqBpfFw/edit 16:24:25 Chris: Last time we talked about the append error algorithm. This doesn't cover what to do when boxes are not well formed. Do we need to add this? 16:24:35 Alicia: If it's non-obvious, we should mention explicitly 16:25:34 Chris: The steps here talk more about the media samples. I don't see description of well-formedness for other boxes 16:26:29 Iraj: If DASH.js could benefit from more specific instructions on validation, we can suggest to DASH-IF. 16:27:13 Chris: Cyril is looking at the byte stream format spec, to align it with what browsers actually implement. We can ask his input 16:27:37 Action: Chris to ask Cyril about ISO BMFF box parsing and the MSE append error algorithm 16:27:55 Topic: Mapping emsg events to the media timeline 16:28:42 Iraj: DASH events timing model is described here: https://dashif.org/docs/EventTimedMetadataProcessing-v1.0.2.pdf (section 3) 16:29:55 ... The PeriodStart, SegmentBase values are known to the web app (parses the manifest). MSE has no knowledge of these 16:30:28 ... MSE doesn't have the manifest, so the web app needs to provide the values. the emsg@ values are in the event message box 16:30:48 ... LAT is the latest arrival time, presentation time of the segment containing the emsg box 16:31:04 ... There needs to be an initialisation setup to set these values for the application, that provides MSE the values needed to calculate the event time 16:31:19 Chris: MSE has two append modes: sequence and segments 16:31:55 Iraj: sequence mode always appends to the end of the append window buffer 16:32:32 ... segments mode puts the segments on the media timeline based on the earliest presentation time of segment. This mode is used more frequently 16:32:47 ... media segments also have presentationTimeOffset 16:33:15 ... MSE not aware of presentationTimeOffset. It has timestampOffset. When you set the timestampOffset and provide a segment, it calculates the segment presentation time based on earliest time. So in DASH.js, each time you add the segment, ensure the timestampOffset is correct 16:33:25 ... The same model can be applied to event message processing. 16:34:37 ... We need some equivalent of timestampOffset for events, so that when MSE extracts and parses an emsg box, it can put in the timeline 16:35:26 ... The application can use Equation 1 in the DASH events timing model document to calculate what the timestamp offset will be 16:36:26 Chris: Is that the same timestampOffset as in the MSE spec (on SourceBuffer)? 16:36:43 Iraj: Need to think about it, it could be an event message specific timestamp offset 16:37:37 Action: Iraj to investigate if timestampOffset is applicable for emsg box timing calculation 16:38:33 Topic: Range removal algorithm 16:39:06 Chris: MSE SourceBuffer has a remove(startTime, endTime) method 16:41:38 ... What should this do with events? If an event starts and ends within the time range being removed, it would seem obvious to remove that event 16:41:58 ... But what if the event duration overlaps the start or the end time being removed? 16:42:44 Andy: For timed metadata tracks, rufael suggested that for long message strings (IMSC across segment boundaries), repeat them. Does this apply to embedded metadata? We have the same thing appearing at multiple times? 16:42:52 Iraj: Does MSE need to know about event duration? Or is event duration a parameter relevant to the application? 16:43:11 Alicia: Yes, it would need to know, as the cue has start and end times 16:43:55 Iraj: The duration wouldn't be a property relevant to the app. The reason is that MSE has to dispatch the event to the app 16:44:29 ... Two modes: on-receive, on-start. In on-receive mode, when the segment is parsed and and emsg is seen, MSE dispatches it to the app 16:44:45 ... Equivalency rule aside, at that point MSE isn't concerned about the event any more 16:45:08 ... Same for on-start mode. When an event is received to be dispatched on-start, MSE needs to calculate start time of event and put it in the append window so it's dispatched at the event start time 16:45:53 Zack: But that's at a higher level than MSE, here we're managing the buffer of events 16:46:29 ... As events are made known to MSE, they're being surfaced as cues. As media playback proceeds, the cues are set to active or inactive by the media player 16:47:03 ... MSE doesn't need app specific context of what the event stream is, but it does need to manage events on the timeline 16:47:36 ... How does it manages the persistency model? 16:48:05 ... To Chris's question, I believe when the totality of the event region is no longer held in the buffer it can be removed 16:48:18 ... If an event spans multiple segments, you'd want to keep it. If it were an MPD event, it's independent of underlying segmentation. Similar should apply to emsg events 16:48:43 Iraj: This is what's not clear to me. If a media segment carries an emsg box and the app adds to the MSE buffer in the append window, 16:48:55 ... and the append window gets shorter, does MSE keep the event? When you lose the segment, you also lose the events carried in the segment? 16:49:33 Zack: From what MSE models, which is the media byte stream, it doesn't need to know the segmentation. After the data is parsed out, events are anchored to the media engine's timeline 16:50:10 ... If you logically remove the segment time, remove the event as long as the message was completely contained. If it extended past, it would remain 16:50:50 ... [Missed something related to the event equivalency logic] 16:51:06 Iraj: Media samples are active for durations. You can change or purge the append window. Move the append window based on sample time 16:51:14 ... Event message boxes can point to something in the future. When you purge the media samples related to the event, do you keep the event? 16:51:32 If you keep the event, you can have consistent rules in terms of event repeats 16:51:42 s/If/... If/ 16:52:22 ... Problem: if the segment is put back in the MSE buffer because the samples don't exist and the app extends the append window to cover the range to cover seeking operations, and then you receive an other emsg box, how do you know if it's a repeat of the emsg boxes in the buffer? Because, conceptually the new segments may have new media samples, or even different event messages 16:52:43 Zack: In this case you're appending another emsg - scheme, start time, duration. These would be equivalent to an existing one in memory 16:52:55 ... Another interesting case: what if the emsg is totally in future, no overlap with media timeline? 16:53:18 Iraj: Need to look in more details, about this and purging of events 16:53:54 Alicia: The emsg are optional, what could be interpretation of sending an emsg box followed by a fragment that defines an event 0-20 seconds. First fragment covers 0-10 seconds. Later append another fragment, would it preserve the event? 16:54:09 Iraj: In this case it replaces them 16:54:43 ... MSE has a specific time range that media samples exist in the time range. Can define the same type of buffer for events. Question is how is that buffer managed? 16:55:14 Action: Chris to capture these questions as GitHub issues for follow-up 16:55:33 Topic: MSE vs app-level emsg parsing 16:55:42 Alicia: What's the point of having in-band or just have an API to add cues from the app. Why have a browser emsg parsing? 16:56:11 ... Applications are already parsing media segments 16:56:34 Iraj: We want to avoid all top-level parsing. Can we can develop a model where MSE handles everything? 16:56:51 Zack: It needs to be solved in an interoperable way, regardless of where it gets implemented 16:57:33 ... My reason for pushing it to MSE is that we need to get to a world where the app doesn't touch the bytestream. With ultra-low latency playback, any touching of the media gives higher latency 16:58:34 ... Removing any touching of the media by JavaScript (see also: appendStream proposal), will really help performance of device implementations. It's not necessarily a win on desktop browsers, but it is for home media devices 17:00:26 Iraj: Interop will benefit All MSE based implementations 17:00:53 ... We need to solve the MSE parsing to see if it's feasible. If it's not, leave to the application 17:01:26 Andy: What are the consequences of this? How to convey these decisions to DASH.js? Does it go through Events TF? DASH.js will need to be updated 17:02:06 Iraj: If this gets implemented in MSE, then DASH players can update. There's a similar issue with DASH.js handling of changeType() 17:02:20 Topic: Next meeting 17:02:40 Iraj: In 2 weeks? 17:03:08 Chris: Easter holiday, so could meet the following Monday 17:03:18 ... So April 12th at 8am Pacific 17:03:23 [adjourned] 17:03:32 rrsagent, draft minutes 17:03:32 I have made the request to generate https://www.w3.org/2021/03/22-me-minutes.html cpn 17:03:38 rrsagent, make log public 17:04:21 present+ Chris_Needham 17:04:23 rrsagent, draft minutes 17:04:23 I have made the request to generate https://www.w3.org/2021/03/22-me-minutes.html cpn