W3C

– DRAFT –
WICG DataCue

22 March 2021

Attendees

Present
Alicia_Boya, Andy_Rosen, Chris_Lorenzo, Chris_Needham, Iraj_Sodagar, Kazuhiro_Hoya, Zack_Cava
Regrets
-
Chair
Chris
Scribe
cpn

Meeting minutes

MSE ISO BMFF byte stream format spec

Chris: Two main parts: init segments and media segments
… I notice the init segment section may need to change for CMAF. Don't need to cover that in this meeting
… In the media segments section, I have added proposed text for emsg boxes

https://docs.google.com/document/d/1J3QtUa0udRycz1u-B3QtVDmTQ8F-sotcPVVYnqBpfFw/edit

Chris: Last time we talked about the append error algorithm. This doesn't cover what to do when boxes are not well formed. Do we need to add this?

Alicia: If it's non-obvious, we should mention explicitly

Chris: The steps here talk more about the media samples. I don't see description of well-formedness for other boxes

Iraj: If DASH.js could benefit from more specific instructions on validation, we can suggest to DASH-IF.

Chris: Cyril is looking at the byte stream format spec, to align it with what browsers actually implement. We can ask his input

Action: Chris to ask Cyril about ISO BMFF box parsing and the MSE append error algorithm

Mapping emsg events to the media timeline

Iraj: DASH events timing model is described here: https://dashif.org/docs/EventTimedMetadataProcessing-v1.0.2.pdf (section 3)
… The PeriodStart, SegmentBase values are known to the web app (parses the manifest). MSE has no knowledge of these
… MSE doesn't have the manifest, so the web app needs to provide the values. the emsg@ values are in the event message box
… LAT is the latest arrival time, presentation time of the segment containing the emsg box
… There needs to be an initialisation setup to set these values for the application, that provides MSE the values needed to calculate the event time

Chris: MSE has two append modes: sequence and segments

Iraj: sequence mode always appends to the end of the append window buffer
… segments mode puts the segments on the media timeline based on the earliest presentation time of segment. This mode is used more frequently
… media segments also have presentationTimeOffset
… MSE not aware of presentationTimeOffset. It has timestampOffset. When you set the timestampOffset and provide a segment, it calculates the segment presentation time based on earliest time. So in DASH.js, each time you add the segment, ensure the timestampOffset is correct
… The same model can be applied to event message processing.
… We need some equivalent of timestampOffset for events, so that when MSE extracts and parses an emsg box, it can put in the timeline
… The application can use Equation 1 in the DASH events timing model document to calculate what the timestamp offset will be

Chris: Is that the same timestampOffset as in the MSE spec (on SourceBuffer)?

Iraj: Need to think about it, it could be an event message specific timestamp offset

Action: Iraj to investigate if timestampOffset is applicable for emsg box timing calculation

Range removal algorithm

Chris: MSE SourceBuffer has a remove(startTime, endTime) method
… What should this do with events? If an event starts and ends within the time range being removed, it would seem obvious to remove that event
… But what if the event duration overlaps the start or the end time being removed?

Andy: For timed metadata tracks, rufael suggested that for long message strings (IMSC across segment boundaries), repeat them. Does this apply to embedded metadata? We have the same thing appearing at multiple times?

Iraj: Does MSE need to know about event duration? Or is event duration a parameter relevant to the application?

Alicia: Yes, it would need to know, as the cue has start and end times

Iraj: The duration wouldn't be a property relevant to the app. The reason is that MSE has to dispatch the event to the app
… Two modes: on-receive, on-start. In on-receive mode, when the segment is parsed and and emsg is seen, MSE dispatches it to the app
… Equivalency rule aside, at that point MSE isn't concerned about the event any more
… Same for on-start mode. When an event is received to be dispatched on-start, MSE needs to calculate start time of event and put it in the append window so it's dispatched at the event start time

Zack: But that's at a higher level than MSE, here we're managing the buffer of events
… As events are made known to MSE, they're being surfaced as cues. As media playback proceeds, the cues are set to active or inactive by the media player
… MSE doesn't need app specific context of what the event stream is, but it does need to manage events on the timeline
… How does it manages the persistency model?
… To Chris's question, I believe when the totality of the event region is no longer held in the buffer it can be removed
… If an event spans multiple segments, you'd want to keep it. If it were an MPD event, it's independent of underlying segmentation. Similar should apply to emsg events

Iraj: This is what's not clear to me. If a media segment carries an emsg box and the app adds to the MSE buffer in the append window,
… and the append window gets shorter, does MSE keep the event? When you lose the segment, you also lose the events carried in the segment?

Zack: From what MSE models, which is the media byte stream, it doesn't need to know the segmentation. After the data is parsed out, events are anchored to the media engine's timeline
… If you logically remove the segment time, remove the event as long as the message was completely contained. If it extended past, it would remain
… [Missed something related to the event equivalency logic]

Iraj: Media samples are active for durations. You can change or purge the append window. Move the append window based on sample time
… Event message boxes can point to something in the future. When you purge the media samples related to the event, do you keep the event?
… If you keep the event, you can have consistent rules in terms of event repeats
… Problem: if the segment is put back in the MSE buffer because the samples don't exist and the app extends the append window to cover the range to cover seeking operations, and then you receive an other emsg box, how do you know if it's a repeat of the emsg boxes in the buffer? Because, conceptually the new segments may have new media samples, or even different event messages

Zack: In this case you're appending another emsg - scheme, start time, duration. These would be equivalent to an existing one in memory
… Another interesting case: what if the emsg is totally in future, no overlap with media timeline?

Iraj: Need to look in more details, about this and purging of events

Alicia: The emsg are optional, what could be interpretation of sending an emsg box followed by a fragment that defines an event 0-20 seconds. First fragment covers 0-10 seconds. Later append another fragment, would it preserve the event?

Iraj: In this case it replaces them
… MSE has a specific time range that media samples exist in the time range. Can define the same type of buffer for events. Question is how is that buffer managed?

Action: Chris to capture these questions as GitHub issues for follow-up

MSE vs app-level emsg parsing

Alicia: What's the point of having in-band or just have an API to add cues from the app. Why have a browser emsg parsing?
… Applications are already parsing media segments

Iraj: We want to avoid all top-level parsing. Can we can develop a model where MSE handles everything?

Zack: It needs to be solved in an interoperable way, regardless of where it gets implemented
… My reason for pushing it to MSE is that we need to get to a world where the app doesn't touch the bytestream. With ultra-low latency playback, any touching of the media gives higher latency
… Removing any touching of the media by JavaScript (see also: appendStream proposal), will really help performance of device implementations. It's not necessarily a win on desktop browsers, but it is for home media devices

Iraj: Interop will benefit All MSE based implementations
… We need to solve the MSE parsing to see if it's feasible. If it's not, leave to the application

Andy: What are the consequences of this? How to convey these decisions to DASH.js? Does it go through Events TF? DASH.js will need to be updated

Iraj: If this gets implemented in MSE, then DASH players can update. There's a similar issue with DASH.js handling of changeType()

Next meeting

Iraj: In 2 weeks?

Chris: Easter holiday, so could meet the following Monday
… So April 12th at 8am Pacific

[adjourned]

Summary of action items

  1. Chris to ask Cyril about ISO BMFF box parsing and the MSE append error algorithm
  2. Iraj to investigate if timestampOffset is applicable for emsg box timing calculation
  3. Chris to capture these questions as GitHub issues for follow-up
Minutes manually created (not a transcript), formatted by scribe.perl version 127 (Wed Dec 30 17:39:58 2020 UTC).

Diagnostics

Succeeded: s/If/... If/

Maybe present: Alicia, Andy, Chris, Iraj, Zack