W3C

- DRAFT -

Media and Entertainment IG - Media Timed Events TF

18 Jun 2018

Agenda

Attendees

Present
Kaz_Ashimura, Bob_Lund, Chris_Needham, Giri_Mandyam, John_Luther, Masaru_Takechi, Ali_Begen, Jean-Yves_Avenard, Thomas_Stockhammer, Nigel_Megitt, Mark_Vickers, Steve_Morris
Regrets
Chair
Giri
Scribe
cpn, kaz

Contents


<cpn> scribenick: cpn

Introduction

Giri: This is the second TF meeting, driving towards creating concrete recommendations by TPAC this year.
... Thanks to Chris, we've created a GitHub repo, https://github.com/w3c/me-media-timed-events

Open issues

Giri: There are a few open issues.

<kaz> open issues

Giri: One was based on a discussion with John and Bob, whether the ISO BMFF byte stream format spec should be updated.

<kaz> issue 1 - ISO BMFF Byte Stream Format

Giri: The conclusion was that this may not be the spec to do that. That led to another open issue, which is how to actually process an emsg box using existing mechanisms: DataCue, defined but not really adopted en masse

<kaz> issue 2 - DataCue

Giri: In the end, Silvia provided a good summary regarding trade-offs between potential options
... Mark commented on the issue that we don't create specs in the IG, we don't have an IPR policy in force
... the recommendations can go into the WICG, will come back to that later

<kaz> issue 3 - DAInty

Giri: There's an issue I opened, regarding the DASH-IF, the client APIs for interactivity (not public).
... I've discussed internally, including with Thomas Stockhammer, suggestion to use the liaison between DASH-IF and W3C.
... I can work with Kaz and the IG chairs to get that info.

<kaz> scribenick: kaz

Chris: This would be really useful input to the TF.

<cpn> scribenick: cpn

Thomas: If there's a benefit to opening up this work to share, we could do that.
... As discussed in DASH-IF recently, dash.js has an event processing model that
... includes emsg. We'll maintain and document this model.
... There's a second approach where the application has more control over the media timeline.

Giri: We should capture that as another issue here.
... It relates to the issue raised by Jon Piesing, the technical director of HbbTV,
... recommending adding reference to emsg handling in HbbTV.

<kaz> issue 4 - reference to HbbTV

Giri: From an ATSC perspective, we considered the HbbTV model and rejected it.
... The concern was that all event stream handling needed to be mapped to dedicated tracks.
... The emsg data cue mapping is specifically called out in the HbbTV spec now.

<kaz> scribenick: kaz

Chris: The issue I raised (#2) about how to expose the event, and whether DataCue is appropriate,
... or whether we prefer a more emsg specific API. I'm interested in receiving implementer feedback on this.

Giri: We have MS here, but not the other browser vendors.

<cpn> scribenick: cpn

Jean-Yves: I'm from Mozilla. I was invited to the call by Chris.
... I had a look at the spec. There are two issues from an implementation point of view.
... Browsers don't have native DASH playback, so have to be in-band
... Handling of the out of band message case

Giri: That's the case in ATSC. Out of band events aren't handed.

Jean-Yves: For in-band events, timing is based on the sidx box. But as far as ISO BMFF is defined for MSE,
... these are to be explicitly ignored, so there's a spec conflict here.

Giri: Is sidx processing also needed?

Jean-Yves: You can't have one without the other, otherwise files are assumed to start at time zero.
... I'm fairly certain that anything out of band would never be implemented unless you
... have a way to pass it to the implementation.

Bob: I have implemented exposing emsg across UAs with dash.js, leveraging the API that Thomas mentioned.
... It was able to handle emsg boxes in ISO BMFF, also out of band events in MPDs.
... It leveraged the existing TextTrack interface supported by browsers.
... It's all doable, we did it using polyfills, so it demonstrates that everything the group wants can be done.

Giri: Can you share a reference?

Bob: I can do that, yes.

Giri: We have some solid things to add to the document, e.g., in-band and out of band processing, also implementations.

<gmandyam> https://github.com/w3c/me-media-timed-events/issues/2

<kaz> scribenick: kaz

Chris: Coming back to issue #2, to summarise. I looked at the DataCue API and
... considered whether it is what we need for emsg.
... The information Jon provided about the HbbTV implementation shows how the mapping can work.
... The TextTrack API exposes the scheme and value combination,
... which allows the existing DataCue API to be used.
... Silvia made a useful suggestion, that VTTCue could be used not only for emsg but other kinds of events more generally.
... The UA would effectively construct JSON objects,
... which could can contain any necessary information, not just an emsg structure.
... In the end, she summarized the 4 options identified: TextTrackCue, VTTCue, DataCue, and defining a new EMSGEventCue.
... The TextTrackCue text attribute is not part of HTML.
... VTTCue is supported by browsers.
... DataCue is only implemented in custom browser engines, e.g., HbbTV
... Or we could choose to use define a new API for emsg, which would require a new spec.

Giri: Should we layout potential options, to generate a recommendation for the WICG?

Chris: What I'd like to do is extract the requirements from our existing document,
... so that we have got something written down to show to them for the emsg-specific part.
... If the browser implementers agree to work on it, we would get a GitHub repo to start the spec work.

Mark: We need to have clear requirements written down.
... Alternatives have been discussed, but we don't have requirements written down yet.
... We've jumped into the discussion on alternatives, but is it only emsg that we're interested in?
... Are there other kinds of timed events? What are underlying mechanism?
... And use cases, how will the data be used?
... What do the APIs need to do, how the events are to be used by applications?
... Describing the possible alternatives is fine, but we do need the use cases and requirements.

Chris: I support that, there is already a section on use cases in the document, but it could use more input from the TF.

<cpn> https://w3c.github.io/me-media-timed-events/#use-cases

Chris: The first one I added is synchronized triggering of image URLs against an audio stream.
... The use case comes from DVB-DASH, targeted to embedded devices.
... The timing of the image display may not be too critical, within 1 second is OK.
... The other use case I have comes from the DASH spec itself,
... where an event is used to tell the application to refresh the MPD manifest file.
... These may not be the most valuable use cases to describe.
... I would like to see others' contributed use cases as well.
... does anybody other use cases?

Thomas: We're discussing advertising use cases in DASH-IF as well,
... but these are quite often not terminated in the endpoint.

<cpn> scribenick: cpn

Thomas: This is not only for browsers, it's for proxies, to manipulate the manifest,
... i.e., for in-network manipulation of data.
... In DASH-IF, requirements have come from 3GPP, synchronised interactivity displayed alongside media.
... ATSC developed its own API for this.

<kaz> scribenick: kaz

Chris: I believe somebody posted the 3GPP document during one of our previous calls.
... It's a good source on synchronized presentation of interactive elements.

<Zakim> nigel, you wanted to ask how generic emsg is vs other kinds of data cues?

<cpn> scribenick: cpn

Nigel: Mark is right to focus us on requirements.
... Having specific use cases is useful, as this group wants to see implementations.
... My view is that HbbTV is a UA like any other, and DataCue is useful there, so it should be considered.
... My question is: there's a lot of focus on emsg, are there other kinds of data, or why is emsg special?

Giri: From an ATSC perspective, there was a desire to have certain types of media timed data transition through different transports,
... if the data is immutable and passed from one to another.
... This makes in-band events desirable as opposed to out of band.

<kaz> scribenick: kaz

Chris: Is there an example of another kind of event than emsg?

Giri: I would go back to what Mark suggested, determine what the requirements would be.

<cpn> scribenick: cpn

Giri: We should document what we mean by in-band and out-of band, this would be helpful input for WICG.
... As Jean-Yves said before, if the application can handle the out of band events, maybe they shouldn't be precisely media timed.

<kaz> scribenick: kaz

Chris: This reminds me of another topic we've been discussing in the IG,
... about the timing accuracy of timed events, the "time marches on" algorithm,
... and how this relates to the timeupdate event, but also media timed event triggering.

<cpn> https://github.com/w3c/media-and-entertainment/issues/4

Chris: When we submit something to WICG, I'd suggest that it also contains timing considerations.
... The timing requirements may differ for particular event types,
... and those should be captured as our requirements.
... Nigel, you've been active in that thread. Did you come to some conclusion?
... You've looked at what the specs say, maybe the specs allow some implementation freedom?

Nigel: It seems that subtitles and captions create good requirements for synchronization.
... We can probably write down the requirements.

<cpn> scribenick: cpn

Nigel: People want frame accuracy for subtitle presentation.
... When you're not at that kind of frame rate - chosen for what people can see without noticing flicker.
... From there, if you have a higher frame rate, this is better for motion artifact,
... but it's not clear if the subtitle accuracy changes.
... I don't know if there are tests to show that.
... Also, from a bit-rate point of view, one technique is to reduce the video frame rate,
... but the audio and subtitles continue as normal.
... The EBU-TT-D appendix describes this. if you quantize to a low frame rate, e.g., 6.25 fps, this has bad effects,
... so you should prefer to synchronize to the audio.
... You don't want subtitles to overlap frame boundaries. These requirements can be documented.
... For other types of event/content, requirements may differ,
... e.g., ad insertion at 100 fps may need higher accuracy for data event triggering.
... There's some differences in terminology, so the requirements doc would be good to capture those,
... e.g., media time.

Kaz: Do we want to add a terminology section in the doc?

Chris: I think we do.

Thomas: Timed text tracks and emsg are similar concepts. emsg was added as it can be multiplexed with the media track.
... There's also media resources in ISO BMFF.

<kaz> scribenick: kaz

Chris: This TF should look into that.
... I asked the W3C TAG for review, their response asked about the Web Packaging work which is already being done within W3C.

<scribe> scribenick: cpn

Thomas: The whole issue of using ISO BMFF tracks for interactivity, using the regular ISO BMFF tracks for more general media.
... Getting people together for a workshop would be helpful.

Giri: We should pick that conversation up again.

<kaz> scribenick: kaz

Chris: I suggest we continue to work on the document to capture use cases, requirements, terminology.
... I will file GitHub issues for each of the things we mentioned today.

Next call

Chris: 3rd Monday of the month, so July 16th.
... Thank you everyone.

[adjourned]

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.147 (CVS log)
$Date: 2018/06/19 22:47:35 $