Media Timed Events TF -- 19 Nov 2018

<kaz> Chair: Giri

<kaz> scribenick: kaz

UC document

Giri: One of the things Chris raised as a next step is publication of this document.
... I believe we said it would come out as an IG Note. There's a couple of outstanding pull requests,
... but after that we have the option of taking a snapshot in time.
... What are the formal procedures involved in publishing this as a note?

<cpn> scribenick: cpn

Kaz: The procedure itself is not very complicated. During this TF call, we can make our own decision, and then we can confirm that decision during a main IG call as well,
... and then talk with PLH as the project manager, to get approval, and then we can publish the document.

<kaz> scribenick: kaz

Giri: Is there any objection to that? My suggestion would be to take care of outstanding pull requests during the next couple of weeks prior to the next IG call,
... then raise this as a topic during that call to say we want to publish. Does that sound OK?

Chris: I think what we have now is good as a first public draft. I'm not sure it's ready to be finalised at this stage.
... I'm certainly happy to publish, but with a view to making some more updates. I don't necessarily feel we've completed this yet.

Giri: I tend to view this as a snapshot in time. Kaz, what's the document life cycle for IG notes, as compared to standards track documents? Can we publish a snapshot and keep revising?

<cpn> scribenick: cpn

Kaz: As we're an Interest Group, this document will become an IG note,
... and we can publish whenever we want as an updated group note.

<kaz> scribenick: kaz

Giri: OK

Steve: That sounds reasonable to me, I think that's the best approach.
... If we get something out then it gives people visibility of what we've been doing.

Giri: I think it's also a good idea too, because as we go forward with the collaboration with WICG, we may find we need to revise the document.

Open issues

GitHub issues

Giri: We always need to keep revising use cases, I think. Timing requirements was another thing.
... Chris and I have both sent a couple of pull requests, we need another set of eyes on this.

Chris: Let's go through these, to see what we think?

issue 22

Chris: Looking at issue 22, the timing requirements.
... We've had some discussions on previous calls about captioning and the need for frame accurate rendering.
... One of the things I think we discussed at TPAC was to identify if different use cases have different timing requirements.

Giri: I had an action item from TPAC to document the SCTE-35 requirements.

PR 25

Giri: There, you can set an insertion cue as little as 2 seconds prior to the availability of the splice event,
... which to me seems to put a tight timing requirement on the user agent for propagating a splice event over to the application.
... It's a use case that doesn't leave much time for client side processing.

Chris: What's the context for this, is it in MSE playback?

Giri: It could be. By splicing, I'm primarily referring to ad insertion.
... As far as SCTE is concerned, it's any downstream point where the splice can take place,
... even all the way down to the user agent, and MSE is the W3C solution to splicing, currently.
... So this is what the cable guys are looking at as far as splice requirements are concerned.
... What are we going to do as far as user agent processing is concerned?
... They don't really give normative requirements that can be translated into user agent requirements.
... Two seconds seems like something the user agent could meet, but that's for the entire processing of the ad insertion cue,
... including the transmission delay and processing by the user agent to extract the event data and propagate it to the application,
... and delay in the application to handle the event data. So the timeline isn't very clear to me.

Chris: We've talked about the "time marches on" algorithm and the 250 millisecond limit that that specifies.
... Would that be sufficient?

Giri: My guess is it's probably not. If you consider a 2 second budget, the transmission latency could be several hundred milliseconds, depending on the quality of the transmission.
... Then there's a 250ms upper bound on processing by the user agent. Plus, if the application has to make any networking calls, that can add several hundred milliseconds.
... So that could be up to 1 second of delay. You would rather have the bulk of that 2 second budget allocated to transmission delays rather than client processing.
... That's my intial take. I could break it down into a time budget for one way transmission.
... I think the BBC might be in a better position to answer this, as you're actually processing these cues on the back-end, so you might also be able to insight.

Chris: I can see what I can find out regarding end to end delay.

Giri: This is an example of additional timing requirements. Are there any others we should consider?

Chris: Somebody contacted me offline, they're following the work we're doing.
... So we may have some new use cases coming in, I hope they'll reply to the GitHub issue.
... The caption rendering is another one where we have stringent timing requirements.

Giri: We have the BBC subtitle guidelines. Are there other sources for caption requirements that could be frame synchronous?

4.4

Chris: There was someone who replied to the M&E IG issue 4.

<cpn> https://github.com/w3c/media-and-entertainment/issues/4#issuecomment-396762643

Chris: This commment mentions caption rendering, he wants the captions to appear and disappear coinciding with scene changes,
... and he's using the currentTime from the video element.

Giri: I see, this seems to be a caption that is sync'd with the scene rather than a captioning requirement.
... I think we can reflect that in the document.
... It seems like a content authoring use case, I can see if there are other mentions of this.
... From a BBC perspective, do you see this requirement from a content authoring perspective?

Chris: I would defer to Nigel on issues of caption rendering,
... but something we are interested in doing, more from a research area,
... is client side rendering of supplemental content alongside video, e.g.,
... triggering overlays or graphics rendered with the video content.
... And we would want to achieve a higher level of timing precision to do those kinds of things.
... Those are the kinds of use cases where right now you would have to take video frames and render into a canvas
... rather than using events to trigger DOM updates.
... So, for us, having a much more integrated control over the video rendering with the ability to mix additional content into that would be of interest.
... I'm less certain what the solutions would be there.

Giri: We have a good case here, if this is critical for content authors.
... My experience with canvas is that you cannot keep the same FPS as the source content, I haven't been able to get that to work.
... I don't think content authors would want to sacrifice the quality of the video playback to satisfy this use case.
... It sounds like the comment is valid, I don't believe that current browsers can satisfy it.
... I think we need some additional sources to say this use case is important for content authors.

Chris: What we have now is really pointing towards a newly defined DataCue API,
... and the ability to surface in-band events of particular types, which may be different depending on the media format.
... The timing guarantees around that may be somewhat tightened up based on the existing definitions around "time marches on".
... But I think there may be a bigger thing to be explored, which is around this more integrated video and graphics pipeline.
... This is a scope issue for our task force, are we looking at all of this together in one piece of work,
... or are we OK with focusing on a TextTrack like mechanism for in-band and out-of-band events,
... with some consideration of timing of event extraction and event propagation to the application?
... The integrated rendering case is a much bigger topic than we've looked at so far,
... and may be something we would want to take back to the IG to get guidance on.
... I'm responding to what we've seen in the issue 4, frame accuracy seeking.
... It seems to have generated some interest from people, some are outside the IG,
... people with use cases they're trying to achieve thwt may require more precise control over timing and rendering.
... My suggestion would be to go back to the IG and follow up there.

Giri: We'll put this on the agenda for the next IG meeting.
... There is an aspect to this which is frame accurate handling of events, which will be difficult at 60 FPS.

Chris: Going back to your PR 25, is there anything we want to say in addition?
... It currently says that the propagation should be considerably less than 2 seconds. Should we try to break that down?

Giri: Yes, I can try to characterize it from a client side perspective. You might be able to say from a content originator perspective.
... Particularly for live content, this becomes a tricky problem.
... we can take the worst case, where you see the ad insertion cue when it's first ingested, before it's sent over some transport, cable, over the air, or internet.
... From there, we can take out the expected transport delays and see what the budget is for client side processing.

Chris: I will see what I can find out.

Giri: I worry some of the browser vendors may come back and say that they can't make it work.
... The SCTE also give other insertion points that account for different kinds of implementation.
... These are just examples, we may have some negotiation in WICG with browser vendors.

Chris: I have a couple of pull requests open. The first one is about restructuring the use cases.

PR 23

Chris: This is open to discussion, if you think it's useful. I put each use case as its own section,
... as opposed to the categories we're using: synchronised events, synchronised rendering of web resources, and embedded web content inside media.
... Sometimes what you have is an event carried in the media, which triggers fetching a resource, which is then rendered. There's a combination of actions that happens.
... I wanted to describe each use case as a whole. I'd like feedback on whether this a good way to go.
... We didn't say much about rendering of embedded content at the moment, e.g., in 3.3.
... It's not clear that these use cases are about embedding inside the media container.
... If structure this way, I'd like to describe the use cases for embedded media here.
... The distinction between these different use cases was less clear, so I felt reorganising may help.
... With those three cases (social media, banner ads, accessibility assets), do we expect the resources to be retrieved over the Web, or carried inside the media container?

Giri: From what I understand of the MPEG work, even if it's carried in the container, it could still be requested over the internet.
... It may just be a trigger to the application to fetch a resource at a particular time.
... The advantage of putting it in the media container is that it allows direct rendering by a media player without an application, if the track is authored in the right way.
... This is something we've found at ATSC, like with HbbTV as well, when the user tunes to a channel there's an associated application that can handle the interactivity.
... We can also remove that from the document for now, as the standardisation effort is in progress.

Chris: Looking ATSC as an example, you have the two media players. Does our document need to target the native player, or are we talking about the interaction between an application level player and the user agent?

Giri: MPEG has two models. One where the media player handles all interactivity as part of the media container.
... The other is more of a metadata cue model, where if an application is present, the media player propagates the events to the application, as we're envisioning with DataCue.

Chris: My overall feeling with section 3 is that the use cases could use some clarification.
... I'll revisit that and see if I can do it differently.

Giri: That's fine, can you try to include it before the next IG meeting, ready for the snapshot?

Chris: Yes, I agree.
... The next pull request is to add details of the Webkit DataCue that Eric posted in WICG.
... It's an extension of the HTML5 DataCue, more flexible. I just added it to the gap analysis section,
... to capture what's presently implemented.
... The SCTE-35 and WebKit pull requests I feel we can merge.
... I'd like to ask your feedback on section 6, Recommendations.
... What I've tried to do is summarise the requirements that an API should support.
... It may be a bit too emsg specific.
... [Describes requirement for subscribing to events by id and value, as in HbbTV]
... Should there be an opt-in mechanism from the application side?

Giri: We want to have specific recommendations to WICG. Since this is a living document, we can add new recommendations.
... I'm not so concerned it's emsg centric. It's recommendations to WICG to help structure the work accordingly.

Chris: If it is too specific, I may want to change that, the first few items in that list, subscribing and unsubscribing.
... The other recommendations are for in-band and out-of-band support, the DAInty mode 1 and mode 2 triggering.
... Did we cover all the topics from the agenda?

WICG

Giri: Recommendations to WICG and next steps. We had a meeting at TPAC with WICG.

Chris: Yes. There was agreement at that meeting that we'd met the necessary bar for starting an incubation.
... We have at least Apple interested to work. I contacted the WICG co-chair, who was happy for us to go ahead.
... We just need a WICG repo, I'll follow that up.
... What I'd like to do is figure out who among us is going to work on the spec development on the WICG side.
... I can invite people from the browser companies.
... Once we have a repo, we can create an initial specification template, then start work on the details.

Rob: In the breakout session I chaired, WICG is the place we should be doing this.
... If we have a repo that will encourage people to participate, and we can get them to join up and contribute.

Chris: I agree. We've more or less reached the limit of what we can do as an IG,
... aside from publishing the document. We need to get this work happening on the WICG side as soon as we can really.

Rob: I have some contacts at Mozilla and Chrome I can contact.

Chris: My plan is to follow up with the WICG co-chairs, as only they can create the repo for us.
... Once we have that, we can contact people from all the browser companies.

Rob: It helps that Apple are already involved.

Chris: I've had some offline discussions with MS as well,
... as it's a general media industry need.

Giri: Let's go ahead and merge the two PRs discussed today.

Chris: OK, I will merge those two PRs, and think about the use cases section.

Rob: I can see what you're trying to do. I wonder if you should summaries the main points at the start, then follow it with the example use cases.

Chris: The existing section headings could be bullet points, and the
... a section per use case
... and describe the detail

Rob: I did a similar thing in WebVMT, describing the use cases in detail, and then a note at the end of the benefits, as bullets.
... It picks out the detail, which may be lost if you read it.

Chris: I'll take a look, it could be a good model to follow.

<RobSmith> https://w3c.github.io/sdw/proposals/geotagging/webvmt/

Chris: I have a few actions, and I hope to have news on WICG soon.

[adjourned]

- DRAFT -

Media Timed Events TF

19 Nov 2018

Attendees

Contents

UC document

Open issues

WICG

Summary of Action Items

Summary of Resolutions