<scribe> scribenick: kaz
https://lists.w3.org/Archives/Public/public-web-and-tv/2019Jun/0026.html
Chris: There's a lot to share today
and get your input on.
... We have some input from DASH-IF to review.
... I would like to present an initial DataCue API proposal I've
been working on.
... Review the DataCue explainer and start to capture a list of
open questions for the API.
... Finally, next steps.
<Zakim> nigel, you wanted to ask if looking at design proposals is in scope of this IG?
Nigel: Can we discuss design proposals in an IG call?
Chris: I should have said at the
start, it was mentioned in the agenda email, that we're running
this call under the WICG terms.
... We're pretty much settled with the requirements, so I would
like to move forward to design proposal.
Nigel: So anybody who would like to make contribution should join the WICG.
Chris: Yes
... AOB?
Rob: One item, I'll be presenting at OGC
https://www.w3.org/2019/05/20-me-minutes.html#ActionSummary
Chris: A few things from the last
call
... I added add detail on events supported in HLS to the IG Note.
There's an an open PR
Chris: The next item was around
terminology, some disambiguation
... This is done, there's a PR open, some discussion on the
changes.
Chris: Nigel, there was one question
from you regarding payload type signalling.
... I think the document captures that, unless I've missed
something? Please let me know.
... I will merge in the next a few days, if no more feedback.
Nigel: I'll check
Chris: Next, Kaz to set up GH repo so
that Chris can add changes.
... And Nigel to draft text to add to section 3.4 use cases around
high rate captions.
Nigel: I've not done that yet.
Chris: And Rob to raise an issue in the DataCue repo for unknown / to-be-determined cue end times.
Rob: I raised 2 issues, for cue end
time at end of media, and for updating cue times.
... Updating cue times may be a non-goal, if it's already covered
by the existing API.
https://github.com/WICG/datacue/issues/8
https://github.com/WICG/datacue/issues/9
Chris: I've not done much work on the explainer document recently, focused on producing an initial API proposal
Chris: (Explains diagram in Figure
1)
... Some of the events are DASH-specific, so go to the DASH Client
Control module, rather than the Application.
... Architecture here is somewhat generic. In HTML we have some of
these components, e.g., the DASH Access API could be thought of in
part as MSE.
... Different architecture for web-based players, where handling
the MPD is done at the application level (Or by the UA for
HbbTV).
... There are two dispatch modes called them on-receive and
on-start. In the IG's media timed events document, we call them
type 1 dispatch and type 2 dispatch, based on our early
conversations with DASH-IF.
... (Figure 2) Here, the media timeline would be what's exposed by
an HTML media element.
... (Figure 3) There's some detail on the structure of emsg events:
timestamps, schema ID, message payload as a byte array.
... And then discussion around triggering timing.
... This document is quite useful, I like the diagrams that capture
the different timing models visually.
... (Figure 6). MPD events timing model with on-receive and
on-start triggering.
... The document doesn't present IDL for an API.
... (Figure 12) There's a mapping of how these values are obtained
for MPD and emsg.
... This is useful input, we have two groups working in parallel,
some communication to make sure we're reasonably aligned.
... I think what I'll present next is compatible with this.
... Any feedback or questions?
Nigel: One question about
synchronization. Does this document describe exactly the exact
mechanism?
... On the web, we have the event loop, so not sure how this fits
with the model we have in HTML with TextTrackCue.
Chris: It's a good question, but I
haven't read this in enough detail to know.
... Nigel: OK, let's come back to that.
https://docs.google.com/presentation/d/1FEfVVQ1Rf9eABEZ5lPhdnuDum3eUsFrD4MTJR0ZeVS0/edit
Chris: My proposal is that we use the
use existing texttrack mechanism as much as possible, rather than
defining a new set of APIs that are timed metadata specific.
... TextTrack give us a lot of the capabilities we need, and we can
extend them to add the parts that are missing.
... One of the things missing is on-receive model, we have
on-start.
... I have a proposal that simplifies adding event handlers to
cues.
... I read the TAG's API design guidelines about feature
detection.
... I also based the API proposal on the WebKit DataCue and aspects
of HbbTV's emsg support.
... I'm also proposing no change to element, i.e, not proposing you
could author a track element that references some kind of timed
metadata track, e.g., a WebVMT document, and expect the UA to do
something with it natively, I'd expect the application to handle
that.
... That would give the flexibility in terms of the data formats
that we use, because we can handle at the application level,
... rather than expecting the UA to handle every different kind of
timed metadata format, or attempting to standardise on a particular
format.
Rob: I would agree with that. The existing API includes a way of allowing metadata, rather than prescribing it, so I think that's sufficient.
Chris: Thanks. I looked at the HbbTV
API, this has a native DASH player.
... It says that there'll be a TextTrack for each event type
signalled in the MPD.
... For emsg, it will provide a TextTrack for each event type in
the media representation currently being played.
... Any changes to the set of event streams, e.g., on switching
representations or on MPD update, there are addtrack / removetrack
events.
... My proposal is different to that. I'm proposing that, because
in our requirements analysis, we said we want events only to be
surfaced on application request, we have an opt-in model.
... It would be an application responsibility to know what kind of
events are present.
... There isn't a discovery mechanism to know what kinds of events
are present in the content.
... If think that's consistent with the requirements we have, but
if there is a need for discovery, we'd have to go back to our
requirements to capture that.
Nigel: How would the application know? Would this be a element inside a element, with an identifier that tells the UA to expect a certain kind of event.
Chris: Yes, exactly. The application
would have to know enough about the content to know what kinds of
events would be present.
... (Shows a big diagram starting with "HTMLMediaElement" on the
upper/left side).
... I mapped out the TextTrack API. Changes to support DataCue and
in-band events are shown in bold text.
... First question was: How would we let an application register
its interest in receiving some kind of in-band event?
... I looked at the existing addTextTrack method, takes kind, and
optional label, language parameters.
... That in itself doesn't allow us to identify an in-band event.
Looking at the TextTrack interface, it has an
inBandMetadataTrackDispatchType, which seems to be the existing
identifier.
Nigel: Is it not the id?
Chris: That's the unique identifier
for the text track, it doesn't tell you anything about the content
of the text track.
... My first idea was overloading addTextTrack and adding a
TextTrackConfig dictionary, where you can set the
inBandMetadataTrackDispatchType.
... Seemed OK, and compatible with the existing API, UA can check
the received parameter type to see how to handle.
... Then I remembered TAG guidance on feature detection. The only
way you could detect this is to call it and see if it works or
throws an error.
... A different way is to add a new method, e.g,
addInBandMetadataTrack.
... TextTrackList doesn't change. In TextTrack, I've added
oncuereceived, which is intended to support the on-receive dispatch
mode that DASH-IF are interested in.
... This would be called when a timed metadata cue is extracted
from the media content, as opposed to its time on the media
timeline.
... I added oncueenter and oncueexit events for the on-start
dispatch mode.
Nigel: Isn't duplicating the enter and exit handlers at another level in the object hierarchy against the current design?
Chris: One reason for this is so that
you don't have to inspect each individual cue to attach
handlers.
... If you're creating cues in the web application, you can do
that, but there's no opportunity to do that for cues generated by
the UA.
... The only way you can action those cues is by using the
oncuechange, which leads to the sync problem because you have to
inspect activeCues list which may not reflect what's currently
active.
Nigel: Why not have an oncueadded or
even an ontexttrackchanged event handler, so you can iterate
through the TextTrackCues that have been added or removed and add
the handlers to them?
... The problem you started with is different to the solution
you've ended up with.
Chris: That's interesting, you could actually solve this with oncuereceived, because that gives you the cue when its parsed from the content, and you wouldn't need these extra oncueenter / oncueexit handlers.
Nigel: That feels cleaner.
Rob: My understanding of oncuechange
is that it has two sub-types, onenter and onexit, and is onreceive
not the one that happens before all of them?
... Would that not be more consistent with the current design, and
also give what you want?
... Need to double check, but I think that's what oncuechange
gives.
Chris: (Figure 10 in the DASH-IF doc). oncuereceived corresponds to the Arrival Time (AT) here, and the cue's onenter corresponds to the Presentation/Start Time (ST).
Rob: So this diagram shows it clearly
that onreceive happens before onstart and onend.
... Are those not part of oncuechange?
Chris: I'd need to go back to the HTML spec to study oncuechange.
https://html.spec.whatwg.org/multipage/media.html#event-media-cuechange
scribe: How do you know which cue is entered or exited? Need to use the activeCues list.
Rob: The cue should be passed with the event.
Nigel: The trouble is having multiple cues that could have been made active or not active. I wouldn't expect oncuechange to be called when a cue is received, because being received doesn't make it active or not active.
Chris: And doing that may give a compatibility issue with existing usage of oncuechange.
Rob: My concern was having
duplication of onenter and onexit, in addition to
oncuechange.
... I'm wondering if we need to discuss further.
Chris: I'll update this based on this
feedback, oncuereceived may be all that's needed
... On TextTrackCue, we change the endTime to allow us to set to
Infinity, which I think requires unrestricted double, although that
allows you to set other strange values (e.g., NaN).
... And finally on the DataCue interface, as specified in HTML5
there's an ArrayBuffer with the data payload.
... We're proposing to use the WebKit approach, which is to have a
value and a type, and the value could be an ArrayBuffer or an
object or string or whatever,
... either depending on what you've give it, for application
created events, or for UA created events it would be up to the UA
to decide the appropriate representation.
... and we'd want to specify how that's done for emsg events for
example.
... That's a summary of the proposal so far, a few changes to the
existing interfaces rather than a new distinct set of APIs.
... Our requirements map well to the existing structure.
... I then looked in detail at the inBandMetadataTrackDispatchType
in HTML.
... The spec describes how that field is populated for different
kinds of media.
... I'm not so familiar with the details of these formats, but this
is a summary from HTML.
... Not all browsers have support for DataCue or the
inBandMetadataTrackDispatchType field.
... Interestingly, Chrome doesn't expose
inBandMetadataTrackDispatchType.
... I looked at HbbTV, the TextTrack property / emsg binding. You
concatenate the scheme id and value, to identify the event
stream.
... We would need to consider if we can use this, or something
similar.
UNKNOWN_SPEAKER: What I'm proposing
to do next is update the DataCue explainer with the latest text
from the IG note,
... and add example code that illustrates all these features I've
just described,
... and then schedule a call with implementers.
... I had the opportunity to talk with Apple during the Fraunhofer
Media Web Symposium recently,
... they suggested coming up with a strawman proposal and
scheduling a call, so that's what I've shared today.
... I want to make sure we have support from everybody, it will be
interesting to see what in-band events different implementers want
to support.
... Any feedback?
Rob: Thanks, Chris. That looks good, thank you very much!
Chris: Thank YOU!
Rob: I like your approach with
minimum changes, the existing stuff has been thought through.
... Should we follow up about the events?
Chris: Yes, we don't have to wait the next scheduled call, let's do it separately.
Rob: OK
Chris: I want to schedule the call with implementers as soon as we can, so don't want to wait for the next scheduled TF call on July 15th.
Rob: There is an OGC technical
committee meeting coming up next week.
... I will present WebVMT and some of this DataCue WICG activity,
to show there's more work going on in this area and generate
interest in video and location from the Open Geospatial side.
... It's my first OGC meeting, I've proposed a breakout
session.
... This would cover a wider area, to get some activity going
either there or elsewhere in OGC, bring in different groups.
... They have something similar to WICG, known as the
testbed.
... They're very application focused, from a geospatial point of
view. Interested to get feedback.
... I have created a GitHub issue
<RobSmith> https://github.com/w3c/sdw/issues/1130
Chris: Excellent, thank you, I wish
you well with that.
... I'm away myself next week, so I don't know exactly when we'll
have the call with implementers, but I'll likely follow up week
after next on that.
[adjourned]