<cyril> scribe: Cyril
<scribe> meeting: DataCue and "Time marches on" in HTML
chris: topics are discuss the DataCue API and the Time marches on algorithm
chris: if anybody is not familiar
    so far
    ... our goal is we want to introduce native user agent
    support
    ... for DASH events
    ... as past of support for MPEG CMAF content
    ... alongside if UA has native user support for DASH
    playback
    ... we would like to support out of band MPD events
    ... HbbTV is an example of a player that has nativd DASH
    support
    ... I'd like to discuss implementer interest for other metadata
    cue formats
    ... for example Safari has support for ID3
    ... we also want API support for application-generated timed
    metadata cues
    ... when generated by a player
    ... the existing approach is to use the VTT cue
    ... either inline or as a reference in the VTTCue object
    ... having a more convenient datacue API that let us store in
    the preferred format would be better
    ... our goals are: sync arbitrary data with video
    ... e.g. dashcams sensor data
    ... there is a lot of interest from the Open Geo
    Consortium
    ... the applicability is broader than for the M&E IG
    ... [shows current support for DataCue API]
    ... basically no support so far in Chrome or Firefox
    ... some support in Safari with an extended API
    ... [showing the data structure for emsg]
    ... there a 2 versions that differ wrt the timing
    ... there would need a mapping between the esmg timing and the
    cue timing
    ... in v0 the timing is relative
    ... the data is a byte array, so you need a schema to identify
    the data
    ... DASH-IF is working in parallel around specifying a delivery
    and processing model for DASH events
    ... they are considering more types of players, not web
    platform only
    ... one of the requirements they have identified
    ... is that in order for an application to get prepared for
    presenting a cue, e.g. a video overlay
    ... that may require fetching other resources
    ... they signal the event to the application ahead of
    time
    ... to be able to render at the appropriate time
    ... so we have 2 events: onreceive and onstart
    ... I have a number of questions:
    ... it relates to the early discussion around this, should in
    band events be exposed as a byte array
    ... or should they be exposed as objects
    ... the second approach makes it easier for app devs
    ... this may be desirables for cues that are commonly
    used
    ... for example within DASH players
    ... the emsg can be used for application specific events
    ... and we don't need support from browsers for those
    ... there is a question of how we identify inband tracks
    ... there are various fields
    ... all of them seem to enable identifying the kind of
    metadata
    ... it is not clear to me reading the spec and comparing
    implementation
    ... what the level of support is
chris_c: on your first
    bullet
    ... is it a reasonable behavior to fallback to the opaque array
    buffer when you don't understand the type?
chris: I'd like to understand
    what common subset can be supported?
    ... but the fallback could be a good approach if we have an API
    for that
francois: youll end up with 2
    representations for the same data
    ... so in the end the devs have to handle the opaque case
    ... so in this case we shouldn't bother about the structured
    object
eric: I disagree strongly
    ... there are metadata formats that are very difficult for JS
    to parse
    ... correctly
    ... so that's why I added the implementation to WebKit
    ... because we had lots of requests to support datacues
    ... and just supporting arrays is not doing web authors a
    servie
francois: it is better to have a system that works across browsers
eric: if we ddecide that
    structured data is important
    ... we need to agree on a set of types
    ... that we want to support
    ... there will always be custom metadata
    ... people can put anything a container format
    ... and they want to have access to them
    ... it does not make sense to have support for limited set
chris: I can imagine a world
    where an impl wants to provide access to ID3 and another
    not
    ... it's the responsability of the dev to know that
eric: I agree that we should not end up in this situation
mounir: is there benefit in trying to avoid that?
eric: I think so
    ... we don't have to end up there
    ... if we can come up with a way to describe the cue
    ... and require that a browser that uses that identifier have a
    structured data
mounir: there could be security issues and different parsing if the browser do it themself
richard: if the parsing within the browser and use webassembly, how could there be a security issue?
mounir: if you use webassembly
    that's ok
    ... we try to avoid doing parsing in C++
chris_c: I'm trying to understand
    what the fallback would look like
    ... maybe the ID3 would not be contentious
chris: having an API structure that lets the application introspect the cue
gkatsev: ID3 in HLS, safari parses it, but in other browsers you have to do it yoursefl
nigel: is the data in the array buffer a registered type
ericc: no the data has no indication
cyril: no magic number?
ericc: no
chris: the emsg also indicates the scheme id
ericc: with the current data cue
    api, the array buffer would have that whole thing from start to
    end
    ... and you'd have to snfiff the bits to figure out if it's an
    emsg or id3
    ... and it's going to have to parse it to determine if it's a
    emsg or not
nigel: imagine that we expose
    this data through MSE
    ... the bytestream would be identifiable
ericc: the UA, thing that parses
    the raw media container, does have a signal about what kind of
    metadata it is
    ... if the data cue had a scheme and identifier for the type of
    metadata and an array buffer
    ... then in theory it could know how to parse it
    ... the reason I decided that was not practical for us
    ... is that there are metadata values that are extremely
    complex to parse
    ... like HLS has a pList
    ... writing a parser for a binary pList in JS
    ... is not easy
    ... pratically speaking, WebKit does not have access to the raw
    pList
    ... the low level does the parsing
    ... and we get it as a native object
    ... a representation of the data
    ... which WebKit converts into a JS object attached to the
    datacue
greg: most of the conversation is
    about inband
    ... I can see datacue useful for out of band use cases
ericc: that is a part of
    this
    ... from script you can make a new data cue with
    start/end
    ... and attach anything
chris: the explainer is
    incomplete and in a very early stage
    ... it does not explain everything
ericc: any solution we come up with has to support cues from script
<nigel> scribe: nigel
cyril: Comment on
    synchronisation
    ... The payload of the metadata may trigger behaviour with
    unbounded complexity
    ... so that's why you probably need to process it in advance
    and to know in advance the practical bound.
    ... To me this is similar to how video content is
    processed.
    ... We don't have two timestamps, one for receiving, the other
    for presenting.
    ... The implementation has to know when to preprocess
    things.
    ... So I'm not convinced that having two events is a good
    approach.
ericc: I agree and am strongly opposed to having two.
<cyril> eric: I strongly oppose to having 2 timestamps
<scribe> scribe: cyril
UNKNOWN_SPEAKER: in addition you
    cannot predict how much it is going to take in the app to do
    the processing
    ... if what you are suggested is that a cue should be delivered
    as soon as it is available
    ... that's going to vary widely
    ... depending on where the parsing happens
francois: perhaps it's useful to look at why
<inserted> scribe: nigel
cyril: I agree, 3 categories of
    event:
    ... 1. Overlay, maybe after js processing.
    ... 2. Network impact, like making requests or sending
    messages
    ... 3rd, modifying the DOM
    ... The 3rd category - you should be able to pre-render in
    advance and keep your frames until they're ready
    ... The other two I'm not sure about yet.
<scribe> scribe: cyril
chris: I'd like to move on to the next part, synchronization
chris: web apps use the
    oncuechange
    ... triggered by the time marches on
    ... and the spec says there is an upper limit
    ... but in practice some implementations do follow the upper
    limit
    ... this means that it is possible for an application to miss a
    short duration cue entirely
    ... the cuechange event is fired, the app inspects the active
    cues list
    ... and acts
    ... it's quite possible that in between cues triggered there
    are cues that app don't see it
    ... there is a bug report raised by Jon Piesing, HbbTV
    ... and the recommendation is not to create short cues
    ... but it's worse than that
    ... you have to take execution time into account
    ... use of oncuechange is problematic for handling cues
    ... the good news is that if you want to avoid missng
    cues
    ... you can attach events to onenter and on exit
nigel: but if it was missed,
    enter/exit are triggered at the same time
    ... and if there are visual changes they will be missed
foolip: the time marches on step
    are not defined to run every 250ms
    ... it's meant to be continuous
    ... only the event are triggered every 250ms
chris: that's not my readinfg of the spec
foolip: the problem is that implementations are not following the spec because that's easier to do
ericc: if you run a test to look
    at the variance
    ... you'll see 10-20ms
because we don't use the time marches on
scribe: but look at the
    cues
    ... this is a quality of implementation issue
nigel: this is a spec
    question
    ... [reading the spec]
foolip: it's just for the
    timeupdate event
    ... not for the cue events
    ... [explaining how it worked in Presto]
nigel: chrome does it this way
foolip: not because the spec is wrong
nigel: but the spec allows it
chris: we need a follow-up to understand that
foolip: maybe open a bug in chromium
chris_n: the spec does not mandate 250ms
ericc: so that we are not firing
    timeupdate events to not overload the system
    ... we could, but that would cause other issues
ack
pierre: the spec guarantees that
    every single cue will be fired
    ... regardless of the algorithm ?
gkatsev: no some have been missed
scott: the text says some cues can be skipped
ericc: cues can be dropped
pal: if I have a cue that has a duration of d is there a req that difference between onenter and onexit is close to d?
ericc: no
pal: you could get them simulatenously
ericc: but if there is onenter/onexit it should be fired
foolip: that's a good idea
chris: another related issue
    is
    ... we want a more accurate firing of these events
    ... driven by the need to align captions with shots or scene
    changes in the video
    ... and we came up with a number of 20ms
    ... that gives a chance to the application
nigel: you want to replace the number 250 with 20?
chris: no
richard: the shorter the time limit goes down, it's exponential the power you're going to have
foolip: the reason the schedulig is poor isnot for battery saving
ericc: it was because it was
    simpler to write
    ... it's not possible to guarantee any kind specific
    latency
    ... because the browser is under the same constraints as
    anything else
nigel: that depends on the frame rate
ericc: cues are not tied to frames
foolip: but frames have time stamos
ericc: in my system the frames are rendered by a different subsystem
foolip: there is a quality of implementation issue
ericc: no matter what wording we
    put in the spec
    ... it won't help you
    ... you have to file bugs to get what you need
chris_n: I wouldn't close an issue because it is ok with the spec
chris: I see some inconsistencies
    between implementations
    ... when the application moves cues around in the
    timeline
    ... if you change time of the cues
    ... and if you seek the media
    ... and seek over some cues
ericc: have you filed bugs?
chris: not yet
chris_n: a spec update is not necessary but it may be useful to avoid others doing the same mistake
ericc: we should not wait for
    TPAC to file bugs
    ... if we want to have the issue fixed quickly
nigel: it's hard to file a bug with the given spec
ericc: if you file a bug with an
    example and it is not good enough even if it matches the spec
    we should fix it
    ... we could get the spec improved
foolip: all specs are wrong every other paragraph!
richard: sometimes I've asked to fix an implementation but been told that the impl is within the spec
foolip: it happens that implementers consider the spec as untouchable but you should escalate
chris: [showing a waverform
    library demo]
    ... I'm using VTTCues
    ... adjusting the times on cues
    ... it's not the only use case
    ... [showing a table of what events get fired in practice]
ericc: you should file a bug
chris: the next stage is the
    meeting on Friday, joint Media WG and Timed Text
    ... we should figure out how to use that time productively
<nigel> Blink bug
<nigel> blink bug
chris: it seems filing bug is the recommendation
This is scribe.perl Revision: 1.154 of Date: 2018/09/25 16:35:56 Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/ Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00) Succeeded: s/Ov/1. Ov/ Succeeded: s/Ne/2. Ne/ Succeeded: i/cyril:/scribe: nigel Present: romain gkatsev pal chcunningham ericc jkamata stepsteg Nigel Found Scribe: Cyril Inferring ScribeNick: cyril Found Scribe: nigel Inferring ScribeNick: nigel Found Scribe: cyril Inferring ScribeNick: cyril Found Scribe: nigel Inferring ScribeNick: nigel Found Scribe: cyril Inferring ScribeNick: cyril Scribes: Cyril, nigel ScribeNicks: cyril, nigel WARNING: No date found! Assuming today. (Hint: Specify the W3C IRC log URL, and the date will be determined from that.) Or specify the date like this: <dbooth> Date: 12 Sep 2002 People with action items: WARNING: IRC log location not specified! (You can ignore this warning if you do not want the generated minutes to contain a link to the original IRC log.)[End of scribe.perl diagnostic output]