W3C

– DRAFT –
WICG DataCue / TextTrackCue meeting

20 March 2025

Attendees

Present
Chris_Needham, Nigel_Megitt_BBC, Rob_Smith
Regrets
-
Chair
Chris
Scribe
cpn

Meeting minutes

Recap of last meeting

Chris: Rob walked us through a proposed inheritance tree for TextTrackCue, VTTCue, DataCue
… All made sense to me
… We ended at discussing of inhibiting TextTrackCue construction

TextTrackCue constructor

Nigel: I have a proposal, for what may be a bigger change

Rob: I saw that, feels independent of this.

Chris: Can you explain, Nigel?

<nigel> Comment with alternative inheritance diagram

Nigel: [describes the alternative design]
… If all you want is the timing behaviour, you can use TrackCue

Rob: Makes sense. It includes some of the elements from my proposal, with different naming. The issue I have is that the root of the problem is TextTrack itself is misnamed

Nigel: True, TrackCue isn't consistent with those other uses

Rob: My thought was to acknlowedge and accept the existing naming

Chris: I tend to agree with that, Rob

Rob: Could call them MetadataTrack

Nigel: The key thing is that they're cues associated with the timeline of a media track
… This is why I think of TrackCue

Rob: Let's discuss cue differentiation, and prohibiting TextTrackCue construction

Nigel: Is there a use case for preventing construction of TextTrackCue (using current naming)

Rob: I found some more detail in the HTML spec. The key is to differentiate between different types. DataCue is designed to do that, it has a type and value
… If that existed in HTML spec, we'd be done
… This proposal is deliberately less of a change - even though DataCue itself is very simple
… For cue differentiation, an abstract base class requires an extension that can be used for cue type

Nigel: Where does the need to identify the cue type come from?
… Limiting it to say you must declare a class isn't something we should impose in everyone

Rob: But you otherwise don't know the type

Nigel: But some use cases don't need a type...

Rob: An example, the existing design is an abstract base class, so it's not a change.
… I considered a use case with no payload, so TextTrackCue instantiated. Is there a use case for that? A video camera monitoring vehicle arrivals in a car park, using cue-enter and cue-exit
… You need to identify individual vehicles
… Requires no payload. With just startTime, endTime, id, we can monitor the vehicles
… There's a family of such use cases. Another is people entering/exiting a venue
… Combine them, and now you have two sorts of cue in the same track

Nigel: Can do it by having different enter and exit handlers.

Rob: True, if you control the cue creation
… Cue lifecycle is in two parts: creation and activation
… In creation, we need to potentially load a cue class, then create a cue instance with times and id. Then add event listeners, optionally. Then add the cue to a track
… In the activation phase, a cue event is triggered by the browser, handled by the web app's event handler
… In an out of band use case, the creation steps and the handlers are in the app code. In that case, you don't need to discriminate the types of cue
… If you've created the Track in the app code, it's private and will only have the cues you created
… If the Track is created by the browser by a <track> element, is can be seen by other apps, so they can put their own cues on it
… onchange event associated with the Track, which sees any kind of cues

Nigel: In that case where you have a <track> element, the browser parses the data and creates the cues on the track
… The ones you make, you control the event handlers. The ones you don't create have their own handlers

Rob: This is the onchange event on the Track

Nigel: So you want to write an onchange handler that can handle the cues you created and the ones you didnt
… Strategy might be to add an attribute, or to add a subclass. Both are valid things to do. So why force everyone to subclass?

Rob: You need to know the type is unique
… If you add an attribute instead, how do you know if the other cues don't also have that attribute

Nigel: Could be useful to have that, so others can integrate with my code

Rob: Risk of not being able to discriminate

Chris: [explains how today's API design allows supports the use cases]

Nigel: You're saying if something meets the runtime interface, it'll be accepted. And the other part it's up to the app developer to ensure that their code integrates well with any third party libraries they're using
… You also said you can have multiple TextTracks, and each handles it's own kinds of cues

Chris: How does <track> map to TextTrack instances?

Rob: HTMLTrackElement has a TextTrack in it. And you can all addTextTrack on a media element to create a track

Nigel: You can give it a different 'kind' attrbute, but he only source document type it'll parse is WebVTT. The browser would create a TextTrackCues with VTTCue in it

Chris: You can have multiple <track> elements with WebVTT files

Nigel: For multiple languages, MDN has a good example

Chris: How to add enter/exit events to individual cues when using <track>?

Nigel: You can iterate through the cues, but not sure when you'd do that before the media plays

Nigel: The Sourcing In-band Tracks document may talk about that

Rob: In-band events is another case. The creation steps are handled by the browser, but unsure where the activation steps happen - browser could listen to its own events

Nigel: VTTCue, a video with 608 subtitles, the browser parser constructs VTTCues. Then it's standard VTT handling
… But what if you have other in-band events?

Chris: Webkit does this, creates DataCues, but I assume they go on their own TextTrack

Chris: For VTTCue, the browser will always do the rendering. You can add enter/exit handlers to do additional things, but that doesn't replace the default behaviour

Nigel: TextTrack has inBandMetadataDispatchType property. Not widely implemented

Nigel: You may want an event when a cue is added or removed to a TextTrack
… This seems missing, but that's separate to today's discussion

Rob: HTML talks about timed data

<RobSmith> https://html.spec.whatwg.org/multipage/media.html#guidelines-for-exposing-cues-in-various-formats-as-text-track-cues

Rob: This talks about unspecified / unknown formats that need to be mapped to a TextTrackCue. It sets defaults for id and pauseOnExit flag
… cue identifiers are optional, but should be there if you need them

Nigel: The terms "in band" and "out of band" are very context dependent, whether in MP4 files, or the blanking in the video, or DASH manifest

<nigel> Sourcing in-band Media Resource Tracks from Media Containers into HTML

Chris: This is a part of HTML spec that is murky, unclear what's interoperably implemented, and ideally would be tidied up

Chris: Are we concluding we don't need to prevent TextTrackCue construction?

Nigel: There are all the levers you need, to be able to discriminate types of cues.

Rob: The problem I see is the type of cue is undefined. DataCue adds a type label in a specified place

Rob: There's another issue that unbounded TextTrackCue isn't implemented
… I also notice the TextTrackCue onenter / onexit events WPT tests are failing

Chris: So I suggest we update the proposal to allow TextTrackCue construction, then we can discuss with Eric

Minutes manually created (not a transcript), formatted by scribe.perl version 244 (Thu Feb 27 01:23:09 2025 UTC).

Diagnostics

Succeeded: s/XX/[explains how today's API design allows supports the use cases]/

Maybe present: Chris, Nigel, Rob

All speakers: Chris, Nigel, Rob

Active on IRC: cpn, nigel, RobSmith