This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Moving region definitions out of the WebVTT header into the "body" could allow us to redefine existing regions (as is necessary sometimes for CEA708) and define new regions dynamically during the cause of the file (as is necessary sometimes with live streaming).
But of course that's one reason it's defined like it is...exactly so that there are no 'surprises' in the middle of the stream. We need to find a balance here.
What "surprises" are you concerned about? It's common for CEA708 captions that a block of captions moves to a different location. We can't do that without the ability to change a region definition. Their timing would need to be attributed to the next cue.
I rather strongly support changing the syntax to be NOTE-like, but that's not enough for changing regions dynamically -- that would require attaching timing information as well. Is that really common enough that it needs to have a declarative solution?
Actually a much better reason not to do this is random access. At the moment, you can chunk up or random, access a VTT file using its header only as the initialization. Once you introduce a state setting that persists, this is no longer possible, and you have to scan all the time from the beginning to know the current state at any time. Say, for example, we do HTTP-chucnked streaming, and the content owner promises the (fairly simple) "no cue has a time span the crosses a chunk boundary". I can now random access given the header info, and then the chunk that spans the time I want to start at. Introduce mutable regions and we either have to ban them in this case, or the file author has to replicate them at the front of every chunk, or other hacky things...
(In reply to David Singer from comment #4) > Actually a much better reason not to do this is random access. At the > moment, you can chunk up or random, access a VTT file using its header only > as the initialization. Once you introduce a state setting that persists, > this is no longer possible, and you have to scan all the time from the > beginning to know the current state at any time. > > Say, for example, we do HTTP-chucnked streaming, and the content owner > promises the (fairly simple) "no cue has a time span the crosses a chunk > boundary". I can now random access given the header info, and then the > chunk that spans the time I want to start at. Introduce mutable regions and > we either have to ban them in this case, or the file author has to replicate > them at the front of every chunk, or other hacky things... We discussed that at FOMS and it seemed that this case can be addressed by adding the currently active region definitions to every chunck. Since the chunks are normally created by tools, that should not cause any extra trouble. Right now, the regions have to be made available anyway and I was told that HLS was going to copy the region definitions into every chunk anyway. So, there's not really any extra overhead for this.
If the tool is willing to replicate the header info into every chunk, and is willing to make sure that the chunk boundaries don't intersect a cue duration, then each chunk can as easily be a separate VTT file; and it's then clear what's happening (that the chunks stand alone, since that is a characteristic of VTT). If I can add the 'header info' I can also add the WebVTT line.
(In reply to David Singer from comment #6) > If the tool is willing to replicate the header info into every chunk, and is > willing to make sure that the chunk boundaries don't intersect a cue > duration, then each chunk can as easily be a separate VTT file; and it's > then clear what's happening (that the chunks stand alone, since that is a > characteristic of VTT). > > If I can add the 'header info' I can also add the WebVTT line. IIUC that's already how HLS is dealing with WebVTT. For example, see these: * the M2U8 file http://cdnbakmi.kaltura.com/api_v3/index.php/service/caption_captionasset/action/serveWebVTT/captionAssetId/0_ucxlurda/a.m3u8 * the referenced 1.vtt file http://cdnbakmi.kaltura.com/api_v3/index.php/service/caption_captionasset/action/serveWebVTT/captionAssetId/0_ucxlurda/segmentIndex/1.vtt * the referenced 2.vtt file http://cdnbakmi.kaltura.com/api_v3/index.php/service/caption_captionasset/action/serveWebVTT/captionAssetId/0_ucxlurda/segmentIndex/2.vtt
https://github.com/w3c/webvtt/issues/231 As far as I can tell from the discussion, having region definitions mixed with cues doesn't help streaming compared to using separate VTT files when region definitions change in a stream, so that aspect is WORKSFORME but we still want to change the syntax.