15632 – <track> Dealing with out of order cues in WebVTT

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 15632 - <track> Dealing with out of order cues in WebVTT

Summary: <track> Dealing with out of order cues in WebVTT

Status:	RESOLVED WONTFIX

Alias:	None

Product:	TextTracks CG
Classification:	Unclassified
Component:	WebVTT (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Ian 'Hixie' Hickson
QA Contact:	This bug has no owner yet - up for the taking

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2012-01-20 03:11 UTC by Silvia Pfeiffer
Modified:	2012-04-25 21:44 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description Silvia Pfeiffer 2012-01-20 03:11:31 UTC

The WebVTT syntax specification says about the start time of a cue:
"The time represented by this WebVTT timestamp must be greater than or equal to the start time offsets of all previous cues in the file."

This is an authoring requirement. 

However, when we go to the parsing section of the spec, there is no step in the parser that makes sure that cues that are out of time order are ignored. This means that an implemented parser will pick up such cues and enter them into the list of cues to be used at the time that they are relevant.

Instead, I propose that we should enter a check into the parser and drop late cues onto the floor. I believe this is the correct thing to do, because the cue is out of order.

I am particularly concerned about this for consistency with two situations: encapsulation into media files and live streaming text.

In the encapsulation case, if a cue is encapsulated into a media file and it comes too late, then the demuxer and decoder will only come across this cue at a time where it's too late to present it and therefore will have to drop it on the floor. Similar reasoning applies to the live streaming case.

I therefore suggest to include such a requirement before step 38 of the
parser at http://dev.w3.org/html5/webvtt/#parsing .

Comment 1 Simon Pieters 2012-01-20 09:19:42 UTC

I disagree. SRT parsers don't ignore out of order cues. SRT content has out of order cues (I don't have numbers readily available but I could check for it). It seems plausible that WebVTT files will have out of order cues, too.

That it doesn't work for encapsulation and streaming is a good reason to make it non-conforming, but it's not a reason to drop cues for static files.

Comment 2 Silvia Pfeiffer 2012-01-22 22:47:12 UTC

(In reply to comment #1)
> I disagree. SRT parsers don't ignore out of order cues.

Some do and some don't. It's not well enough specified as to what the right approach is.


> SRT content has out of
> order cues (I don't have numbers readily available but I could check for it).
> It seems plausible that WebVTT files will have out of order cues, too.

Since it's an authoring requirement in the WebVTT spec, authors should expect their cues to be dropped on the floor if they come out of order. Why have that requirement in the first place when we ignore it later on during actual processing?

Also, what about the processing of chapters? We introduced a means to create chapter trees by nesting of cues, see http://www.whatwg.org/specs/web-apps/current-work/multipage/the-video-element.html#text-tracks-describing-chapters . If we allow cues to be out of order, that makes it hard to incrementally create the chapter tree.


 
> That it doesn't work for encapsulation and streaming is a good reason to make
> it non-conforming, but it's not a reason to drop cues for static files.

How about consistency? A file that is used through embedding in a video and compare to a statically referenced file would have different cues presented. That's a situation that we should really avoid.

As an author I'd much rather find out that I've misplaced a cue in the file during authoring of the static file than notice that as the file is used elsewhere my cues are going missing.

Comment 3 Philip Jägenstedt 2012-02-07 14:45:06 UTC

I agree with Simon here, dropping cues in the parser is more work for a result that is worse. Validity and parsing don't need to be in perfect sync and I can't really see any downsides to handling this case silently. A parser could warn in the error console, though, I'll have a look at doing that in Opera.

Comment 4 Philip Jägenstedt 2012-02-07 16:36:23 UTC

Off topic: I guess I should stop putting <track> in the title now that we have the "TextTracks CG" Product?

Comment 5 Silvia Pfeiffer 2012-02-07 23:25:00 UTC

(In reply to comment #4)
> Off topic: I guess I should stop putting <track> in the title now that we have
> the "TextTracks CG" Product?

Well, where it applies to both the WebVTT file format and the TextTrack API in HTML, we likely need both.

Comment 6 Philip Jägenstedt 2012-02-10 20:01:51 UTC

I've added a console error to our implementation (no public build, yet) to warn about out-of-order cues now. It would be trivial to also drop cues in this situation, but I really don't think that's nice when handling it comes for free (since scripts can insert cues out of order).

Comment 7 Ian 'Hixie' Hickson 2012-04-25 21:44:28 UTC

I don't think the reasons in comment 0 are compelling. The reason it's non-conforming is because there are situations where it can cause problems (e.g. streaming, editing), but I don't see why we'd want to start dropping the cues. It would just hurt users.