This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12076 - <video> Recast WebVTT parser so that it first does line breaking then handles each line, instead of being character-oriented
Summary: <video> Recast WebVTT parser so that it first does line breaking then handles...
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P5 trivial
Target Milestone: Unsorted
Assignee: contributor
QA Contact: contributor
Depends on:
Reported: 2011-02-15 09:27 UTC by contributor
Modified: 2012-07-18 18:40 UTC (History)
7 users (show)

See Also:


Description contributor 2011-02-15 09:27:31 UTC

Wishlist: line-based parser

Posted from:
Comment 1 Philip Jägenstedt 2011-02-15 09:44:56 UTC
I'm calling this a wishlist item because it is editorial. Still, here's my reasoning:

When I made a JavaScript implementation of the earlier WebSRT parser, I found it quite hard to follow the steps because of how handling of CRLF is sprinkled all over, and even found a spec bug related to it (fixed already). Of course the spec should be precise down to every single byte what should happen, but I'm hoping that could happen with a line-based parser as well.

If it's not obvious, by a line-based parser I mean one which operates on the input and generates lines for a second step. This wouldn't harm streaming, because AFAICT no cues will be output from the parser before CRLF or EOF is encountered anyway.

I dare say this makes it more likely that implementations of WebVTT in high-level languages like JavaScript and Python will actually follow the spec, since operating on lines is quite easier to understand for a format like WebVTT. If you go and look for random SRT parsers, I think you'll find that most work like this. (The ones I've written do anyway.)

The spec is already mostly line-based, I'm just suggesting that the line-splitting be separated out from the rest to improve readability. Do as you will.
Comment 2 Philip Jägenstedt 2011-02-15 09:45:40 UTC
Oh yeah, it'd be simple to add line-based comments to such a parser, too.
Comment 3 Ian 'Hixie' Hickson 2011-05-05 08:21:24 UTC
This would be a lot of risky work for minimal gain, IMHO.
Comment 4 Philip Jägenstedt 2011-05-05 09:22:04 UTC
Risky in what sense? There's no existing content or implementations to break.
Comment 5 Ian 'Hixie' Hickson 2011-06-02 00:38:00 UTC
Risky in the sense that I'm almost certain to screw it up and spend hours spread over many days trying to fix it.
Comment 6 Ian 'Hixie' Hickson 2011-06-02 23:47:15 UTC
Let me keep this on my radar for a bit longer, in case I come across a stronger rationale for doing this. Currently though I'm leaning towards not changing this. It would be a lot of effort for minimal gain, and the opportunity cost would thus be high.
Comment 7 Philip Jägenstedt 2011-09-11 16:03:21 UTC
Comments from the Open Video Conference, with implementors of Opera, Firefox,
Chrome and Safari discussing WebVTT:

At this point of implementation we don't care about this any longer, closing this bug.