12076 – <video> Recast WebVTT parser so that it first does line breaking then handles each line, instead of being character-oriented

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12076 - <video> Recast WebVTT parser so that it first does line breaking then handles each line, instead of being character-oriented

Summary: <video> Recast WebVTT parser so that it first does line breaking then handles...

Status:	RESOLVED WONTFIX

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	HTML (show other bugs)
Version:	unspecified
Hardware:	Other other

Importance:	P5 trivial
Target Milestone:	Unsorted
Assignee:	contributor
QA Contact:	contributor

URL:	http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2011-02-15 09:27 UTC by contributor
Modified:	2012-07-18 18:40 UTC (History)
CC List:	7 users (show)

See Also:

Attachments

Description contributor 2011-02-15 09:27:31 UTC

Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html
Section: http://www.whatwg.org/specs/web-apps/current-work/#webvtt-parser

Comment:
Wishlist: line-based parser

Posted from: 83.218.67.122

Comment 1 Philip Jägenstedt 2011-02-15 09:44:56 UTC

I'm calling this a wishlist item because it is editorial. Still, here's my reasoning:

When I made a JavaScript implementation of the earlier WebSRT parser, I found it quite hard to follow the steps because of how handling of CRLF is sprinkled all over, and even found a spec bug related to it (fixed already). Of course the spec should be precise down to every single byte what should happen, but I'm hoping that could happen with a line-based parser as well.

If it's not obvious, by a line-based parser I mean one which operates on the input and generates lines for a second step. This wouldn't harm streaming, because AFAICT no cues will be output from the parser before CRLF or EOF is encountered anyway.

I dare say this makes it more likely that implementations of WebVTT in high-level languages like JavaScript and Python will actually follow the spec, since operating on lines is quite easier to understand for a format like WebVTT. If you go and look for random SRT parsers, I think you'll find that most work like this. (The ones I've written do anyway.)

The spec is already mostly line-based, I'm just suggesting that the line-splitting be separated out from the rest to improve readability. Do as you will.

Comment 2 Philip Jägenstedt 2011-02-15 09:45:40 UTC

Oh yeah, it'd be simple to add line-based comments to such a parser, too.

Comment 3 Ian 'Hixie' Hickson 2011-05-05 08:21:24 UTC

This would be a lot of risky work for minimal gain, IMHO.

Comment 4 Philip Jägenstedt 2011-05-05 09:22:04 UTC

Risky in what sense? There's no existing content or implementations to break.

Comment 5 Ian 'Hixie' Hickson 2011-06-02 00:38:00 UTC

Risky in the sense that I'm almost certain to screw it up and spend hours spread over many days trying to fix it.

Comment 6 Ian 'Hixie' Hickson 2011-06-02 23:47:15 UTC

Let me keep this on my radar for a bit longer, in case I come across a stronger rationale for doing this. Currently though I'm leaning towards not changing this. It would be a lot of effort for minimal gain, and the opportunity cost would thus be high.

Comment 7 Philip Jägenstedt 2011-09-11 16:03:21 UTC

Comments from the Open Video Conference, with implementors of Opera, Firefox,
Chrome and Safari discussing WebVTT:

At this point of implementation we don't care about this any longer, closing this bug.