This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12398 - Tokenizer stuck in ScriptDataDoubleEscapedState
Summary: Tokenizer stuck in ScriptDataDoubleEscapedState
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P1 critical
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
Depends on:
Reported: 2011-03-30 00:52 UTC by Tony Gentilcore
Modified: 2011-08-04 05:34 UTC (History)
9 users (show)

See Also:

Testcase (333 bytes, text/html)
2011-03-30 00:52 UTC, Tony Gentilcore

Description Tony Gentilcore 2011-03-30 00:52:49 UTC
Created attachment 972 [details]

The attached test case gets stuck in ScriptDataDoubleEscapedState which seems incorrect.

CR10-12: FAIL

Filed in WebKit as
Comment 1 Adam Barth 2011-03-30 01:00:20 UTC
Note: This bug affects Bugzilla, so be careful copy/pasting the string around in this page or you might hose the bug.
Comment 2 Tony Gentilcore 2011-03-30 01:24:37 UTC
Actually the title of this bug is misleading, the "</script>" gets us out of ScriptDataDoubleEscapedState and back into ScriptDataEscapedState. I'm still looking into it, but maybe the bug is that we shouldn't go into ScriptDataDoubleEscapedState in the first place.
Comment 3 Henri Sivonen 2011-03-30 06:17:07 UTC
It's known that there exists some sites on the Web where the spec gets stuck inside a script. That's the price of never reparsing.

Is this breaking a top site that you have been unable to evangelize?

With Firefox, the script states have been a wild success beyond my expectations. Exactly one case of site breakage has reached me on b.m.o and that site was successfully evangelized.

With the data presented so far, I'd WONTFIX this.

To the extent this affects Bugzilla itself, it's a bug in Bugzilla, although I thought the bug was already fixed in upstream Bugzilla. In general, any Web app that includes untrusted strings as string literals in inline scripts MUST escape < as \u003C to be safe. (That's the simplest way to deal with <!--, <script> and </script> all in one go.)
Comment 4 Henri Sivonen 2011-03-30 06:19:38 UTC
Also note that you'll never get all possible cases "right" without including JavaScript and VBScript parsers as part of the HTML tokenizer, so it's kinda useless to be able to point to one more case that "fails".
Comment 5 Adam Barth 2011-03-30 06:43:26 UTC
Tony probably knows more of the details, but my understanding is that this issue manifests itself in one of Google's web site.  We can certainly ask those folks if they'd be willing to work around the issue.
Comment 6 Henri Sivonen 2011-03-30 09:41:30 UTC
(In reply to comment #5)
> Tony probably knows more of the details, but my understanding is that this
> issue manifests itself in one of Google's web site.  We can certainly ask those
> folks if they'd be willing to work around the issue.

In that case, I think this should be WONTFIX+evang.
Comment 7 Simon Pieters 2011-03-30 10:03:42 UTC
I agree that this should be WONTFIX. The current spec is as web compatible as it gets without doing reparsing.
Comment 8 Tony Gentilcore 2011-03-30 16:21:31 UTC
Thanks for the explanations. I've mentioned your escaping suggestion to the webmaster. Does anyone have a bugzilla contact to do the same?
Comment 9 Henri Sivonen 2011-03-31 05:44:38 UTC
I believe the Bugzilla bug was

Is there still another bug of that nature in Bugzilla itself?
Comment 10 Ian 'Hixie' Hickson 2011-06-10 22:24:59 UTC
Note that a Web page that runs into this will be invalid, so authors can avoid this problem by using the validator to check their pages.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:

Status: Rejected
Change Description: no spec change
Rationale: per comments above
Comment 11 Michael[tm] Smith 2011-08-04 05:34:11 UTC
mass-move component to LC1