This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22233 - [HTML]: I can't find the rules which specify real-world parsing of <body><script>&amp;
Summary: [HTML]: I can't find the rules which specify real-world parsing of <body><scr...
Status: RESOLVED INVALID
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Linux
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-06-02 12:57 UTC by Alan Jenkins
Modified: 2013-06-07 13:29 UTC (History)
5 users (show)

See Also:


Attachments

Description Alan Jenkins 2013-06-02 12:57:23 UTC
AFAICS the tokenizer is only switched to "script data state" from the "in head" insertion mode.

However real-world browsers also switch to "script data state" from <script> inside <body>.  E.g. Firefox 21.0 with this test page:

<!doctype html>
<body><!-- behaviour is identical if <body> is removed -->
<script>alert('&amp;')</script>

The result is "&amp;".  But AFAICS the spec implies this (non-conforming) page should result in "&".  (Which violates the principle of least surprise, at least).

My understanding was that this was the real-world behaviour on all major browsers.  And if the spec is in variation then no major browser is conforming, which is an obstacle to standardization.

Am I right about the behaviour specified by HTML5?  And major browsers other than Firefox?  If so, does the spec need to be changed?

This thought was provoked after looking at how <svg><script> works in HTML syntax.  http://security.stackexchange.com/questions/36701/why-does-this-xss-vector-work-in-svg-but-not-in-html

I recently came across this particular tag soup in ci-Bonfire.  Example page http://eposure.com/
Comment 1 Henri Sivonen 2013-06-07 13:29:34 UTC
"In body" says:
> A start tag token whose tag name is one of: "base", "basefont", "bgsound", "link", "meta", "noframes", "script", "style", "title"
>
>    Process the token using the rules for the "in head" insertion mode.