This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 14676 - For UTF-16, the oder of the steps in "change the encoding" doesn't seem right.
Summary: For UTF-16, the oder of the steps in "change the encoding" doesn't seem right.
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-11-02 10:03 UTC by contributor
Modified: 2012-07-18 18:46 UTC (History)
3 users (show)

See Also:


Attachments

Description contributor 2011-11-02 10:03:30 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html
Multipage: http://www.whatwg.org/C#changing-the-encoding-while-parsing
Complete: http://www.whatwg.org/c#changing-the-encoding-while-parsing

Comment:
For UTF-16, the oder of the steps in "change the encoding" doesn't seem right.

Posted from: 114.43.127.97 by kennyluck@csail.mit.edu
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Comment 1 KangHao Lu 2011-11-02 10:19:22 UTC
s/oder/order/

Consider a simple test case like, <script>alert(document.characterSet||document.charset)</script><meta http-equiv="content-type" content="charset=utf-16"> . By the pre-scanning algorithm, the first try should be utf-8. But then in step 1 of the "change the encoding" uft-16 isn't at that moment equivalent to utf-8, so a reload is possible depending on how you interpret the "may" in step 4.

Gecko doesn't reload in this case. IE first gives the default encoding (which contradicts the pre-scanning algorithm but that's another issue), and then "unicode" (but decodes the content in "utf-8").

Anyway, is allowing reloading in my example intentional? If not, I propose we move step 3 before step 1.
Comment 2 contributor 2011-11-02 20:40:37 UTC
Checked in as WHATWG revision r6814.
Check-in comment: When a page interpreted as UTF-8 has a <meta charset> saying UTF-16, the spec used to say to reload even though the encoding didn't change.
http://html5.org/tools/web-apps-tracker?from=6813&to=6814