This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10257 - Meta charset sniffing need to resolve aliases before checking UTF-16ness
Summary: Meta charset sniffing need to resolve aliases before checking UTF-16ness
Status: VERIFIED INVALID
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: https://bugzilla.mozilla.org/show_bug...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-29 07:18 UTC by Henri Sivonen
Modified: 2010-12-01 15:30 UTC (History)
4 users (show)

See Also:


Attachments

Description Henri Sivonen 2010-07-29 07:18:52 UTC
Step 12 says "If charset is a UTF-16 encoding, change the value of charset to UTF-8."

It should resolve charset to the canonical name first, and if that's UTF-16, UTF-16BE or UTF-16LE (what about 32?), change charset to UTF-8?
Comment 1 Henri Sivonen 2010-07-29 07:31:29 UTC
Actually, Safari allows iso-10646 to sniff to the default instead of UTF-8. So maybe this bug report is wrong.
Comment 2 Henri Sivonen 2010-07-29 11:36:19 UTC
Thanks to Philip, I've examined four Web pages that declare iso-10646 in meta. (Thankfully, they are rare.) 3 were ASCII. One was Windows-1252. So from this data, it seems we should *not* do alias resolution before the UTF-16 to UTF-8 aliasing step.
Comment 3 Henri Sivonen 2010-07-29 12:05:16 UTC
(In reply to comment #1)
> Actually, Safari allows iso-10646 to sniff to the default instead of UTF-8. So
> maybe this bug report is wrong.

Chances are I'm misreading Safari's encoding menu.