[Bug 21088] New: Spec repeats potential Gecko bugs about encoding defaults as the truth

https://www.w3.org/Bugs/Public/show_bug.cgi?id=21088

            Bug ID: 21088
           Summary: Spec repeats potential Gecko bugs about encoding
                    defaults as the truth
    Classification: Unclassified
           Product: HTML WG
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: CR HTML5 spec
          Assignee: robin@w3.org
          Reporter: hsivonen@iki.fi
        QA Contact: public-html-bugzilla@w3.org
                CC: mike@w3.org
        Depends on: 21087

See
http://www.w3.org/html/wg/drafts/html/CR/syntax.html#determining-the-character-encoding

+++ This bug was initially created as a clone of Bug #21087 +++

The spec includes a table of locales and encoding defaults for those locales.
The data for that table has been taken from Gecko 1.9.1 source code. It appears
that the data hasn't been properly compared with the behavior of IE, which
might have more significant market share in some of the locales involved. In
particular, it looks suspicious that the Simplified Chinese is GB18030 rather
than GBK and every entry that suggests UTF-8 as the encoding looks suspicious.
For example, chances are that users of Welsh UI will be exposed to the same
legacy content as the users of UK English UI. Also, Windows has a legacy code
page specifically for Vietnamese, so it seems incredible that legacy content
encountered by users of the Vietnamese locale would more often be UTF-8 that
mean that code page.

In order to avoid spreading bugs, please remove all the entries that haven't
been cross-checked to agree with the defaults of a version of Internet Explorer
that predates the inclusion of the table in the spec. If such cross-checking
can be performed in a timely manner, please at least remove all the entries
that claim that the default should be UTF-8 or GB18030 for the time being.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Friday, 22 February 2013 13:18:44 UTC