This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
http://gsnedders.html5.org/web-encoding-names/results.html shows what document.characterSet returns in current versions of browsers. Notably, Firefox and Chrome both return the uppercased names for many of these. (IE returns them all lowercase except "GB18030"; ZombieOpera returns them all lowercase) Googling these encoding names it becomes clear that almost everyone refers to "UTF-8", "ISO-8859-n", etc. (uppercased), and as there is no interop here currently, and the proposed behaviour matches Firefox/Chrome, it would seem better to just give them their names that are in common usage. As such, I propose to change the names to the following (thereby changing case only): - UTF-8 - IBM866 - ISO-8859-n - ISO-8859-8-I - KOI8-R - KOI8-U - HZ-GB-2312 - Big5 - EUC-JP - ISO-2022-JP - Shift_JIS - EUC-KR - UTF-16BE - UTF-16LE
I value more that now you can predict what characterSet returns. With your proposed change you need to know that windows-1252 is not spelled Windows-1252. And that gb18030 is an exception.