Bugzilla – Bug 23971
Define an encoding for formerly latin1
Last modified: 2014-11-04 15:04:13 UTC
The web does not have latin1, but we do need it for HTTP and stuff. We should probably also expose it to the API.
"identity" makes some sense, but only in one direction. "bikeshed" also makes sense.
We should not expose this encoding to HTTP charset= or <meta> overrides. This is only for internal matters (such as XMLHttpRequest) and developers using the API.
Would be good to see some pointers to HTTP code that actually uses de jure ISO-8859-1 and not windows-1252 decoding. Do you have pointers?
That's a good point. In XMLHttpRequest this is used for methods and headers. While most of that is restricted to 0x00 - 0x7F, header values can be pretty much any octet.
http://dump.testsuite.org/xhr/header-with-bytes.php is an example of that. Header has 0x80 as value. Comes out as U+0080 (and not as €).
"unicodelatin1" might be an acceptable name. Unicode refers to this block as "Latin-1 Supplement" so that does not seem too bad.
My understanding is that user agents have dedicated routines for original "latin1" type of conversion in the HTTP layer and potentially elsewhere. If we do indeed want something similar we should probably add statics on String and ArrayBuffer or some such for such conversion.
The Encoding API can then remain for actual encodings only.