Bug 23971 - Define an encoding for formerly latin1
Define an encoding for formerly latin1
Product: WHATWG
Classification: Unclassified
Component: Encoding
PC All
: P2 normal
: Unsorted
Assigned To: Anne
Depends on:
  Show dependency treegraph
Reported: 2013-12-03 14:15 UTC by Anne
Modified: 2014-11-04 15:04 UTC (History)
3 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Anne 2013-12-03 14:15:58 UTC
The web does not have latin1, but we do need it for HTTP and stuff. We should probably also expose it to the API.

"identity" makes some sense, but only in one direction. "bikeshed" also makes sense.

We should not expose this encoding to HTTP charset= or <meta> overrides. This is only for internal matters (such as XMLHttpRequest) and developers using the API.
Comment 1 Henri Sivonen 2013-12-03 14:45:08 UTC
Would be good to see some pointers to HTTP code that actually uses de jure ISO-8859-1 and not windows-1252 decoding. Do you have pointers?
Comment 2 Anne 2013-12-03 14:52:54 UTC
That's a good point. In XMLHttpRequest this is used for methods and headers. While most of that is restricted to 0x00 - 0x7F, header values can be pretty much any octet.
Comment 3 Anne 2013-12-12 16:23:39 UTC
http://dump.testsuite.org/xhr/header-with-bytes.php is an example of that. Header has 0x80 as value. Comes out as U+0080 (and not as €).
Comment 4 Anne 2014-04-28 15:24:52 UTC
"unicodelatin1" might be an acceptable name. Unicode refers to this block as "Latin-1 Supplement" so that does not seem too bad.
Comment 5 Anne 2014-11-04 15:04:13 UTC
My understanding is that user agents have dedicated routines for original "latin1" type of conversion in the HTTP layer and potentially elsewhere. If we do indeed want something similar we should probably add statics on String and ArrayBuffer or some such for such conversion.

The Encoding API can then remain for actual encodings only.