This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
IE and Firefox use asymmetric mapping table for some charsets. Mainly ISO charsets use corresponding Windows charsets for decoding while be strict about encoding. IMO it's desirable to employ this approach to keep "willful violation" to IANA registry as low as possible. iso-8859-9, latin5, l5, csISOLatin5, and iso-ir-148 are not aliases of windows-1254.
See also bug 15340. At least ISO encodings need to be separated from Windows encodings so that conformance checkers can report parse errors.
Since these are legacy encodings, is it really worth caring that much about the IANA registry? It seems better to simplify code and lower the barrier to entry for new players.
I don't think the barrier is so high because browsers can ignore parse errors (that is, it's sufficient to just replace mapping tables). But conformance checkers can not.
Right, about conformance checkers. I think they should flag everything that is not UTF-8. I don't really think it's worthwhile for them to flag that your usage of iso-8859-1 is actually windows-1252. Henri, Ian, opinions?
(In reply to comment #4) > Right, about conformance checkers. I think they should flag everything that is > not UTF-8. I don't really think it's worthwhile for them to flag that your > usage of iso-8859-1 is actually windows-1252. If you mean requiring conformance checkers to emit warning messages for any document that's not UTF-8, I'm not sure Richard would be too keen on that.
I think if a document is labeled as ISO-8859-1 but has characters that are going to be interpreted differently than ISO-8859-1 says they should be, that the validator should give an error message. This is what the HTML spec currently requires for HTML docs.
1. Per the Encoding Standard there is no difference between iso-8859-1 and windows-1252. I think that is fine, unless there is some compatibility problem with that. 2. I think we should make non-utf-8 usage non-conforming because there are too many traps with URLs, form submission, and other formats that only work well with utf-8. Per that I'm going to mark this WONTFIX.