This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
'encoding' is currently defined as follows: An encoding defines a mapping from a code point to one or more bytes (and vice versa). This does not work e.g. for iso-2022-jp, because that can only be explained as a mapping from a sequence (one or more) of code points to a sequence (one or more) of bytes (and vice versa).
Why is that? I can see how this is the case for big5 though. How about "An encoding defines a mapping from a code point sequence to a byte sequence (and vice versa)."?
(In reply to comment #1) > Why is that? I can see how this is the case for big5 though. How about "An > encoding defines a mapping from a code point sequence to a byte sequence > (and vice versa)."? The fix looks okay! But I don't understand why you think this is needed for Big5, but not for iso-2022-jp. If I have the byte sequence 0x24 0x24, is iso-2022-jp, this can either be "$$" or "い" (Hiragana I). So you need context to know which it is, you can't just convert one code point at a time. For Big5, on the other hand, you can convert one code point at a time assuming you get "packets" of bytes that each represent a code point.
https://github.com/whatwg/encoding/commit/088780df57d0b4d567aad175d2be3d46980b7561 Sure, you need multiple bytes for one code point. The definition already covered that... big5 however can emit two code points sometimes.