This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26885 - Drop JIS X 212 from ISO-2022-JP
Summary: Drop JIS X 212 from ISO-2022-JP
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: Encoding (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+encodingspec
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-22 17:35 UTC by Jungshik Shin
Modified: 2014-11-04 14:22 UTC (History)
8 users (show)

See Also:


Attachments

Description Jungshik Shin 2014-09-22 17:35:26 UTC
EUC-JP in the encoding spec does not support JIS X 212 even though the formal definition of EUC-JP does include it. 

OTOH, ISO-2022-JP in the encoding spec does support JIS X 212 to my surprise. 

 Is there a reason to support JIS X 212 in ISO-2022-JP? It's not a part of the original ISO-2022-JP (RFC 1468 : https://www.ietf.org/rfc/rfc1468.txt). It's added later in RFC 1554, but I don't think it's widely used. 

Blink has never supported ISO-2022-JP-2 with JIS X 212. It only supports the original ISO-2022-JP.  Supporting JIS X 212 adds ~ 50kB to our build and I'd rather avoid it.
Comment 1 Anne 2014-09-22 18:05:32 UTC
Per bug 19939 comment 5 IE probably does not support this either.
Comment 2 Masatoshi Kimura 2014-09-23 00:46:26 UTC
(In reply to Jungshik Shin from comment #0)
> EUC-JP in the encoding spec does not support JIS X 212 even though the
> formal definition of EUC-JP does include it. 

Actually the EUC-JP decoder in the encoding spec DOES support JIS X 212.
https://encoding.spec.whatwg.org/#euc-jp-decoder
> If lead and byte are both in the range 0xA1 to 0xFE, set code point to the index code point for (lead − 0xA1) × 94 + byte − 0xA1 in index jis0208 if the euc-jp jis0212 flag is unset and in index jis0212 otherwise.

Did Chrome drop support for decoding JIS X 212 in the EUC-JP decoder?
Comment 3 Jungshik Shin 2014-09-24 05:58:27 UTC
oops. Recently making Blink's Shift-JIS aligned with the encoding spec, I forgot that EUC-JP does support JIS X 212 when converting to Unicode (decoding). Blink's EUC-JP was aligned with the encoding spec and does support JIS X 212 when decoding.
Comment 4 Anne 2014-09-24 09:26:16 UTC
It sounds like this is INVALID.
Comment 5 Jungshik Shin 2014-09-24 17:24:15 UTC
Well, the way ICU's ISO-2022-JP-2 implementation is written, the JIS X 212 table is not shared with EUC-JP converter. Unless I change the ICU to share JIS X 212 table, Blink has to live with the build size increase of tens of kBs. 

In addition, I have to change it unidirectional (decoding only). 

These are doable, but it's not clear if there's any user-benefits other than being compliant to the current spec. 

With "ISO-2022-JP" label, Blink (Chrome, Opera) and Safari do not support JIS X 212. They never did (except that I don't know whether the old Opera ever did). 

With "ISO-2022-JP-1" (or ISO-2022-JP-2) label, Blink and Safari do support JIS X 212 in both directions. Needless to say, virtually nobody uses 'ISO-2022-JP-[12]' in charset declaration. So, it does not count. 

MSIE does not support JIS X 212 either in ISO-2022-JP. 

That leaves only Firefox supporting JIS X 212 in ISO-2022-JP at the moment.
Comment 6 Anne 2014-09-25 07:55:06 UTC
Jungshik, thanks for explaining further. I take it Blink will stop supporting "ISO-2022-JP-1" and "ISO-2022-JP-2"?

Masatoshi, Henri, Simon, would we be okay with dropping 212 support from ISO-2022-JP in Gecko?
Comment 7 Masatoshi Kimura 2014-09-25 14:23:48 UTC
Gecko's ISO-2022-JP decoder has JIS X 0212 support mainly because of Thunderbird.
Comment 8 Henri Sivonen 2014-09-25 16:19:45 UTC
(In reply to Jungshik Shin from comment #5)
> MSIE does not support JIS X 212 either in ISO-2022-JP. 
> 
> That leaves only Firefox supporting JIS X 212 in ISO-2022-JP at the moment.

In that case, I think we should remove the ISO-2022-JP-2 aspects of the ISO-2022-JP implementation in Firefox (mozilla-central) and leave it to Thunderbird to support ISO-2022-JP-2 if deemed necessary (should be discoverable via telemetry). (The ISO-2022-JP-2 aspecs involve other weird stuff like a nested GB18030 decoder and a nested EUC-KR decoder.)
Comment 9 Simon Montagu 2014-09-28 09:10:40 UTC
The nested decoders in Gecko's ISO-2022-JP-2 implementation, weird as they may be, are there to avoid the same duplication of mapping tables that Jungshik wants to avoid in Blink.

Ideally I agree that we want to remove ISO-2022-JP-2 support (not just JIS X 212) from Firefox and from Encoding, though retaining support in Thunderbird, if desired, will be more complicated than other encodings that we've removed.
Comment 10 Anne 2014-09-28 10:48:16 UTC
As far as I know removing iso-2022-jp entirely is a nonstarter. It does seem however that Firefox' implementation is sufficiently special it will not be copied by others and that therefore the specification should not follow it.