This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Section: http://whatwg.org/specs/web-apps/current-work/#character-encodings-0 Comment: EUC-JP and ISO-2022-JP also need replacement encodings: CP51932 (or eucJP-ms) and CP50221. Posted from: 210.138.109.139
Waiting for Anne to do this.
This bug predates the HTML Working Group Decision Policy. If you are satisfied with the resolution of this bug, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html This bug is now being moved to VERIFIED. Please respond within two weeks. If this bug is not closed, reopened or escalated within two weeks, it may be marked as NoReply and will no longer be considered a pending comment.
Created attachment 832 [details] EUC-JP on WinIE
Created attachment 833 [details] EUC-JP on MacFx3.6
Created attachment 834 [details] EUC-JP on Safari4
Created attachment 835 [details] EUC-JP on MacChrome5
Created attachment 836 [details] EUC-JP on WinOpera10
First I described about EUC-JP. See attached images begin with EUC-JP. They are showing http://coq.no/X/charset5/test-EUC-JP.php?EUC-JP with * Internet Explorer 6 on Windows XP * Firefox 3.6 on Mac OS X 10.5 * Safari 4.0.5 on Mac OS X 10.5 * Google Chrome 5 on Mac OS X 10.5 * Opera 10.0 on Windows Vista All of them can show (0) ASCII (yen sign/back solidus is beyond this ticket) (1) JIS X 0208 before 1990 (2) Half-width katakana * NEC selected IBM extended characters (1st and 2nd character of labeled as `IBM') IE, Firefox, Chrome and Opera can show * NEC special characters (labeled as `KanjiTalk 6/7, NEC' and `NEC') Firefox, Safari, Chrome and Opera can show * JIS X 0212 derived from IBM extended character (3rd-6th of `IBM') Firefox, Chrome and Opera can show (3) JIS X 0212-1990 Safari and Chrome can show * IBM extended chacater (last one of `IBM') No one can show (1) JIS X 0208 after 1990 * DEC Kanji and KanjiTalk IANA defined EUC-JP as following but real implementations are above. Name: Extended_UNIX_Code_Packed_Format_for_Japanese MIBenum: 18 Source: Standardized by OSF, UNIX International, and UNIX Systems Laboratories Pacific. Uses ISO 2022 rules to select code set 0: US-ASCII (a single 7-bit byte set) code set 1: JIS X0208-1990 (a double 8-bit byte set) restricted to A0-FF in both bytes code set 2: Half Width Katakana (a single 7-bit byte set) requiring SS2 as the character prefix code set 3: JIS X0212-1990 (a double 7-bit byte set) restricted to A0-FF in both bytes requiring SS3 as the character prefix Alias: csEUCPkdFmtJapanese Alias: EUC-JP (preferred MIME name) CP51932 is: (0) ASCII (yen sign/back solidus is beyond this ticket) (1) JIS X 0208-1983 NEC special characters NEC selected IBM extended characters (2) Half-width katakana http://nkf.sourceforge.jp/ucm/cp51932.ucm All browser without Safari can show this character set. Safari cannnot show NEC special characters; but Chrome, whose engine is the same of Safari: WebKit, can show, so I think this is Safari's bug.
Forgive me, for I am not well-versed in these encodings. What should I put in the spec in the "Character encoding overrides" table?
(In reply to comment #9) > Forgive me, for I am not well-versed in these encodings. > > What should I put in the spec in the "Character encoding overrides" table? I think, what want you say is "'EUC-JP' is actually Windows Codepae 51932" is not kind for readers of HTML5. It is reasonable, so I'm trying to register CP51932: http://mail.apps.ietf.org/ietf/charsets/msg01877.html
Thank you for starting the registration process. Much appreciated. I'll update the spec once the registry is updated.
Marking this REMIND for now for tracking purposes; please feel free to reopen whenever the encoding is registered. I'll check this periodically.
CP51932 has been registered now. http://www.iana.org/assignments/character-sets http://www.iana.org/assignments/charset-reg/CP51932 You can use it as a replacement encoding for EUC-JP.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Accepted Change Description: see diff given below Rationale: Concurred with reporter's comments. I've added the EUC-JP to CP51932 mapping. Should there also be a mapping for ISO-2022-JP? This was mentioned in the first comment but wasn't mentioned afterwards. Also, do you know what I should use as the EUC-JP reference?
Checked in as WHATWG revision r5560. Check-in comment: Canonical mapping for EUC-JP for compat reasons. http://html5.org/tools/web-apps-tracker?from=5559&to=5560
(In reply to comment #14) > EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are > satisfied with this response, please change the state of this bug to CLOSED. If > you have additional information and would like the editor to reconsider, please > reopen this bug. If you would like to escalate the issue to the full HTML > Working Group, please add the TrackerRequest keyword to this bug, and suggest > title and text for the tracker issue; or you may create a tracker issue > yourself, if you are able to do so. For more details, see this document: > http://dev.w3.org/html5/decision-policy/decision-policy.html > > Status: Accepted > Change Description: see diff given below > Rationale: Concurred with reporter's comments. > > I've added the EUC-JP to CP51932 mapping. Thank you! > Should there also be a mapping for ISO-2022-JP? This was mentioned in the first > comment but wasn't mentioned afterwards. I posted a registration of CP50220 on 2010-09-17. When it is registered, it should be used as ISO-2022-JP. > Also, do you know what I should use as the EUC-JP reference? EUC-JP, which includes US-ASCII, JIS X 0201 Katakana, JIS X 0208, and JIS X 0212, is defined in "UI-OSF ú{ê«ÀKñ Version 1.1". http://home.m05.itscom.net/numa/uocjle-a4.pdf voluntary uploaded http://home.m05.itscom.net/numa/uocjleE.pdf voluntary uploaded It is referred by at least Japanese-Locale-Policy and Solaris 10's Japanese Manual. http://www.linux.or.jp/JF/JFdocs/Japanese-Locale-Policy.txt http://docs.sun.com/app/docs/doc/819-0364/ja.locale-10002?a=view
FWIW, Gecko indeed uses Microsoft-style data tables instead of the de jure tables for all three Japanese encodings. (Except on OS/2 where IBM-style tables are used instead.)
For the reference, I use a single document title, a list of names of editors, if any, and the name of the standards organisation that published the document, if any. Could you let me know what I should use of EUC-JP based on your comments above? Ideally using just ASCII and English, I'm afraid my understanding of Japanese is rather limited. :-(
(In reply to comment #18) > For the reference, I use a single document title, a list of names of editors, > if any, and the name of the standards organisation that published the document, > if any. Could you let me know what I should use of EUC-JP based on your > comments above? Ideally using just ASCII and English, I'm afraid my > understanding of Japanese is rather limited. :-( It should be "Definition and Notes of Japanese EUC". It is written by UI-OSF-USLP. (the Open Software Foundation, Inc., UNIX International, Inc, and UNIX System Laboratries Pacific, Ltd.) see C.1.1 It is included in Annex C of http://home.m05.itscom.net/numa/uocjleE.pdf P.S. I'm ok about "Y. Naruse" in http://html5.org/tools/web-apps-tracker?from=5559&to=5560
> It should be "Definition and Notes of Japanese EUC". > It is written by UI-OSF-USLP. > (the Open Software Foundation, Inc., UNIX International, Inc, and UNIX System > Laboratries Pacific, Ltd.) see C.1.1 > It is included in Annex C of http://home.m05.itscom.net/numa/uocjleE.pdf Awesome, thanks. I've updated the spec (diff below). > P.S. I'm ok about "Y. Naruse" in > http://html5.org/tools/web-apps-tracker?from=5559&to=5560 Thanks, that makes my life easier. :-) I'll mark this bug REMIND again while we wait for IANA to register CP50220. Please don't hesitate to reopen the bug once it's registered so that I can update the spec accordingly. Thank you so much for your patience and help with this bug. It is much appreciated.
Checked in as WHATWG revision r5607. Check-in comment: EUC-JP reference. http://html5.org/tools/web-apps-tracker?from=5606&to=5607
(In reply to comment #20) > I'll mark this bug REMIND again while we wait for IANA to register CP50220. > Please don't hesitate to reopen the bug once it's registered so that I can > update the spec accordingly. Recently CP50220 has registered as MIBenum: 2260. http://www.iana.org/assignments/character-sets
Awesome, thanks.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Accepted Change Description: see diff given below Rationale: I've added the ISO-2022-JP mapping as requested. Please check the diff below and the spec as it now stands, and let me know if there's anything further than needs doing (reopen the bug if so). Thanks agan for your help, much appreciated!
Checked in as WHATWG revision r6646. Check-in comment: Define compatibility mapping for ISO-2022-JP. http://html5.org/tools/web-apps-tracker?from=6645&to=6646
I'm ok, thanks!