This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21146 - Separate big5-hkscs from big5
Summary: Separate big5-hkscs from big5
Status: RESOLVED INVALID
Alias: None
Product: WHATWG
Classification: Unclassified
Component: Encoding (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+encodingspec
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-27 15:32 UTC by Masatoshi Kimura
Modified: 2014-09-25 22:07 UTC (History)
5 users (show)

See Also:


Attachments

Description Masatoshi Kimura 2013-02-27 15:32:18 UTC
Please see the Gecko bug: https://bugzilla.mozilla.org/show_bug.cgi?id=845743
Comment 1 Anne 2013-02-27 16:52:09 UTC
So we are going to add this back because of an intranet application? This is a tad hypocritical as we have previously refused to standardize behavior from Internet Explorer where they have said that intranet applications depended upon it.

I guess as long as Chrome supports the encoding too though there is not much leverage here either way.
Comment 2 Henri Sivonen 2013-03-01 15:03:30 UTC
Before changing the spec, we should check whether big5 in Gecko implemented the requirements of the Encoding Standard for the decoder.

And yes, it's pretty sad to do this for one intranet app (though breaking someone's CRM would be uncool, too).
Comment 3 Masatoshi Kimura 2013-03-02 03:00:08 UTC
At least Gecko's implementation passed <https://code.google.com/p/stringencoding/source/browse/test-big5.js>.
Comment 4 Simon Pieters 2013-03-28 09:55:22 UTC
The test case in the gecko bug (https://bugzilla.mozilla.org/attachment.cgi?id=718907 ) has this byte sequence for the interesting character:

0x91 0x6f

and expects this character (what I get in Firefox Nightly and Opera with big5-hkscs label):

U+9C02

I cloned https://code.google.com/p/stringencoding/ and changed the test-big5.js file as follows:

test(
  function () {
    var bytes = [0x91,0x6f];
    var string = "\u9c02";
    assert_equals(TextDecoder("big5").decode(new Uint8Array(bytes)), string, "decoded");
  },
  "big5"
);

Then I ran the tests.html file and got the following result for the above test:

Pass	big5

This means that, assuming the stringencoding project implements the Encoding Standard correctly, the spec would pass the test case in the gecko bug.
Comment 5 Simon Pieters 2013-03-28 09:59:58 UTC
Also note that the Web compat analysis that lead to the current spec recommended a unified label:

[[
Not treating big5 and big5-hkscs as aliases is clearly breaking  
pages, so I would recommend a single mapping for both.

Of the existing mappings, opera-hk seems like the overall winner. As a  
starting point for the spec, I suggest taking the intersection of  
opera-hk, firefox-hk and chrome-hk.
]]
http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Apr/0082.html
Comment 6 Anne 2013-09-04 09:35:49 UTC
Masatoshi, it's not really clear to me what to do here. It seems the experiment in Gecko was done poorly and we should try again there, do you agree?
Comment 7 Masatoshi Kimura 2013-09-04 12:47:47 UTC
I didn't verify that the Encodings spec's Big-5 is able to replace Big-HKSCS yet.
Anyway, Gecko will have to obtain an agreement from MozTW community.
Comment 8 Philip Jägenstedt 2013-09-04 13:40:08 UTC
(In reply to comment #0)
> Please see the Gecko bug: https://bugzilla.mozilla.org/show_bug.cgi?id=845743

Can someone explain the problem here? As far as I can tell, someone from Yahoo has data that needs to be interpreted as Big5-HKSCS and says that "if FF don't support HKSCS, we have no choice but to go to google chrome", so how is treating Big5 and Big5-HKSCS differently a fix for this?
Comment 9 Anne 2013-09-05 14:44:09 UTC
I'm marking this INVALID as the implementation was done all wrong. https://bugzilla.mozilla.org/show_bug.cgi?id=912470 is the bug on implementing this better.
Comment 10 Jungshik Shin 2014-09-24 21:19:55 UTC
Anne, could you tell me how you derived the current Big5 in the spec? Is it the merge of Big5-HKSCS (2008) [1]  with Windows-950 [2] ?  


[1] http://www.ogcio.gov.hk/en/business/tech_promotion/ccli/terms/doc/New2003cmp_2008.txt

[2] http://msdn.microsoft.com/en-us/goglobal/cc305155
Comment 11 Anne 2014-09-24 21:27:42 UTC
Philip can probably do that better. He did most of the work.
Comment 12 Philip Jägenstedt 2014-09-25 22:07:26 UTC
https://bugzilla.mozilla.org/show_bug.cgi?id=912470#c48 summarizes how this came to be.

In order to separate them, we need another solution for all of the Big5-HKSCS content labeled as Big5, which was common in Hong Kong at the time I did the research.