<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>16697</bug_id>
          
          <creation_ts>2012-04-11 07:39:31 +0000</creation_ts>
          <short_desc>Indexes: additional gbk mappings</short_desc>
          <delta_ts>2013-12-16 16:09:53 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>Encoding</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows 3.1</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>DUPLICATE</resolution>
          <dup_id>16862</dup_id>
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Anne">annevk</reporter>
          <assigned_to name="Anne">annevk</assigned_to>
          <cc>mike</cc>
    
    <cc>philipj</cc>
    
    <cc>pub-w3</cc>
          
          <qa_contact>sideshowbarker+encodingspec</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>66607</commentid>
    <comment_count>0</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2012-04-11 07:39:31 +0000</bug_when>
    <thetext>Comparing your GBK index to my tables reveals a few differences:
 
&gt; 6 57 coq only FE10
&gt; 6 58 coq only FE12
&gt; 6 59 coq only FE11
&gt; 6 60 coq only FE13
&gt; 6 61 coq only FE14
&gt; 6 62 coq only FE15
&gt; 6 63 coq only FE16
&gt; 6 76 coq only FE17
&gt; 6 77 coq only FE18
&gt; 6 83 coq only FE19
 
Vertical variants of punctuation marks in GBK/1 (GBK additions to GB2312, Row 6) missing from the index.  These were apparently missing from the original GBK standard.
 
&gt; 8 28 coq only 1E3F
 
Another GBK/1 addition (ḿ).
 
&gt; 203 96 annevk only 3000
 
An additional codepoint for ideographic space missing from my tables.  This looks a bit random, but (at least some) browsers do this, so I guess it is needed.  More information would be nice.
 
&gt; 294 18 coq only 20087
&gt; 294 19 coq only 20089
&gt; 294 20 coq only 200CC
&gt; 294 26 coq only 9FB4
&gt; 294 34 coq only 9FB5
&gt; 294 39 coq only 9FB6
&gt; 294 40 coq only 9FB7
&gt; 294 45 coq only 215D7
&gt; 294 46 coq only 9FB8
&gt; 294 55 coq only 2298F
&gt; 294 63 coq only 9FB9
&gt; 294 80 coq only 9FBA
&gt; 294 81 coq only 241FE
&gt; 294 96 coq only 9FBB
 
This is the entire Unihan G9 repertoire, 14 of the 101 non-Unicode 1.0 hanzi included at the end of GBK/4.
 
Apart from the ideographic space, the codepoints listed above all mapped to PUA or FFFD in browsers when I last checked (cf. &lt;http://coq.no/character-tables/chinese-simplified/en&gt; under GBK), but they render as expected in IE and should probably be added to the index.
 
Øistein</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>67050</commentid>
    <comment_count>1</comment_count>
    <who name="">pub-w3</who>
    <bug_when>2012-04-25 19:41:26 +0000</bug_when>
    <thetext>More useful list of the missing characters including GBK/GB18030 encoding, old PUA mapping and new non-PUA mapping:

A6 D9  U+E78D  U+FE10  ︐
A6 DA  U+E78E  U+FE12  ︒
A6 DB  U+E78F  U+FE11  ︑
A6 DC  U+E790  U+FE13  ︓
A6 DD  U+E791  U+FE14  ︔
A6 DE  U+E792  U+FE15  ︕
A6 DF  U+E793  U+FE16  ︖
A6 EC  U+E794  U+FE17  ︗
A6 ED  U+E795  U+FE18  ︘
A6 F3  U+E796  U+FE19  ︙

A8 BC  U+E7C7  U+1E3F  ḿ

FE 51  U+E816  U+20087  </thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>67052</commentid>
    <comment_count>2</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2012-04-25 19:47:15 +0000</bug_when>
    <thetext>What about the code points listed after U+20087 in comment 0?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>67053</commentid>
    <comment_count>3</comment_count>
    <who name="">pub-w3</who>
    <bug_when>2012-04-25 19:49:31 +0000</bug_when>
    <thetext>FE 51  U+E816  U+20087  </thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>67054</commentid>
    <comment_count>4</comment_count>
    <who name="">pub-w3</who>
    <bug_when>2012-04-25 19:51:26 +0000</bug_when>
    <thetext>W3C cannot handle astral characters, apparently.  :-(

FE 51  U+E816  U+20087  [astral]
FE 52  U+E817  U+20089  [astral]
FE 53  U+E818  U+200CC  [astral]
FE 59  U+E81E  U+9FB4  龴
FE 61  U+E826  U+9FB5  龵
FE 66  U+E82B  U+9FB6  龶
FE 67  U+E82C  U+9FB7  龷
FE 6C  U+E831  U+215D7  [astral]
FE 6D  U+E832  U+9FB8  龸
FE 76  U+E83B  U+2298F  [astral]
FE 7E  U+E843  U+9FB9  龹
FE 90  U+E854  U+9FBA  龺
FE 91  U+E855  U+241FE  [astral]
FE A0  U+E864  U+9FBB  龻</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97596</commentid>
    <comment_count>5</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2013-12-13 15:22:25 +0000</bug_when>
    <thetext>What about U+E7C9 from bug 21145? But yeah, I need to fix this mess.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97668</commentid>
    <comment_count>6</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2013-12-16 16:09:53 +0000</bug_when>
    <thetext>

*** This bug has been marked as a duplicate of bug 16862 ***</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>