Re: [i18N-ISSUE-255] CJK Compatibility ideographs

On Tue, Apr 29, 2014 at 9:49 PM, Asmus Freytag <asmusf@ix.netcom.com> wrote:

>  On 4/29/2014 5:48 PM, hyunyoung kim wrote:
>
> Hello
>
>  You are right. Korean is frequently using Both CJK compatibility
> ideographs (U+F900~ and up) and CJK ideographs (block U+4E00~ and others).
>
>  So I added "CJK ideographs" at 3 places. Please check the followings;
>
>
> Since when are CJK ideographs "punctuation" ? Or am I misreading something?
>

I'm also puzzled at this.

 BTW, I believe that it'd be better to start from the Unicode character
properties, categories, UAX 14 and UAX 29 to tailor UAX 14/29 (only) when
necessary to meet Korean requirements.

Jungshik




>
> A./
>
>   *2.1.2 Hangul Punctuation Mark Code Ranges based on Unicode*
>
> Following punctuation marks are used in a Hangul environment. (Refer to
> Appendix A for the code table.)
>
>    - Basic Latin (U+0020~U+007F): Latin alphabet and numerals
>    - General Punctuation (U+2010~)
>    - Superscripts and Subscripts (U+2070~)
>    - Currency Symbols (U+20A0~)
>    - Letterlike Symbols (U+2100~)
>    - Number Forms (U+2050~)
>    - Arrows (U+2190~)
>    - Mathematical Operators (U+2200~)
>    - Enclosed Alphanumerics (U+2460~)
>    - Box Drawing (U+2500~)
>    - Block Elements (U+2580~)
>    - Geometric Shapes (U+25A0~)
>    - Miscellaneous Symbols (U+2600~)
>    - Dingbats (U+2700~)
>    - CJK Symbols and Punctuation (U+3000~)
>    - Enclosed CJK Letters and Months (U+3200~)
>    - CJK Ideographs (U+4E00~)
>    - CJK Compatibility Ideographs (U+F900~)
>    - CJK Compatibility Symbols and Punctuation for Vertical Writing
>    (U+FE30~FE48)
>
>
>   *3.1.2 Examples for Grouping by Typographic Characteristic of
> Characters and Symbols*
>
> In a Hangul environment, characters and symbols are classified by
> typographic characteristics, into 32 classes.
>
> *cl20. Hanja (CJK Ideographic Characters) *
>
> (U+F900~)
>
> *cl21. Proportional Width Latin Alphabet *
>
> (U+0041~U+005A, U+0061~U+007A)
>
> *cl22. Full-Width Unit Symbols *
>
> ㎥; mainly used for vertical writing
>
> *cl23. Latin Alphabet Space *
>
> (U+0020)
>
> *cl24. Latin Alphabet Characters *
>
> (U+002C~… ~U+261E)
>
> *cl25. Proportional-Width Numerals *
>
> (U+0030~U+0039)
>
> *cl26. Fixed Width Half-width Numerals *
>
> (U+0030~U+0039)
>
> *cl27. Fixed Width Full-width Numerals *
>
> (U+0020~U+007F)
>
> *cl28. Fixed Width Full-width Numerals *
>
> (U+FF21~U+FF5A)
>
> *cl29. CJK Ideographs*
>
>                    (U+4E00~U+9FFF)
>
>
>  *3.2.1 Character Types in Hangul Writing*
>
>
>    1. Hangul Compatibility Jamo (U+3130~)
>    2. Enclosed CJK Characters and Numerals (U+3200~)
>    3. CJK Ideographs (U+4E00~)
>    4. Hangul Jamo Extended-A (U+A960~)
>    5. Hangul Precomposed Syllables (U+AC00~U+D7A3)
>    6. Hangul Jamo Extended-B (U+D7B0~)
>    7. CJK Compatibility Ideographs (U+F900~)
>
>
>  Regards
> HyunYoung Kim
>
>
>

Received on Wednesday, 30 April 2014 05:49:34 UTC