{"id":1078,"date":"2009-10-23T17:52:22","date_gmt":"2009-10-23T17:52:22","guid":{"rendered":"http:\/\/www.w3.org\/blog\/Internationaltmp\/2009\/10\/23\/unicode_collation_algorithm_version_5_2_\/"},"modified":"2011-09-28T07:48:46","modified_gmt":"2011-09-28T07:48:46","slug":"unicode_collation_algorithm_version_5_2_","status":"publish","type":"post","link":"https:\/\/www.w3.org\/blog\/International\/2009\/10\/23\/unicode_collation_algorithm_version_5_2_\/","title":{"rendered":"Unicode Collation Algorithm Version 5.2 Released"},"content":{"rendered":"<p>Version 5.2 of the <a href=\"http:\/\/www.unicode.org\/reports\/tr10\/\">Unicode Collation Algorithm<\/a> has been released. This version resynchronizes the Unicode Collation Algorithm with all<br \/>\nof the updates for the Unicode Standard, Version 5.2.<\/p>\n<p>The rest of this post is taken from the Unicode Consortium&#8217;s release notification and details changes and issues for implementations.<\/p>\n<ul>\n<li>The text of UTS #10 has been updated. Among other changes, the revised text for UTS #10 makes it clear that the BASE for   implicit generation of weights for Han characters does not include unassigned code points.<\/li>\n<li>There are small changes in Gujarati, Telugu, Malayalam (including weighting for chillus), Tamil, and Sinhala. While these changes move in the direction of expected behavior, good   results will only come from tailoring for particular languages,  such as with CLDR.<\/li>\n<li>There have been significant changes to the ordering of many  combining marks. Many combining marks that are not in customary  use in modern languages now have the same secondary weight, and  will only be distinguished on a fourth level, by code point    ordering. This can be seen by looking at the Unicode Collation   Charts (http:\/\/unicode.org\/charts\/collation\/). In 5.2, many characters now have a white background, indicating that they   sort exactly the same as the previous character, unless a 4th  (codepoint) level is used.<\/li>\n<li>Implementations of UCA should take note that the increased  number of characters may cause overflows if the implementing  code makes certain assumptions or optimizations. This can result either from the new character additions (which increase the number of distinct weights in the table) or because of changes in the way the weights, particularly for secondary weight values, are assigned in the table. The latter change may result in unexpected numbers of characters having the same weight.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Version 5.2 of the Unicode Collation Algorithm has been released. This version resynchronizes the Unicode Collation Algorithm with all of the updates for the Unicode Standard, Version 5.2. The rest of this post is taken from the Unicode Consortium&#8217;s release &hellip; <a href=\"https:\/\/www.w3.org\/blog\/International\/2009\/10\/23\/unicode_collation_algorithm_version_5_2_\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":79,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,13,5,19],"tags":[78],"class_list":["post-1078","post","type-post","status-publish","format-standard","hentry","category-highlight","category-miscellaneous","category-w3cwebdesign","category-w3cwebuseragents","tag-unicode"],"_links":{"self":[{"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/posts\/1078","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/users\/79"}],"replies":[{"embeddable":true,"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/comments?post=1078"}],"version-history":[{"count":1,"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/posts\/1078\/revisions"}],"predecessor-version":[{"id":1503,"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/posts\/1078\/revisions\/1503"}],"wp:attachment":[{"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/media?parent=1078"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/categories?post=1078"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.w3.org\/blog\/International\/wp-json\/wp\/v2\/tags?post=1078"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}