W3CInternationalization (I18n) Activity: Making the World Wide Web truly world wide! 

Need some help?
What do you want to do?

Author HTML & CSS

Author SVG

Author XML

Develop a spec

Set up a server

Develop a schema

Configure my browser

About the Activity

Groups: Core, ITS, IG, JLTF, ILTF, MLW-LT

Mission, Contacts

Activity Statement

Participate!

Join a Working Group

Review a W3C specification

Translate a specification or page

Subscribe to a mailing list

Search for news

Admin

Tag(s): unicode

Posts

March 17, 2010

The Unicode Consortium Releases CLDR, Version 1.8

The Unicode Consortium announced today the release of the new version of the Unicode Common Locale Data Repository (Unicode CLDR 1.8), providing key building blocks for software to support the world’s languages.

CLDR 1.8 contains data for 186 languages and 159 territories: 501 locales in all. Version 1.8 of the repository contains over 22% more locale data than the previous release, with over 42,000 new or modified data items from over 300 different contributors.

For this release, the Unicode Consortium partnered with ANLoc, the African Network for Localization, a project sponsored by Canada’s International Development Research Centre (IDRC), to help extend modern computing on the African continent. ANLoc’s vision is to empower Africans to participate in the digital age by enabling their languages in computers. A sub-project of ANLoc, called Afrigen, focuses on creating African locales.

For more information about Unicode CLDR 1.8, see the CLDR 1.8 Release Note.

Tags:
December 11, 2009

Unicode Releases Common Locale Data Repository, Version 1.7.2

The Unicode Consortium has just announced this new release of CLDR (Common Locale Data Repository), the largest and most extensive standard repository of locale data. This data is used for software internationalization and localization: adapting software to the conventions of different languages for such common software tasks as formatting of dates, times, time zones, numbers, and currency values; sorting text; choosing languages or countries by name; transliterating different alphabets; and many others. See more information about the Unicode CLDR project (including charts).

Tags:
October 23, 2009

Unicode Collation Algorithm Version 5.2 Released

Version 5.2 of the Unicode Collation Algorithm has been released. This version resynchronizes the Unicode Collation Algorithm with all
of the updates for the Unicode Standard, Version 5.2.

The rest of this post is taken from the Unicode Consortium’s release notification and details changes and issues for implementations.

  • The text of UTS #10 has been updated. Among other changes, the revised text for UTS #10 makes it clear that the BASE for implicit generation of weights for Han characters does not include unassigned code points.
  • There are small changes in Gujarati, Telugu, Malayalam (including weighting for chillus), Tamil, and Sinhala. While these changes move in the direction of expected behavior, good results will only come from tailoring for particular languages, such as with CLDR.
  • There have been significant changes to the ordering of many combining marks. Many combining marks that are not in customary use in modern languages now have the same secondary weight, and will only be distinguished on a fourth level, by code point ordering. This can be seen by looking at the Unicode Collation Charts (http://unicode.org/charts/collation/). In 5.2, many characters now have a white background, indicating that they sort exactly the same as the previous character, unless a 4th (codepoint) level is used.
  • Implementations of UCA should take note that the increased number of characters may cause overflows if the implementing code makes certain assumptions or optimizations. This can result either from the new character additions (which increase the number of distinct weights in the table) or because of changes in the way the weights, particularly for secondary weight values, are assigned in the table. The latter change may result in unexpected numbers of characters having the same weight.
Tags:
October 7, 2009

Unicode 5.2.0 Released

On 1st October, Unicode 5.2 was released! The data files, code charts, and Unicode Standard Annexes for this version are final and are posted on the Unicode site.

For Unicode 5.2, the core specification is no longer just a delta document applied to the book; instead, the entire core specification,with all textual changes integrated, will be available on the Unicode site. As of this announcement, the first five chapters are available; the other chapters will follow soon

For full details about what is new or changed in this release, see the version documentation for Unicode 5.2.

Tags:

Questions or comments? ishida@w3.org