Home page
The W3C Internationalization (I18n) Activity works with W3C working groups and liaises with other organizations to make it possible to use Web technologies with different languages, scripts, and cultures. From this page you can find articles and other resources about Web internationalization, and information about the groups that make up the Activity.
Recent highlights
8 December 09
New article: Choosing a Language Tag
25 November 09
Internet Governance Forum Poster
23 October 09
Unicode Collation Algorithm Version 5.2 Released
9 October 09
Article for wide review: Choosing a language tag
9 October 09
Updated article: Language tags in HTML and XML
7 October 09
Unicode 5.2.0 Released
All news
New translations into Romanian
Thanks to the Sorin Velescu, the following articles have been translated into Romanian.
Schimbarea codificarii paginii (X)HTML in UTF-8 (Changing (X)HTML page encoding to UTF-8)
CSS3 si textul international (CSS3 and International Text)
[search keys: qa-changing-encoding article-css3-text]
New translations into Spanish
Thanks to the Spanish Translation Team, Spanish Translation US, the following articles have been translated into Spanish.
Verificación de encabezados HTTP (Checking HTTP Headers)
Configuración de información charset en .htaccess (Setting charset information in .htaccess)
Verificación de la codificación de caracteres mediante el verificador (Checking the character encoding using the validator)
[search keys: qa-headers-charset qa-htaccess-charset qa-validator-charset-check]
Unicode Releases Common Locale Data Repository, Version 1.7.2
New article: Choosing a Language Tag
FAQ-based article: Which language tag is right for me? How do I choose language and other subtags?
Following the publication of RFC 5646 earlier this year (replacing RFC 4646 as part of BCP 47), the IANA Subtag Registry now contains almost 8,000 subtags, and the list of subtag types was increased with the introduction of extended language subtags. This article tries to simplify the choice of an appropriate language tag for your needs by outlining the necessary decisions in a step-wise fashion.
By Richard Ishida, W3C. [search key: qa-choosing-language-tags]
Charlint updated
Internet Governance Forum Poster
The fourth annual IGF Meeting was held in Sharm El Sheikh, Egypt on 15-18 November 2009. The W3C Internationalization Activity had a poster [PDF] at the event.
Talk slides: Standards-based Translations with W3C ITS and OASIS XLIFF
On November 5th, Christian Lieske and Felix Sasaki gave a talk entitled Standards-based Translations with W3C ITS and OASIS XLIFF at TCWorld, Wiesbaden, Germany.
The slides are in PDF. The presentation describes ITS and XLIFF, the two standards which are important for proper internationalization and localization of XML. Topics include a discussion of general benefits of standards-based internationalization and localization, an introduction to both standards and how they help to achieve such benefits, and an explanation of the relation between the two. A highlight was the introduction of a tool for round-tripping from an XML-document with ITS information to XLIFF, and the integration of translated material from XLIFF back into the original XML. [search keys: talk-2009 talk-sasaki] talk-lieske]
Updated article: Styling using language attributes
The major change was the addition of detailed information about use of CSS selectors with xml:lang, but there were many other edits (see the list below). Translators should consider retranslating the whole tutorial. [search keys: qa-css-lang]
Unicode Collation Algorithm Version 5.2 Released
Version 5.2 of the Unicode Collation Algorithm has been released. This version resynchronizes the Unicode Collation Algorithm with all of the updates for the Unicode Standard, Version 5.2.
The rest of this post is taken from the Unicode Consortium's release notification and details changes and issues for implementations.
- The text of UTS #10 has been updated. Among other changes, the revised text for UTS #10 makes it clear that the BASE for implicit generation of weights for Han characters does not include unassigned code points.
- There are small changes in Gujarati, Telugu, Malayalam (including weighting for chillus), Tamil, and Sinhala. While these changes move in the direction of expected behavior, good results will only come from tailoring for particular languages, such as with CLDR.
- There have been significant changes to the ordering of many combining marks. Many combining marks that are not in customary use in modern languages now have the same secondary weight, and will only be distinguished on a fourth level, by code point ordering. This can be seen by looking at the Unicode Collation Charts (http://unicode.org/charts/collation/). In 5.2, many characters now have a white background, indicating that they sort exactly the same as the previous character, unless a 4th (codepoint) level is used.
- Implementations of UCA should take note that the increased number of characters may cause overflows if the implementing code makes certain assumptions or optimizations. This can result either from the new character additions (which increase the number of distinct weights in the table) or because of changes in the way the weights, particularly for secondary weight values, are assigned in the table. The latter change may result in unexpected numbers of characters having the same weight.
Article for wide review: Choosing a language tag
Comments are being sought on this article prior to final release. Please send any comments to www-international@w3.org (subscribe). We expect to publish a final version in one to two weeks. [search keys: qa-choosing-language-tags]
Questions or comments? ishida@w3.org
Powered by ![]()
Copyright © 1997-2009 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.