The W3C Internationalization (I18n) Activity works with W3C working groups and liaises with other organizations to ensure Web technologies work for everyone, regardless of their language, script, or culture.
From this page you can find articles and other resources about Web internationalization, and information about the groups that make up the Activity.
Read also about opportunities to participate and fund work via the new Sponsorship Program.
What the W3C Internationalization Activity does
Selected quick links
Selected quick links
Selected quick links
W3C India Office to host International Conference
The W3C India Office is organizing an International Conference “World Wide Web: Technology, Standards and Internationalization – 2010” in New Delhi on May 6-7, 2010.
The conference will focus on promoting and proliferating W3C Standards in India to enable seamless Web access in Indian languages. One of the major topics covered in the conference will be Internationalization, especially in light of the complexity of implementing Indian languages.
Core Technology Tracks in the Conference include:
- W3C and Web Technologies
- Internationalization Aspects in W3C
- Web Access through mobile and hand-held devices
- CSS and Styling issues
- Web Architecture and Semantic Web
- Human Machine Interface for the Web
- Web Content Accessibility in Indian Languages
- W3C and E-Governance
The Conference will also attempt to evolve a Roadmap for proliferation and specific requirements for Indian Languages in W3C and associated standards.
See the W3C India Website.
The Unicode Consortium Releases CLDR, Version 1.8
The Unicode Consortium announced today the release of the new version of the Unicode Common Locale Data Repository (Unicode CLDR 1.8), providing key building blocks for software to support the world’s languages.
CLDR 1.8 contains data for 186 languages and 159 territories: 501 locales in all. Version 1.8 of the repository contains over 22% more locale data than the previous release, with over 42,000 new or modified data items from over 300 different contributors.
For this release, the Unicode Consortium partnered with ANLoc, the African Network for Localization, a project sponsored by Canada’s International Development Research Centre (IDRC), to help extend modern computing on the African continent. ANLoc’s vision is to empower Africans to participate in the digital age by enabling their languages in computers. A sub-project of ANLoc, called Afrigen, focuses on creating African locales.
For more information about Unicode CLDR 1.8, see the CLDR 1.8 Release Note.
New translations into Spanish
Thanks to the Spanish Translation Team, Spanish Translation US, the following articles have been translated into Spanish.
Códigos de idioma de dos o tres letras (Two-letter or three-letter language codes)
Cómo trabajar con mensajes compuestos (Working with Composite Messages)
Reutilización de cadenas en contenido de script (Re-using Strings in Scripted Content)
New First Public Working Draft: Additional Requirements for Bidi in HTML
This document arose out of the frustrations of people who have to work with bidirectional text on the Web in everyday practical situations. For example, it covers issues related to re-use of fragments of text in various new locations by web apps or scripts, or situations where users need to type in or send bidirectional form data. It proposes additions to the HTML5 specification for such situations, which are not covered by the current HTML specification. Many of the ideas in the document, however, are also relevant to markup formats in general, and there are some implications for CSS and XSL-FO (which we hope to address more directly in a subsequent document).
Please send comments on this document to public-i18n-bidi@w3.org. Join the list and view the archive). We hope to publish a new version of the document, incorporating feedback, in about a month (depending on feedback received). We will then ask the HTML WG to review that version.
Editor: Aharon Lanin, Google.
W3C Workshop on Conversational Applications
A call has gone out for a Workshop on Conversational Applications — Use Cases and Requirements for New Models of Human Language to Support Mobile Conversational Systems, 18-19 June 2010, Hosted by Openstream, NJ, US
Scope of the Workshop Submissions must describe (1) requirements and use cases for improving W3C standards for conversational interaction and (2) how the use cases justify one or more of these topics:
- Formal notations for representing grammar in: Syntax, Morphology, Phonology, Prosodics
- Engine standards for improvement in processing: Syntax, Morphology, Phonology, Lexicography
- Lexicography standards for: parts-of-speech, grammatical features and polysemy
- Formal semantic representation of human language including: verbal tense, aspect, valency, plurality, pronouns, adverbs, etc.
- Efficient data structures for binary representation and passing of: parse trees, alternate lexical/morphologic analysis, alternate phonologic analysis
- Other suggested areas or improvements for standards based conversational systems development
Experts in the following technology areas would be welcome.
- computational linguistics
- speech prosody
- syntax
- internationalization
- mobile applications
- MMI/voice technology
For more information see http://www.w3.org/2010/02/convapps/cfp.html
New translations into Romanian
Thanks to Sorin Velescu, the following article has been translated into Romanian.
Sfaturi utile de internationalizare pentru Web (Internationalization Quick Tips for the Web)
Talk slides: Flarenet Forum
On 11th February Richard Ishida gave a talk entitled Language Tagging using the New RFC 5646 at the FlaReNet Forum 2010 n Barcelona, Spain.
The talk proposed that interoperability will be served best by widening the adoption of the language tags specified by BCP 47, and to that end reviewed the various types of subtag described by the RFC 5646 syntax, and looked at some of the choices that need to be made when selecting subtags.
New translation into French
Thanks to the French Translation Team, Trusted Translations Inc., the following article has been translated into French.
Création de pages SVG Tiny en arabe, hébreu et autres scripts lus de droite à gauche (Creating SVG Tiny Pages in Arabic, Hebrew and other Right-to-Left Scripts)
New translations into Spanish
Thanks to the Spanish Translation Team, Spanish Translation US, the following articles have been translated into Spanish.
Selección de una etiqueta de idioma (Choosing a Language Tag)
Problemas de visualización provocados por BOM en UTF-8 (Display problems caused by the UTF-8 BOM)
Caracteres y glifos faltantes (Missing characters and glyphs)
Article for review: Character encodings in HTML and CSS
Comments are being sought on this article prior to final release. Please send any comments to www-international@w3.org (subscribe). We expect to publish a final version in one to two weeks.
This is an update, in a temporary location, of the tutorial Character sets & encodings in XHTML, HTML and CSS. (Please be careful about bookmarking the location, since it is only temporary.)
A lot of new material was added, eg. related to the UTF-8 BOM, normalization, etc., and the material was rearranged significantly. The rearrangement was to downplay slightly the XHTML 1.0 issues, given that that is now only relevant to IE6, but also to help readers more quickly find information they need for the format they are dealing with.
The explicit distinction between XHTML 1.0 and XHTML 1.1 with regard to MIME types was removed, since the XHTML2 WG is hopefully very close to issuing a PER that enables XHTML 1.1 to be served as text/html.
The update adds information about HTML5.
Where a section corresponds to an article that has been updated, those updates were also migrated to this document.