Internationalization (i18n) Activity

Making the World Wide Web worldwide!

Group pages

Activity Statement

i18n WG

i18n Interest Group

i18n Tag Set (ITS) IG

Chinese Layout Task Force

Indic Layout Task Force

Community groups

W3C Validator Suite
Includes i18n Checker

Site links

Getting Started
Resource types
Mail archives
Aggregated data
News by category
News archives
July 2011 (13)
July 2009 (10)
June 2009 (10)
June 2008 (13)

Search for news


Category: New resource


New version of the Internationalization Checker released

The ‘i18n checker‘ is a free service by W3C that provides information about internationalization-related aspects of your HTML page, and advice on how to improve your use of markup, where needed, to support the multilingual Web.

This latest release uses a new user interface and redesigned source code. It also adds a number of new tests, a file upload facility, and support for HTML5.

This is still a ‘pre-final’ release and development continues. There are already plans to add further tests and features, to translate the user interface, to add support for XHTML5 and polyglot documents, to integrate with the W3C Unicorn checker, and to add various other features. At this stage we are particularly interested in receiving user feedback.

Try the checker and let us know if you find any bugs or have any suggestions.


Updated Working Group Note: Working with Time Zones

An updated version of Working with Time Zones has just been published as a Working Group Note.

Date and time values can be complex and the relationship between computer and human timekeeping systems can lead to problems. The working group has updated this version to contain more comprehensive guidelines and best practices for working with time and time zones in applications and document formats. Use cases are provided to help choose an approach that ensures that geographically distributed applications work well. This document also aims to provide a basic understanding and vocabulary for talking about time and time handling in software.

Editor: Addison Phillips, Lab126.


6 new articles about character encodings and HTML/CSS

Some articles are brand new and others were originally part of a tutorial, but have been updated and amplified to bring HTML5 to the fore and incorporate feedback from various readers. The articles are:

  1. Character encodings: Essential concepts
  2. Choosing & applying a character encoding
  3. Declaring character encodings in HTML
  4. The byte-order mark (BOM) in HTML
  5. Normalization in HTML and CSS
  6. Characters or markup?

Together these articles, with several other existing articles that were updated at the same time, provide practical advice to content authors on how to handle character encodings in HTML and CSS.

New article: Using b and i elements

Answers the question: Should I use b and i elements?

The HTML5 specification redefines b and i elements to have some semantic function, rather than purely presentational. However, the simple fact that the tag names are ‘b’ for bold and ‘i’ for italic means that people are likely to continue using them as a quick presentational fix.

This article explains why that can be problematic for localization (and indeed for restyling of pages in a single language), and echoes the advice in the specification intended to address those issues.

By Richard Ishida, W3C.

Prototype Internationalization Checker available

This checker performs various tests on a Web Page to determine its level of internationalisation-friendliness. It also lists key internationalization settings related to character encoding, language declarations, text direction and class/id names. This information includes HTTP headers, which can be particularly useful for troubleshooting problems.

The checker is still only a prototype, so there are guaranteed to be bugs and missing features. It will slowly improve over the coming months, but it has been made available for use now since it is likely to be helpful to many people already.

Use the checker

New article: Choosing a Language Tag

Read the article

FAQ-based article: Which language tag is right for me? How do I choose language and other subtags?

Following the publication of RFC 5646 earlier this year (replacing RFC 4646 as part of BCP 47), the IANA Subtag Registry now contains almost 8,000 subtags, and the list of subtag types was increased with the introduction of extended language subtags. This article tries to simplify the choice of an appropriate language tag for your needs by outlining the necessary decisions in a step-wise fashion.

By Richard Ishida, W3C.

New Working Group Note: Requirements for String Identity Matching and String Indexing

On 15th September, the Internationalization Core Working Group published Requirements for String Identity Matching and String Indexing as a Working Group Note.

This document is being published as a Working Group note in order to capture and preserve historical information. It contains requirements elaborated in 1998 for aspects of the character model for W3C specifications. It was developed and extensively reviewed by the Internationalization Working Group, but never progressed beyond Working Draft status. For this publication, the wording of the 1998 version remains unchanged (except for correction of a small number of typographic errors), but the links to references have been updated prior to this publication.

The document describes requirements for some important aspects of the character model for W3C specifications. The two aspects discussed are string identity matching and string indexing.

Editor: Martin Dürst.

Categories: Articles, New resource

New Working Group Note: Authoring HTML: Handling Right-to-left Scripts

The Internationalization Core Working Group has published Authoring HTML: Handling Right-to-left Scripts as a Working Group Note.

This document describes techniques for the use of HTML markup and CSS style sheets when creating content in languages that use right-to-left scripts, such as Arabic, Hebrew, Persian, Thaana, Urdu, etc. It builds on (but also goes beyond) markup needed to supplement the Unicode bidirectional algorithm, and also touches on how to prepare content that will later be localized into right-to-left scripts.

Editor: Richard Ishida.

New language tag specification, RFC 5646, published

The IETF has published RFC 5646, an update of Tags for Identifying Languages. This specification obsoletes former RFCs 4646, 3066 and 1766.

RFC 5646 makes it possible to use over 7,000 three-letter ISO 639-3 language codes, in addition to the 2 letter codes that have been in use for some time. It also introduces 220 ‘extended language’ subtags, mainly for backwards compatibility.

It continues to be best to refer to this specification as BCP47. This is a non-changing name and web address that points to the latest relevant RFCs.

The Internationalization Working Group at the W3C is working on an article to help users choose language tags, given the various types of subtag that are now available, and the sheer number of subtags.

You can look up language and other subtags in the IANA Language Subtag Registry.

(Richard Ishida has provided an unofficial tool for searching the registry that also provides advice for choosing subtags, and allows you to partially validate a hyphen-separated language tag.)

New article: Using Unicode controls for bidi text

Read the article

FAQ-based article: If I’m unable to use markup to correctly order bidirectional text, what can I do?

By Richard Ishida, W3C.

Copyright © 2014 W3C ® (MIT, ERCIM, Keio, Beihang) Usage policies apply.
Questions or comments?