Monthly Archives: September 2009
The Internationalization Core Working Group has published Authoring HTML: Handling Right-to-left Scripts as a Working Group Note.
This document describes techniques for the use of HTML markup and CSS style sheets when creating content in languages that use right-to-left scripts, such as Arabic, Hebrew, Persian, Thaana, Urdu, etc. It builds on (but also goes beyond) markup needed to supplement the Unicode bidirectional algorithm, and also touches on how to prepare content that will later be localized into right-to-left scripts.
Editor: Richard Ishida.
The IETF has published RFC 5646, an update of Tags for Identifying Languages. This specification obsoletes former RFCs 4646, 3066 and 1766.
RFC 5646 makes it possible to use over 7,000 three-letter ISO 639-3 language codes, in addition to the 2 letter codes that have been in use for some time. It also introduces 220 ‘extended language’ subtags, mainly for backwards compatibility.
It continues to be best to refer to this specification as BCP47. This is a non-changing name and web address that points to the latest relevant RFCs.
The Internationalization Working Group at the W3C is working on an article to help users choose language tags, given the various types of subtag that are now available, and the sheer number of subtags.
You can look up language and other subtags in the IANA Language Subtag Registry.
(Richard Ishida has provided an unofficial tool for searching the registry that also provides advice for choosing subtags, and allows you to partially validate a hyphen-separated language tag.)