On 18 June the MultilingualWeb-LT Working Group holds a showcase event in Dublin about the upcoming Internationalization Tag Set (ITS) 2.0 specification. Group participants demonstrate implementations for authoring ITS 2.0 data categories, for using them in localization workflows, and for improving machine translation or other language technology processes with ITS 2.0. Participation is free, but registration is required.
The draft implements all changes since the previous publication of 11 April 2013. There are no remaining open issues. The Working Group is planning to finalize ITS 2.0 now: this is your last time to provide feedback! The Last Call period ends 11 June.
ITS 2.0 provides metadata to foster the adoption of the multilingual Web.
The Internationalization Working Group has published a First Public Working Draft of Requirements for Hangul Text Layout and Typography and is looking for feedback.
This document describes requirements for general Korean language/Hangul text layout and typography realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a project to develop the international standard for Korean text layout.
A Korean version of the document is also available (한국어 텍스트 레이아웃 및 타이포그래피를 위한 요구사항), but the English version is the authoritative version.
Unicode Bidirectional Algorithm basics is a repackaging of the initial part of “What you need to know about the bidi algorithm and inline markup” as a standalone article. It provides a gentle introduction to the behaviour of the Unicode Bidirectional Algorithm, and helps you understand why bidirectional text in Arabic, Hebrew, Thaana, Urdu, etc. behaves the way it does.
Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts
This tutorial has been modified to bring it in line with the current tutorial format. Rather than contain duplicate content, it now introduces the novice to key concepts and points off to useful further reading in an organized fashion. It has been completely rewritten.
Text direction and structural markup in HTML
This article has been created from material formerly in the tutorial “Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts” and augmented with information about new HTML5 markup constructs that are beginning to see adoption. It should be regarded as a new article, focusing on applying bidi markup to document- and block-level content, including forms.
What you need to know about the bidi algorithm and inline markup
This is an update of an existing article, but it has been almost completely rewritten. The most significant changes are the new parts describing how to apply the new HTML5 constructs which are beginning to see adoption. Additional changes will be needed as HTML5 bidi markup is finalised over the coming months. The article also proposes a simpler way to approach markup of bidi text, particularly useful for those with less experience, that relies less on a deep understanding of the issues involved.
Visual vs. logical ordering of text
This is a new article created from material that has been removed from the previously mentioned articles. It was removed into a separate article because visual ordering is much less important these days, and to avoid duplication. Only a few changes have been made to the content itself.
eBooks & i18n: Richer Internationalization for eBooks on 4 June 2013 in Tokyo, Japan, will investigate international functionality that needs to be added to the Open Web Platform. The Open Web Platform includes core W3C technologies such as HTML, CSS, SVG, XML, XSLT, XSL-FO, PNG, RDF, and many more, that are used extensively in eBooks and eBook production.
The goal is to make the various eBook reading platforms suitable for electronic books that use the printing and typesetting traditions of different cultures. If you are interested in participating, please submit a position paper by 30 April 2013. See the Call for Participation for details.
An Indic Layout Task Force has just been announced, as part of the W3C Internationalization Activity. Similar to the very successful Japanese Layout Task Force, the Indic group will provide input to the W3C Open Web Platform related to Indic Languages and Layout.
This task force will gather and integrate feedback from the participating members about the needs and technical feasibility of Indic requirements, and will report the results of its activities as a group back to the Internationalization Core Working Group, as well as to other relevant groups and to the W3C membership and community.
The chair of the Task Force is Swaran Lata, the contact person at the Indian Office of W3C is Somnath Chandra, and the Staff Contact is Richard Ishida. See the home page for more information.
In order to participate in, or follow, the work of the Task Force, please subscribe to the mailing list of the Task Force. You therewith also become a member of the Internationalization Interest Group.
The MultilingualWeb-LT Working Group published a First Public Working Draft of Metadata for the Multilingual Web – Usage Scenarios and Implementations. This document introduces a variety of usage scenarios and applications for the Internationalization Tag Set (ITS) 2.0, ranging from simple machine translation or human translation quality check to training for machine translation systems or automatic text analyis. Many of the underlying implemementations will be showcased in the upcoming W3C MultilingualWeb Workshop 12-13 March in Rome.
Until now, it has been very difficult for web application designers to do something as simple as sort names correctly according to the user’s language. The new standard ECMA-402 changes this. It provides:
- string comparison for sorting (such as for Swedish, where “ö” is a separate letter that sorts after “z”),
- number and currency formatting (such as “1.234,56 €” for a German language euro presentation, or the following choices for a Serbian language USD presentation: 12.345,12 US$, 12.345,12 USD or 12.345,12 америчких долара),
- date and time formatting capabilities (such as 2012年12月12日 for a Japanese language date, or for a French date: mercredi 12 décembre 2012).
ECMA-402, ECMAScript Internationalization API Specification, is available free of charge from the Ecma International website. See also An introduction to the standard.
A new FAQ page devoted to the topic of private-use characters, noncharacters, and sentinels has been posted on the Unicode web site. This FAQ aims to clear up confusion about whether noncharacters are permitted in Unicode text, and how they differ from ordinary private-use characters. The recently published Corrigendum #9: Clarification About Noncharacters makes it clear that noncharacters are permitted even in interchange, and the new FAQ page addresses some of the fine points about their usage and about differences from other types of Unicode code points. The brief mentions of noncharacters in other FAQ pages have also been updated accordingly.
Are you unclear about what Unicode “noncharacters” even are? The new FAQ page also answers basic questions about noncharacters and private-use characters, and provides a bit of history about how they came to be part of the Unicode Standard.