Internationalization (i18n) Activity

Making the World Wide Web worldwide!

Group pages

Activity Statement

i18n WG

i18n Interest Group

i18n Tag Set (ITS) IG

Chinese Layout Task Force

Indic Layout Task Force

Community groups

Site links

Getting Started
Resource types
Mail archives
Aggregated data
News by category
News archives
July 2011 (13)
July 2009 (10)
June 2009 (10)
June 2008 (13)

Search for news


Category: w3cWebUserAgents


Updated Working Draft: Indic Layout Requirements

Indic Layout Requirements describes the basic requirements for Indic script layout and text support on the Web and in Digital Publications. These requirements provide information for Web technologies such as CSS, HTML, and SVG about how to support users of Indic scripts. The current document focuses on Devanagari, but there are plans to widen the scope to encompass additional Indian scripts as time goes on.

Changes in the new version relate to initial letter styling in Devanagari text. Editorial changes were also made to bring the document in line with recent changes to the Internationalization Activity publishing process.


Updated Working Draft: Character Model for the World Wide Web: String Matching and Searching

Character Model for the World Wide Web: String Matching and Searching builds upon Character Model for the World Wide Web 1.0: Fundamentals to provide authors of specifications, software developers, and content developers a common reference on string identity matching on the World Wide Web and thereby increase interoperability.

This new version introduces numerous editorial changes as well as replacing some temporary terminology with better terms, and integrating the case folding text from the string matching algorithm into the case folding section. The document template was also adapated to match the new Internationalization publication process. See details of changes.

Additional Requirements for Bidi in HTML & CSS published as Working Group Note

Additional Requirements for Bidi in HTML & CSS was used to work through and communicate recommendations made to the HTML and CSS Working Groups for some of the most repetitive pain points prior to HTML5 and CSS3 for people working with bidirectional text in scripts such as Arabic, Hebrew, Thaana, etc.

It is being published now as a Working Group Note for the historical record in order to capture some of the thinking that lay behind the evolution of the specifications and to help people in the future working on bidi issues to understand the history of the decisions taken. Notes have been added to give a brief summary of what was actually implemented in the HTML or CSS specifications.


Linguistic Linked Data in selected domains: 5th and 6th LIDER Roadmapping Workshops to be held in July 2015

In the last 1 1/2 years, the LIDER project has organized several roadmapping events to gather a broad community around the topic of linguistic linked data. In July this year, LIDER will engage with two selected communities. On July 6, the 5th LIDER roadmapping workshop will be held in Rome at Sapienza University of Rome. The topic will be cross-media linked data and the event will provide several high level speakers from the multimedia area. On July 13th LIDER will organize the 6th roadmappping workshop in Munich. The event will be hosted by Siemens and will focus on content analytics and linked data in healthcare and medicine.

For both workshops participation is limited. If you are interested in the Rome event please contact Tiziano Flati, for Munich please contact Philipp Cimiano.

Tags: , ,

Announcing The Unicode® Standard, Version 8.0

Version 8.0 of the Unicode Standard is now available. It includes 41 new emoji characters (including five modifiers for diversity), 5,771 new ideographs for Chinese, Japanese, and Korean, the new Georgian lari currency symbol, and 86 lowercase Cherokee syllables. It also adds letters to existing scripts to support Arwi (the Tamil language written in the Arabic script), the Ik language in Uganda, Kulango in the Côte d’Ivoire, and other languages of Africa. In total, this version adds 7,716 new characters and six new scripts. For full details on Version 8.0, see Unicode 8.0.

The first version of Unicode Technical Report #51, Unicode Emoji is being released at the same time. That document describes the new emoji characters. It provides design guidelines and data for improving emoji interoperability across platforms, gives background information about emoji symbols, and describes how they are selected for inclusion in the Unicode Standard. The data is used to support emoji characters in implementations, specifying which symbols are commonly displayed as emoji, how the new skin-tone modifiers work, and how composite emoji can be formed with joiners. The Unicode website now supplies charts of emoji characters, showing vendor variations and providing other useful information.

Some of the changes in Version 8.0 and associated Unicode technical standards may require modifications in implementations. For more information, see Unicode 8.0 Migration and the migration sections of UTS #10, UTS #39, and UTS #46.


Provide input to the planning of the Big Data Value Chain: Contribute to BDVA Summit 18-19 June, Madrid

In the context of the Big Data Value Association Madrid Summit, 17-19th June, there are two sessions of specific relevance to standards and also to multilingualism: on 18th June a session on standardization, and on 19th June a session on Multilingual Data Value Chains. If you want to have an active participation in both sessions or want to provide further feedback, please contact Felix Sasaki <> on Standardization and Asun Gomez-Perez <> on Multilingual Data Value Chains. Presentation will be short in order to promote a wide participation.

If you cannot be in Madrid please also provide your input – see above session links for further instructions. The BDVA Summit will be crucial in shaping upcoming funding opportunities related to Big Data. Don’t miss the chance to describe your views on opportunities, challenges and potential solutions for the Big Data Value Chain!


Updated Working Draft: Language Tags and Locale Identifiers for the World Wide Web

Language Tags and Locale Identifiers for the World Wide Web describes the best practices for identifying or selecting the language of content as well as the the locale preferences used to process or display data values and other information on the Web. It describes how document formats, specifications, and implementations should handle language tags, as well as extensions to language tags that describe the cultural or linguistic preferences referred to in internationalization as a “locale”.

Changes in this update include the following: All references to RFC3066bis were updated to BCP 47 or to RFC5646 or RFC 4647 as appropriate.References to HTML were changed to point to HTML5. Imported and rewrote the text formerly containing in Web Services Internationalization Usage Scenarios defining internationalization, locale, and other important terms. Modified and reorganized the other sections of this document. Moved the Web services materials to an appendix.


Updated Working Draft: Requirements for Hangul Text Layout and Typography

The updated Working Draft of Requirements for Hangul Text Layout and Typography brings the English version of the draft into line with a number of changes prompted by feedback that were added to the editor’s copy. Notes pointing to as yet unresolved comments were also added to the document. It also points to the new location of the editor’s draft, on github, and suggest the use of github issues for future comments.

The document describes requirements for general Korean language/Hangul text layout and typography realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a project to develop the international standard for Korean text layout.


Multilingual Linked Data for a Digital Single Market – Dedicated LD4LT call, 2 April 2015, 3 p.m. CEST

The LIDER project is fostering the creation of a community around Linguistic Linked Data (LLD): linked data used to represent metadata about linguistic resources and the resources themselves, e.g. lexica, thesauri, corpora, multilingual semantic networks etc. In a dedicated LD4LT community group call on 2 April, 3 p.m. CEST, we will discuss how LLD can contribute to the creation of the digital single market. See for more details the slides that will be presented during the call.

The call is open to the public, no LD4LT group participation is required. Dial-in information is available. No knowledge about linguistic linked data is required.

Program published for W3C MultilingualWeb Workshop in Riga, 29 April

See the program. The keynote speaker will be Page Williams, Director of Global Readiness, Trustworthy Computing, Microsoft. She is followed by a strong line up in sessions entitled Developers and Creators, Localizers, Machines, and Users, including speakers from Microsoft, the European Parliament, the UN FAO, Intel, Verisign, and many more. The workshop is made possible with the generous support of the LIDER project.

Participation in the event is free. Please register via the Riga Summit for the Multilingual Digital Single Market site.

The MultilingualWeb workshops, funded by the European Commission and coordinated by the W3C, look at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The workshops are successful because they attract a wide range of participants, from fields such as localization, language technology, browser development, content authoring and tool development, etc., to create a holistic view of the interoperability needs of the multilingual Web.

We look forward to seeing you in Riga!

Copyright © 2014 W3C ® (MIT, ERCIM, Keio, Beihang) Usage policies apply.
Questions or comments?