The W3C Internationalization (I18n) Activity works with W3C working groups and liaises with other organizations to make it possible to use Web technologies with different languages, scripts, and cultures. From this page you can find articles and other resources about Web internationalization, and information about the groups that make up the Activity.
This article was updated to reflect the latest version of text quoted from the HTML5 spec. In addition, editorial changes were made to improve the readability of the article and bring it in line with more recent templates.
Translators are requested to update the German, Spanish, Hungarian, Portuguese, Russian and Ukrainian translations appropriately.
Language Tags and Locale Identifiers for the World Wide Web describes the best practices for identifying or selecting the language of content as well as the the locale preferences used to process or display data values and other information on the Web. It describes how document formats, specifications, and implementations should handle language tags, as well as extensions to language tags that describe the cultural or linguistic preferences referred to in internationalization as a “locale”.
Changes in this update include the following: All references to RFC3066bis were updated to BCP 47 or to RFC5646 or RFC 4647 as appropriate.References to HTML were changed to point to HTML5. Imported and rewrote the text formerly containing in Web Services Internationalization Usage Scenarios defining internationalization, locale, and other important terms. Modified and reorganized the other sections of this document. Moved the Web services materials to an appendix.
The updated Working Draft of Requirements for Hangul Text Layout and Typography brings the English version of the draft into line with a number of changes prompted by feedback that were added to the editor’s copy. Notes pointing to as yet unresolved comments were also added to the document. It also points to the new location of the editor’s draft, on github, and suggest the use of github issues for future comments.
The document describes requirements for general Korean language/Hangul text layout and typography realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a project to develop the international standard for Korean text layout.
This article was updated to emphasize that UTF-8 should be the default character encoding on the Web. In addition, editorial changes were made to improve the readability of the article and bring it in line with more recent templates.
Translators are requested to update their translations appropriately.
Multilingual Linked Data for a Digital Single Market – Dedicated LD4LT call, 2 April 2015, 3 p.m. CEST
The LIDER project is fostering the creation of a community around Linguistic Linked Data (LLD): linked data used to represent metadata about linguistic resources and the resources themselves, e.g. lexica, thesauri, corpora, multilingual semantic networks etc. In a dedicated LD4LT community group call on 2 April, 3 p.m. CEST, we will discuss how LLD can contribute to the creation of the digital single market. See for more details the slides that will be presented during the call.
See the program. The keynote speaker will be Page Williams, Director of Global Readiness, Trustworthy Computing, Microsoft. She is followed by a strong line up in sessions entitled Developers and Creators, Localizers, Machines, and Users, including speakers from Microsoft, the European Parliament, the UN FAO, Intel, Verisign, and many more. The workshop is made possible with the generous support of the LIDER project.
Participation in the event is free. Please register via the Riga Summit for the Multilingual Digital Single Market site.
The MultilingualWeb workshops, funded by the European Commission and coordinated by the W3C, look at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The workshops are successful because they attract a wide range of participants, from fields such as localization, language technology, browser development, content authoring and tool development, etc., to create a holistic view of the interoperability needs of the multilingual Web.
We look forward to seeing you in Riga!
The LIDER project is developing LingHub, a repository of metadata about language resources and linguistic data. During a dedicated conference call on 19 March, 3 p.m. CET, LingHub will be discussed in the LD4LT community group to gather feedback from the public at large. The call is open to the public, no LD4LT group participation is required. Dial-in information is available. The call will be relevant for anybody interest specifically in language resources, or in public metadata repositories and the re-use of public sector information in general.
The Unicode® Consortium announced the start of the beta review for Unicode 8.0.0, which is scheduled for release in June, 2015. All beta feedback must be submitted by April 27, 2015.
Unicode 8.0.0 comprises several changes which require careful migration in implementations, including the conversion of Cherokee to a bicameral script, a different encoding model for New Tai Lue, and additional character repertoire. Implementers need to change code and check assumptions regarding case mappings, New Tai Lue syllables, Han character ranges, and confusables. Character additions in Unicode 8.0.0 include emoji symbol modifiers for implementing skin tone diversity, other emoji symbols, a large collection of CJK unified ideographs, a new currency sign for the Georgian lari, and six new scripts. For more information on emoji in Unicode 8.0.0, see the associated draft Unicode Emoji report.
Please review the documentation, adjust code, test the data files, and report errors and other issues to the Unicode Consortium by April 27, 2015. Feedback instructions are on the beta page.
Contribute to the foundations of linguistic linked data processing: dedicated LD4LT call on the LIDER reference architecture
The LIDER project is developing a reference architecture for working with Linguistic Linked Data (LLD). LLD is linked data used to represent metadata about linguistic resources and the resources themselves, e.g. lexica, thesauri, corpora, multilingual semantic networks etc. The reference architecture defines various aspects of LLD processing, related e.g. to LLD publishing, linking, services or discovery. As part of this activity, the LD4LT community group is organizing a conference call on 5 March, 3 p.m. CET, to gather feedback from the public at large.
The call is open to the public, no LD4LT group participation is required. Dial-in information is available. No knowledge about LLD is required. We especially are interested in feedback from potential users of LLD in content analytics related application areas.