Accesskey n skips to in page navigation. Skip to the content start
These are the things that typically make their way to the the I18N home page. It includes information about general events and also about newly published resources of all types. It typically does not include descriptions of changes to existing documents, items for review only, etc.
Items are in chronological order, with the newest at the top. See the sidebar for other specialized lists and related RSS feeds.
The W3C Internationalization Activity home page was converted to a blog format in April of this year. The blog supersedes these news filter pages, although similar categories will be used to group blog posts. The old pages will remain available as a historical record. The new blog approach also makes it possible to easily host short articles with a comment facility, such as requests for public feedback.
If you are subscribed to this RSS feed, you should now subscribe to this new feed.
The article looks at design and development practices that can cause major problems for translation. Designers must be very careful about how they split up and reuse text on-screen because the linguistic differences between languages can lead to real headaches for localizers and may in some cases make a reasonable translation impossible to achieve. [search key: composite-messages]
The article looks at a particular design and development practise that can cause major problems for translation of content. Many programmers and designers decide that if a particular string is used in many places, they will use copies of the same string rather than implement many identical strings. String reuse is not necessarily a bad thing. The trick is to know what constitutes a good candidate for reuse and what does not. If you get it wrong, you can be creating an insuperable obstacle to good localization. [search key: text-reuse]
The W3C GEO Working Group has developed a set of Quick Tips to help newcomers to Web internationalization. They summarize important concepts related to international Web design in a similar way to the popular WAI Quick Tips. These tips are not complete guidelines, they are simply a few key concepts to bear in mind. The page also links to supporting material, where available, at the W3C's Internationalization Activity subsite.
The document is linked from the new Getting Started page that also explains various ways to find information on the W3C Internationalization subsite, and points to some key definitions. [search key: quicktips]
The Internationalization Tag Set Working Group has published an updated Working Draft of the Internationalization Tag Set (ITS). Organized by data categories, this set of elements and attributes supports the internationalization and localization of schemas and documents. Implementations are provided for DTDs, XML Schema and Relax NG, and for existing vocabularies like XHTML, DocBook and OpenDocument. [search key: itsrec]
The Math IG just published this Note which analyzes potential problems with the use of MathML for the presentation of mathematics in the notations customarily used with Arabic, and related languages. The goal is to clarify avoidable implementation details that hinder such presentation, as well as to uncover genuine limitations in the specification. These limitations in the MathML specification may require extensions in future versions of the specification.
Note: The XHTML+MathML version displays the examples better, if your user agent supports it.
The W3C GEO Working Group has published the first in a series of articles aimed at those who are new to internationalization. These pages will introduce you to key internationalization topics and tasks, and direct you towards articles or resources on the W3C Internationalization subsite that will take you to the next level of understanding.
This document introduces topics in the general area of character sets, encoding, escapes, etc.
The document is linked from a new 'Getting Started' page that also explains various ways to find information on the W3C Internationalization subsite, and points to some key definitions.[search key: gettingstarted/characters]
Molly Holzschlag, I18n GEO Working Group member, recently published an on-line article entitled Putting the World into "World Wide Web", part of a series entitled 24 ways to impress your friends. She makes the point that the art and science of creating sites for global audiences requires a lot more preparation and planning than one might think at first glance.
What do the terms 'internationalization' and 'localization' mean, and how are they related?[search key: qa-i18n]
When should I use xml:lang and when should I define my own element or attribute for passing language values in an XML document schema (DTD)? [search key: qa-when-xmllang]
This document defines data categories and their implementation as a set of elements and attributes called the Internationalization Tag Set (ITS). ITS is used with new and existing schemas to support the internationalization and localization of schemas and documents. Implementations of ITS are provided for three schema languages: XML DTDs, XML Schema and RELAX NG. In addition, implementations are provided as fixed modularizations of various existing vocabularies (e.g. XHTML, DocBook, Open Document). The definition of the data categories is still in an early draft stage. [search key: itsrec]
When creating schemas (XML Schema, DTD, etc.), it is important to include constructs that meet the needs of content authors dealing with international audiences, and address the needs of the localization community. This document provides a list of key requirements to achieve such a goal. It will be used to provide a framework and direction for a detailed solution proposal (or set of proposals) to be developed later. [search key: itsreq]
The W3C Director and Internationalization Core Working Group members would like to express our most sincere thanks to Addison Phillips for his contributions to W3C and for chairing the Working Group for the past two years.
We are fortunate that François Yergeau (Invited Expert) has accepted the role of Chair of the I18n Core Working Group. François has been contributing to the W3C since 1996.
Richard Ishida presented at the W3C Indian Office opening in New Delhi, India. The Office is hosted by the Centre for Development of Advanced Computing (C-DAC Noida). Internationalization was high on the agenda of topics discussed during and around the opening ceremony. Only a small proportion of the nearly 1 billion Indians can work in English, for the rest there are 22 constitutional languages and 11 scripts. It is expected that the new office will facilitate the representation of Indian requirements in the development of W3C technologies.
W3C held a Workshop on Internationalizing the Speech Synthesis Markup Language (SSML) on 2-3 November hosted by IBM at the IBM China Research Lab in Beijing, China. Attendees discussed ways to improve rendering of non-English natural languages using the SSML W3C Recommendation which generates synthetic speech and controls pronunciation, volume, pitch and rate. Discussion focussed on topics such as word and tone disambiguation in East Asian languages, and dealing with de-accented text. The Voice Browser WG is discussing the recommendations made, and may hold another workshop to capture requirements for other languages.
The Internationalization Core Working Group has published an updated Working Draft of Character Model for the World Wide Web 1.0: Normalization to improve text manipulation on the Web. Based on the character model Fundamentals W3C Recommendation, the draft provides authors of specifications, software developers, and content developers with a common reference for text normalization and string identity matching. [search key: charmod-norm]
Based on discussions with the XQuery and XSL Working Groups, the Internationalization Core Working Group has released Working with Time Zones as a Working Group Note. The document discusses problems encountered when working with the date, time, and dateTime values from XML Schema when time zone offsets are included or omitted. It offers guidelines for working with field-based dates and times, for working with date and time values that require a time zone, and for comparing times. [search key: timezone]
The Internationalization Core Working Group has released the First Public Working Draft of Web Services Internationalization (WS-I18N). The draft enhances SOAP messaging for locale and international preference negotiation and defines a locale policy. Without using Accept-Language and user identity, implementations can handle the requester's locale, locale policy and language preference. [search key: ws-i18n]
How do I change the encoding of my (X)HTML pages to UTF-8? [search key: qa-changing-encoding]
The 28th Internationalization & Unicode Conference will be held 7-9 September in Orlando, Florida, USA. Team members Richard Ishida and Felix Sasaki will present several papers at this premier technical conference worldwide for software and Web internationalization. Read about Unicode and the W3C Internationalization Activity.
The Internationalization Tag Set (ITS) Working Group has released the First Public Working Draft of Internationalization and Localization Markup Requirements. Addressing the main challenges and issues of internationalizing and localizing XML documents, the draft outlines requirements for vocabulary, guidelines and mechanisms to meet the needs of content authors, developers and the localization community. [search key: itsreq]
What are character entities and NCRs, and when should I use them? [search key: qa-escapes]
What are the best practices for using pull-down menus based on the select element to direct visitors to localized content? [search key: qa-navigation-select]
Work has been done at the W3C to enable support for ruby text in XHTML 1.1. This is especially useful for Japanese and other East Asian content. It allows small annotations to be rendered above and below base text, such as is needed to support Japanese furigana. This tutorial covers:
The tutorial (originally developed for the WWW2005 Conference in Chiba, Japan) has just been republished in non-draft status, after incorporation of changes based on review comments. [search key: ruby]
Richard Ishida delivered a keynote talk entitled Internationalizing the Web for Africa at the Pan-African Localization Workshop in Casablanca, 13-15 June. The workshop, organized by the IDRC, brought together participants from 12 African countries and experts from other continents to discuss how to better localize ICT into indigenous languages and scripts so as to promote rapid and fair development in Africa. There were also visits from the Moroccan Minister for General and Economic Affairs and the Canadian Ambassador to Morocco, who both expressed support for its aims.
The workshop sets a foundation for future networking and information sharing via the development of a collaborative, Web-based site which will provide useful information to and support the initiatives of a pan-African community of localizers.
Felix Sasaki joins the W3C Team as of 1 April to work within the Internationalization Activity. Felix will be based in Keio University, in Japan, from the end of April.
Felix studied Japanese and Linguistics in Berlin, Nagoya (Japan) and Tokyo. Since 1999 he worked in the Department of Computational Linguistics and Text-technology, at the University of Bielefeld (Germany), where he finished his PhD in 2004. The PhD deals with the integration of heterogenous linguistic resources using XML-based (e.g.linguistic corpora) and RDF-based (e.g. lexica, conceptual models) representations.
Felix replaces Martin Dürst, who has left the W3C to take up a post at Aoyama Gakuin University in Japan. We wish Martin success for the future, and thank him for his dedication and hard work in leading the internationalization effort for many years. Richard Ishida now becomes Internationalization Activity Lead.
A techniques index has been added to the W3C Internationalization site. This provides an overview of all current techniques documents. It also provides quick access to a summary of techniques and useful links on a task by task basis.
In addition, it is now possible to search within the Internationalization sub-site by typing your search text into the brown box at the top right of pages on the site.
Specifying the language of content is useful for a wide number of applications, from linguistically sensitive searching to applying language-specific display properties. In some cases the potential applications for language information are still waiting for implementations to catch up, whereas in others, such as detection of language by voice browsers, it is a necessity today. Marking up language information is something that can and should be done today. Without it, it is not possible to take advantage of any of these applications.
This document is one of a series of documents providing HTML authors with techniques for developing internationalized HTML using XHTML 1.0 or HTML 4.01, supported by CSS1, CSS2 and some aspects of CSS3. It focuses specifically on advice about specifying the language of content. It is produced by the Internationalization GEO (Guidelines, Education & Outreach) Working Group of the W3C Internationalization Activity. [search key: html-tech-lang]
Language tags are used to indicate the language of text in HTML and XML documents, and are also used in HTTP headers, SMIL and SVG switch statements, CSS pseudo-elements, etc. This article describes how to choose values for language tags.
The article augments an existing article with information that previously existed in a tutorial. The article title was also changed from "Language tagging in HTML and XML". [search key: language-tags]
The World Wide Web Consortium today released Character Model for the World Wide Web 1.0: Fundamentals as a W3C Recommendation. The document allows Web applications to transmit and process the characters of the world's languages. Building on the Universal Character Set defined by Unicode and ISO/IEC 10646, it gives authors of specifications, software developers, and content developers a common reference for text manipulation. Read the press release. [search key: charmod]
W3C announces support for the publication of RFC 3987 Internationalized Resource Identifiers (IRIs) as an IETF Proposed Standard, together with STD 66, RFC 3986, Uniform Resource Identifier (URI): Generic Syntax (Press release).
IRIs expand the set of characters in URIs from a subset of US-ASCII to the Universal Character Set (Unicode/ISO 10646). They allow content developers and users to identify resources such as Web pages in their own languages. The IRI specification was in part developed by the Internationalization Working Group. The IRI specification will also provide a definitive reference for many W3C specifications - such as XML, RDF, XHTML and SVG.
See also the article An Introduction to Multilingual Web Addresses.
Recent developments enable you to add non-ASCII characters to Web addresses. This article provides a high level introduction to how this works. It is aimed at content authors and general users who want to understand the basics without too many gory technical details. [search key: idn-and-iri]
W3C is pleased to announce the relaunch of the Internationalization Activity. The Internationalization Tag Set (ITS) Working Group, chaired by Yves Savourel (Enlaso), is chartered with new work to develop elements and attributes to support document internationalization and localization. Formerly task forces, the Internationalization Core Working Group is chaired by Addison Phillips (webMethods) and the Internationalization Guidelines, Education & Outreach (GEO) Working Group is chaired by Richard Ishida (W3C). All three Working Groups and the Internationalization Interest Group, chaired by Martin Dürst (W3C), are chartered through October 2006.
The Working Groups are now looking for additional participants.
W3C has opened a position in Internationalization at Keio University in Japan, because Martin Dürst is leaving the W3C Team at the end of March. For details, see the description and instructions.
(For other open positions at W3C, please see the Job Opportunities list. )
A new Character Model document dealing with Resource Identifiers was published as a Candidate Recommendation today. The content of this document was previously part of the Character Model Fundamentals document.
It is an architectural specification providing authors of specifications, software developers, and content developers with a common reference for the use of resource identifiers building on the Universal Character Set, defined jointly by the Unicode Standard and ISO/IEC 10646.
Editors: Martin J. Dürst, François Yergeau, Richard Ishida, Misha Wolf, Tex Texin [search key: charmod-resid]
The Character Model Fundamentals document moved to Proposed Recommendation status today.
This is an architectural specification providing authors of specifications, software developers, and content developers with a common reference for interoperable text manipulation on the World Wide Web, building on the Universal Character Set, defined jointly by the Unicode Standard and ISO/IEC 10646. Topics addressed include use of the terms 'character', 'encoding' and 'string', a reference processing model, choice and identification of character encodings, character escaping, and string indexing.
Editors: Martin J. Dürst, François Yergeau, Richard Ishida, Misha Wolf, Tex Texin [search key: charmod]
Developed to help achieve worldwide usability for Web services, these requirements address the way internationalization options are exposed in Web services definitions, descriptions, messages, and discovery mechanisms.
Editor: Addison Phillips [search key: ws-i18n-req]
This page lists updates to resources and publications on the W3C International site, as well as news items. Items are in chronological order, with the newest at the top. There are also a number of additional lists generated from this one according to categories assigned to news items. They currently include:
Each of the logs provided comes with a link to an RSS 2.0 feed, so that you can be notified of new items. For example, non-native English speakers or translators may wish to subscribe to the translations RSS feed, to know when new translations are produced.
If you would like to see additional categories, please contact Richard Ishida at ishida @ w3.org.
How do you define localization, internationalization and globalization? How are these concepts related? [search key: qa-i18n]
The GEO Task Force of the Internationalization WG has published this new Working Draft to solicit comments prior to publication as a WG Note. Please read and send any comments to firstname.lastname@example.org.
The document provides HTML authors with techniques for developing internationalized HTML using XHTML 1.0, HTML 4.01, or XHTML 1.1, supported by CSS1, CSS2 and some aspects of CSS3. This document focuses specifically on advice about specifying the language of content. [search key: html-tech-lang]
This test seeks to establish whether and how a user agent supports the use of the link element to allow the user to navigate to versions of the current document written in alternative languages.
See also the preliminary results and conclusions. [search key: sec-link]
The Internationalization Working Group has made available draft charters for the next charter period beginning later this year. Note that these are still in development. There are two charters: one for the Internationalization Core WG, that will continue the work of the current Core and Web Services Task Forces; and a charter for a new Internationalization GEO (Guidelines, Education & Outreach) WG that will continue and expand the current work of the current GEO Task Force. Comments on these proposed charters should be sent to email@example.com.
Why should I use the language attribute in web pages? [search key: qa-lang-why]
Should I declare the language of my XHTML document using a language attribute, the Content-Language HTTP header, or a meta element? [search key: qa-http-and-lang]
How do I use .htaccess directives on an Apache server to serve files with a specific encoding? [search key: qa-htaccess-charset]
These two tests examine the recognition of escapes in XHTML. [search key: sec-escapes]
This test seeks to establish whether a user agent supports use of the hreflang attribute plus CSS to display information about the language of a link target. It does not test whether the hreflang value is used when viewing the target document.
See also the preliminary results and conclusions. [search key: sec-hreflang-style]
The Web Services Task Force of the Internationalization Working Group has released an updated Working Draft of Web Services Internationalization Usage Scenarios with additional guidance for implementers of Web service technologies. The document examines how language, culture and related issues interact with Web services architecture and technology. Comments are welcome on this draft. [search key: ws-i18n-scenarios]
The GEO Task Force of the Internationalization Working Group has published three First Working Drafts under the general title of Authoring Techniques for HTML/XHTML Internationalization. They are Characters and Encodings 1.0, Specifying the language of content 1.0 and Handling Bidirectional Text 1.0. These new documents have been separated out from what was previously a single document and updated. They provide HTML authors with techniques for developing internationalized HTML using XHTML 1.0, XHTML 1.1, or HTML 4.01, supported by CSS1, CSS2 and some aspects of CSS3.
The Internationalization Working Group has published a new Internet Draft of Internationalized Resource Identifiers (IRIs). A two-week last call on the firstname.lastname@example.org mailing list was announced, which ends May 23, 2004. IRIs are similar to URIs, but with the restriction to US-ASCII removed, and with a mapping to URIs.
Richard Ishida, team contact for the W3C Internationalization Working Group, presented as part of a panel discussion to a meeting of the UK Usability Professionals Association, 22 April, in London. Title of his talk: Don’t Blame the Localizers! [PDF 715Kb]. The talk presented some examples of how designers and developers shoulder much responsibility in enabling internationalization of products.
Do I need to worry because display capabilities (screen sizes, number of colors, etc.) of computers vary in other countries? [search key: qa-display-capabilities]
How do I use the MultiViews approach on an Apache Web server to automatically serve resources in the language requested by an HTTP request? [search key: qa-apache-lang-neg]
The Internationalization Activity welcomes the participation of individuals and organizations around the world to help improve the appropriateness of the Web for multiple cultures, scripts and languages.
How to participate:
Join a Working Group:
subscribeas the subject.
More information about the Internationalization Activity.