W3C   W3C Internationalization (I18n) Activity: Making the World Wide Web truly world wide!

Latest del.icio.us tags

Blog searches


If you own a blog with a focus on internationalization, and want to be added or removed from this aggregator, please get in touch with Richard Ishida at ishida@w3.org.

All times are UTC.

Powered by: Planet

Planet Web I18n

The Planet Web I18n aggregates posts from various blogs that talk about Web internationalization (i18n). While it is hosted by the W3C Internationalization Activity, the content of the individual entries represent only the opinion of their respective authors and does not reflect the position of the Internationalization Activity.

August 11, 2015

Global By Design

Is .com worth changing your name?

Venture capitalist Paul Graham believes so. He notes: 100% of the top 20 YC companies by valuation have the .com of their name. 94% of the top 50 do. But only 66% of companies in the current batch have the .com of their name. Which suggests there are lessons ahead for most of the rest, one  … Read more

by John Yunker at 11 August 2015 02:55 PM

August 10, 2015

Global By Design

Global gateway icon: Globes gain traction

Regular readers of this blog know well that I advocate for a global gateway icon on websites and apps — a visual element that lets users know, regardless of their language, where to find the global gateway menu. I recommend using a globe icon because it displays well in small sizes, can be made geographically neutral  … Read more

by John Yunker at 10 August 2015 10:02 PM

August 09, 2015

ishida>>blog » i18n

UniView 8.0.0a: CJK features added

Picture of the page in action.
>> Use UniView

This update allows you to link to information about Han characters and Hangul syllables, and fixes some bugs related to the display of Han character blocks.

Information about Han characters displayed in the lower right area will have a link View data in Unihan database. As expected, this opens a new window at the page of the Unihan database corresponding to this character.

Han and hangul characters also have a link View in PDF code charts (pageXX). On Firefox and Chrome, this will open the PDF file for that block at the page that lists this character. (For Safari and Edge you will need to scroll to the page indicated.) The PDF is useful if there is no picture or font glyph for that character, but also allows you to see the variant forms of the character.

For some Han blocks, the number of characters per page in the PDF file varies slightly. In this case you will see the text approx; you may have to look at a page adjacent to the one you are taken to for these characters.

Note that some of the PDF files are quite large. If the file size exceeds 3Mb, a warning is included.

by r12a at 09 August 2015 08:56 AM

July 30, 2015

W3C I18n Activity highlights

W3C Workshop Report on the MultilingualWeb workshop in Riga

A report summarizing the MultilingualWeb workshop in Riga is now available from the MultilingualWeb site. It contains a summary of each session with links to presentation slides and minutes taken during the workshop in Riga. The workshop was a huge success. With the parallel Connecting Europe Facility (CEF) event, it had more than 200 registered participants. See a summary of highlights, and a dedicated report about outreach activities of the supporting EU funded LIDER project. The Workshop was locally organized by Tilde, sponsored by the LIDER project and by Verisign. Learn more about the Internationalization Activity.

by Richard Ishida at 30 July 2015 01:46 PM

July 23, 2015

W3C I18n Activity highlights

First Public Working Draft: Requirements for Chinese Text Layout 中文排版需求

The Internationalization Working Group has published a First Public Working Draft of Requirements for Chinese Text Layout (中文排版需求), on behalf of the Chinese Layout Task Force, part of the Internationalization Interest Group.

The document describes requirements for Chinese script layout and text support on the Web and in digital publications. These requirements inform developers of Web technologies such as CSS, HTML, and SVG, and inform browser and tool implementers, about how to support the needs of users in Chinese-speaking communities.

This is still a very early draft and the group is looking for comments and contributions to support the ongoing development of the document.

by Richard Ishida at 23 July 2015 03:53 PM

Updated Working Draft: Requirements for Hangul Text Layout and Typography

Changes in this publication of Requirements for Hangul Text Layout and Typography (한국어 텍스트 레이아웃 및 타이포그래피를 위한 요구사항) are editorial in nature, but significant. The separate English and Korean versions of the document were merged into one page. (You can use buttons at the top right of the page to view the document in one language or the other, if you prefer.)

Merging the languages helps significantly for development and maintenance of the document, for guiding users to a language version they prefer, and for bilingual readers offers additional opportunities.

In addition, the links to issues in the document were changed to point to the github issues list, rather than the former Tracker list.

There were no substantive changes to the English (authoritative) version, but the Korean version was brought into line with earlier changes to the English text.

by Richard Ishida at 23 July 2015 03:46 PM

July 22, 2015

W3C I18n Activity highlights

Updated Working Draft: Indic Layout Requirements

Indic Layout Requirements describes the basic requirements for Indic script layout and text support on the Web and in Digital Publications. These requirements provide information for Web technologies such as CSS, HTML, and SVG about how to support users of Indic scripts. The current document focuses on Devanagari, but there are plans to widen the scope to encompass additional Indian scripts as time goes on.

Changes in the new version relate to initial letter styling in Devanagari text. Editorial changes were also made to bring the document in line with recent changes to the Internationalization Activity publishing process.

by Richard Ishida at 22 July 2015 10:17 AM

Updated Working Draft: Character Model for the World Wide Web: String Matching and Searching

Character Model for the World Wide Web: String Matching and Searching builds upon Character Model for the World Wide Web 1.0: Fundamentals to provide authors of specifications, software developers, and content developers a common reference on string identity matching on the World Wide Web and thereby increase interoperability.

This new version introduces numerous editorial changes as well as replacing some temporary terminology with better terms, and integrating the case folding text from the string matching algorithm into the case folding section. The document template was also adapated to match the new Internationalization publication process. See details of changes.

by Richard Ishida at 22 July 2015 10:12 AM

Additional Requirements for Bidi in HTML & CSS published as Working Group Note

Additional Requirements for Bidi in HTML & CSS was used to work through and communicate recommendations made to the HTML and CSS Working Groups for some of the most repetitive pain points prior to HTML5 and CSS3 for people working with bidirectional text in scripts such as Arabic, Hebrew, Thaana, etc.

It is being published now as a Working Group Note for the historical record in order to capture some of the thinking that lay behind the evolution of the specifications and to help people in the future working on bidi issues to understand the history of the decisions taken. Notes have been added to give a brief summary of what was actually implemented in the HTML or CSS specifications.

by Richard Ishida at 22 July 2015 08:52 AM

July 17, 2015

Global By Design

In search of a better translation icon

A few years ago I wrote about the translation icon and its many variations at that point in time. I thought now would be a good time to revisit this icon. Let’s start with the Google Translate. This icon has not changed in substance over the years but it has been streamlined a great deal. Here is  … Read more

by John Yunker at 17 July 2015 01:54 AM

July 14, 2015

Global By Design

Global gateway fail: Qualcomm

Qualcomm supports a mobile-friendly website but doesn’t (yet) support a global-friendly website. Particularly if you’re a visitor who does not speak English. The home page, shown here, includes no evidence of a global gateway. It does exist — you have to click the ellipses button at the bottom of the left column Tthen you are met with this  … Read more

by John Yunker at 14 July 2015 01:13 AM

July 07, 2015

Global By Design

On the importance of date display localization

The proper display of dates for each locale has become relatively trivial with libraries such as Globalize and yet I still encounter websites that don’t get it right. Case in point, I recently visited a tech website looking for a firmware upgrade and I found a list of three downloads: I had to scan to list to  … Read more

by John Yunker at 07 July 2015 07:10 PM

June 18, 2015

W3C I18n Activity highlights

Announcing The Unicode® Standard, Version 8.0

Version 8.0 of the Unicode Standard is now available. It includes 41 new emoji characters (including five modifiers for diversity), 5,771 new ideographs for Chinese, Japanese, and Korean, the new Georgian lari currency symbol, and 86 lowercase Cherokee syllables. It also adds letters to existing scripts to support Arwi (the Tamil language written in the Arabic script), the Ik language in Uganda, Kulango in the Côte d’Ivoire, and other languages of Africa. In total, this version adds 7,716 new characters and six new scripts. For full details on Version 8.0, see Unicode 8.0.

The first version of Unicode Technical Report #51, Unicode Emoji is being released at the same time. That document describes the new emoji characters. It provides design guidelines and data for improving emoji interoperability across platforms, gives background information about emoji symbols, and describes how they are selected for inclusion in the Unicode Standard. The data is used to support emoji characters in implementations, specifying which symbols are commonly displayed as emoji, how the new skin-tone modifiers work, and how composite emoji can be formed with joiners. The Unicode website now supplies charts of emoji characters, showing vendor variations and providing other useful information.

Some of the changes in Version 8.0 and associated Unicode technical standards may require modifications in implementations. For more information, see Unicode 8.0 Migration and the migration sections of UTS #10, UTS #39, and UTS #46.

by Richard Ishida at 18 June 2015 09:51 AM

June 17, 2015

ishida>>blog » i18n

UniView 8.0.0 available

Picture of the page in action.

>> Use UniView

Unicode 8.0.0 is released today. This new version of UniView adds the new characters encoded in Unicode 8.0.0 (including 6 new scripts). The scripts listed in the block selection menu were also reordered to match changes to the Unicode charts page.

The URL for UniView is now http://r12a.github.io/uniview/. Please change your bookmarks.

The github site now holds images for all 28,000+ Unicode codepoints other than Han ideographs and Hangul syllables (in two sizes).

I also fixed the Show Age filter, and brought it up to date.

by r12a at 17 June 2015 06:28 PM

June 15, 2015

Global By Design

When will more global websites support Arabic?

I read a brief report on digital Arabic content produced by the Wamda Research Lab, in partnership with Google and Taghreedat. A few data points jumped out at me, such as: By 2017 over half of the Arab world will have access to the Internet, an increase from the 32% that were online in 2012. Estimates suggest  … Read more

by John Yunker at 15 June 2015 07:15 PM

June 10, 2015

Global By Design

You Say .Sucks, I Say .Global: The flood of new domain names isn’t pretty but will create a truly global Internet

I sympathize with Internet old timers (such as myself) who look back wistfully at the good ol’ days, when the only decision you had to make when registering a new domain name was choosing between .com or .net. Today, there are more than 500 of these top-level domains from which to choose (with 400 more  … Read more

by John Yunker at 10 June 2015 05:13 PM

Wikimedia Foundation


The Content Translation Tool makes it easier to create new new Wikipedia articles from other languages. It is now available as a beta-feature in 148 Wikipedias. The tool now features an updated selector with image thumbnails for search results. Screenshot by Runa Bhattacharjee, freely licensed under CC0 1.0

Since our last blog post, much has happened in the world of Content Translation — a tool that makes it easier to translate Wikipedia articles into different languages. The Wikimedia Language Engineering team deployed as a beta-feature in January 2015 in the Catalan, Spanish and Portuguese Wikipedias; today, nearly 150 Wikipedias have access to the tool, and more than 5,000 articles have been created by more than 1,500 editors.

While on the one hand there are large Wikipedias, like English or German, where thousands of volunteers have written articles about millions of topics, there are over 100 smaller Wikipedias where a handful of volunteers are struggling to add more content. Translating from an existing article in another language is a common method adopted in such Wikipedias to create more content. Content Translation attempts to solve this rather daunting problem by simplifying the process, allowing editors to quickly create the first draft of the article and focus on improving the content. It includes an editing interface and translation tools that make it easy to adapt wiki-specific syntax, links, references, and categories. Machine translation support via Apertium is also available for a limited set of languages, which can considerably speed up the process; if it is currently missing for yours, we invite you to take our ongoing survey to test and provide feedback.

Even without machine translation, the tool has been used to translate from any of the available languages (i.e. from all Wikipedias) with features that allow automatic adaptation of links, images, references and categories. For instance, nearly 500 new articles have been created in the French Wikipedia using Content Translation and without machine translation.

New languages and feature improvements

At present, Content Translation is available on 148 Wikipedias as a beta-feature for logged-in users, including several large Wikipedias like French, Dutch and Polish. Being a beta-feature, only logged-in users can use it at present by enabling it from their preferences.

The tool presents a simple workflow—select the source language and article to translate, select the target language, translate the contents of the article, and publish it as a new page in the corresponding Wikipedia. For category adaptation, the corresponding category needs to exist in the target Wikipedia. Translators can also save the translations and work on it later.

Articles published using Content Translation in April and May 2015. Image by Pau Giner, freely licensed under CC BY-SA 4.0.

New translators who published articles using Content Translation in April and May 2015. Image by Pau Giner, freely licensed under CC BY-SA 4.0.

In the months of April and May, we focused on improving features that made it easy for users to start translating with Content Translation by quickly gaining access to it. We introduced a campaign that prompted users to try Content Translation instead, when they were creating a new article. Users could enable the feature directly from the campaign message screen and begin translating the page from another language. A call-out message was also added to the Contributions menu providing quick access to different kinds of contributions (including translations). As an outcome of these measures, we now see a sharp increase in the number of new articles being created every week by increasing number of new users (see images). We expect to get better insight into the usage numbers in the coming month.

Feature improvement highlights:

  • To prepare for deployment on wikis with Right-to-Left content, several bugs have been fixed.
  • Users will also see an improvement in the selector dialog, where results from articles searched are now displayed with a thumbnail and small description (see image).
  • The ULS input method has been integrated in the Content Translation editing interface
  • New articles created using Content Translation are now automatically linked through Wikidata

Deployment update and what’s coming next

In the coming month, we aim to continue adding Content Translation as a beta-feature to more Wikipedias so that more users can test the tool. This not only exposes special cases that we need to be aware of (like local gadgets or Wikipedia specific scripts) but also provides us with feature suggestions.

Upcoming features:

  • Improved link handling with provision for complex use-cases.
  • Redesigned statistics page with additional data.
  • Preliminary features for an integrated notification system using Echo, to better connect with our users

The Language team will be hosting two Content Translation workshops at Wikimania this year. You can sign up on the Wikimania website (here and here); it is open for all participants. You can read more about Content Translation on the project page and also in the new User Guide (translations are very welcome!).

Read more about Content Translation developments and other updates from the Language Engineering team in our monthly report. We would also like to invite everyone for our online office hour session on June 10 at 14:30 UTC.

Runa BhattacharjeeLanguage Engineering, EditingWikimedia Foundation

by Runa Bhattacharjee at 10 June 2015 04:17 AM

June 08, 2015

Global By Design

Global gateway fail: DeWalt

DeWalt greets visitors with a pull-down global gateway shown here: This type of landing page is not ideal in this day and age. Using geolocation (see Geolocation for Global Success), DeWalt could take the user directly to the localized website and display an overlay asking the user to confirm or change locale setting. But this is  … Read more

by John Yunker at 08 June 2015 08:55 PM

June 02, 2015

Global By Design

Country codes are maturing — but not retiring

Country codes are still adding registrations but the more developed markets are seeing a decline in growth rate. Shown below is a visual from the registry behind the .FR domain: Of course, AFNIC is keen to point out that the growth rate for country codes is still much higher than “legacy” domains like .com and  … Read more

by John Yunker at 02 June 2015 05:45 PM

May 30, 2015

Wikimedia Foundation


The Translatewiki.net project enables communities to localize open source software. It was recently used for a “Translation Rally” that engaged volunteers around the world to translate over 44,000 messages in nine days. Photo by Christian Mehlführer, CC BY-SA 3.0.

How can we engage volunteers to contribute in important yet monotonous tasks? Over the past year, Wikimedia Sverige (Sweden) has been experimenting with ways to strengthen its community on translatewiki.net — a little-known project that nevertheless benefits hundreds of millions of people each month.

Translatewiki is a platform for translating the texts that appear in open source software, including the MediaWiki software used on Wikipedia. These translations make it possible for you to get all the buttons and system messages on Wikipedia in your preferred language; it is preparatory work to make it as easy as possible for writers and readers to use Wikipedia and other open source software.

Translating technical messages is therefore a very important task, but it is often rather isolated and and independent work. Wikimedia Sverige aimed to change that, to make it fun to jointly produce an effort which differs from the regular activity, and therefore invited Translatewiki‘s volunteer translators to a “Translation Rally” for nine days in mid-May with a sum of 500 euros to be divided between all participants reaching more than 500 translations of some of the most important messages. This concept was originally developed by Wikimedia Nederland (Netherlands).

We initially aimed to complete the messages in MediaWiki’s core software—the central messages used on the Wikimedia projects. When finished, the participants could continue with 11 other selected projects. There are almost 65,000 messages to translate to each language, of which MediaWiki constitutes approximately 24,500. Some are only one word long (e.g. “Save”), while others may be several sentences long. As the translations are completed by volunteers, some languages ​​are almost completely translated—but others are almost entirely untranslated. Many even lack translations of the core messages. Participants were given the opportunity to either keep the money for themselves or donate them to Translatewiki‘s continued operation. The majority of the translations were made into non-European languages, but these languages also benefited; for example, hundreds of messages were translated into Swedish.

Number of edits during the Translation Rally in May 2015. Graph by Translatewiki.net, freely licensed under CC BY-SA 3.0.

Prior to the Translation Rally, an email was sent to all registered users asking them to join; this was an important step, as it brought in older users whose activity had dropped off over the years. During the rally’s nine days, the website’s activity was around four times higher than normal. 201 users contributed at least one new translation, and a massive 44,844 messages were added.

Sites using MediaWiki software are now easier to use in the 116 languages improved; it is clear that a much higher activity was achieved thanks to the Rally. However, most of the volunteers did not reach the minimum of 500 translations and couldn’t claim a slice of the 500 euros; 23 of them had valid claims and will split the prize. The winner with the most qualified translations are yet to be appointed.

A remaining question is if this type of activity has a positive or negative effect on the community in the long term. The benefit for community engagement is that people are invited and engaged in something new and exiting; you can create a noticeable buzz. However, there are potential risks when adding money or prizes into the mix. Will that reduce interest to participate in the long run, when there are no more prices? Can conflicts increase because of this? Will participants be more sloppy with their translations (this seem to have happened this time)? What can we then do to mitigate these risks? These types of predictions are notoriously hard to do without proper research, as different methods might have different problems and gains. We would greatly welcome more studies in this area.

The only thing we can say with some certainty right now is that for a limited cost, there has been a massive short-term positive effect, especially for languages spoken in poorer countries. From the graphs we developed, you can see that activity has regressed to its previous norm now that the rally ended.

The Translation Rally was organized and sponsored by Wikimedia Sverige, with a generous support from Internetfonden, and the rally itself was run by Siebrand Mazeland at the Wikimedia Foundation.

John AnderssonWikimedia Sverige

by John Andersson at 30 May 2015 02:11 AM

May 28, 2015

Internet Globalization News

Why Country Sites of International Companies are so Bad

The worst sites are usually not the truly local sites designed by local businesses or government agencies. Instead, the offenders often come from multinational corporations (small and large) that create country sites with horrible usability - and usually without a true understanding of the local market and users. via www.nngroup.com How can multinational companies solve this problem and get better country sites? By reversing the causes of the bad design: Don't let your local office throw away money to advertising agencies that don't understand Internet marketing. Instead, consider local sites as part of a global Internet strategy. Specifically: Document the design rationale for your website and your product line strategy, and ensure that local teams understand why the web team at headquarters does things in particular ways. Train local staff in web usability, Internet marketing, and other topics that will empower them to say no to inane design ideas from...

by blogalize.me at 28 May 2015 09:19 PM

May 27, 2015

Global By Design

No Ordinary Disruption: It’s time to reset intuitions

I was given a review copy of No Ordinary Disruption: The Four Global Forces Breaking All the Trends, which I read over the weekend. The authors are Richard Dobbs, James Manyika, and Jonathan Woetzel — all directors at the McKinsey Global Institute. Readers of this blog are not going to be surprised by some of the disruptions  … Read more

by John Yunker at 27 May 2015 03:26 PM

May 26, 2015

Global By Design

Hotels.com, Hotels.ng and the value of country codes

I read today about the Nigerian startup Hotels.ng and my first thought was why Hotels.com didn’t already own the Nigerian country code. After all, Hotels.com owns country codes for France and Italy and Japan, among others. But was apparently late to registering country codes for Germany and Netherlands — as well as Nigeria (Africa’s most populous country). Now,  … Read more

by John Yunker at 26 May 2015 03:24 PM

May 19, 2015

Global By Design

Global gateway fail: App Annie

I want to focus on App Annie because it appears the company is planning to significantly expand its global reach — and therefore needs a gateway suited to task. Currently, App Annie supports five languages. But you might not know this because the gateway is buried in the footer, as shown here: To App Annie’s  … Read more

by John Yunker at 19 May 2015 04:12 PM

May 15, 2015

Global By Design

The humans behind machine translation

Google Translate is the world’s most popular machine translation tool. And, despite predictions by many experts in the translation industry, the quality of Google Translate has improved nicely over the past decade. Not so good that professional translators are in any danger of losing work, but good enough that many of these translators will use Google  … Read more

by John Yunker at 15 May 2015 08:37 PM

May 12, 2015

Global By Design

Who needs BRIC when you have the Blue Banana?

Perhaps it’s human nature (or perhaps just savvy marketing) to think up new and unique ways of organizing our world. BRIC (Brazil, Russia, India, China) is one popular grouping. And did you know about MINT (Mexico, Indonesia, Nigeria, Turkey)? Or MIST (Mexico, Indonesia, South Korea, Turkey)? And if you think those groupings sound odd, consider CIVETS (Colombia, Indonesia, Vietnam,  … Read more

by John Yunker at 12 May 2015 05:35 PM

May 07, 2015

Global By Design

Greece gets an Internationalized Domain Name (IDN)

Greece has received string approval for an IDN, shown below in red: This brings to 35 the number countries with approved IDNs — and an impressive range of scripts, shown here: I’ve just updated the IDN map of the world. If you’d like to order a copy — or a custom variation —  contact me.

by John Yunker at 07 May 2015 07:10 PM

May 03, 2015

Global By Design

Translators Without Borders and the Wikipedia 100-language project

Translators Without Borders is an amazing organization of volunteer translators using their skills to make the world a better place. One project worth noting is an ambitious effort to translate valuable Wikipedia articles into 100 languages: The 100 x 100 Wikipedia Project envisions the translation of the 100 most widely read Wikipedia articles on health issues  … Read more

by John Yunker at 03 May 2015 03:13 PM

April 27, 2015

Global By Design

Will FedEx plus TNT equal an improved global website?

As FedEx closes its acquisition of TNT Express, I see an opportunity for FedEx to improve its global website. In the recent Web Globalization Report Card, among delivery services companies, FedEx finished dead last. TNT supports 37 languages, compared with the relatively paltry 27 languages that FedEx supports. Hopefully FedEx will embrace a new language baseline  … Read more

by John Yunker at 27 April 2015 05:28 PM

April 24, 2015

W3C I18n Activity highlights

Updated Working Draft: Language Tags and Locale Identifiers for the World Wide Web

Language Tags and Locale Identifiers for the World Wide Web describes the best practices for identifying or selecting the language of content as well as the the locale preferences used to process or display data values and other information on the Web. It describes how document formats, specifications, and implementations should handle language tags, as well as extensions to language tags that describe the cultural or linguistic preferences referred to in internationalization as a “locale”.

Changes in this update include the following: All references to RFC3066bis were updated to BCP 47 or to RFC5646 or RFC 4647 as appropriate.References to HTML were changed to point to HTML5. Imported and rewrote the text formerly containing in Web Services Internationalization Usage Scenarios defining internationalization, locale, and other important terms. Modified and reorganized the other sections of this document. Moved the Web services materials to an appendix.

by Richard Ishida at 24 April 2015 09:28 AM

Contact: Richard Ishida (ishida@w3.org).