W3C Architecture DomainW3C Internationalization (I18n) Activity: Making the World Wide Web truly world wide!

Subscribe to this

(feed) Atom feed

Latest del.icio.us tags

Contributors

If you own a blog with a focus on internationalization, and want to be added or removed from this aggregator, please get in touch with Richard Ishida at ishida@w3.org.

All times are UTC.

Powered by: Planet

Planet I18n

The Planet I18n aggregates posts from various blogs that talk about Web internationalization (i18n). While it is hosted by the W3C Internationalization Activity, the content of the individual entries represent only the opinion of their respective authors and does not reflect the position of the Internationalization Activity.

May 16, 2008

Global By Design

Google Translate is growing up

What began as just another “gisting” application — like Babel Fish — is gradually becoming an impressive translation tool. And I’m not referring to the quality of translation, though that is improving as well.

I’m referring to the breadth of languages and breadth of features that Google Translate supports.

Today, Google announced that Google Translate added support for ten more languages, bringing the total to 23. The ten new languages are Bulgarian, Croatian, Czech, Danish, Finnish, Hindi, Norwegian, Polish, Romanian and Swedish.

And that’s not all!

Google Translate also now provides a detect language tool that will tell what language a batch of text is in. This type of tool can come in awfully handy for people like me who navigate across so many languages on a daily basis. It’s an easy feature for Google to support because the translation engine needs to know what the source language is before translating it. But I also tested language detect on a few languages not yet supported for translation, such as Slovakian, and the engine correctly identified them.

A week ago, I integrated Google Translate into the home page of Byte Level:

Google Translate on Byte Level Research

When it comes to translation, I’m not a good example of “putting my money where my mouth is.” Byte Level Research, with the exception of the Tower of Babel site, has been available only in English for years.

While I have no illusions that this widget will make up for a lack of professionally translated text, I am curious to see if people use it and to what extent. What I need to know is if Google Analytics can track Google Translate widget usage so I can know which languages are most popular. If anyone knows how to set this up, please contact me.

And, if nothing else, it’s an interesting experiment — and it buys me time before having to shell out real money for professional translation, which I will ultimately need to do.

by John Yunker at 16 May 2008 02:58 AM

May 15, 2008

Hacklog: Blogamundo

“Machine Translation” is a Misnomer

An ickle rant ensues.

The public doesn’t understand how machine translation works. And generally speaking, the public doesn’t understand that machine translation couldn’t exist without human translators in the loop.

In other words, it’s not really “machines” that are “doing” the translating, it’s people. The machines are simply programmed to imitate the translations the people do.

I think this is symptomatic of a widespread disease in the computer world: an obsession with cutting people out of the loop.

It’s the same tunnel vision that motivated the much (and rightly) criticized footer that graced Google News at its launch:

This page was generated entirely by computer algorithms without human editors.

No humans were harmed or even used in the creation of this page.

O’Reilly Network — Google Needs People

Admittedly, this quote has long since been removed, and it was probably only meant as a joke in the first place anyway. But this problematic attitude still underlies a lot of reactions to Machine Translation. In general, people don’t realize that it wouldn’t be possible without human translators.

Yes, the title is pretty much a joke. I do visit reality once in a while.

by Patrick Hall at 15 May 2008 11:48 PM

Global By Design

Poland joins the million domain club

Poland (.pl) announced last week that it joined the million domain club by registering its one-millionth ccTLD.

The reason for the sudden surge in registrations is Poland’s easing of the registration process by adding partner registries. There are now 95 registries for .pl including 37 of these based outside of Poland.

Here’s a chart of all countries with more than one million registrations. I also included the EU in there. I did not include Tuvalu (.tv).

Million domain ccTLD list

I expect China to surpass Germany in the next two weeks.

For the ultimate country code reference, see the Country Codes of the World poster.

by John Yunker at 15 May 2008 04:24 PM

May 13, 2008

Hacklog: Blogamundo

Aligning translations with text compression?

Dear interwebs series of tubes people:

Random thought:

If you have two translations, and you perform some sort of compression on the both of them, could interesting relationships between the content of the two translations be uncovered? For instance, it seems like you might be able to get rid of non-content words, which might make it conceivable to align the texts at a phrase level.

I’ve dug a bit, but only found a paper by Conley and Klein, “Using Alignment for Text Compression.” But a quick glance at that (haven’t read it yet) suggests that they’re interested in improving compression for compressions sake, which isn’t what I have in mind.

Thanks for your thoughts and observations, interwebs.

by Patrick Hall at 13 May 2008 08:31 PM

May 11, 2008

Global By Design

The art & science of global navigation: June 3rd

My second Lionbridge Webinar is scheduled for June 3rd at 1 pm EST and you can register for it here.

The topic is global navigation — why it’s so important and how to improve upon it. I wrote an ebook about this topic two years ago. Since then, geolocation and language negotiation have become more commonly used and an increasing number of companies have launched splash global gateways — like Intel, which launched its first splash gateway just last week.

If I have time, I also plan to talk about IDNs — internationalized domain names — and why companies will need to register them (and may in fact be required to register them).

See you on June 3rd!

by John Yunker at 11 May 2008 08:14 PM

May 08, 2008

Global By Design

Web globalization webinar follow-up

The Lionbridge webinar yesterday has been archived for those of you who couldn’t make it. You can register to listen to the call at the Lionbridge site.

And mark your calendars for June 3rd, when I will host a second webinar, also sponsored by Lionbridge, to discuss the many aspects of global navigation — from splash global gateways, to country codes, to geolocation. I’ll include lots of real-world examples.

by John Yunker at 08 May 2008 06:49 PM

Even the Vatican uses a splash global gateway

I took a moment to visit the Vatican recently at www.vatican.va.

And here is what I saw — a splash global gateway:

Vatican splash global gateway

The splash gateway offers six languages: English, German, Spanish, Italian, Portuguese, and French.

And I enjoy the fact that the Vatican City has its own country code top-level domain: .va. In case you’re wondering, our Country Codes of the World poster includes the .va domain.

by John Yunker at 08 May 2008 06:44 PM

May 05, 2008

Hacklog: Blogamundo

Unicode headed toward World Domination™

The Google Blog has a chart showing that there is a very clear trend toward Unicode adoption.

Apparently their numbers refer to UTF-8 alone (as opposed to UTF-16/UCS-2 or (haha)UTF-32/UCS-4), which again is good news. (Though one wonders if there is any uptake of UTF-16 on the web… I hope not.)

The data is “Google internal”… peer-reviewed, it ain’t.

Thanks to Won for the pointer!

by Patrick Hall at 05 May 2008 07:09 PM

May 02, 2008

Global By Design

Google vs. Baidu: A User Experience Analysis

There are tons of articles about Google vs. Baidu, but few of these articles take an in-depth look at how Google compares to Baidu from a Chinese user’s perspective.

In this article, I do just that, and I render a verdict as to which Web site is better.

Search

The best way to compare search engine quality is to compare searches.

I recently input three Chinese keywords for my experiment:

  • 许霆 (Xu Ting: A Chinese citizen who was recently involved in a controversial criminal case)
  • 次级房贷 (Subprime mortgage)
  • 看羹吃饭 (Kan-Geng-Chi-fan: A phrase used and recognized by a relatively small number of Chinese, meaning that you have to think carefully before taking action)

These keywords represent three different categories of information people search for online. Xu Ting is a hot keyword in China at the moment but it has received little international media coverage. Subprime mortgage, on the other hand, is a foreign concept and the term has been transliterated into Chinese characters from the English equivalent. Kan-Geng-Chi-fan is used within a specific dialect that is not used by the majority of Chinese citizens.

Okay, here are the results as of April 18, 2008:

“Xu Ting”

It would seem that Baidu knows much more about Xu Ting than Google, although I did not verify that every result referred to this particular individual.

Interestingly, in the first results page of both google.com and google.cn, one of the search results directed users to Baidu Post — Baidu’s popular user forum.

Overall, I would rate both sites equally because the top 20 results from each search engine were highly qualified and I could easily find information I wanted from there. Verdict: A tie.

 

“Subprime mortgage”

This time google.cn appears to do much better than Baidu. But if we look closely at the top 20 search results, we’ll find there are 7 results at google.com and 5 results at google.cn that direct us to Web sites that use traditional Chinese characters, which are used in Taiwan, Hong Kong and by the overseas Chinese community.

It can be rather challenging for the mainland Chinese to read traditional Chinese, though they can understand most of the message. Nonetheless, this mix of simplified and traditional Characters is not the most user-friendly approach. Verdict: Baidu wins.

 

 

“Kan Geng Chi Fan”

At first glance, Google produced overwhelmingly more information than Baidu. However, if we examine the details, Google did not perform so well. Neither Google.com nor Google.cn produce an accurate search result within the first 10 pages respectively, while all the 207 search results from Baidu are accurate. Verdict: Baidu wins again.

Based on these three searches, Google comes across as a bit complicated and “foreign” to Chinese users. Baidu is the superior Chinese search engine.

 

Products

Both Google and Baidu are trying to leverage their network effects to promote other products. Google has many excellent products, but not every product has performed well in China. For example, Google Maps is widely used by American users. Unfortunately, Google Maps in China is unable to provide the same features due to unavailability of mapping data in China. Google’s satellite map currently only covers the major Chinese cities. Should Google acquire better maps, it would have a clear advantage over Baidu, which doesn’t offer the same degree of functionality and usability in its map tool.

Although music copyright is a controversial issue within China, the market reality is that millions of Chinese Internet users download free music online. Baidu understands this reality and its music search product — which presents a list of links for free music downloads when people search by song, singer, or label — is extremely popular. Google is unable to compete with Baidu in this regard due to its adherence to US copyright laws.

Another example is Baidu Post, an online forum allowing Internet user to create new topics based on search keywords and provide commentary. When people search online by keyword, they can also follow these keywords to Baidu Post, where they may find additional information — or at least find out what others think of the selected keywords.

Online forums are a very important medium in China for distributing information online. I think an important reason for this is because the Chinese, as well as many businesses, want to remain anonymous. While this may change in the years ahead as the next generation embraces social networking sites, for the time being, online forums are dominant. Baidu also offers a blog platform (Hi Baidu) while Google has localized Blogger into Chinese, very few Chinese people currently use it.

Local culture and consumer behavior are critical factors in determining whether a product will succeed in an overseas market or not. So far, Google products have not been as appealing as Baidu to Chinese users.

The Brand Name

The name of Baidu (百度) is from a beautiful Chinese ancient poem:

 

Thousands of times, I looked for my girl;

Suddenly, at some point, I stopped and looked back,

I found she was just over there among a bunch of lanterns.

This poem, written by Qiji Xin, who lived in the Song Dynasty nearly 1000 years ago, is still very popular in China and also taught in high schools. Baidu in Chinese means thousands of times. In Chinese culture, this poem communicates one’s desire to achieve his/her dreams. Obviously, meshes well with the services offered by Baidu, a company that claims it better understands Chinese users and Chinese culture.

Google started to use its Chinese name Guge (谷歌) in 2006. Guge (goo-ge) is transliterated from Google and it literally means “the song of grain” in Chinese. A survey conducted in 2006 shows 84.6% Chinese do not like this name. I think the most important reason is that Chinese people want to feel international and modern. This is also one reason you may see many Chinese companies using English words in their marketing materials, as it creates an international effect. The “song of grain” presents an image of the agricultural society that the Chinese people are striving to break away from.

Google has exerted a good deal of effort in localizing its name for China but it has not yet been accepted by the Chinese people. It may take some time. Some companies have chosen to simply use their English names in China, avoiding localization altogether, such as IBM.

To sum up, Baidu definitely has an edge over Google in China. But it is early yet and Google has been doing things such as redesigning its Chinese home page, which may resonate with users. The key takeaway here is that every new market is a new challenge; just because you are number one at home does not mean you will be number one in every country you enter. Should Baidu enter the US market some day, it will face many of the same challenges that Google is now facing in China.

by Jason Yu at 02 May 2008 01:23 AM

Hacklog: Blogamundo

Of the Media, Scientists, Word Lengths, and Colossal Squids

I’m a big fan of Pharyngula, but… he was kinna wrong about a nitpicky little detail. And this particular nitpicky little detail was about language, so, the truth must out!

In his post As big as dinner plates?, Dr. Myers compares two articles about the recent dissection of a Colossal squid. (Can we please pause to acknowledge that that thing is HUGE? Kthx.)

Read the USA Today article on the colossal squid eye, which boils down to basically, “Oooh, they’re big!”. Then compare it to the blog entry on the colossal squid eye, written by a scientist. The latter is much more informative, and contains more specific details, and isn’t afraid to challenge the reader with words longer than a single syllable.

Emphasis added, to the bit that’s linguistic. Now obviously, that’s an exaggeration; the AP article (not USA Today, in fact) doesn’t really consist of one-syllable words (though one can say a lot with one-syllable words…)

But the idea is clear enough: USA Today uses shorter words than scientists, because journalists dumb down science.

Right?

Heck, I dunno. So, I answered the question the way I usually do: I wrote a program.

Surprise!

Average Word Lengths
4.45 Blog
4.64 AP
        
Longest words
Blog:
photoreceptors
tremendously
considerably
architeuthis
neighbouring
cephalopods
cephalopods
disappeared
        
AP:
mesonychoteuthis
communications
international
invertebrates
redistributed
formaldehyde
centimeters
spectacular

Average word length is about the same. Which says nothing whatsoever about the quality of information in the articles; it does however say that they use words that are about the same size.

One can argue about the character of the long words in each text (you could use frequency counts of them, too), but still, the AP article uses the word “Mesonychoteuthis.” That’s a long way from a one-syllable word.

Here’s the code.

by Patrick Hall at 02 May 2008 01:09 AM

April 30, 2008

Global By Design

Reminder: Web Globalization Webinar in one week

Just a quick note to let you know that my Lionbridge Webinar will be held a week from today. Here’s the link to register. The call will also be recorded in case you can’t make it.

by John Yunker at 30 April 2008 07:11 PM

April 26, 2008

Hacklog: Blogamundo

Worth a look: An Introduction to Opentype

I’m a Unicode geek, but I feel like I don’t really know enough about what goes on in operating systems after the encoding and decoding issues are worked out. That is, when and where does all of that glyph selection and font shaping and other black magic actually happen? How much of it is dependent on the operating system? Applications? Fonts?

Head a splodes.

But whatever, live and learn. I did run across a very nicely done introduction to what appears to be the cutting edge in computer typography, OpenType:

Adam Twardoch’s PDF slides on OpenType: Typographic perfection with OpenType?

It gives a good idea of the sort of subtleties that OpenType can handle. And it’s very pretty. Font nerds, rejoice…

by Patrick Hall at 26 April 2008 07:00 PM

April 24, 2008

Hacklog: Blogamundo

Upper West Side, Zona Sul, and other tricky subdivisions

Random translation observation:

I was translating some Brazilian Portuguese into English, and the source article was about an earthquake near São Paulo. (See? Brazil does have natural disasters!)

A particular phrase got me thinking: a zona leste de São Paulo meaning something like “the Eastern Zone of São Paulo.” That’s a pretty tricky thing to translate―you don’t really talk about the “Eastern Zone” of a city in English.

In the States, you might talk about the “Upper East Side” of New York, or the “South Side” of Chicago, or “Northwest” (sometimes just “NW”) in DC. My Brazilo-Londonian homey Carlos tells me that zones in London have numbers, so you talk about “Zone 5,” etc.

And then there are those arrondissements in Paris, which are numbered like an escargot.

So does “The East Side of São Paulo” work as a translation for “a zona leste de São Paulo”? Sounds okay to me, actually.

I’d be curious to know about the ways that other cities are subdivided.

by Patrick Hall at 24 April 2008 06:37 AM

April 23, 2008

iheni » internationalisation

Twitter in Japanese

A Japanese version of Twitter was launched today making this the first alternative language version for the site. I find it interesting that Japanese is the first language that the site has been localised in but as Twitter reported in their blog they were noticing a high volume of users and Twitters originating from Japan which [...]

by iheni at 23 April 2008 08:22 AM

April 18, 2008

Global By Design

eBay also has international revenues to thank

eBay recently announced Q1 results and, like Google, shows a significant rise in international revenues vs. domestic revenues. Domestic revenues actually decreased by 1%.

Here’s a visual I cranked out that illustrates eBay’s transformation over the years:

eBay\'s global revenues

Like Google, eBay has foreign exchange rates to thank for the strong quarter, that is, a pathetically weak US dollar. Then again, eBay wouldn’t have been able to benefit from a weak dollar if it wasn’t already a global player to begin with. It’s worth emphasizing that those companies that were well diversified globally (as in Web globalization) before the dollar’s slide are navigating this looming recession quite nicely.

But I think the more interesting point is that eBay has continued to grow revenues internationally despites its numerous missteps in China. This should be a lesson to other companies that are thinking of throwing all their eggs into China. China has actually been a drag on eBay, while Western Europe has been a blessing.

By the way, eBay’s newest market launch was in Thailand.

by John Yunker at 18 April 2008 04:35 PM

April 17, 2008

Global By Design

Google international revenues surpass domestic

Way back in 2006, I predicted that Google’s international revenues would surpass US revenues by the end of 2007.

I was a few months off.

Today, Google announced Q1 results. According to Google, “revenues from outside of the United States totaled $2.65 billion, representing 51% of total revenues in the first quarter of 2008, compared to 47% in the first quarter of 2007 and 48% in the fourth quarter of 2007.”

Granted, foreign exchange rates played a role in this as well. But the point is that Google has benefited greatly from smart and aggressive Web globalization. Localizing Google Adwords into 40+ languages appears to have paid off nicely.

Google was ranked #1 in the 2008 Web Globalization Report Card.

by John Yunker at 17 April 2008 09:03 PM

April 16, 2008

W3C I18n Activity highlights

Updated Working Draft: Web Services Internationalization (WS-I18N)

The Internationalization Core Working Group has published a Working Draft of Web Services Internationalization (WS-I18N). This document describes enhancements to SOAP messaging to provide internationalized and localized operations using locale and international preferences. These mechanisms can be used to accommodate a wide variety of development models for international usage.

Editors: Addison Phillips, Mary Trumble (until September 2005), Felix Sasaki [search keys: tr-ws-i18n]

by Felix at 16 April 2008 01:23 PM

First Public Working Draft: Requirements of Japanese Text Layout

Participants from four W3C Groups CSS, Internationalization Core, SVG and XSL Working Groups as part of the Japanese Layout Task Force published Requirements of Japanese Text Layout. This document describes requirements for general Japanese layout realized with technologies like CSS, SVG and XSL-FO. The document is mainly based on a standard for Japanese layout, JIS X 4051. However, it also addresses areas which are not covered by JIS X 4051. Japanese version is also available.

Editors: Toshi Kobayashi, Yasuhiro Anan. [search keys: tr-jlreq]

by Richard Ishida at 16 April 2008 08:30 AM

April 12, 2008

Global By Design

Google is Microsoft

This comes via my brother via Blogoscoped — Google has made interesting use of the Iceland country code to create a URL that packs a narrative:

google is microsoft

Check out out here: www.google.is/microsoft.

by John Yunker at 12 April 2008 02:59 PM

April 11, 2008

W3C I18n Activity highlights

New article: Migrating to Unicode

Article: This article provides guidelines for the migration of software and data to Unicode. It covers planning the migration, and design and implementation of Unicode-enabled software. A basic understanding of Unicode and the principles of character encoding is assumed.

By Addison Phillips, Yahoo. [search key: article-unicode-migration]

by Richard Ishida at 11 April 2008 12:58 AM

April 10, 2008

Global By Design

Upcoming Web globalization Webinar

I’m going to be presenting a series of Webinars on Web globalization.

The Webinars are sponsored by Lionbridge.

Mark your calendar for May 7th at 1 pm (EST), when I will present the first Webinar — The Best Global Web Sites (and why) — which focuses on key findings from The 2008 Web Globalization Report Card.

The Webinars are free and open to executives who manage global Web sites, or have plans to do so.

To register, visit the Lionbridge site here.

by John Yunker at 10 April 2008 03:21 PM

April 09, 2008

Global By Design

Localization in China

I am pleased to have been invited by John Yunker to contribute thoughts on the localization industry in China. I welcome your comments and suggestions for future articles. Here’s my first posting -

Four years ago, I was working for a localization company in Shanghai. One day, I received a phone call from a woman who said: “I read your advertisement about localization services. We’ve just moved to Shanghai and I was wondering if you could help find a baby-sitter for us.” This may sound like a strange request, but it was not that unusual back then.

Fortunately, times have changed, and quickly. China has become one of the most important regional markets in the world for multinational corporations:

  • 470 of the Fortune 500 companies have invested in China;
  • 750+ multinational companies, including Microsoft, Intel, GE, and Motorola have established R&D centers in China;
  • In 2006, 144 multinational companies chose Shanghai as their Asia-Pacific regional headquarters, while 36 chose Beijing. These numbers are certain to grow.

And then there are the 210 million Internet users in China, according to CNNIC, making the country an alluring market for any Web-based service or application.

However, Chinese Web users have proven to be very selective when choosing news, ecommerce, and networking products. More often than not, they are choosing home-grown products. For example:

  • Despite Google’s best efforts thus far, Baidu is still the number one search engine in China.
  • Sina, Sohu, and Netease remain the three biggest news portals in this market, and not Yahoo! China.
  • QQ is an IM tool developed by Tencent, a local company. It now has 160 million registered users and 50 million active users, greatly outnumbering the users of Yahoo Massager, MSN, and Google Talk.
  • Although MySpace has been successful in the States, it seems that Chinese people are more interested in local social networking sites, such as Mop and Tianya.

These few examples demonstrate the significant challenges that companies face when localizing for China. There are cultural, financial, and lingustic obstacle to overcome — many of which I plan to address in more detail in future articles.

by Jason Yu at 09 April 2008 10:42 PM

GoDaddy launching the summer of .ME

GoDaddy is building up to a major promotion of the new .ME domain.

According to its Web site, the domain should be available for registration over the summer, with an initial Sunrise Period for trademark holders.

Based on some of the feedback I’m getting here, this could be a very popular domain. The squatters will certainly be first in line.

UPDATE: I just got word from Montenegro with dates for the registration process. They are as follows:

Sunrise Period
May 6-20, 2008
Open to all trademark owners and will have an auction process similar to dotAsia

Landrush Period
June 6-26, 2008
This period will also contain an auction for multiple applications

Open Registrations
July 17, 2008
Open to anyone on a first-come first-served basis.

by John Yunker at 09 April 2008 04:25 PM

April 07, 2008

ishida>>blog » i18n

UniView 5.1 available

>> See what it can do !

>> Use it !

Picture of the page in action.

Those of you who have used UniView over the last couple of days will have seen that it now supports Unicode 5.1. All Unicode 5.1 character information is available, however you will only be able to see the new characters if you have fonts that cover them. The decodeunicode graphics for the new characters are not yet available.

Last night I also fixed a long-running bug that had meant that additional information available in my character database was not accessible in Internet Explorer (due to AJAX issues). (See the related post if you are interested in the code).

There are no other changes at this time (though those two are pretty significant).

Please report any bugs to me, and don’t forget to refresh any UniView files in your cache before using the new version.

by r12a at 07 April 2008 07:38 AM

April 06, 2008

Global By Design

Vodafone CEO on global leadership

An excerpt from an interview with Arun Sarin, CEO of UK-based Vodafone Group — the world’s largest cellular operator — conducted by Chief Executive magazine:

How much of your business is international?

Less than 5% of our operating profits comes from the UK. We’ve had to fundamentally redesign this company as a global company. … We are a highly consumer-centric company. In Germany, we feel German. In Italy, we feel Italian. In Spain, we feel Spanish. In India, we feel Indian.

You operate in 30 countries. How does your leadership team reflect that global culture?

My top 10 executives represent five nationalities. … At GE, all of Jeff Immelt’s direct reports are American; at Siemens, all but one are German. … We are a very international company and therefore we need an international group of executives.

Can you give an example?

Americans want to land on a decision more quickly than Europeans, who usually want more debate before signing on. … The challenge of running an international team is understanding the balance and complexion of the team and how that impacts the way members bond, work, take decisions, and follow through.

Here is the complete interview.

by John Yunker at 06 April 2008 04:56 PM

April 03, 2008

ishida>>blog » i18n

Managing links to translated versions

Picture of typical links section.

W3C Internationalization articles have links in the top right corner to translated versions of the page. When a new translation is provided, these links need updating on each translated version of the article in question. This has been a pain to do.

I just published details of a new approach to managing these changes which means that I no longer have to touch the files themselves, and can produce the changes with a single, very small edit.

I’m not claiming that this is the ideal solution (though so far it seems pretty helpful, and way better than the previous approach) - just documenting it for those who are interested.

by r12a at 03 April 2008 09:28 AM

March 31, 2008

Hacklog: Blogamundo

Unicode support in Ruby1.9! Yippee!

$ cat unicode.rb
# -*- coding: utf-8 -*-

s = “ABCあいう”
puts “s: #{s}”
puts “s[0]: #{s[0]}”
puts “s[3,1]: #{s[3,1]}”

puts “s.length: #{s.length}”
puts “s.reverse: #{s.reverse}”
puts “s.encoding: #{s.encoding}”

$ ruby1.8 unicode.rb
s: ABCあいう
s[0]: 65
s[3,1]:
s.length: 12
s.reverse: ��㄁め�CBA
unicode.rb:10: undefined method `encoding’ for “ABC\343\201\202\343\201\204\343\201\206″:String (NoMethodError)

$ ruby1.9 unicode.rb
s: ABCあいう
s[0]: A
s[3,1]: あ
s.length: 6
s.reverse: ういあCBA
s.encoding: UTF-8
$ # yay!

Big ups to Matz.

by Patrick Hall at 31 March 2008 09:42 PM

March 28, 2008

Global By Design

What took you so long? Craigslist begins translating its site

Craig Newmark announced that Craigslist has begun adding support for additional languages. There are five languages so far: Spanish, German, Italian, Portuguese (Brazil), and French (Canada or France or both; he doesn’t specify).

In 2004, Craigslist began offering sites for cities such as Paris and Tokyo. I thought it odd at the time that Craigslist wasn’t translating anything, but the argument could be made that the Web sites were directed at expats more so than locals.

Then in 2005, eBay launched Kijiji in an effort to out-Craigslist Craigslist. I noted then that Kijiji was doing one thing right — actually localizing its sites for users around the world, with varying degrees of success. Kijiji also marked eBay’s re-entrance into Japan after closing eBay Japan a few years prior.

So here we are in 2008 and Craigslist is doing some translating. I’m certainly happy to see it though I get the feeling that Craigslist has more important issues to contend with these days. I used to visit the site fairly regularly but now find it so loaded with scam artists that I don’t find it to be a very credible source for much at all.

Regarding translation, it’s worth checking out Craigslist Berlin to get a vivid demonstration of text expansion in action. The width of the page expands dramatically as English is translated into German, as shown here:

Craigslist Berline

Fortunately, Craigslist uses a text-based design so the page simply expands to accommodate the text. Had the text been embedded within visuals, the localization could have been a bit trickier.

It will be interesting to see how many languages Craigslist adds over the next year. And now that Craigslist is multilingual, I will be evaluating it in next year’s Web Globalization Report Card.

by John Yunker at 28 March 2008 07:49 PM

March 27, 2008

Global By Design

Country code wallpaper for your iPhone

I realize I’m a bit obsessed with country codes. It’s a sickness, I know.

After I created the Country Codes of the World map, I began looking at other platforms for the design. And since I own an iPhone, I couldn’t resist creating a custom wallpaper for it.

Here are two ccTLD wallpapers for the iPhone.

iPhone ccTLD wallpaper, version 1

If you’d like to use one, simply save the image to your desktop and then import it to your iPhone via iPhoto or your PC images folder.

iPhone ccTLD wallpaper

I’m using the black background currently.

I’m also working on a wallpaper for laptops and desktops. I’ll keep you posted…

by John Yunker at 27 March 2008 05:15 PM

March 26, 2008

W3C I18n Activity highlights

Internationalization Tag Set Interest Group Launched

The Internationalization Tag Set (ITS) Interest Group, chaired by Yves Savourel (ENLASO Corporation), was launched today.

The ITS IG is a forum to foster a community of users of the Internationalization Tag Set (ITS), and aims to promote its adoption, and gather information on its further development. ITS defines data categories that may be used with schemas to support the internationalization and localization of XML-based documents.

Participation in the new ITS IG is open to W3C Members and the public.

by Richard Ishida at 26 March 2008 06:29 PM


Contact: Richard Ishida (ishida@w3.org).