W3C   W3C Internationalization (I18n) Activity: Making the World Wide Web truly world wide!

Latest del.icio.us tags

Blog searches

Contributors

If you own a blog with a focus on internationalization, and want to be added or removed from this aggregator, please get in touch with Richard Ishida at ishida@w3.org.

All times are UTC.

Powered by: Planet

Planet Web I18n

The Planet Web I18n aggregates posts from various blogs that talk about Web internationalization (i18n). While it is hosted by the W3C Internationalization Activity, the content of the individual entries represent only the opinion of their respective authors and does not reflect the position of the Internationalization Activity.

May 27, 2015

Global By Design

No Ordinary Disruption: It’s time to reset intuitions

I was given a review copy of No Ordinary Disruption: The Four Global Forces Breaking All the Trends, which I read over the weekend. The authors are Richard Dobbs, James Manyika, and Jonathan Woetzel — all directors at the McKinsey Global Institute. Readers of this blog are not going to be surprised by some of the disruptions  … Read more

by John Yunker at 27 May 2015 03:26 PM

May 26, 2015

Global By Design

Hotels.com, Hotels.ng and the value of country codes

I read today about the Nigerian startup Hotels.ng and my first thought was why Hotels.com didn’t already own the Nigerian country code. After all, Hotels.com owns country codes for France and Italy and Japan, among others. But was apparently late to registering country codes for Germany and Netherlands — as well as Nigeria (Africa’s most populous country). Now,  … Read more

by John Yunker at 26 May 2015 03:24 PM

May 19, 2015

Global By Design

Global gateway fail: App Annie

I want to focus on App Annie because it appears the company is planning to significantly expand its global reach — and therefore needs a gateway suited to task. Currently, App Annie supports five languages. But you might not know this because the gateway is buried in the footer, as shown here: To App Annie’s  … Read more

by John Yunker at 19 May 2015 04:12 PM

May 15, 2015

Global By Design

The humans behind machine translation

Google Translate is the world’s most popular machine translation tool. And, despite predictions by many experts in the translation industry, the quality of Google Translate has improved nicely over the past decade. Not so good that professional translators are in any danger of losing work, but good enough that many of these translators will use Google  … Read more

by John Yunker at 15 May 2015 08:37 PM

May 12, 2015

Global By Design

Who needs BRIC when you have the Blue Banana?

Perhaps it’s human nature (or perhaps just savvy marketing) to think up new and unique ways of organizing our world. BRIC (Brazil, Russia, India, China) is one popular grouping. And did you know about MINT (Mexico, Indonesia, Nigeria, Turkey)? Or MIST (Mexico, Indonesia, South Korea, Turkey)? And if you think those groupings sound odd, consider CIVETS (Colombia, Indonesia, Vietnam,  … Read more

by John Yunker at 12 May 2015 05:35 PM

May 07, 2015

Global By Design

Greece gets an Internationalized Domain Name (IDN)

Greece has received string approval for an IDN, shown below in red: This brings to 35 the number countries with approved IDNs — and an impressive range of scripts, shown here: I’ve just updated the IDN map of the world. If you’d like to order a copy — or a custom variation —  contact me.

by John Yunker at 07 May 2015 07:10 PM

May 03, 2015

Global By Design

Translators Without Borders and the Wikipedia 100-language project

Translators Without Borders is an amazing organization of volunteer translators using their skills to make the world a better place. One project worth noting is an ambitious effort to translate valuable Wikipedia articles into 100 languages: The 100 x 100 Wikipedia Project envisions the translation of the 100 most widely read Wikipedia articles on health issues  … Read more

by John Yunker at 03 May 2015 03:13 PM

April 27, 2015

Global By Design

Will FedEx plus TNT equal an improved global website?

As FedEx closes its acquisition of TNT Express, I see an opportunity for FedEx to improve its global website. In the recent Web Globalization Report Card, among delivery services companies, FedEx finished dead last. TNT supports 37 languages, compared with the relatively paltry 27 languages that FedEx supports. Hopefully FedEx will embrace a new language baseline  … Read more

by John Yunker at 27 April 2015 05:28 PM

April 24, 2015

W3C I18n Activity highlights

Updated Working Draft: Language Tags and Locale Identifiers for the World Wide Web

Language Tags and Locale Identifiers for the World Wide Web describes the best practices for identifying or selecting the language of content as well as the the locale preferences used to process or display data values and other information on the Web. It describes how document formats, specifications, and implementations should handle language tags, as well as extensions to language tags that describe the cultural or linguistic preferences referred to in internationalization as a “locale”.

Changes in this update include the following: All references to RFC3066bis were updated to BCP 47 or to RFC5646 or RFC 4647 as appropriate.References to HTML were changed to point to HTML5. Imported and rewrote the text formerly containing in Web Services Internationalization Usage Scenarios defining internationalization, locale, and other important terms. Modified and reorganized the other sections of this document. Moved the Web services materials to an appendix.

by Richard Ishida at 24 April 2015 09:28 AM

April 23, 2015

Global By Design

Do your web developers know about Globalize?

Today, the JQuery Foundation has announced availability of Globalize 1.0: Globalize provides developers with always up-to-date global number formatting and parsing, date and time formatting and parsing, currency formatting, and message formatting. Based on the Unicode Consortium standards and specifications, Globalize uses the Common Locale Data Repository (CLDR), the most extensive and widely-used standard repository of  … Read more

by John Yunker at 23 April 2015 11:55 PM

April 08, 2015

Wikimedia Foundation

The Content Translation tool makes it easier to create new Wikipedia articles from other languages. You can now start translations from your Contributions link, where you can find articles missing in your language. Screenshot by Runa Bhattacharjee, freely licensed under CC0 1.0

The Content Translation tool makes it easier to create new Wikipedia articles from other languages. You can now start translations from your Contributions link, where you can find articles missing in your language. Screenshot by Runa Bhattacharjee, freely licensed under CC0 1.0
The Content Translation tool makes it easier to create new Wikipedia articles from other languages. You can now start translations from your Contributions link, where you can find articles missing in your language. Screenshot by Runa Bhattacharjee, licensed under CC0 1.0

Since it was first introduced three months ago, the Content Translation tool has been used to write more than 850 new articles on 22 Wikipedias. This tool was developed by Wikimedia Foundation’s Language Engineering team to help multilingual users quickly create new Wikipedia articles by translating them from other languages. It includes an editing interface and translation tools that make it easy to adapt wiki-specific syntax, links, references, and categories. For a few languages, machine translation support via Apertium is also available.

Content Translation (aka CX) was first announced on January 20, 2015, as a beta feature on 8 Wikipedias: Catalan, Danish, Esperanto, Indonesian, Malay, Norwegian (Bokmal), Portuguese, and Spanish. Since then, Content Translation has been added gradually to more Wikipedias – mostly at the request of their communities. As a result, the tool is now available as a beta feature on 22 Wikipedias. Logged-in users can enable the tool as a preference on those sites, where they can translate articles from any of the available source languages (including English) into these 22 languages.

Here is what we have learned by observing how Content Translation was used by over 260 editors in the last three months.

Translators

Number of users who enabled this beta feature over time on Catalan Wikipedia. Graph by Runa Bhattacharjee, CC0 1.0

To date, nearly 1,000 users have manually enabled the Content Translation tool — and more than 260 have used it to translate a new article. Most translators are from the Catalan and Spanish Wikipedias, where the tool was first released as a beta feature.

Articles

Articles published using Content Translation. Graph by Runa Bhattacharjee, CC0 1.0

Articles created with the Content Translation tool cover a wide range of topics, such as fashion designers, Field Medal scholars, lunar seas and Asturian beaches. Translations can be in two states: published or in-progress. Published articles appear on Wikipedia like any other new article and are improved collaboratively; these articles also include a tag that indicates that they were created using Content Translation. In-progress translations are unpublished and appear on the individual dashboard of the translator who is working on it. Translations are saved automatically and users can continue working on them anytime. In cases where multiple users attempt to translate or publish the same article in the same language, they receive a warning. To avoid any accidental overwrites, the other translators can publish their translations under their user page — and make separate improvements on the main article. More than 875 new articles have been created since Content Translation has been made available — 500 of which were created on the Catalan Wikipedia alone.

Challenges

When we first planned to release Content Translation, we decided to monitor how well the tool was being adopted — and whether it was indeed useful to complement the workflow used by editors to create a new article. The development team also agreed to respond quickly all queries or bugs. Complex bugs and other feature fixes were planned into the development cycles. But finding the right solution for the publishing target proved to be major challenge, from user experience to analytics. Originally, we did not support publishing into the main namespace of any Wikipedia: users had to publish their translations under their user pages first and then move them to the main namespace. However, this caused delays, confusion and sometimes conflicts when the articles were eventually moved for publication. In some cases, we also noticed that articles had not been counted correctly after publication. To avoid these issues, that original configuration was changed for all supported sites. A new translation is now published like any other new article and in case an article already exists or gets created while the translation was being done, the user is displayed warnings.

New features

Considering the largely favorable response from our first users, we have now started to release the tool to more Wikipedias. New requests are promptly handled and scheduled, after language-specific checks to make sure that proposed changes will work for all sites. However, usage patterns have varied across the 22 Wikipedias. While some of the causes are outside of our control (like the total number of active editors), we plan to make several enhancements to make Content Translation easily discoverable by more users, at different points of the editing and reading workflows. For instance, when users are about to create a new article from scratch, a message gives them the option to start with a translation instead. Users can also see suggestions in the interlanguage link section for languages that they can translate an article into. And last but not least, the Contributions section now provides a link to start a new translation and find articles missing in your language (see image at the top of this post).

In coming months, we will continue to introduce new features and make Content Translation more reliable for our users. See the complete list of Wikipedias where Content Translation is currently available as a beta feature. We hope you will try it out as well, to create more content

Runa Bhattacharjee, Language Engineering, Wikimedia Foundation

by Wikimedia Blog at 08 April 2015 05:10 PM

April 07, 2015

Global By Design

Starbucks: The best global retail website

For the 2015 Web Globalization Report Card, we studied 10 retail websites: Best Buy Costco GameStop Gap H&M IKEA McDonald’s Staples Starbucks Toys R Us UNIQLO Walmart Zara Out of those 10 websites, Starbucks emerged as number one. Here is a screen shot from the German site: McDonald’s leads the category in languages supported, with 39 (in  … Read more

by John Yunker at 07 April 2015 05:30 PM

April 06, 2015

Wikimedia Foundation

Group photo

Group photo
The Content Translation tool has made it a lot easier for Catalan Wikimedians to convert articles to and from different languages. Photo by Flamenc, freely licensed under CC BY-SA 3.0

Catalan Wikimedians are a very enthusiastic wiki community. In relation to the whole movement, we are mid-sized but one of the most active in terms of editors per millions of speakers.

Surprisingly Catalan, our mother language, was banished for more than 40 years. Thankfully, editors like to use wikis for digital language activism. With Wikipedia (Viquipèdia, in Catalan) we founded a digital space where we can freely spread our language without real life restrictions (governments, markets).

Almost 99% of Catalan speakers are bilingual and also speak Spanish. This means that content translation from Spanish Wikipedia happens frequently on our project. Some translate by hand, others use commercial platforms like Google Translate or freely licensed translation engines like Apertium. Some users even create their own translation bots, like the AmicalBot or EVA, which our community loves and uses often.

A few months ago, we heard news of the upcoming Wikimedia’s ContentTranslation tool, and we’re really happy to find that the very first language tests were planned between Spanish and Catalan. Our community responded to this news with great enthusiasm and we have been testing the tool for months now. The development team has kindly listened to our comments and demands, while implementing many of our shared recommendations.

At a personal level, I found the tool really helpful. It is easy to use and understand, and it greatly facilitates our work. I can now translate a 20- line article in less than 5 minutes, saving lots of time. Before, the worst part of translating articles was spending extra time translating reference templates and some of the wikicode. We understand the tool is not perfect yet, but nothing is perfect in a wiki environment: it is continuously being improved.

One of our community’s biggest challenges is updating different language wikis. We have good content about Catalan culture in the Catalan language, but we are not that good at exporting this content to other wikis. I personally hope that this tool can help us with both tasks.

I recommend that you try the ContentTranslation tool with an open mind and spend some time with it. Translate a few articles and if you find any bugs, please report them. When we say Wikipedia is a global project, we mean that it is multilingual, and this tool really helps us reach our shared vision to help every single human being can freely share in the sum of all knowledge.

Alex Hinojo, Amical Wikimedia community member

by Wikimedia Blog at 06 April 2015 10:08 PM

Global By Design

Adobe points to external localized tutorials

Adobe provides French, German and Japanese tutorials for Photoshop Elements. But what about other languages? Until the funding comes along for additional translation, Adobe directs users to tutorials created in Spanish, Polish, Dutch and Russian. Simple and smart. I don’t know what more software companies don’t do this. PS: Adobe ranked #9 overall in this year’s Web  … Read more

by John Yunker at 06 April 2015 05:20 PM

March 25, 2015

W3C I18n Activity highlights

Program published for W3C MultilingualWeb Workshop in Riga, 29 April

See the program. The keynote speaker will be Page Williams, Director of Global Readiness, Trustworthy Computing, Microsoft. She is followed by a strong line up in sessions entitled Developers and Creators, Localizers, Machines, and Users, including speakers from Microsoft, the European Parliament, the UN FAO, Intel, Verisign, and many more. The workshop is made possible with the generous support of the LIDER project.

Participation in the event is free. Please register via the Riga Summit for the Multilingual Digital Single Market site.

The MultilingualWeb workshops, funded by the European Commission and coordinated by the W3C, look at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The workshops are successful because they attract a wide range of participants, from fields such as localization, language technology, browser development, content authoring and tool development, etc., to create a holistic view of the interoperability needs of the multilingual Web.

We look forward to seeing you in Riga!

by Richard Ishida at 25 March 2015 08:28 AM

March 24, 2015

Global By Design

BMW & Chevrolet: The Best Global Automotive Websites

For the 2015 Web Globalization Report Card, we studied 14 automotive manufacturers and one supplier (Michelin). Audi BMW Chevrolet Ford Goodyear Honda Hyundai Land Rover Lexus Mercedes Michelin Mini Nissan Toyota Volkswagen Out of those 15 websites, BMW and Chevrolet emerged in a numerical tie for number one. BMW and Chevrolet both support an impressive 41 languages, in addition to  … Read more

by John Yunker at 24 March 2015 04:25 PM

March 16, 2015

Global By Design

Why you should be using geolocation for global navigation

In the 2015 Web Globalization Report Card, slightly more than half of the websites studied use geolocation specifically to improve global navigation. This is up significantly from just a few years ago. Geolocation is the process of identifying the IP address of a user’s computer or smartphone and responding with localized content or websites. Companies that  … Read more

by John Yunker at 16 March 2015 11:12 AM

March 14, 2015

Global By Design

Armenia gets an IDN: հայ

This is not exactly breaking news, but Armenia now has an IDN: հայ Here it is in my fast-evolving IDN map: This means that 34 countries now have delegated IDNs.  

by John Yunker at 14 March 2015 08:59 PM

March 11, 2015

W3C I18n Activity highlights

Unicode 8.0 Beta Review

The Unicode® Consortium announced the start of the beta review for Unicode 8.0.0, which is scheduled for release in June, 2015. All beta feedback must be submitted by April 27, 2015.

Unicode 8.0.0 comprises several changes which require careful migration in implementations, including the conversion of Cherokee to a bicameral script, a different encoding model for New Tai Lue, and additional character repertoire. Implementers need to change code and check assumptions regarding case mappings, New Tai Lue syllables, Han character ranges, and confusables. Character additions in Unicode 8.0.0 include emoji symbol modifiers for implementing skin tone diversity, other emoji symbols, a large collection of CJK unified ideographs, a new currency sign for the Georgian lari, and six new scripts. For more information on emoji in Unicode 8.0.0, see the associated draft Unicode Emoji report.

Please review the documentation, adjust code, test the data files, and report errors and other issues to the Unicode Consortium by April 27, 2015. Feedback instructions are on the beta page.

See more information about testing the 8.0.0 beta. See the current draft summary of Unicode 8.0.0.

by Richard Ishida at 11 March 2015 12:29 PM

March 04, 2015

Global By Design

Google to the Internet: Go mobile or watch your sales rank fall

Four years ago, for the Web Globalization Report Card, I began noting (and rewarding) those websites that supported mobile devices. Even then one could easily see the virtual grounds shifting in favor of mobile devices. But at the time, only about 20% of the websites studied supported mobile devices. In this year’s Report Card, the majority of websites are  … Read more

by John Yunker at 04 March 2015 03:14 PM

February 26, 2015

W3C I18n Activity highlights

Speaker deadline for Riga MultilingualWeb Workshop is Sunday, 8 March

We would like to remind you that the deadline for speaker proposals for the 8th MultilingualWeb Workshop (April 29, 2015, Riga, Latvia) is on Sunday, March 8, at 23:59 UTC.

Featuring a keynote by Paige Williams (Director of Global Readiness, Trustworthy Computing at Microsoft) and sessions for various audiences (Web developers, content creators, localisers, users, and multilingual language processing), this workshop will focus on the advances and challenges faced in making the Web truly multilingual. It provides an outstanding and influential forum for thought leaders to share their ideas and gain critical feedback.

While the organizers have already received many excellent submissions, there is still time to make a proposal, and we encourage interested parties to do so by the deadline. With roughly 150 attendees anticipated for the Workshop from a wide variety of profiles, we are certain to have a large and diverse audience that can provide constructive and useful feedback, with stimulating discussion about all of the presentations.

The workshop is made possible by the generous support of the LIDER project and will be part of the Riga Summit 2015 on the Multilingual Digital Single Market. We are organizing the workshop as part of the Riga Summit to strengthen the European related community at large. Depending on the number of submissions to the MultilingualWeb workshop we may suggest to move some presentations to other days of the summit. For these reasons we highly recommend you to attend the whole Riga Summit! See the line-up of speakers already confirmed for the various events during the summit.

For more information and to register a presentation proposal, please visit the Riga Workshop Call for Participation. For registration as a regular participant of the MultilingualWeb workshop or other events at the Riga Summit, please register at the Riga Summit 2015 site.

by Richard Ishida at 26 February 2015 11:30 AM

February 20, 2015

Global By Design

Web localization in the Year of the Sheep

I enjoying watching how Western companies localize their websites and products to capitalize on Chinese New Year — the Year of the Sheep (or Goat). Like this gift card from Starbucks China: And this  hero image on the Microsoft China home page: And Nike has put together a color-appropriate assortment of products: Happy New Year!

by John Yunker at 20 February 2015 03:26 AM

February 18, 2015

Global By Design

LinkedIn adds Arabic

Nice to see that LinkedIn has added support for Arabic: This raises LinkedIn’s language total to 24 languages, including English. As a point of comparison, Facebook supports more than 70 languages.    

by John Yunker at 18 February 2015 05:20 PM

February 10, 2015

Global By Design

The top 25 global websites from the 2015 Web Globalization Report Card

I’m pleased to announce the publication of The 2015 Web Globalization Report Card. Here are the top-scoring websites from the report: For regular readers of this blog, you’ll notice that Google is once again ranked number one. The fact is, no other company on this list invests in web and software globalization like Google. While  … Read more

by John Yunker at 10 February 2015 08:08 PM

February 05, 2015

W3C I18n Activity highlights

Paige Williams (Microsoft) to keynote at 8th Multilingual Web Workshop (April 29, 2015, Riga)

We are please to announce that Paige Williams, Director of Global Readiness, Trustworthy Computing at Microsoft, will deliver the keynote at the 8th Multilingual Web Workshop, “Data, content and services for the Multilingual Web,” in Riga, Latvia (29 April 2015).

Paige spent 10 years managing internationalization of Microsoft.com, before joining the Trustworthy Computing organization in 2005. In TwC, Paige oversees compliance of company policy for geographic, country-region and cultural requirements, establishing a new center of excellence for market and world readiness, globalization/localizability, and language programs, tools, resources and external community forums to reach markets across the world with the right local experience.

The Multilingual Web Workshop series brings together participants interested in the best practices, new technologies, and standards needed to help content creators, localizers, language tools developers, and others address the new opportunities and challenges of the multilingual Web. It will provide for networking across communities and building connections.

Registration for the Workshop is free, and early registration is recommended since space at the Workshop is limited.

The workshop will be part of the Riga Summit 2015 on the Multilingual Digital Single Market. We are organizing the workshop as part of the Riga Summit to strengthen the European related community at large. Depending on the number of submissions to the MultilingualWeb workshop we also may suggest to move presentations to other days of the summit. For these reasons we highly recommend you to attend the whole Riga Summit!

There is still opportunity for individuals to submit proposals to speak at the workshop. Ideal proposals will highlight emerging challenges or novel solutions for reaching out to a global, multilingual audience. The deadline for speaker proposals is March 8, but early submission is strongly encouraged. See the Call for Participation for more details.

This workshop is made possible by the generous support of the LIDER project.

by Richard Ishida at 05 February 2015 11:30 AM

February 03, 2015

W3C I18n Activity highlights

Counter Styles: two documents published

The Cascading Style Sheets (CSS) Working Group has published a Candidate Recommendation of CSS Counter Styles Level 3. It adds new built-in counter styles to those defined in CSS 2.1, but, more importantly, it also allows authors to define custom styles for list markers, numbered headings and other types of generated content.

At the same time, the Internationalization Working Group has updated their Working Draft of Predefined Counter Styles, which provides custom rules for over a hundred counter styles in use around the world. It serves both as a ready-to-use set of styles to copy into your own style sheets, and also as a set of worked examples.

by Richard Ishida at 03 February 2015 06:18 PM

January 29, 2015

ishida>>blog » i18n

Bopomofo on the Web

Three bopomofo letters with tone mark.

Light tone mark in annotation.

A key issue for handling of bopomofo (zhùyīn fúhào) is the placement of tone marks. When bopomofo text runs vertically (either on its own, or as a phonetic annotation), some smarts are needed to display tone marks in the right place. This may also be required (though with different rules) for bopomofo when used horizontally for phonetic annotations (ie. above a base character), but not in all such cases. However, when bopomofo is written horizontally in any other situation (ie. when not written above a base character), the tone mark typically follows the last bopomofo letter in the syllable, with no special handling.

From time to time questions are raised on W3C mailing lists about how to implement phonetic annotations in bopomofo. Participants in these discussions need a good understanding of the various complexities of bopomofo rendering.

To help with that, I just uploaded a new Web page Bopomofo on the Web. The aim is to provide background information, and carry useful ideas from one discussion to the next. I also add some personal thoughts on implementation alternatives, given current data.

I intend to update the page from time to time, as new information becomes available.

by r12a at 29 January 2015 12:07 PM

January 20, 2015

Wikimedia Foundation

File:Content Translation Screencast (English).webm

File:Content Translation Screencast (English).webm

Video: How to translate a Wikipedia article in 3 minutes with Content Translation. This video can also be viewed on YouTube (4:10). Screencast by Pau Giner, licensed under CC BY-SA 4.0

Wikimedia Foundation’s Language Engineering team is happy to announce the first version of Content Translation on Wikipedia for 8 languages: Catalan, Danish, Esperanto, Indonesian, Malay, Norwegian (Bokmål), Portuguese and Spanish. Content Translation, available as a beta feature, provides a quick way to create new articles by translating from an existing article into another language. It is also well suited for new editors looking to familiarize themselves with the editing workflow. Our aim is to build a tool that leverages the power of our multicultural global community to further Wikimedia’s mission of creating a world where every single human being can share in the sum of all knowledge.

Design

During early 2014, when the design ideas for Content Translation were being conceptualized, we came across an interesting study by Scott A.Hale of University of Oxford, on the influences and editing patterns of multilingual editors on Wikipedia. Combined with feedback from editors we interacted with, the data presented in the study guided our initial choices, both in terms of features and languages. We were fortunate to have met the researcher in person at Wikimania 2014, so we could learn more about his findings and references.

The tool was designed for multilingual editors as our main target users. Several important patterns emerged from a month-long user study, including:

  • Multilingual editors are relatively more active in Wikipedias of smaller size. Often the editors from smaller sized Wikipedias would also edit on a relatively large sized Wikipedia like English or German;
  • Multilingual editors often edited the same articles in their primary and non-primary languages.

These and other factors listed in the study impact the transfer of content between different language versions of Wikipedia; they increase content parity between versions — and decrease ‘self-focus’ bias in individual editions.

Languages

When selecting languages for the tool’s introduction, we were guided by several factors, including signs of relatively high multilingualism amongst the primary editors. The availability of high quality machine-translated content was an additional consideration, to fully explore the usability of the core editing workflow designed for the tool. Based on these considerations, Catalan Wikipedia, a very actively edited project of medium size was a logical choice. Subsequent language selections were made by studying possible overlap trends between language users — and the probability of editors benefiting from those overlaps when creating new articles. Availability of machine translation to speed up the process and community requests were important considerations.

How it works

The article Abel Martín in the Spanish Wikipedia doesn’t have a version in Portuguese, so a red link to Portuguese is shown.
Content Translation red interlanguage link screenshot by Amire80 , licensed under CC BY-SA 4.0

Content Translation combines a rich text translation interface with tools targeted for editing — and machine translation support for most language pairs. It integrates different tools to automate repetitive steps during translation: it provides an initial automatic translation while keeping the original text format, links, references, and categories. To do so, the tool relies on the inter-language connections from Wikidata, html-to-wikitext conversion from Parsoid, and machine translation support from Apertium. This saves time for editors and allows them to focus on creating quality content.

Although basic text formatting is supported, the purpose of the tool is to create an initial version of the content that each community can keep improving with their usual editing tools. Content Translation is not intended to keep the information in sync across multiple language versions, but to provide a quick way to reuse the effort already made by the community when creating an article from scratch in a different language.

The tool can be accessed in different ways. There is a persistent access point at your contributions page, but access to the tool is also provided in situations where you may want to translate the content you are just reading. For instance, a red link in the interlanguage link area (see image).

Next steps

Next steps for the tool’s future development include adding support for more – eventually all – languages, managing lists of articles to translate, and adding features for more streamlined translation.

In coming weeks, we will closely monitor feedback from users and interact with them to guide our future development. Please read the release announcement for more details about the features and instructions on using the tool. Thank you!

Amir Aharoni, Pau Giner, Runa Bhattacharjee, Language Engineering, Wikimedia Foundation

by Wikimedia Blog at 20 January 2015 06:56 PM

January 18, 2015

ishida>>blog » i18n

Bengali picker & character & script notes updated

Screen Shot 2015-01-18 at 07.42.56

Version 16 of the Bengali character picker is now available.

Other than a small rearrangement of the selection table, and the significant standard features that version 16 brings, this version adds the following:

  • three new buttons for automatic transcription between latin and bengali. You can use these buttons to transcribe to and from latin transcriptions using ISO 15919 or Radice approaches.
  • hinting to help identify similar characters.
  • the ability to select the base character for the display of combining characters in the selection table.

For more information about the picker, see the notes at the bottom of the picker page.

In addition, I made a number of additions and changes to Bengali script notes (an overview of the Bengali script), and Bengali character notes (an annotated list of characters in the Bengali script).

About pickers: Pickers allow you to quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Pickers are likely to be most useful if you don’t know a script well enough to use the native keyboard. The arrangement of characters also makes it much more usable than a regular character map utility. See the list of available pickers.

by r12a at 18 January 2015 08:10 AM

January 15, 2015

Global By Design

Global Gateway Fail: Yandex

Yandex is Russia’s leading search engine and, following in Google’s footsteps, is eager to take over much of Russia’s Internet, which naturally includes the web browser. Yandex is also in the process of expanding its reach beyond Russia. But when I visited the web browser download web page I couldn’t help but notice a few problems with the global  … Read more

by John Yunker at 15 January 2015 01:25 AM


Contact: Richard Ishida (ishida@w3.org).