I18nsite contentneg

From Internationalization
Jump to: navigation, search

Collaborative editing page

Follow the conventions for editing this page.

Status: Initial Draft ie. please focus on technical content, rather than wordsmithing at this stage.

See the [I18n Core home page].

Author: John Smythe

Content Negotiation on the W3C I18n Site

This article describes the approach to content language negotiation currently in use on the W3C Internationalization Activity sub-site. It is not intending to advocate this as a good approach. It simply documents the current practice.

Background

As an international organization, the W3C officially operates in English. We are more than happy for volunteers to provide translations of our documents, but we don't have the means to produce everything in many languages.

The Internationalization Activity sub-site contains translations for a number of articles and tutorials, generously donated by the public. The coverage depends on what people volunteer for, so the number of translations available for any given page can vary widely. For many pages there are still no translations.

We update articles from time to time in response to changing events or feedback, and translations do not always get updated. We do try to indicate when a translation is out of date. In addition, whereas the articles themselves go through several stages of review, the quality of translations is not checked. For these reasons, some people prefer to always read the (canonical) English version of an page.

It is for these people that we provide ['sticky' content negotiation] using a cookie. By indicating which language they prefer, they are able to choose the language they will read articles on the site, regardless of their browser settings.

If you click on certain links, a cookie will be set on your client which simply indicates the language you chose. Then, as you continue to browse the site you will be presented with articles in that language whenever they are available.

If no such translation is available (or you disable cookies), the server will look at the language preferences set in your browser, and serve to you a translation corresponding to those settings, if one is available.

Otherwise, you will get the English page.

How it works

From the user's perspective

Every page always contains links to all available language versions of that page.

If you want to explicitly choose the language of pages in which you view the site, click on one of the following links. These are the only links that set the cookie:

1. The links to translations on the [home page], the [brief explanatory page], or the page that [lists all articles] 2. Language names in the top right of a page for which translations are available.

Most of the time this will not affect the way you use the site, but the stickiness provided by the cookie is useful if you want to read articles on the site in a different language to that indicated by the preferences in your browser.

Example: Suppose that your browser indicates that you prefer Polish. Using normal content negotiation, you will always be served a Polish page if one is available. Suppose now that you prefer to read pages in, say, English on this site. There are links on any Polish page to the English version, but with standard content negotiation each time you move to a new page your browser will request the Polish version, and you will need to click on the link to the translation again. The cookie avoids that by always serving you the English version once you have chosen English. You can, of course, decide to see the pages in Polish (or any other language) by clicking on a link to the Polish page at any time.

If a version of a page is not available in the language specified by your cookie, standard content language negotiation is performed on the server and you will be provided with a translation that corresponds to the language settings of your browser.

If a translation cannot be found that matches your browser settings, you will be served the English page as a default.

You can also view a page in another language (where available) by specifying the language in the uri (eg. filename.fr for a French page). This will not affect your cookie settings.

Setting up the server for content negotiation

The W3C uses an Apache server, at the time of writing the version is xxx. The i18n sub-site has a .htaccess file with the following instructions:

MultiviewsMatch Handlers AddHandler type-map .var

This sets up standard type-map content negotiation. Each article on the sub-site has a language extension (eg. qa- i18n.de.php is the German variant) and an accompanying .var file. For example the file qa-filename.var might contain the following:

	URI: qa-i18n

URI: qa-i18n.de.php
Content-type: text/html
Content-language: de

URI: qa-i18n.en.php
Content-type: text/html
Content-language: en

URI: qa-i18n.es.php
Content-type: text/html
Content-language: es

URI: qa-i18n.pt.php
Content-type: text/html
Content-language: pt

URI: qa-i18n.pl.php
Content-type: text/html
Content-language: pl

URI: qa-i18n.ru.php
Content-type: text/html
Content-language: ru

URI: qa-i18n.en.php
Content-type: text/html

This .var file lists the locations of the various language versions available for a given page, and associates each URI with a language tag that matches Accept-Language headers sent by your browser in the HTTP request. The server then checks the HTTP request information and serves the respective file. The final two lines catch the default case, where the HTTP header is requesting a language that is not available. In the example, the English file is served as the default.

For more information about how this works, see [Apache documentation].

Serving files based on cookie data

In order to catch the cookie settings, an additional instruction is added to the .htaccess file, which looks like this:

	SetEnvIf Cookie "w3ci18nlang=ar" prefer-language=ar
	SetEnvIf Cookie "w3ci18nlang=de" prefer-language=de
	SetEnvIf Cookie "w3ci18nlang=el" prefer-language=el
	SetEnvIf Cookie "w3ci18nlang=en" prefer-language=en
	...

And so on, for all languages versions available on the sub-site. This construct only works for versions of Apache including XXX and above. Essentially, it looks up the value of any cookie called w3ci18nlang in the HTTP header request, and if it is set to a particular language tag, it tells the server to send a page in that language if one is available rather than follow the normal content negotiation route.

Setting the cookie

The only other missing piece is that which sets the cookie. Articles with translations on the i18n sub-site are served using PHP. This allows the boilerplate text to be handled separately, reducing the work of a translator and increasing consistency.

When a PHP page is served, it looks for a query in the uri, and if that query is found, it sets the cookie. The query is changelang and the expected value is a language tag.

For example, http://www.w3.org/International/questions/qa-i18n.ru.php will take you to the Russian page without setting a cookie; http://www.w3.org/International/questions/qa-i18n.ru.php?changelang=ru will take you to the Russian page and set the cookie to Polish.

The code that actually sets the cookie is as follows:

if (isset($_GET['changelang'])) { 
	setcookie("w3ci18nlang", $_GET['changelang'], time()+60*60*24*30, "/"); 
	}