Intended audience: webmasters, server administrators, and anyone who wants to better understand language-based content negotiation.
When is it appropriate, or not, to use language negotiation?
Language negotiation is a function of the HTTP protocol which lets a server choose among several language versions of a page, based on the URL and on preference information sent by the browser (specifically in the Accept-Language header). This is distinct from page selection based on the IP address of the browser or from a manual selection by the user on a language-selection page. If there is no match between the preferences expressed by the browser and the languages available on the server, either a language-selection page is shown or a default language is served.
In many cases, the initial user agent setting for language preferences is okay. For example, if you have a Japanese version of a browser, the browser typically assumes that you prefer pages in Japanese, and sends this information to the server. Mainstream browsers allow you to modify these language preferences. For more information see Setting language preferences in a browser.
This article addresses the question of when it is appropriate (or not) to set up language negotiation on the server.
The short answer is: always.
A slightly longer answer is: almost always, but not alone.
Language negotiation is evidently a useful thing, but before implementing it across the board, it is important to understand its limits. To illustrate these, we will use the example of a site, www.example.be, that offers its content in Flemish, French and German, implements language negotiation and defaults to Flemish for all pages. Our visitor, Sylvia, will be Italian-speaking, but able to deal with German. Several situations may arise:
Hopefully the picture is now clear: language negotiation does not always give the intended result.
Additionally, language negotiation is not even relevant when pages are not equivalent, i.e. do not have essentially the same content in different languages. The article Monolingual vs. multilingual Web sites sheds some light on this, see in particular the "Multilingual, same content" and "Multilingual, changed content" sub-sections. Note however than some measure of cultural adaptation (e.g. changing the currency) does not necessarily make pages non-equivalent; the non-equivalence limitation to language negotiation really exists when a site is adapted so that pages in different languages do not correspond to one another any more.
Despite its limitations, language negotiation is a useful function and it is desirable to implement it in multilingual sites. But the shortcomings must be addressed. In short, it is important to provide means for visitors to override the automatic choice of language when it is wrong. This means putting some interface elements in the page (we'll call them language controls here) that link to the other available languages. These controls must of course be clearly visible and understandable by a visitor who has no familiarity with the language of the currently displayed page.
One question that arises is: should language negotiation and manual language controls be implemented for all pages, or only for the home page? The best answer is "all pages", except those sets of pages that are not sufficiently equivalent. Language negotiation is good because, if Jaap emails a link inside www.example.be to Pierre, Pierre will be happy to get the French version, even though Jaap read the Flemish one. Language controls must also be provided, whether negotiation is implemented or not. If negotiation is absent, Pierre will need controls to get the French version from Jaap's link; even if it is there, Sylvia will need to switch to German manually in situations 2 and 3 above.
By the way, some sites choose to return a special-purpose language selection page when there is no match between the visitor's preferences and the available languages (www.example.be could do that instead of returning Flemish). This has the advantages of making the situation clear and of not giving one language precedence over the others, which may be a politically sensitive issue. Unfortunately, some sites always return this special page (for the home page) instead of implementing language negotiation. This forces everyone to always go through that page while offering no apparent advantage. Bad human factors design.
Suppose Sylvia visits www.example.be and gets Flemish (situation 2 or 3). She then clicks on the German control and reads on, no real trouble. But she then clicks on a link to get to an interesting page within the site. Oops, Flemish again! Fortunately, the German control is still there, but after a couple of such detours she's getting understandably frustrated. Can't www.example.be just remember that she can't read Flemish? What is needed here is some stickyness of the explicit language selection.
There are a couple of ways that www.example.be can provide this stickyness. Which one to choose will depend on the underlying technology available on the server and on the level of effort that can be expended.
If the site implements a session mechanism (for instance using cookies), then it is a fairly simple matter to arrange for language to be part of the session state. Once Sylvia clicks on the German control, this gets stored (either in the cookie itself or in the server, to be matched by a session number in the cookie) and from then on she gets German when navigating the site. The cookie can even be made persistent (although this makes privacy issues more acute), so that Sylvia gets German pages automatically the next time she returns to www.example.be. Sites that provide a login mechanism can also store language preferences as part of each user's profile, and serve pages accordingly. Language negotiation is then used only for users that have not yet logged in.
Another way to diminish frustration is to make all internal links within the site be language-specific. In the German home
page, links to deeper pages would be of the form
href="company/about.de.html" (instead of
would be language-generic)*. Navigation is then constrained to remain in German, until overriden by the special language controls. This has
a couple of downsides, however. One is that all internal links become translatable material, increasing the cost of translation as well as the
potential for errors. Another is that if Jaap sends a link to Pierre, the URL (picked up from the browser's address bar) will be language-specific;
Pierre will then get the Flemish page. Neither of these downsides are horrible, however, so using language-specific links is probably the way to go
if stickyness cannot be provided through a session state or profile mechanism.
The HTTP Accept-Language header is not the only source of language information available. All browsers also send a User-Agent header identifying the browser's make, version number and in some cases language version. This can be used to guess at the user's preferred language if the Accept-Language header is missing, but it is less reliable and more limited (one language only) than Accept-Language. Use with extreme care.
Language negotiation is only one aspect of HTTP content negotiation. Other aspects that can be automatically negotiated are the media type (i.e. the format: HTML, PDF or plain text for instance), the character encoding and the transfer encoding (encrypted, compressed, etc.). Language negotiation is the most useful and most used.
Content first published 2004-02-26. Last substantive update 2004-02-26 15:10 GMT. This version 2010-09-03 18:54 GMT
For the history of document changes, search for qa-when-lang-neg in the i18n blog.
Copyright © 2004-2010 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.