Intended audience: script developers (PHP, JSP, etc.), Web project managers, and anyone who wants to use locale information.
Is it a good idea to use the HTTP Accept-Language header to determine the locale of the user?
For a number of perfectly valid reasons some web applications would like to associate a locale with each user that visits the site. This locale would enable them to provide information in the local format. Some of this information is common to traditional software locales, such as:
In other cases, other information may be derived from the locale information plus additional knowledge, such as:
Since none of these are included in the HTTP protocol many web developers have used the Accept-Language header to make inferences about the user's locale.
The Accept-Language header is information about the user's language preferences that is passed via HTTP when a document is requested.
Mainstream browsers allow these language preferences to be modified by the user. The
value itself is a defined by BCP 47, typically as a
two or three letter language code (eg.
fr for French), followed by
optional subcodes representing such things as country (eg.
fr-CA represents French as spoken in Canada).
The question is about whether this information is appropriate for determining the locale of the user.
The HTTP Accept-Language header was originally only intended to specify the user's language. However, since many applications need to know the locale of the user, common practice has used Accept-Language to determine this information. It is not a good idea to use the HTTP Accept-Language header alone to determine the locale of the user. If you use Accept-Language exclusively, you may handcuff the user into a set of choices not to his liking.
For a first contact, using the Accept-Language value to infer regional settings may be a good starting point, but be sure to allow them to change the language as needed and specify their cultural settings more exactly if necessary. Store the results in a database or a cookie for later visits.
Some of the potential issues include the following:
de-ATto indicate German as spoken in Germany, Switzerland or Austria, respectively. On the other hand, you might only get
deindicating a preference for German. If you were planning to use the region to decide what currency to use you are now in a bind. Your particular circumstances might allow you to make assumptions such as "Germany has 83 million people, Switzerland has 7 million but only 63% speak German, Austria has 8 million, so this user probably uses the Euro. If we're wrong we'll only offend 4.6% of our German speaking customers or just over 4 million people." Feel free to make such an assumption, if you can accept the risk. Its a lot simpler to ask the user for more information. Also, the math gets more difficult for Spanish or English, for instance.
Using the Accept-Language header is also a good starting point for determining the language of the user, rather than the locale, but even then you must know its limitations and give the user some way to override the assumptions you make. Many of the potential issues listed above apply here too.
Even if you aren't intentionally making assumptions about locale or region, you should be aware that your programming environment may be making such assumptions on your behalf. Many Web servers, server side scripting languages, and operating environments, by default, parse and infer their native locale objects from Accept-Language. For example, .NET uses the Accept-Language to determine the default CultureInfo, Java Servlet provides a getLocale() and getLocales() pair of methods that parse Accept-Language, and so forth. These objects are very useful in obtaining resources and other "language preference" material. They are less useful, as pointed out above, in determining many of the fine grained attributes of users or in designing the international behavior of a site. A language preference of es-MX doesn't necessarily mean that a postal address form should be formatted or validated for Mexican addresses. The user might still live in the USA (or elsewhere).
Content first published 2003-09-17 12:15. Last substantive update 2003-09-17 12:15 GMT. This version 2011-06-03 18:07 GMT
For the history of document changes, search for qa-accept-lang-locales in the i18n blog.
Copyright © 2003-2011 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.