Accept-Language used for locale setting

Question

Is it a good idea to use the HTTP Accept-Language header to determine the locale of the user?

Background

For a number of perfectly valid reasons some web applications would like to associate a locale with each user that visits the site. This locale would enable them to provide information in the local format. Some of this information is common to traditional software locales, such as:

In other cases, other information may be derived from the locale information plus additional knowledge, such as:

Since none of these are included in the HTTP protocol many web developers have used the Accept-Language header to make inferences about the user's locale.

The Accept-Language header is information about the user's language preferences that is passed via HTTP when a document is requested. Mainstream browsers allow these language preferences to be modified by the user. The value itself is defined by BCP 47, typically as a two or three letter language code (eg. fr for French), followed by optional subcodes representing such things as country (eg. fr-CA represents French as spoken in Canada).

The question is about whether this information is appropriate for determining the locale of the user.

Answer

The HTTP Accept-Language header was originally only intended to specify the user's language. However, since many applications need to know the locale of the user, common practice has used Accept-Language to determine this information. It is not a good idea to use the HTTP Accept-Language header alone to determine the locale of the user. If you use Accept-Language exclusively, you may handcuff the user into a set of choices not to his liking.

For a first contact, using the Accept-Language value to infer regional settings may be a good starting point, but be sure to allow them to change the language as needed and specify their cultural settings more exactly if necessary. Store the results in a database or a cookie for later visits.

Some of the potential issues include the following:

By the way

Using the Accept-Language header is also a good starting point for determining the language of the user, rather than the locale, but even then you must know its limitations and give the user some way to override the assumptions you make. Many of the potential issues listed above apply here too.

Even if you aren't intentionally making assumptions about locale or region, you should be aware that your programming environment may be making such assumptions on your behalf. Many Web servers, server side scripting languages, and operating environments, by default, parse and infer their native locale objects from Accept-Language. For example, .NET uses the Accept-Language to determine the default CultureInfo, Java Servlet provides a getLocale() and getLocales() pair of methods that parse Accept-Language, and so forth. These objects are very useful in obtaining resources and other "language preference" material. They are less useful, as pointed out above, in determining many of the fine grained attributes of users or in designing the international behavior of a site. A language preference of es-MX doesn't necessarily mean that a postal address form should be formatted or validated for Mexican addresses. The user might still live in the USA (or elsewhere).