Apache MultiViews language negotiation set up

Question

How do I use the MultiViews approach on an Apache Web server to automatically serve resources in the language requested by an HTTP request?

When a user agent requests a document from a server, information about the user's language preferences is typically passed to the server via the HTTP Accept-Language header. If the server stores versions of a page in more than one language, this HTTP information can be used to retrieve the page in the user's preferred language, if it is available. If there is only one version of a page on the server, that version will be retrieved.

The mechanism of choosing the relevant page to return to the user, based on the Accept-Language information in the HTTP request, is referred to as language negotiation.

In many cases, the initial user agent setting is okay. For example, if you have a Japanese version of a browser, the browser typically assumes that you prefer pages in Japanese, and sends this information to the server. Mainstream browsers allow you to modify these language preferences. For more information see Setting language preferences in a browser.

There are two different approaches to resource negotiation in Apache. The first involves the use of a type map file (i.e., a .var file) which names the files containing the variants explicitly; the second involves the use of a 'MultiViews' search, where the server does an implicit filename pattern match and chooses from among the results. The MultiViews option can also be set on a per directory basis, if the server administrator has allowed it.

This article addresses the question of how to set up the documents on an Apache server, using the MultiViews approach, so that language negotiation works.

Answer

Note, first, that language negotiation may or may not be the best approach for serving your multilingual content to your readers. In some cases, for example, localized sites may be best maintained by keeping the translated versions of a page in language-specific directories, or by mixing the two approaches. Which approach is appropriate, and when, will be the topic of a future article.

Setting up language negotiation involves

  1. developing a convention for naming the different language versions of your file,
  2. planning a fallback strategy to deal with requested languages that you don't support, and
  3. setting the appropriate server-side directives to make it all work.

There is more than one way to set up language negotiation on Apache servers, and the right approach will depend on the higher level settings and the privileges enabled or disabled by the server administrator. You may need to contact your server administrator to check which approach is available and what privileges you have.

Given the number of ways in which server set up can vary, it is difficult to provide a simple definitive description of how to set up language negotiation. What follows is a description of a typical approach. We will assume that MultiViews is enabled (the default) and that the user can change certain directives in .htaccess files (small text files in the directory structure). AllowOverride has to be properly set by the server administrator for the .htaccess file approach to work. You will need to check with your server administrator whether this approach is workable for your own circumstances.

We will use the following example: a document called example.html is available in 3 languages, English, French and German, and the default is English. Although the example limits itself to .html files, language negotiation can be applied to other types of file.

File naming

Each language version is indicated by a special extension, which can appear before or after the .html extension. In practice, there are some considerations to bear in mind with regard to the placement of this extension.

If you put the language extension last, the .html extension can be included or excluded when requesting the file. This strategy may, however, make it more difficult to read or edit the files if they are not on an Apache server (eg. read from another server, a CD, or a hard drive). This is because most editors and browsers just look at the last extension to determine what type of file this is and how to handle it. For our example the English, French and German files would be named, respectively, as follows:

If you put the .html extension last you make it easier to to read or edit the files if they are not on an Apache server, but to access the resource on the Apache server the name must always be typed in a browser address bar or identified in a hyperlink, etc., without the extensions (eg. <a href="example">...</a>). For our example, the filenames would be:

The language labels shown can actually be any strings, as long as you define their meaning on the server (see below). The server is likely to already recognize a number of 2-letter language extensions from global settings in its httpd.conf file. We recommend that you use ISO language and country codes in the way defined by BCP 47, since this provides for greater consistency and easy recognition of language labels.

You should be careful with a few extensions. For example, using the ISO code for Polish, .pl, would confuse it with the extension typically used to indicate Perl documents. You may therefore want to use pl-PL for Polish.

Note that users can refer to a specific file by typing in the full file name, eg. example.fr.html will retrieve the French version, regardless of the user's language settings.

Server directives

You would normally use the AddLanguage directive to specify which extension maps to which content language specified in the incoming HTTP.

For example, the following directive maps the HTTP content language request for French to the extension .fr:

AddLanguage fr .fr

There are a number of places you can specify this. It may already be specified globally by an entry in the server's httpd.conf file, or a server administrator may add it there. Alternatively, a user uploading content might specify it in a file in the directory hierarchy. Such a file would typically be called .htaccess.

Default files

It is important to specify a default file, since a user who doesn't have either English, French or German in their list of preferred languages (say for example a Spanish user), or who's user agent doesn't support content negotiation, would otherwise receive a HTTP 406 result (NOT ACCEPTABLE) rather than a file.

The best way to specify a default file will vary, depending on whether your language extension precedes or follows the .html extension, and on what version of Apache you are using. In the examples below we will assume that the default is English (likely to often be the best choice for a default, given the widespread nature of English).

Specifying default files in Apache 2.0.30 and above

On versions 2.0.30 and above of the Apache server you can specify a default file quite cleanly using the ForceLanguagePriority and LanguagePriority directives (follow the links for detailed descriptions of how these directives work).

Given our example above, we could set the default to be English using the following two lines:

LanguagePriority en fr de
ForceLanguagePriority Fallback

Now if a Spanish user were to request a Spanish document in the context of our example they would be served the English document instead, ie. the first item in the LanguagePriority list.

Specifying default files in Apache 1.3.4 to 2.0.29

If your server version is earlier than 2.0.30 you will have to do a lot more work to specify the default file, since the ForceLanguagePriority directive is not available. Also, the approach will depend on whether the language extension comes before or after the .html extension.

First let's look at the case where your language extension comes before the .html extension (ie. the resource must always be typed in a browser address bar or identified in a hyperlink, etc., without the .html extension). To set the default to English you can create a copy of the English file in the directory with the following name:

example.html

If your language extension comes after the .html (ie. the .html extension can be included or excluded when requesting the file) you will need to name your copy of the English file:

example.html.html

The default file name ends in .html.html because if the default file were named example.html and the user requested the file as example.html no content negotiation would ever take place (because an exact match can be found).

By the way

If there is only one file with a given name in a directory and it has no language extension, that will be served whatever the client's language preference.

This technique can be applied to other types of file besides HTML. We just used an HTML example here because it is a common requirement.