This document recommends how to mark the primary language(s) in a HTML document. It could be considered a clarification of the HTML 4.0 Specification [HTML40]; in particular, it is not in contradiction with the HTML 4.0 Specification. The objective is to have a best practice in this field; at present there is some confusion.
langattribute specifies the natural language. This document is mostly concerned with how to specify the primary language(s) (there could be more than one) and the base language (there is only one) in HTML documents.
Some documents are bilingual and few are trilingual or n-lingual. Bilingual documents are usually short; i.e, a few paragraphs. N-lingual documents are usually very short; a few sentences.
The main reason for the existence of n-lingual documents is political; i.e., in certain situations it is not politically correct to assume a base language. A common practice is to have one small document that is a menu of languages. For example, the Europa server of the European Commission [EUR].
Another approach to choose the language is to set the client (e.g., the browser) to the preferred language(s). The client will transmit the language(s) in the Accept-Language field of HTTP. Immediately, the server will send an appropriate document. For example, the Spanish version will be presented if the language preferences (in the browser) are Spanish and French and the document is available (in the server) in French, German and Spanish.
<HTML> <HEAD> <META HTTP-EQUIV="Content-Language" Content="fr"> <TITLE>Mon doc</TITLE> </HEAD> <BODY> Je suis un Berlinois. </BODY> </HTML>
The value of the
of the META element is the same as the
value of the
Content-Language header in HTTP;
a comma-separated list of language codes.
<META HTTP-EQUIV="Content-Language" Content="fr,en">
These language codes are the same used in the
attribute of some HTML elements.
The language codes are defined in [RFC1766]. See also 8.1.1 Language codes of the HTML 4.0 Specification [HTML40] and [RFC2068].
The order of the languages in the Content-Language is significant.
The first language in the list is the base language of the document;
i.e., any text not re-specified with the
lang attribute is in
the base language.
The META should not be marked with more than one language in documents with minor fragments in other languages. The rules to specify a document as monolingual, bilingual or n-lingual are the same as for printed books.
The reason for recommending META as opposed to the HTML element with
lang attribute are:
lang attribute in the HTML element overrides the language
specified in the META element.
The inheritance rules are in
8.1.2 Language information and text direction
of the HTML 4.0 Specification
In particular, thanks to