Accesskey n skips to in-page navigation. Skip to the content start.
Intended audience: XHTML/HTML coders (using editors or scripting), script developers (PHP, JSP, etc.), CSS coders, and anyone who wants to use language information to apply CSS styles to markup.
What is the most appropriate way to associate CSS styles with text in a particular language in a multilingual XHTML/HTML document?
Presentation styles are commonly used to control changes in fonts, font sizes and line heights when language changes occur in the document. This can be particularly useful when dealing with Simplified versus Traditional Chinese, where users tend to prefer different fonts, even though they may be using many of the same characters. It can also be useful to better harmonise the look of mixed, script-specific fonts, such as when mixing Arabic and Latin fonts.
This page looks at available options for doing this most effectively.
There are four ways to apply different styles to different languages in a multilingual document using CSS2. They are listed here in order of preference.
At the time of writing Firefox 1.0, Mozilla 1.7.2, Netscape 8.0 and Opera 8.0.2* support these selectors. Unfortunately, interoperable use of the three selectors is hampered by the fact that Internet Explorer doesn't support them. This means that a large proportion of a general audience will not see the intended results of your styling if you use the selectors, and you need to use a generic class or id selector on every tag you want to style. For more information about support for these selectors see the test results.
The remainder of this page explains and provides examples of ways in which the use of these selectors differ.
A significant difference between :lang and the other methods is that it recognises the language of element content even
if the language is declared outside the element in question.
Suppose, for example, that in a future English document containing Japanese text you wanted to style emphasised Japanese text using special Asian CSS3 properties, rather than italicisation (which doesn't always work well with the complex characters of Japanese). You might have the following rules in your stylesheet:
em { font-style: italic; }
em:lang(ja) { font-style: normal; font-emphasize: dot before; }
Now assume that you have the following content, that the user agent supports :lang, and that the html tag
states that this is an English document.
<p>This is <em>English</em>, but <span lang="ja"
xml:lang="ja">これは<em>日本語</em>です。</span></p>
You would expect to see the emphasized English word italicised, but the emphasized Japanese word in regular text with small dots above each character, something like this:
![]()
The important point to be made in this section is that this would not be possible using the [lang|="..."] or
[lang="..."] selectors. For those to work you would have to declare the language explicitly on each Japanese em tag.
This is a significant difference between the usefulness of these different selectors.
:lang() pseudo-class selectorThe XHTML fragment:
<p>It is polite to welcome people in their own language:</p>
<ul>
<li xml:lang="zh-Hans" lang="zh-Hans">欢迎</li>
<li xml:lang="zh-Hant" lang="zh-Hant">歡迎</li>
<li xml:lang="el" lang="el">Καλοσωρίσατε</li>
<li xml:lang="ar" lang="ar">اهلا وسهلا</li>
<li xml:lang="ru" lang="ru">Добро пожаловать</li>
<li xml:lang="din" lang="din">Kudual</li>
</ul>
could have the following styling:
body {font-family: "Times New Roman", serif;}
:lang(ar) {font-family: "Traditional Arabic", serif; font-size: 1.2em;}
:lang(zh-Hant) {font-family: PMingLiU,MingLiU, serif;}
:lang(zh-Hans) {font-family: SimSum-18030,SimHei, serif;}
:lang(din) {font-family: "Doulos SIL", serif;}
Note that the Greek and Russian use the styling set for the body element.
This would be the ideal way to style language fragments, because it is the only selector that can apply styling to the content of an element when the language of that content is declared earlier in a page.
The rule for :lang(zh) would match elements with a language value of zh. It would also match more specific
language specifications such as zh-Hant, zh-Hans and zh-HK.
The selector :lang(zh-Hant) will only match elements that have a language value of zh-Hant or have inherited
that language value. If the CSS rule specified :lang(zh-HK), the rule would not match our sample paragraph.
[lang|="..."] selector that matches the beginning of a value of an attributeFor the earlier example of markup, the stylesheet could be written as:
body {font-family: "Times New Roman", serif;}
*[lang|="ar"] {font-family: "Traditional Arabic", serif; font-size: 1.2em;}
*[lang|="zh-Hant"] {font-family: PMingLiU,MingLiU, serif;}
*[lang|="zh-Hans"] {font-family: SimSum-18030,SimHei, serif;}
*[lang|="din"] {font-family: "Doulos SIL", serif;}
The selectors for Chinese use specific values, and will only match to the indicated values, while the other language attribute
selectors are more generic. For instance [lang|="en"] will successfully match en-NZ.
This, and the next selector, will work as long as you declare the language of the element you want to style in the element itself, unlike .
There is a significant difference between [lang="en"] and [lang|="en"]. The first language selector will
only match elements with a language attribute equal to en, while the second selector will match any element with a language attribute
starting with en. Therefore the second selector would match en-US, en-HK or en-CA. In fact
lang(en) is an equivalent form to [lang|="en"].
[lang="..."] selector that matches the value of an attributeThe third method of specifying rules is to use an attribute selector that exactly matches the attribute value.
For the earlier example of markup, the stylesheet could be written as:
body {font-family: "Times New Roman", serif;}
*[lang="ar"] {font-family: "Traditional Arabic", serif; font-size: 1.2em;}
*[lang="zh-Hant"] {font-family: PMingLiU,MingLiU, serif;}
*[lang="zh-Hans"] {font-family: SimSum-18030,SimHei, serif;}
*[lang="din"] {font-family: "Doulos SIL", serif;}
Note that using this approach en will not match en-AU. The match has to be exact.
Currently, the best supported method is to use an ordinary CSS class or id selector.
This works with most browsers that support CSS. The disadvantage is that adding the class attributes takes up time and bandwidth.
For the example above, this would require us to change the XHTML code by adding class attributes as follows:
<p>It is polite to welcome people in their own language:</p>
<ul>
<li class="zhs" xml:lang="zh-Hans" lang="zh-Hans">欢迎</li>
<li class="zht" xml:lang="zh-Hant" lang="zh-Hant">歡迎</li>
<li xml:lang="el" lang="el">Καλοσωρίσατε</li>
<li class="ar" xml:lang="ar" lang="ar">اهلا وسهلا</li>
<li xml:lang="ru" lang="ru">Добро пожаловать</li>
<li class="din" xml:lang="din" lang="din">Kudual</li>
</ul>
We could then have the following styling:
body {font-family: "Times New Roman", serif; }
.ar {font-family: "Traditional Arabic", serif; font-size: 1.2em;}
.zht {font-family: PMingLiU, MingLiU, serif;}
.zhs {font-family: SimSum-18030, SimHei, serif;}
.din {font-family: "Doulos SIL", serif;}
I have used the language codes "zh-Hant" and "zh-Hans". These language codes do not represent specific languages. "zh-Hant" would indicate Chinese written in Traditional Chinese script. Similarly “zh-Hans” represents Chinese written in Simplified Chinese script. This could refer to Mandarin or many other Chinese languages.
Until recently the codes "zh-TW" and "zh-CN" were used to indicate Traditional and Simplified versions of Chinese writing, respectively. In reality, "zh-TW" should indicate Chinese spoken in Taiwan, although there are more than one Chinese language spoken in Taiwan. Similarly "zh-CN" represents Chinese spoken in China (PRC). This could refer to Mandarin or any other Chinese language.
Some of the modern web browsers will use the presence of the language tags "zh-CN" and "zh-TW" to select the default fonts to display the text when the web page designer does not indicate a font (see the test results for more information).
If you need to use language tags to differentiate between Chinese languages, the IANA language subtag registry has more precise language codes for a range of Chinese languages.
Tell us what you think (English).
Content first published 2003-08-07. Last substantive update 2005-08-03 14:27 GMT. This version 2007-10-31 15:41 GMT
For the history of document changes, search for qa-css-lang in the i18n blog.
Copyright © 2003-2006 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.