This document may contain examples in another language or script. Use accesskey "n" to jump to the internal navigation links at any point. Right now you can

Go to W3C Home PageGo to Architecture Domain home page.  Internationalization 
 

Authoring Techniques for XHTML & HTML Internationalization: Outline View

This document provides techniques, in outline form, for content authors working with XHTML, HTML and CSS. Click on the text of a technique to link to a detailed explanation. See the notes on document use for additional information.

Language

Switch to the resource view.Return to top of contents...

Declaring the text processing language

IEwFFoxMozOpNNSaIEm
Always declare the default text processing language of the page, using the html tag, unless there are more than one primary languages.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Use the lang and/or xml:lang attributes around text to indicate any changes in language.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Do not use Content-Language to declare the default text processing language, and do not use language attributes to declare the primary language metadata.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Do not declare the language of a document in the body tag.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
For HTML use the lang attribute only, for XHTML 1.0 served as text/html use the lang and xml:lang attributes, and for XHTML served as XML use the xml:lang attribute only.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
If the text in attribute values and element content is in different languages, consider using a russian doll approach.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
For documents with multiple primary languages, decide whether you want to declare a single text processing language in the html tag, or leave it undefined.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Switch to the resource view.Return to top of contents...

Specifying primary language metadata

IEwFFoxMozOpNNSaIEm
Consider using a Content-Language declaration in the HTTP header or a meta tag to declare metadata about the primary language of a document.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Switch to the resource view.Return to top of contents...

Choosing language values

IEwFFoxMozOpNNSaIEm
Follow the guidelines in RFC3066 or its successors for language attribute values.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Use the two-letter ISO 639 codes for the language code where there are both 2- and 3-letter codes.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Consider using the codes zh-Hans and zh-Hant to refer to Simplified and Traditional Chinese, respectively.Issues still to dateIssues with base version, but not latestIssues still to dateNo issuesIssues still to date?Issues still to date
Switch to the resource view.Return to top of contents...

Identifying in-document language changes

IEwFFoxMozOpNNSaIEm
Use the lang and/or xml:lang attributes around text to indicate any changes in language.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
For HTML use the lang attribute only, for XHTML 1.0 served as text/html use the lang and xml:lang attributes, and for XHTML served as XML use the xml:lang attribute only.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues
Switch to the resource view.Return to top of contents...

Indicating the language of a link destination

IEwFFoxMozOpNNSaIEm
When pointing to a resource in another language, consider the pros and cons of indicating the language of the target document.No issuesNo issuesNo issuesNo issuesNo issues?No issues
If you want to indicate that the target document of an a element is in another language, consider using hreflang with CSS.When pointing to a resource in another language, consider the pros and cons of using CSS to indicate the language, based on the value of the hreflang attribute of the a element.Issues still to dateNo issuesNo issuesNo issuesNo issues?Issues still to date
If using CSS to generate a language marker from the hreflang attribute, dDo not use flag icons to indicate languages.No issuesNo issuesNo issuesNo issuesNo issuesNo issuesNo issues

Character sets, character encodings and entities

Switch to the resource view.Return to top of contents...

Choosing a page encoding

IEwFFoxMozOpNNSaIEm
Choose UTF-8 or another Unicode encoding for all content.      
If you don't use a Unicode encoding, select an encoding that best supports the languages / characters to be included in the page text. What does this mean? Does it mean, which maximizes the opportunity to directly represent characters and minimizes the need to represent characters by markup means such as character escapes? Does it include the idea that you should choose the most commonly used encoding for a region?      
Check that user agents (all agents that must render the page) adequately support the page encoding that you have selected. If not, you might need to use a more widely supported encoding to achieve an adequate degree of user agent support. Couldn't this be rolled into the previous technique?      
Use character sets and encodings that will be accessible and common to your users.      
Switch to the resource view.Return to top of contents...

Specifying a page encoding

IEwFFoxMozOpNNSaIEm
Always declare the encoding of your documents.      
Switch to the resource view.Return to top of contents...

Representing characters using escapes

IEwFFoxMozOpNNSaIEm
Only use escapes for characters in exceptional circumstances. Create pages using an encoding that supports all the characters you need.      
Ensure that numbers in numeric character references always reference a Unicode codepoint.      
When using escapes, use the hexadecimal form.      
Use numeric character references rather than entities if your document is to be processed by unknown XML tools or converted to XML.      
If you use escapes, to represent characters in a style attribute consider using CSS escapes, rather than NCRs or entities.      

Bidirectional text

Switch to the resource view.Return to top of contents...

Enabling easy localization for RTL scripts

IEwFFoxMozOpNNSaIEm
Whenever possible, avoid HTML attributes with values of right and left. Use CSS in a linked stylesheet instead.      
Switch to the resource view.Return to top of contents...

General use of bidi markup

IEwFFoxMozOpNNSaIEm
Do not use CSS styling to control directionality in XHTML/HTML. Use markup.     
Only use bidi markup where it is needed.     
Switch to the resource view.Return to top of contents...

Basic setup for pages in RTL scripts

IEwFFoxMozOpNNSaIEm
Add dir="rtl" to the html tag any time the overall document direction is right-to-left.     
Do not add dir="rtl" to the body tag.     
Use logical order, not visual ordering for Hebrew.     
If using an ISO character encoding for Hebrew, choose iso-8859-8-i and use logical ordering.     
Switch to the resource view.Return to top of contents...

Changing the directionality of a block element

IEwFFoxMozOpNNSaIEm
Add the dir attribute to a block level element (only) to change its directionality.     
Switch to the resource view.Return to top of contents...

Mixing text direction inline

IEwFFoxMozOpNNSaIEm
Use a Unicode right-to-left mark (RLM) or left-to-right mark (LRM) to make neutral characters such as punctuation and spaces appear in the right place when they fall between different directional runs.     
Use a Unicode right-to-left mark (RLM) or left-to-right mark (LRM) to correctly order separate runs of same direction text separated by neutral characters such as punctuation and spaces.     
Use the dir attribute on an inline element to resolve problems with nested directional runs.     
For attribute text or element text that allows no internal markup, use Unicode control characters for bidirectional control.     
Do not use Unicode control characters for bidirectional control if markup is available.     
Do not leave white space at the end of inline elements that mark a directional boundary.     

Notes on document use

This document is in early draft form. It is undergoing constant and frequent modification and does not yet contain accurate content.

Use the icons to the right of each section header to view the full text or view resources for a given section.

The yellow cells to the right indicate whether a technique is supported by a given user agent. The possible alternatives are:

Editor: Richard Ishida.

Valid XHTML 1.0!

Valid CSS!

Encoded in UTF-8!

Content created 15 March, 2004. Last update 2005-02-07 17:35