Accesskey n skips to in-page navigation. Skip to the content start.

s_gotoW3cHome Internationalization
 

Script direction and languages

Intended audience: anyone who wants to know which scripts are associated with right-to-left text.

Updated 2003-09-15 15:10

Question

What directions are commonly localized languages written in?

Answer

Here are links to frequently asked questions about the direction of written text, and their answers. Click on any link to go directly to the question and answer, or scroll the page to view them all.

What is a script?

The Unicode Consortium's glossary uses the following definition:

Script: a collection of symbols used to represent text in one or more writing systems.

Microsoft offers the following definition on their globalization web site:

Script: A collection of characters for displaying written text, all of which have a common characteristic that justifies their consideration as a distinct set. One script can be used for several different languages (for example, Latin script, which covers all of Western Europe). Some written languages require multiple scripts (for example, Japanese, which requires at least three scripts: the hiragana and katakana syllabaries and the kanji ideographs imported from China). This sense of the word "script" has nothing to do with programming scripts such as Perl or Visual Basic Scripting Edition (VBScript).

Why is text directionality important to web design?

Knowing the directionality of text, based on the script(s) to be used, is important to web designers and authors, because right-to-left text can be more complicated (for beginners) to work with and the organization and directionality of the page layout are affected. Therefore, knowing the writing direction can be relevant to estimating the work involved to create web pages in a new language.

Which languages are written right-to-left (RTL)?

Languages don't have a direction. Scripts have a writing direction, and so languages written in a particular script, will be written with the direction of that script.

Languages can be written in more than one script. For example, Azeri can be written in any of the Latin, Cyrillic, or Arabic scripts. When written in Latin or Cyrillic scripts, Azeri is written left-to-right (LTR). When written in the Arabic script, it is written right-to-left.

Which script should I use?

If a language can be written in more than one script, which script should a web designer or localizer use, or should the text be provided in all scripts?

The answer will depend on your target audience. The script may change for different countries or regions. The script may also change by legislation or with changes in government policy. For example, to reach the Azeri-speaking population in Iran, you would use Arabic script. From the late 1930s, Cyrillic was the script of choice in Azerbaijan itself and became policy in 1940. Due to the fall of the Soviet Union, beginning in 1991 a gradual switch to Latin occured, becoming mandatory for official uses in 2001. However, for your target audience and unofficial uses, you might want to use Cyrillic for older audiences and Latin for younger audiences, and most likely both to reach the general Azerbaijani population. If you want to reach all Azeri speakers, you would use all 3 scripts. (Note that there might be terminology and other differences among Azeri speakers in different countries, just as there are differences between English or French speakers in different countries.)

You also should be aware that your choice of script may have political, religious, demographic or cultural overtones. In countries where the language of higher learning was Russian, Cyrillic will be used by educated people. Latin is associated with Pan-Turkic movements, and more generally can indicate Western-tending movements. Arabic script has associations with Islamist movements.

More generally, just as you research which languages are required to serve different cultures, you may need to investigate the correct script or scripts to use. There are suggestions in the Directionality of Commonly Requested Languages Table below.

What are some examples of right-to-left scripts?

The following scripts are written right-to-left. The languages listed are (sometimes) written in these scripts. In some cases, the languages may also be written in other scripts. Some of the languages were written in the listed script historically, but are not today.

Note that this list, of necessity, is not complete. There are too many languages in existence to identify them all here. The table is provided just to identify a few right-to-left scripts. It is not intended to guide web authors or developers in choosing scripts for languages.

Right-To-Left Script Languages
(Note many languages are also written in other scripts, which may be left-to-right.)
Arabic Arabic, Azeri/Azerbaijani1, Bakhtiari, Balochi, Farsi/Persian, Gilaki, Javanese3, Kashmiri, Kazakh3, Kurdish (Sorani), Malay3, Malayalam3, Pashto, Punjabi, Qashqai, Sindhi, Somali2, Sulu, Takestani, Turkmen, Uighur, Western Cham, Urdu
Hebrew Hebrew, Ladino/Judezmo2, Yiddish
N'ko Mandekan
Syriac Assyrian, Modern Aramaic Koine, Syriac
Thaana/Thâna Dhivehi/Maldivian
Tifinar Tamashek

Table Footnotes:
1 Azeri/Azerbaijani is written in Latin, Cyrillic or Arabic scripts.
2 Ladino,/Judezmo and Somali are typically written in the Latin script today.
3 These languages were historically written in the listed script, but use another script in modern practice.

Which languages are generally not written in right-to-left scripts?

Languages written in Latin, Cyrillic, (Modern) Greek, Indic and Southeast Asian scripts are left-to-right . Example languages include the modern languages of the Americas, Europe, India, and Southeast Asia.

Ideographic languages (e.g. Japanese, Korean, Chinese) are more flexible in their writing direction. They are generally written left-to-right, or vertically top-to-bottom (with the vertical lines proceeding from right to left). However, they are occasionally written right to left. Chinese newspapers sometimes combine all of these writing directions on the same page.

What directions are commonly localized languages written in?

The following table indicates the directionality of scripts used for writing languages in the countries listed. The list reflects (more or less) the languages most often asked about by localizers.

Note that many countries have more than one official language, and often have large numbers of speakers of minority languages. Therefore you should not use this list to define your localization strategy, but should independently evaluate your regional market requirements.

For example, Israel has 2 official languages: Hebrew and Arabic. However, Russian, and English are also popularly used. China includes: Putonghua/Mandarin, Cantonese, Wu, Minbei, Minnan, Xiang, Gan, Hakka, and others. India (the land of 1,000 languages) includes: English, Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Oriya, Panjabi, Tamil, Telugu, Urdu, Bihari, Kashmiri, Sindhi, Tibetan.

If you have a question about a language not listed here, send your question to www-international@w3.org with script faq suggestions as the subject.

Country/Region Script Direction1 Language
Afghanistan Arabic RTL Pashto
Armenia Armenian LTR Armenian
Austria Latin LTR German
Belgium Latin LTR Dutch, French
Brazil Latin LTR Portuguese (Brazilian)
Bulgaria Cyrillic LTR Bulgarian
China, except Hong Kong Simplified Chinese LTR or TTB Mandarin
Croatia Latin LTR Croatian
Czech Republic Latin LTR Czech
Denmark Latin LTR Danish
Estonia Latin LTR Estonian
Finland Latin LTR Finnish
France Latin LTR French
Georgia Georgian LTR Georgian
German Latin LTR German
Greece Greek LTR Greek
Hong Kong Traditional Chinese2 LTR or TTB Cantonese
Hungary Latin LTR Hungarian
India Devanagari LTR Hindi3
Israel Hebrew RTL Hebrew
Italy Latin LTR Italian
Japan Kanji + Hiragana + Katakana LTR or TTB Japanese
Korea Hangul, Hanja LTR or TTB Korean
Latin America, except Brazil Latin LTR Spanish
Latvia Latin LTR Latvian
Lithuania Latin LTR Lithuanian
Middle East Arabic RTL Arabic
Netherlands Latin LTR Dutch
North America Latin LTR English, French, Spanish
Norway Latin LTR Norwegian
Pakistan Arabic RTL Urdu
Poland Latin LTR Polish
Portugal Latin LTR Portuguese (Portugal)
Romania Latin LTR Romanian
Russia Cyrillic LTR Russian
Serbia and Montenegro Cyrillic LTR Serbian
Slovakia Latin LTR Slovak
Slovenia Latin LTR Slovenian
Spain Latin LTR Catalan, Spanish
Sweden Latin LTR Swedish
Switzerland Latin LTR French, German, Italian
Taiwan Traditional Chinese LTR or TTB Mandarin
Thailand Thai LTR Thai
Turkey Latin LTR Turkish
United Kingdom Latin LTR English

Table Footnotes:
1 "TTB" is Top-to-bottom, "LTR" is Left-to-right, "RTL" is Right-to-left.
2 Hong Kong script includes characters of the Hong Kong Supplementary Character Set.
3 English language software is often used in India.

Tell us what you think (English).

Subscribe to an RSS feed.

New resources

Home page news

Twitter (Home page news)

‎@webi18n

Further reading

By: Tex Texin, XenCraft.

Valid XHTML 1.0!
Valid CSS!
Encoded in UTF-8!

Content first published 2003-09-12. Last substantive update 2003-09-15 15:10 GMT. This version 2011-08-31 9:41 GMT

For the history of document changes, search for qa-scripts in the i18n blog.