Intended audience: anyone who wants to know which scripts are associated with right-to-left text.
Updated 2003-09-15 15:10
What directions are commonly localized languages written in?
Here are links to frequently asked questions about the direction of written text, and their answers. Click on any link to go directly to the question and answer, or scroll the page to view them all.
The Unicode Consortium's glossary uses the following definition:
Script: a collection of symbols used to represent text in one or more writing systems.
Microsoft offers the following definition on their globalization web site:
Script: A collection of characters for displaying written text, all of which have a common characteristic that justifies their consideration as a distinct set. One script can be used for several different languages (for example, Latin script, which covers all of Western Europe). Some written languages require multiple scripts (for example, Japanese, which requires at least three scripts: the hiragana and katakana syllabaries and the kanji ideographs imported from China). This sense of the word "script" has nothing to do with programming scripts such as Perl or Visual Basic Scripting Edition (VBScript).
Knowing the directionality of text, based on the script(s) to be used, is important to web designers and authors, because right-to-left text can be more complicated (for beginners) to work with and the organization and directionality of the page layout are affected. Therefore, knowing the writing direction can be relevant to estimating the work involved to create web pages in a new language.
Languages don't have a direction. Scripts have a writing direction, and so languages written in a particular script, will be written with the direction of that script.
Languages can be written in more than one script. For example, Azeri can be written in any of the Latin, Cyrillic, or Arabic scripts. When written in Latin or Cyrillic scripts, Azeri is written left-to-right (LTR). When written in the Arabic script, it is written right-to-left.
If a language can be written in more than one script, which script should a web designer or localizer use, or should the text be provided in all scripts?
The answer will depend on your target audience. The script may change for different countries or regions. The script may also change by legislation or with changes in government policy. For example, to reach the Azeri-speaking population in Iran, you would use Arabic script. From the late 1930s, Cyrillic was the script of choice in Azerbaijan itself and became policy in 1940. Due to the fall of the Soviet Union, beginning in 1991 a gradual switch to Latin occured, becoming mandatory for official uses in 2001. However, for your target audience and unofficial uses, you might want to use Cyrillic for older audiences and Latin for younger audiences, and most likely both to reach the general Azerbaijani population. If you want to reach all Azeri speakers, you would use all 3 scripts. (Note that there might be terminology and other differences among Azeri speakers in different countries, just as there are differences between English or French speakers in different countries.)
You also should be aware that your choice of script may have political, religious, demographic or cultural overtones. In countries where the language of higher learning was Russian, Cyrillic will be used by educated people. Latin is associated with Pan-Turkic movements, and more generally can indicate Western-tending movements. Arabic script has associations with Islamist movements.
More generally, just as you research which languages are required to serve different cultures, you may need to investigate the correct script or scripts to use. There are suggestions in the Directionality of Commonly Requested Languages Table below.
The following scripts are written right-to-left. The languages listed are (sometimes) written in these scripts. In some cases, the languages may also be written in other scripts. Some of the languages were written in the listed script historically, but are not today.
Note that this list, of necessity, is not complete. There are too many languages in existence to identify them all here. The table is provided just to identify a few right-to-left scripts. It is not intended to guide web authors or developers in choosing scripts for languages.
(Note many languages are also written in other scripts, which may be left-to-right.)
|Arabic||Arabic, Azeri/Azerbaijani1, Bakhtiari, Balochi, Farsi/Persian, Gilaki, Javanese3, Kashmiri, Kazakh3, Kurdish (Sorani), Malay3, Malayalam3, Pashto, Punjabi, Qashqai, Sindhi, Somali2, Sulu, Takestani, Turkmen, Uighur, Western Cham, Urdu|
|Hebrew||Hebrew, Ladino/Judezmo2, Yiddish|
|Syriac||Assyrian, Modern Aramaic Koine, Syriac|
1 Azeri/Azerbaijani is written in Latin, Cyrillic or Arabic scripts.
2 Ladino,/Judezmo and Somali are typically written in the Latin script today.
3 These languages were historically written in the listed script, but use another script in modern practice.
Languages written in Latin, Cyrillic, (Modern) Greek, Indic and Southeast Asian scripts are left-to-right . Example languages include the modern languages of the Americas, Europe, India, and Southeast Asia.
Ideographic languages (e.g. Japanese, Korean, Chinese) are more flexible in their writing direction. They are generally written left-to-right, or vertically top-to-bottom (with the vertical lines proceeding from right to left). However, they are occasionally written right to left. Chinese newspapers sometimes combine all of these writing directions on the same page.
The following table indicates the directionality of scripts used for writing languages in the countries listed. The list reflects (more or less) the languages most often asked about by localizers.
Note that many countries have more than one official language, and often have large numbers of speakers of minority languages. Therefore you should not use this list to define your localization strategy, but should independently evaluate your regional market requirements.
For example, Israel has 2 official languages: Hebrew and Arabic. However, Russian, and English are also popularly used. China includes: Putonghua/Mandarin, Cantonese, Wu, Minbei, Minnan, Xiang, Gan, Hakka, and others. India (the land of 1,000 languages) includes: English, Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Oriya, Panjabi, Tamil, Telugu, Urdu, Bihari, Kashmiri, Sindhi, Tibetan.
If you have a question about a language not listed here, send your question to
script faq suggestions as the
|China, except Hong Kong||Simplified Chinese||LTR or TTB||Mandarin|
|Hong Kong||Traditional Chinese2||LTR or TTB||Cantonese|
|Japan||Kanji + Hiragana + Katakana||LTR or TTB||Japanese|
|Korea||Hangul, Hanja||LTR or TTB||Korean|
|Latin America, except Brazil||Latin||LTR||Spanish|
|North America||Latin||LTR||English, French, Spanish|
|Serbia and Montenegro||Cyrillic||LTR||Serbian|
|Switzerland||Latin||LTR||French, German, Italian|
|Taiwan||Traditional Chinese||LTR or TTB||Mandarin|
1 "TTB" is Top-to-bottom, "LTR" is Left-to-right, "RTL" is Right-to-left.
2 Hong Kong script includes characters of the Hong Kong Supplementary Character Set.
3 English language software is often used in India.
Tell us what you think (English).
Content first published 2003-09-12. Last substantive update 2003-09-15 15:10 GMT. This version 2011-08-31 9:41 GMT
For the history of document changes, search for qa-scripts in the i18n blog.
Copyright © 2003-2011 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.