Intended audience: XHTML/HTML coders (using editors or scripting), XML content authors, and schema developers (DTDs, XML Schema, RelaxNG, etc.), and anyone who wants to know whether they can use element names in other languages than English.
Updated 2004-06-28 09:57
Can I write HTML and XML element and attribute tag names in languages and scripts other than English?
HTML or XHTML tags are all pre-defined (in English) and must remain that way if they are to be correctly recognized by user agents (eg. browsers).
In XML it is possible to define your own tag names. You can do this in any language and script supported by Unicode. (More specifically XML 1.0 supports selected characters from the Unicode Standard version 2.0. XML 1.1 supports nearly all characters defined by the Unicode Standard versions 3.0 and above.)
Although all XML processors must support Unicode, it is sensible to apply some caution here. If a person has to work with a tag set in, say, Chinese, Arabic or Hindi it might prove difficult if they don't speak those languages or don't have the right fonts and rendering software on their system. English tag names have an advantage for DTDs that are used by multinational groups because people from a large number of countries are likely to be able to easily view and understand the meaning of the tags you are using.
On the other hand, non-English tag names can be useful for educational materials. For example, it is common in Japanese XML primers.
Note also that, because NCRs are not allowed in tag names, using non-ASCII tag names requires you to use a character encoding that supports the characters needed. Using a Unicode encoding such as UTF-8 is usually the best approach.
If you are using XML 1.1 almost any character is allowed, but not every character is sensible. For a set of recommendations about which characters to use, see Appendix B of the XML 1.1 spec.
For specific information about which characters are allowed in XML tags see the further reading listed below.
Tell us what you think (English).
Content first published 2003-06-09 09:57. Last substantive update 2004-06-28 09:57 GMT. This version 2011-05-04 7:20 GMT
For the history of document changes, search for qa-non-eng-tags in the i18n blog.
Copyright © 2003-2011 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply. Your interactions with this site are in accordance with our public and Member privacy statements.