The translate attribute in HTML5 has been long awaited by those involved with translation, since it will improve translation of content whether it be in industrial localization environments or by individuals wanting to translate a single page using an online translation service, such as those offered by Google, Microsoft and Yandex.
This article discusses what the translate attribute is for, and how it should be used.
This article is based on text that was originally published in the WG Note, Internationalization Best Practices: Specifying Language in XHTML & HTML Content. The Note will be updated in due course, at which time the material will be removed from the Note.
The article discusses some of the pros and cons for signalling the language of a page which a link points to, if that page is not in the same language as the current content. It also looks at how people have done this in the past using the hreflang attribute.
Because of its history, the article has not been through the normal review process, but comments can be sent using the feedback form.
A future version of the article may look at alternative approaches and implementations, such as those used for European languages.
Unicode Bidirectional Algorithm basics is a repackaging of the initial part of “What you need to know about the bidi algorithm and inline markup” as a standalone article. It provides a gentle introduction to the behaviour of the Unicode Bidirectional Algorithm, and helps you understand why bidirectional text in Arabic, Hebrew, Thaana, Urdu, etc. behaves the way it does.
Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts
This tutorial has been modified to bring it in line with the current tutorial format. Rather than contain duplicate content, it now introduces the novice to key concepts and points off to useful further reading in an organized fashion. It has been completely rewritten.
Text direction and structural markup in HTML
This article has been created from material formerly in the tutorial “Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts” and augmented with information about new HTML5 markup constructs that are beginning to see adoption. It should be regarded as a new article, focusing on applying bidi markup to document- and block-level content, including forms.
What you need to know about the bidi algorithm and inline markup
This is an update of an existing article, but it has been almost completely rewritten. The most significant changes are the new parts describing how to apply the new HTML5 constructs which are beginning to see adoption. Additional changes will be needed as HTML5 bidi markup is finalised over the coming months. The article also proposes a simpler way to approach markup of bidi text, particularly useful for those with less experience, that relies less on a deep understanding of the issues involved.
Visual vs. logical ordering of text
This is a new article created from material that has been removed from the previously mentioned articles. It was removed into a separate article because visual ordering is much less important these days, and to avoid duplication. Only a few changes have been made to the content itself.
One tutorial and two articles have been updated, and a new article has been created from material that was moved out of the tutorial. The updates all involve major rewrites of the former text. These changes incorporate up-to-date information about how language declarations are handled in HTML5, and generally refresh and improve the previous material.
The new articles are:
Working with language in HTML (tutorial)
All articles use a new HTML5-based template with additional change to the boilerplate code.
Some articles are brand new and others were originally part of a tutorial, but have been updated and amplified to bring HTML5 to the fore and incorporate feedback from various readers. The articles are:
- Character encodings: Essential concepts
- Choosing & applying a character encoding
- Declaring character encodings in HTML
- The byte-order mark (BOM) in HTML
- Normalization in HTML and CSS
- Characters or markup?
Together these articles, with several other existing articles that were updated at the same time, provide practical advice to content authors on how to handle character encodings in HTML and CSS.
Numerous changes were made to this article to address feedback and also incorporate material on CSS escapes from the character encoding tutorial. This and other changes are described below. View the article.
German, Spanish, and Brazilian and Iberian Portuguese translators should consider updating it.
Description of changes:
- various parts of the text were rewritten
- the title and the question were changed
- the latest template was applied, and various new style conventions that affect the markup
- two new sections were added relating to CSS
- substantial changes were made to the Further Reading section
Translators should retranslate the whole article.
The article Who uses Unicode? was rewritten to reflect the fact that Unicode-encoded web pages now account for over 50% of the Web, as determined by Google.
Spanish and Polish and Brazilian Portuguese translators should consider retranslating the article.
The article was updated as follows:
- the title and some of the text was changed to reduce the emphasis on corporate sites
- the first paragraph was modified, and two paragraphs and a sidenote were added to the answer section
- substantial changes to the Further Reading section