Unicode Bidirectional Algorithm basics is a repackaging of the initial part of “What you need to know about the bidi algorithm and inline markup” as a standalone article. It provides a gentle introduction to the behaviour of the Unicode Bidirectional Algorithm, and helps you understand why bidirectional text in Arabic, Hebrew, Thaana, Urdu, etc. behaves the way it does.
Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts
This tutorial has been modified to bring it in line with the current tutorial format. Rather than contain duplicate content, it now introduces the novice to key concepts and points off to useful further reading in an organized fashion. It has been completely rewritten.
Text direction and structural markup in HTML
This article has been created from material formerly in the tutorial “Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts” and augmented with information about new HTML5 markup constructs that are beginning to see adoption. It should be regarded as a new article, focusing on applying bidi markup to document- and block-level content, including forms.
What you need to know about the bidi algorithm and inline markup
This is an update of an existing article, but it has been almost completely rewritten. The most significant changes are the new parts describing how to apply the new HTML5 constructs which are beginning to see adoption. Additional changes will be needed as HTML5 bidi markup is finalised over the coming months. The article also proposes a simpler way to approach markup of bidi text, particularly useful for those with less experience, that relies less on a deep understanding of the issues involved.
Visual vs. logical ordering of text
This is a new article created from material that has been removed from the previously mentioned articles. It was removed into a separate article because visual ordering is much less important these days, and to avoid duplication. Only a few changes have been made to the content itself.
One tutorial and two articles have been updated, and a new article has been created from material that was moved out of the tutorial. The updates all involve major rewrites of the former text. These changes incorporate up-to-date information about how language declarations are handled in HTML5, and generally refresh and improve the previous material.
The new articles are:
Working with language in HTML (tutorial)
All articles use a new HTML5-based template with additional change to the boilerplate code.
Some articles are brand new and others were originally part of a tutorial, but have been updated and amplified to bring HTML5 to the fore and incorporate feedback from various readers. The articles are:
- Character encodings: Essential concepts
- Choosing & applying a character encoding
- Declaring character encodings in HTML
- The byte-order mark (BOM) in HTML
- Normalization in HTML and CSS
- Characters or markup?
Together these articles, with several other existing articles that were updated at the same time, provide practical advice to content authors on how to handle character encodings in HTML and CSS.
Numerous changes were made to this article to address feedback and also incorporate material on CSS escapes from the character encoding tutorial. This and other changes are described below. View the article.
German, Spanish, and Brazilian and Iberian Portuguese translators should consider updating it.
Description of changes:
- various parts of the text were rewritten
- the title and the question were changed
- the latest template was applied, and various new style conventions that affect the markup
- two new sections were added relating to CSS
- substantial changes were made to the Further Reading section
Translators should retranslate the whole article.
The article Who uses Unicode? was rewritten to reflect the fact that Unicode-encoded web pages now account for over 50% of the Web, as determined by Google.
Spanish and Polish and Brazilian Portuguese translators should consider retranslating the article.
The article was updated as follows:
- the title and some of the text was changed to reduce the emphasis on corporate sites
- the first paragraph was modified, and two paragraphs and a sidenote were added to the answer section
- substantial changes to the Further Reading section
Answers the question: Should I use b and i elements?
The HTML5 specification redefines b and i elements to have some semantic function, rather than purely presentational. However, the simple fact that the tag names are ‘b’ for bold and ‘i’ for italic means that people are likely to continue using them as a quick presentational fix.
This article explains why that can be problematic for localization (and indeed for restyling of pages in a single language), and echoes the advice in the specification intended to address those issues.
By Richard Ishida, W3C.
FAQ-based article: Which language tag is right for me? How do I choose language and other subtags?
Following the publication of RFC 5646 earlier this year (replacing RFC 4646 as part of BCP 47), the IANA Subtag Registry now contains almost 8,000 subtags, and the list of subtag types was increased with the introduction of extended language subtags. This article tries to simplify the choice of an appropriate language tag for your needs by outlining the necessary decisions in a step-wise fashion.
By Richard Ishida, W3C.