Text content in HTML elements with child text nodes, and text in attributes of HTML elements that allow free-form text, may contain characters in the range U+202A to U+202E (the bidirectional-algorithm formatting characters). However, the use of these characters is restricted so that any embedding or overrides generated by these characters do not start and end with different parent elements, and so that all such embeddings and overrides are explicitly terminated by a U+202C POP DIRECTIONAL FORMATTING character. This helps reduce incidences of text being reused in a manner that has unforeseen effects on the bidirectional algorithm.
The aforementioned restrictions are defined by specifying that certain parts of documents form bidirectional-algorithm formatting character ranges, and then imposing a requirement on such ranges.
The string resulting from the concatenation of the data of all of an HTML element's text nodes, if any, is a bidirectional-algorithm formatting character range.
The value of a namespace-less attribute of an HTML element is a bidirectional-algorithm formatting character range.
Any strings that, as described above, are bidirectional-algorithm formatting character ranges must match the
string production in the following ABNF, the character set for which is Unicode. [ABNF]
string = *( plaintext ( embedding / override ) ) plaintext embedding = ( lre / rle ) string pdf override = ( lro / rlo ) string pdf lre = %x202A ; U+202A LEFT-TO-RIGHT EMBEDDING rle = %x202B ; U+202B RIGHT-TO-LEFT EMBEDDING lro = %x202D ; U+202D LEFT-TO-RIGHT OVERRIDE rlo = %x202E ; U+202E RIGHT-TO-LEFT OVERRIDE pdf = %x202C ; U+202C POP DIRECTIONAL FORMATTING plaintext = *( %x0000-2029 / %x202F-10FFFF ) ; any string with no bidirectional-algorithm formatting characters
For convenience, where possible authors will likely prefer to use the
dir attribute, the
bdo element, and the
bdi element, rather than maintaining the bidirectional-algorithm formatting characters manually.