Html-bidi-isolation

From Internationalization

Proposals to add isolation for bidi content in HTML5

Authors: Richard Ishida, Aharon Lanin.

Read this first

Unicode 6.3 will shortly be released, and will contain new control codes (RLI, LRI, FSI, PDI) to enable authors to express isolation at the same time as direction in inline bidirectional text. The Unicode Consortium recommends that isolation be used as the default for all future inline bidirectional text embeddings. The CSS Writing Modes specification has already been adapted to accommodate this new development. We now need to ensure that HTML5 encourages and enables content authors to adopt and apply isolation as the default whenever they set direction on inline content.

The use of the bdi element to achieve isolation can continue, and is particularly handy when the direction of the content is unknown. However, it can not continue to be the only or even the primary way to achieve isolation in markup, since it relegates isolation to being a little-known power tool instead of the default for bidi content, and since using a special element for this purpose is impractical in some scenarios.

There are three proposals on the table currently.

Proposal A is to change the semantics of the dir attribute so that isolation is always applied to the content surrounded by the element with the dir attribute.

Proposal B is for new values for dir, specifically r and l. It is offered as an alternative if Proposal A is not acceptable.

Proposal C is for a new direction attribute. It is also offered as an alternative if Proposal A is not acceptable.

The Internationalization Working Group supports Proposal A. This is by far the cleanest approach and the most author-friendly.

If you are new to the idea of bidi isolation, you may want to start with the Background material at the end, then read about each proposal.

For a more in depth overview of inline direction in HTML, and a description of current capabilities in HTML4 and HTML5, see Creating HTML Pages in Arabic, Hebrew and Other Right-to-left Scripts.


Key objectives

Note that any proposed solution must:

  • move all authors in time to using markup that isolates by default while setting base direction.
  • support continued use of the dir attribute when used in browsers that don't yet support the new stuff.
  • be intuitive enough for authors to readily use in place of dir=ltr/rtl with its current non-isolating semantics
  • not require that users understand or appreciate the concept of isolation in order to mark up up their text

Proposal A: change the dir semantics

The very best solution from the point of intuitivness and ease of implementation would be to simply change the styling associated with the current dir attribute, so that it applies isolation by default.

In other words, the HTML default stylesheet specification for dir="ltr" and dir="rtl" should be changed to result in unicode-bidi:isolate (or, for , unicode-bidi:isolate-override), instead of unicode-bidi:embed (or, for , unicode-bidi:bidi-override).

Transition

No special steps needed. dir will continue to work as before until the browser adopts the new behaviour.

On the other hand, if a content author wants to get the new behavior before a browser starts complying with the new spec and that browser already supports the isolation changes in CSS, they could add the following to their stylesheet.

[dir='ltr'] { unicode-bidi: embed; unicode-bidi: -webkit-isolate;
    unicode-bidi: -moz-isolate; unicode-bidi: -ms-isolate; unicode-bidi:
    isolate; direction: ltr; }
[dir='rtl'] { unicode-bidi: embed; unicode-bidi: -webkit-isolate;
    unicode-bidi: -moz-isolate; unicode-bidi: -ms-isolate; unicode-bidi:
    isolate; direction: rtl; }
bdo[dir='ltr'] { unicode-bidi: bidi-override; unicode-bidi:
    -webkit-isolate-override; unicode-bidi: -moz-isolate-override;
    unicode-bidi: -ms-isolate-override; unicode-bidi: isolate-override;
    direction: ltr; }
bdo[dir='rtl'] { unicode-bidi: bidi-override; unicode-bidi:
    -webkit-isolate-override; unicode-bidi: -moz-isolate-override;
    unicode-bidi: -ms-isolate-override; unicode-bidi: isolate-override;
    direction: rtl; }


This will not do anything on IE, however IE8 and above, applies something similar to isolation anyway.

Benefits

  • Content authors don't need to learn to do anything different, nor do they need to understand isolation vs. embedding or different usage for block vs. inline elements. Their code just simply works better.
  • Nothing special is needed to cover the transition period while browsers adopt the new semantics: the code will work better in browsers that have implemented the change, but just work the same as before in browsers that haven't.
  • Problems with existing legacy pages (such as badly rendered inserted names before numbers) will be fixed automatically by this change.
  • Once a browser supports isolation it is trivial to implement the change in semantics for dir in the browser.

Drawbacks

  • It is possible that a small number of web pages would behave differently as a result of this change, but this only affects the uncommon cases where people have put in place undesirable workarounds (hacks) in the past. (For details see 'Do we still need non-isolate embeddings?' in the Background section below.) On the other hand, it appears that version 8 and above of Internet Explorer already applies some type of partial isolation when the dir attribute is used (for details see Internet Explorer behaviour in the Background section below). Furthermore, this change would probably fix other parts of such pages that currently don't work as intended.

Proposal B: new values for dir

Keep the "dir" attribute, add values "L" and "R" (uppercase used here for readability - value is case-insensitive, as usual).

Convince people to move away from using the ltr and rtl values completely in time.

Transition period

  • If CSS can be assumed, trivial - just add the following to your stylesheet:
   [dir='l'] { unicode-bidi: embed; unicode-bidi: -webkit-isolate; unicode-bidi: -moz-isolate; unicode-bidi: -ms-isolate; unicode-bidi: isolate; direction: ltr; }
   [dir='r'] { unicode-bidi: embed; unicode-bidi: -webkit-isolate; unicode-bidi: -moz-isolate; unicode-bidi: -ms-isolate; unicode-bidi: isolate; direction: rtl; }
   bdo[dir='l'] { unicode-bidi: bidi-override; unicode-bidi: -webkit-isolate-override; unicode-bidi: -moz-isolate-override; unicode-bidi: -ms-isolate-override; unicode-bidi: isolate-override; direction: ltr; }
   bdo[dir='r'] { unicode-bidi: bidi-override; unicode-bidi: -webkit-isolate-override; unicode-bidi: -moz-isolate-override; unicode-bidi: -ms-isolate-override; unicode-bidi: isolate-override; direction: rtl; }
  • If CSS can't be assumed, use markup: "dir=ltr/rtl" for blocks and use "<bdi dir=ltr/rtl>...</bdi>" or "<element><bdi dir=ltr/rtl>...</bdi></element>" for inlines.

We can speed up the transition period by getting the required CSS fragment into libraries.

Benefits

  • Keeps the existing "dir" attribute, which is reasonably well-known among authors, which reduces teaching burden.
  • New values are even shorter than existing values, which helps encourage people to use them.
  • Avoids the risk of transposition typos that the "lri/rli/rtl/ltr" proposal had.
  • The transition plan works slightly better in the future than for Proposal C - the inline fallback wiht can technically continue to be used after browsers implement isolation, but it's less convenient than just dropping the element entirely, so people are more likely to do just the new stuff as it's less work.
  • The transition markup for this is actually less characters than for the "direction" attribute in proposal C's transition markup, in both cases.
  • If an implementation already has support for isolation, making it accept new "dir" values is completely trivial - even less work than making it recognize a new "direction" attribute.

Drawbacks

  • Content authors need to learn to use the new values and move away from the old ones.
  • The bdi workaround for inline elements is cumbersome and requires authors to be aware of and use different approaches for inline vs. block elements.
  • Content authors need to do something special during the transition period, and decide when the transition period is sufficiently complete to allow them to only use the new values.
  • You can't assume CSS is available in all situations, eg. for blog or tweet aggregators, etc. This means that the CSS-only approach needs to be used with care.
  • rtl and ltr still sound like the default values - the reasons for using r and l are difficult to understand, and so authors may continue to use ltr/rtl without understanding why they should change.
  • During the transition, different approaches are needed for inline vs block elements.
  • The CSS appears to not work on IE7 and below, ie. if you have <span dir="r", you will get neither isolation nor embedding. In other words, pages using dir=r will not work at all on IE7.

The value names

Initial feedback from discussion during the i18n telecon was the question about whether there is a better alternative than r and l as value names.

Note that we need to choose value names that are sufficiently distinct from rtl and ltr to avoid confusion and simple mistakes, and yet need to be intuitive and similar enough to be readily understandable.

One suggestion was to use rl and lr, although it's not clear whether these are far enough removed from rtl/ltr to avoid confusion. Another suggestion was to us rli and lri, which are the same names as used for the isolating Unicode control characters, but content authors should not be concerned with what the i stands for. Furthermore, rli/lri are subject to the same typos as are seen when content authors transpose letters in rtl and ltr.

Proposal C: A new 'direction' attribute

This is a proposal for a new attribute for HTML5 called "direction" with values "ltr", "rtl" and "auto". The new attribute can be used on both block and inline elements, but for the latter automatically applies isolation (block elements are already isolated). The existing “dir” attribute retains its current non-isolate semantics, but its use should be discouraged and the intention is that eventually for new content dir will be completely replaced by the direction attribute. In the interim, dir and direction can be used side by side to manage the transition: when both are given the browser should use the direction attribute if it supports it.


Benefits

  • The use of the name 'direction' is clearly understandable by content authors, matches the direction property in CSS, and does not rely on them making choices based on understanding the value of isolation.
  • Authors continue to use rtl and ltr values, so this is intuitive and easy.
  • Using two attributes makes it possible to transition in a way that provides a guaranteed fallback for browsers that don't support the new attribute. If direction is not supported, dir takes over. There is no need for CSS to produce the expected outcome.
  • It allows for a clean break in usage from dir without supposing that authors understand the value of isolation and choose the right values. All advice about how to manage bidi text can simply recommend use of the direction attribute, rather than the dir attribute, and deprecation of dir will encourage authors to make the switch.
  • it doesn't rely on CSS being available.


Drawbacks

  • The name "direction" is significantly longer. This is significant for "plumbing"/"structural" attributes, which should get out of the way and not distract from the "meaningful" attributes and content. It also means that the transition from "dir" to "direction" will be harder, since people like short names.
  • Having two attributes that do the same thing is confusing. We've already suffered through this with things like "lang", and it's painful and confusing to tell exactly which attribute authors should be using.
  • The transition plan (add both "dir" and "direction") makes for ugly markup, which means it's less likely to be used by authors (see the argument about "plumbing" attributes, above). Also, this kind of advice is precisely the kind of thing that is nearly certain to be cargo-culted FOREVER, since there's no functional downside to doing it long after all browses support the new stuff.

Alternative names for the direction attribute

The name 'direction' is a little on the long side, and we considered various alternatives, such as bdi, bd, idir, etc. The alternatives were all rejected either because they assumed a knowledge of and interest in bidi isolation on the part of the author, or because it didn't work as well on block elements as on inline.

The name 'direction' is a very simple and intuitive name, which can be used anywhere, and, moreover and usefully, it appears to mean the same as dir. in fact, for some authors, 'direction' will be a lot more meaningful and memorable than 'dir'.

Other alternatives that were rejected

This section lists alternative ideas that were considered, and why they were rejected.


Use of bdi

Currently, HTML5 requires the use of to get a bidirectional isolate when the first-strong auto-direction heuristics of dir=”auto” are inappropriate. To get a bidirectional embedding, on the other hand, one does not need to use an additional element; all you have to do is put a dir=”ltr|rtl” on an existing “inline” element. When one considers that bidi isolates are what embeddings should have been all along, and should be used in new documents instead of using the old-style non-isolate embeddings, the fact that it is more difficult to set up an isolate than to set up an embedding begins to look quite strange.

This lack of symmetry is unique to HTML5. In CSS3, the choice between unicode-bidi:embed vs unicode-bidi:isolate, and in Unicode, the choice between LRE|RLE...PDF vs LRI|RLI...PDI is entirely symmetrical.

But, after all, what’s an extra element between friends? Yes, it is easier to write

<a dir=”rtl” href=”...”>פיצה</a>

than to write

<a href=”...”>פיצה</a>

or

<a href=”...”>פיצה</a>

but who cares?

We care:

  • As long as isolates are more difficult to set up than embeddings, embeddings will be the default, and isolates the exception; the use of isolates will not replace the use of embeddings.
  • A single attribute has historically been and should continue to be sufficient to do all the bidi in HTML. Why should the preferred way to embed opposite-direction content inline now require the use of both a special-purpose element () and a special attribute (dir)?
  • HTML document authors must be instructed that when a “block” element like <p> gets opposite-direction content, they should indicate it by putting a dir attribute on that element. For “inline” elements, however, it depends. An element like <textarea> or <input> or <option> whose content is inherently “out-of-flow” and thus directionally isolated can also get the dir attribute directly on it. However, when an “ordinary” “inline” element like <cite> gets opposite-direction content, they should not just put the dir attribute directly on it, but on a special <bdi> element especially inserted for that purpose either within the <cite> or around it. (Which, by the way?) As for <a>, put the dir attribute directly on it if it has “block” descendants, but add a <bdi> otherwise. The distinctions are impossible to justify or explain!
  • When an HTML or XHTML document tags a data item with microformatting or some other form of data export, it makes good sense to also indicate the data item’s direction using an attribute on the tagged element, so that consumers of the data will know how to display it properly. It makes little sense to put it on a surrounding element, where consumers of the data will ignore it (unless they bother to ask for the tagged element’s computed direction style) or on an element especially introduced within the tagged element for the purpose of carrying the attribute, suddenly turning what had been a nice plain-text data item into HTML. If the attribute goes on the tagged element, and it happens to be inline, we want it to be isolated, so now the tagged element suddenly has to be <bdi>. Do we need to update the RFCs on microformatting to require the use of <bdi> for all microformatting (except where a “block” element is used)?

In brief, we must make it possible to set up bidi isolates by using a direction attribute alone.

Use dir plus an additional isolate attribute

Another possibility is to create a new attribute such as isolate, which would be used to complement dir. This ensures that lagging browsers display the new-style markup at least as well as HTML4.

One problem with using an additional attribute (such as in <span dir=rtl isolate=yes>...</span>) is that it doesn't encourage use of isolation by default. It also adds a significant, permanent burden for the author creating bidi text since this markup will be used anywhere there are directional changes in pages written in right-to-left scripts (and that's a lot). The additional effort required to create extra markup is no longer insignificant in such a context. It also appears to place a choice before authors which requires them to understand the concepts related to isolation vs. non-isolation: this is actually not something they need to concern themselves with.

Bidi isolation applied by custom styling

If new pages should be using bidi isolates, and we want to make it possible to set up a bidi isolate via the dir attribute alone, but can not break backward compatibility by making dir=”ltr” and dir=”rtl” themselves define isolates by default, perhaps the answer is to recommend to content authors to do so in their own custom stylesheets. It’s easy enough, something like

[dir='ltr'] { unicode-bidi: embed; unicode-bidi: -webkit-isolate;
     unicode-bidi: -moz-isolate; unicode-bidi: -ms-isolate; unicode-bidi:
     isolate; direction: ltr; }
[dir='rtl'] { unicode-bidi: embed; unicode-bidi: -webkit-isolate;
     unicode-bidi: -moz-isolate; unicode-bidi: -ms-isolate; unicode-bidi:
     isolate; direction: rtl; }
bdo[dir='ltr'] { unicode-bidi: bidi-override; unicode-bidi:
     -webkit-isolate-override; unicode-bidi: -moz-isolate-override;
     unicode-bidi: -ms-isolate-override; unicode-bidi: isolate-override;
     direction: ltr; }
bdo[dir='rtl'] { unicode-bidi: bidi-override; unicode-bidi:
     -webkit-isolate-override; unicode-bidi: -moz-isolate-override;
     unicode-bidi: -ms-isolate-override; unicode-bidi: isolate-override;
     direction: rtl; }

Obviously, to avoid breaking existing documents, document authors should only do this for new documents built to use bidi isolates.

Unfortunately, this old/new dichotomy is extremely inconvenient for document authors with a large existing codebase. Any software library that generates HTML that may use the dir attribute must either use it as an isolate or as a non-isolate embedding. Thus, it can not be used to generate both “old” and “new” pages. The move from “old” to “new” thus can not be made gradually: either a page and all the software used to generate it is “old”, or the page and all the the software used to generate it is “new”. There is no middle ground.

This approach is also problematic when HTML needs to be converted to plain text, since the choice between LRE|RLE...PDF and LRI|RLI...PDI is no longer apparent in the mark-up, but must be relegated to consulting the style.

Background

Bidirectional isolation

Over the last couple of years, the CSS3 and HTML5 standards have added a new feature to ease dealing with bidirectional text: bidi isolates. Bidirectional isolates are expected to make it much easier to insert text data that contains (or may contain) text of the direction opposite to the context, e.g. Hebrew or Arabic text in an English or Russian-language page, and vice-versa, without unduly affecting the display of the content around it.

A bidirectional isolate directionally isolates its contents from its surroundings:

  • The content inside the isolate has no effect on the bidirectional ordering of the content surrounding the isolate.
  • The content surrounding the isolate has no effect on the bidirectional ordering inside the isolate.
  • The element as a whole has the effect of a neutral character on the visual order of surrounding content, regardless of its dir attribute value.

In HTML, this feature is currently exposed primarily via the new element. Thus,

<bdi dir=”rtl”>פיצה: <a href=”...”>5 reviews</a>

is quite unsurprisingly displayed as:

פיצה‎: 5 reviews

This contrasts to the effect of a traditional bidi embedding, which is what one gets when one puts a dir=”ltr” or dir=”rtl” on an inline element other than <bdi>. An embedding usually has the same effect on the visual ordering of the surrounding content as a strong character of the same direction. Thus, for example,

<span dir=”rtl”>פיצה: <a href=”...”>5 reviews</a>

displays the same as without the <span dir=”rtl”> i.e. the rather more surprising and quite useless

פיצה: 5 reviews

where the number “stuck” to the RTL text preceding it by the rules of the Unicode Bidirectional Algorithm for embeddings (as opposed to isolates).

HTML5 also directionally isolates any element that uses the new “auto” value of the dir attribute, which also sets its direction according to its first strong character. And, in fact, dir=”auto” is the default for <bdi> (unless it is given an explicit dir=”ltr” or “dir=”rtl”). Despite the considerable overlap in functionality between the <bdi> element and dir=”auto”, bidirectional isolation and automatic direction are functionally distinct. While dir=”auto” does provide bidirectional isolation, it should only be used for content of unknown direction. When the direction is known (e.g. a phone number is always LTR) or when an estimation method other than dir=”auto”’s first strong must be employed, the dir attribute must be given an explicit “ltr” or “rtl” value, but bidirectional isolation is still equally important.

For both dir=”auto” and <bdi>, bidirectional isolation is actually achieved via the new “isolate” value of the CSS property unicode-bidi. In other words, the exact and only difference between <span dir=”...”> and <bdi dir=”...”> is that the former is by default assigned unicode-bidi:embed, while the latter gets unicode-bidi:isolate.

The Unicode technical committee is currently in the process of adding bidirectional isolates to Unicode 6.3, with new bidi classes that are the isolate equivalents of LRE, RLE, and PDF: namely LRI, RLI, and PDI. There is also FSI, “first strong isolate”.

Do we still need non-isolate embeddings?

Over the past year it has become increasingly clear that if the concept of bidirectional isolation had been around and its benefits understood at the time that bidirectional embeddings were being worked into the Unicode standard back in 1999, bidirectional embeddings would have been defined with isolate semantics. LRE and RLE (and thus dir=”ltr” and dir=”rtl” on any inline element) would have been behaving as bidi isolates all along. This statement is not controversial. In fact, it is the current consensus.

The reason that embeddings were originally defined to have (basically) the same effect on the text surrounding them as that of strong characters is that, after all, they wrap strong text. But the only reason that strong characters have such a strong effect in the first place is to heuristically arrive at a reasonable implicit ordering for bidirectional text. These implicit heuristics are misplaced when the exact boundaries of directionality have been explicitly indicated. Thus, it makes perfect sense to make embeddings have the same effect on surrounding text as a neutral character.

Here are a couple of examples of edge cases where authors have put workarounds in place that try to make embedded text behave like isolated text. This bidi cruft only works given the old embed semantics.


Example 1

In LTR content you may have two RTL items side by side that should appear in LTR order, with a separator. This is sometimes the case for lists. For example: "items TSRIF, DNOCES"

In all the following source code samples the characters are shown in strictly logical order in the source text.

If the source code was:

<p>items FIRST - SECOND</p>

you would see "items DNOCES - TSRIF", which may not be what we want in this example.

It is possible (though not recommended) under embed semantics (but not under isolate semantics) to order the two RTL items from left to right if you put a span with dir=ltr around the hyphen, ie.

<p>items FIRST <span dir=ltr>-</span> SECOND</p>

you would see the items listed LTR as desired, ie. "items TSRIF - DNOCES"

Note, however, that this is cruft and that only works if you have spaces around the separator. For example, the following, more common example of a list without a space:

<p>items FIRST<span dir=ltr>,</span> SECOND</p>

would not work, and would produce "items ,TSRIF DNOCES"

In the case of the hyphen, if dir has isolating semantics, the isolated items are treated as neutral characters, so you would see: "items DNOCES - TSRIF"

One way to achieve the desired effect in HTML4, regardless of whether the dir attribute uses embed or isolate semantics, is to use an LRM character between the items, like this

<p>items FIRST &lrm;- SECOND</p>

In HTML5, once isolating semantics are supported on dir, the best way would be to wrap each of the opposite-direction phrases in markup, like this:

<p>items <span dir=rtl>FIRST</span> - <span dir=rtl>SECOND</span></p>

This is a more intuitive and effective method of marking up this kind of bidi text.


Example 2

It is entirely possible that two adjacent RTL items need to be displayed in RTL order, for example when showing a heading and body, eg: ".YDOB .GNIDAEH".

If the source code was: <p>HEADING. BODY.</p> you would see "YDOB .GNIDAEH.", with the body's final dot in the wrong visual position. This is of course due to the lack of markup indicating the direction of the heading and body. When we want to add such markup, however, we are faced with a problem when the heading and body are given to us as separate data items each with its own direction. The naive way to handle that is to put the heading and body in successive spans, each with a dir attribute to indicate its direction:

<p><span dir="rtl">HEADING.</span> <span dir="rtl">BODY.</span></p>

The problem is that this only gets the desired ordering under embed semantics. Under isolate semantics, each of the spans is treated as a neutral in the overall ordering, so the actual result is the undesirable ".GNIDAEH .YDOB".

However, the right way to deal with this situation is to **express our desire for a single direction across the two items** explicitly, by putting the two fields in a single directional unit. There are several ways to do that, e.g.

<p dir="rtl">HEADING. <span dir="rtl">BODY.</span></p>

or

<p><span dir="rtl"><span dir="rtl">HEADING.</span> BODY.</span></p>

Either gives the intended display under both embed and isolate semantics.



In cases where adjacent RTL items in LTR content have been marked up with dir, the embedded semantics would combine the two into a single directional run. For example: <p>items <span dir=rtl>FIRST</span> <span dir=rtl>SECOND</span></p> would result in: "items DNOCES TSRIF" In this second use case we actually want With the isolating semantics, the two RTL items are treated as neutrals, and the result would be "items TSRIF DNOCES" This should have been marked up with a single span or no span at all. That would produce the expected order, ie. "items DNOCES TSRIF"

Internet Explorer behaviour

In Internet Explorer versions 7 and below, as well as in IE8 and above when working in quirks mode or IE7 mode, inline elements bearing the dir attribute affect the visual ordering of their surroundings just like a strong character of the element's directionality. Thus,

<div dir="ltr">א ==> <span dir="rtl">*</span></div>

is displayed as

* <== א

and

<div dir="ltr"><span dir="rtl">*</span> ==> ב</div>
<div dir="ltr"><span dir="rtl">*</span> ==> 123</div>

is displayed as

123 <== *

This is because the RTL span forms a single RTL run with the RTL character before or it or the number or RTL character after it, even when separated from it by some neutrals.

Since this is usually the effect that directional embedding is supposed to have under the Unicode Bidirectional Algorithm (and since its exceptions, like empty embeddings and nested embeddings where the inner embedding is at the very beginning or at the very end of the outer embedding, are not commonly encountered), it is fair to say that IE7 closely approximates embedding semantics for the dir attribute on inline elements.

This is no longer true in IE8 and above (except, of course, when working in quirks mode or in IE7 mode). Here, an inline element bearing the dir attribute affects the visual ordering of its surroundings as if it were immediately preceded by an invisible character of the element's directionality, but immediately followed by an invisible character of the parent element's directionality. Thus,

<div dir="ltr">א ==> <span dir="rtl">*</span></div>

is displayed as

* <== א

which is the same as IE7 and below, but

<div dir="ltr"><span dir="rtl">*</span> ==> ב</div>
<div dir="ltr"><span dir="rtl">*</span> ==> 123</div>

is displayed as

* ==> ב
* ==> 123

which is the opposite ordering from IE7, and from all other major browsers.

This unusual approach cannot be said to approximate the standard embedding semantics. While it certainly is not isolation either, its effects are actually the same as isolation when both of the following conditions are satisfied:

  • The dir attribute value assigned to the inline element is the opposite of its parent element's directionality.
  • If the first strong character preceding an inline element with a dir attribute has the same directionality as that element, it too is inside an inline element with a dir attribute.

These conditions are actually more commonly satisfied than not, because:

  • It is usually redundant to put a dir attribute on an element if its parent already has that directionality.
  • If a software application creating a web page bothers to declare the directionality of one piece of opposite-direction text that it needs to display inline, it is likely to do the same for another.

Thus, one could say that under the most common circumstances, the behavior of IE8 and up is closer to isolation than to embedding. But even if that seems like a stretch, it is quite safe to say that currently there is a lack of interoperability in the behavior of the dir attribute between the current versions of IE and of the other major browsers (which continue to follow the current HTML specification and give the dir attribute embedding semantics).

Given that:

  • The currently specified semantics of the dir attribute, embedding, are known to be inferior to (and are deprecated by Unicode in favor of) isolation, and
  • There is already a lack of interoperability in dir attribute behavior between major browsers, and
  • The behavior of one major browser is, under the most common citrcumstances, akin to isolation

then it seems like the best solution to the dir attribute semantics problem is to simply change the HTML specification for the dir attribute to use isolation instead of embedding.

In other words, the HTML default stylesheet specification for dir="ltr" and dir="rtl" should be changed to result in unicode-bidi:isolate (or, for , unicode-bidi:isolate-override), instead of unicode-bidi:embed (or, for , unicode-bidi:bidi-override).