This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
As has surfaced in the discussion of bug 10809, it would be helpful to declare invalid documents where any element's text node children (*not* descendants generally) contain improperly balanced LRE, RLE, LRO, RLO, or PDF characters. In other words, for the purposes of validation, treat every LRE, RLE, LRO, or RLO character as the opening tag of an imaginary element, something like <bidi-formatting>, and PDF as that imaginary element's closing tag. This applies to these character's entities, as well, of course. Examples of invalid usage: 1. <div>‪</div> 2. <div>‬</div> 3. <div>‬‪</div> 4. <div>‪‪‬</div> 5. <div>‪<br>‪‬</div> 6. <div>‪<span>‬</span></div> 7. <div><span>‪</span>‬</div> An example of valid (but not recommended!) usage: <div>‪<span>...</span>‬</div>
This shouldn't be too hard to add to the spec.
What about attribute values?
(In reply to comment #2) > What about attribute values? Not sure what you mean.
I mean, should the following be invalid? <p title="‪">
Yes.
(In reply to comment #5) > Yes. Yeah, it makes sense. They should be balanced within an attribute value.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Accepted Change Description: see diff given below Rationale: Concurred with reporter's comments.
Checked in as WHATWG revision r5754. Check-in comment: Define conformance criteria around bidi formatting characters http://html5.org/tools/web-apps-tracker?from=5753&to=5754
mass-move component to LC1
The checked-in change seems to say that the use of the formatting characters (when restricted as specified) is perfectly fine: "[Text content] may contain characters in the range U+202A to U+202E (the bidirectional-algorithm formatting characters)." "Note: *For convenience*, where possible authors will likely prefer to use the dir attribute, the bdo element, and the bdi element, rather than maintaining the bidirectional-algorithm formatting characters manually." (emphasis mine) The use of the formatting characters, even when they obey the given rules, should still be discouraged. It is *not* equivalent to the use of the dir attribute and the bdo element, for two reasons. (BTW, the bdi element should not be mentioned at all. There is no way to faithfully emulate its behavior using the formatting characters.) 1. The dir attribute sets the element's directionality. The formatting characters don't. That means that they do no affect the proposed CSS4 :dir(ltr|rtl} pseudo-class. 2. When used around an element that introduces bidi paragraph break, e.g. "LRE <br> PDF" or "LRE <div></div> PDF", the formatting characters go completely haywire, since the paragraph break resets the bidirectional state, so that the effect of the opening character is lost after the paragraph break, and the closing formatting character is unmatched. The effects of the dir attribute, on the other hand, are carefully defined in CSS (via its effect on unicode-bidi) to be reopened after the paragraph break. Neither of these can be fixed. Thus, the use of the formatting characters, even when they obey the given rules, should be discouraged wherever mark-up can be used instead. The bug as opened suggested ruling certain uses of formatting characters completely invalid. It did not suggest pronouncing the remaining use perfectly fine. Certainly the use of the dir attribute etc. is more than a matter of convenience. It is *the only recommended way* of declaring text direction in HTML (except for those places where mark-up can not be used, e.g. inside <option> and <title>). The use of both CSS and formatting characters for this purpose is discouraged (for different reasons).
(In reply to comment #10) > Certainly the use of the dir attribute etc. is more than a matter of > convenience. It is *the only recommended way* of declaring text direction in > HTML While I do not disagree with you on this point (which is to say, I agree), I think we should not go as far as recommending against ("should not"). The BiDi control characters can come in handy when different sources product the HTML entities and the content, and are sometimes the only practical option available. Shachar
(In reply to comment #10) > (BTW, the bdi element should > not be mentioned at all. There is no way to faithfully emulate its behavior > using the formatting characters.) I disagree on this point; you can't faithfully emulate <bdi> with formatting characters as it's not equivalent to any one of them, but some of the problems that can are solved with formatting characters (like ‏) are better solved with <bdi>, so this cross-reference should be given.
(In reply to comment #12) > (In reply to comment #10) > > (BTW, the bdi element should > > not be mentioned at all. There is no way to faithfully emulate its behavior > > using the formatting characters.) > > I disagree on this point; you can't faithfully emulate <bdi> with formatting > characters as it's not equivalent to any one of them, but some of the problems > that can are solved with formatting characters (like ‏) are better solved > with <bdi>, so this cross-reference should be given. Currently, the sentence says that the mark-up is just a convenience that translates to formatting characters, which is not really true for dir= and <bdo>, and completely untrue for <bdi>. If the sentence is changed to encourage people to use dir=, <bdo>, and <bdi> instead of formatting characters, then I fully agree with fantasai.
(In reply to comment #11) > (In reply to comment #10) > > Certainly the use of the dir attribute etc. is more than a matter of > > convenience. It is *the only recommended way* of declaring text direction in > > HTML > > While I do not disagree with you on this point (which is to say, I agree), I > think we should not go as far as recommending against ("should not"). The BiDi > control characters can come in handy when different sources product the HTML > entities and the content, and are sometimes the only practical option > available. > > Shachar The spec could recommend using directional mark-up instead of directional formatting characters whenever feasible. It could also have a note warning that placing an element between an LRE, RLE, LRO, or RLO and its matching PDF does not work well with various HTML and CSS features, and has effects that vary radically depending on the element's style.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Accepted Change Description: see diff given below Rationale: Concurred with reporter's comments. Specifically, I changed the spec to encourage authors to use the elements instead, and made the conformance rules not allow "LRE <div></div> PDF".
Checked in as WHATWG revision r6487. Check-in comment: More useful conformance rules and advice for bidi formatting characters http://html5.org/tools/web-apps-tracker?from=6486&to=6487
The change looks great, with two small flaws: 1. The treatment given to an element that is flow content but is not also phrasing content should be extended to <br>, which also serves as a bidi paragraph break, and thus (by design) terminates the effects of the bidi formatting characters. 2. The comment that the formatting characters interact poorly with CSS is too narrow - they also interact poorly with some HTML features (even when used as currently spec'ed). An example: <div dir=rtl>‪If this works I will eat my <input />.‬</div> The <input> will have RTL directionality despite being between an LRE and its matching PDF. I am not suggesting adding this example or changing the validity spec - just expanding the note to include some unspecified HTML features (as opposed to just CSS).
I'll add something about <br>. > <div dir=rtl>‪If this works I will eat my <input />.‬</div> That's not a poor interaction IMHO.
BTW> The I18N WG supported re-opening this bug and Aharon's comments generally (I18N-ACTION-66). In looking at the changes, I note that there may be a very minor typo where is says: -- The strings resulting from the applying the following algorithm... -- It should say "The string", since "output" is a single string?
(In reply to comment #19) > It should say "The string", since "output" is a single string? "output" is a list of strings. EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Partially Accepted Change Description: see diff given below Rationale: Addressed the <br> issue.
Checked in as WHATWG revision r6533. Check-in comment: Make sure <br> is handled right in the requirements regarding bidi formatting characters. http://html5.org/tools/web-apps-tracker?from=6532&to=6533