This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Comment from the i18n review of: http://dev.w3.org/html5/spec/ Comment 7 At http://www.w3.org/International/reviews/html5-bidi/ Editorial/substantive: S Tracked by: AL Location in reviewed document: undefined [http://dev.w3.org/html5/spec/spec.html#contents] Comment:This is a part of the proposals made by the "Additional Requirements for Bidi in HTML" W3C First Public Working Draft. For a full description of the use cases, please see http://www.w3.org/International/docs/html-bidi-requirements/#newline-as-separator [http://www.w3.org/International/docs/html-bidi-requirements/#newline-as-separator] . Here is the proposal made there: In elements where line breaks are not collapsed, e.g. <textarea> and elements with white-space:pre|pre-line|pre-wrap, line breaks should constitute UBA paragraph breaks. When a line break introduces a UBA paragraph break, the base direction of the new UBA paragraph will be determined by the computed direction of the nearest ancestor element whose bidi properties require its contents to be in a separate UBA paragraph (or sequence of paragraphs), e.g. a block element or an element directionally isolated by the ubi attribute (which is being proposed in a separate bug). Furthermore, for every element on the path in between that results in the creation of an embedding or override level, e.g. a <bdo> element or any element with a dir attribute or a value other than "normal" for the unicode-bidi CSS property, the correspondeng embedding or override level is re-introduced at the start of the new UBA paragraph (to be closed at the end of the element or the UBA paragraph, whichever comes first).
Shouldn't this be defined in CSS instead of HTML, at least the part that ties behavior to specific white-space values? See also bug 10810
I don't understand what HTML has to do here. Could you give an example that doesn't use CSS that demonstrates the ambiguity in the spec today?
(In reply to comment #2) > I don't understand what HTML has to do here. > > Could you give an example that doesn't use CSS that demonstrates the ambiguity > in the spec today? Here is an example. Uppercase Latin letters are used to represent RTL characters: <textarea dir=rtl> 1. IT IS IMPORTANT TO LEARN html. 2. css IS IMPORTANT TOO. </textarea> The correct display should be: .html NRAEL OT TNATROPMI SI TI .1 .OOT TNATROPMI SI css .2 Currently, because the bidi behavior of line breaks in <textarea> has not been explicitly specified, the display in Firefox and Opera is: html. NRAEL OT TNATROPMI SI TI .1 .OOT TNATROPMI SI 2. css This is unreadable. The exact same example could have been given with <pre> instead of <textarea>, and that is why I originally formulated this bug as "In elements where line breaks are not collapsed, e.g. <textarea> and <pre>, line breaks should constitute UBA paragraph breaks." However, I then realized that <div style="white-space:pre"> should also be covered, just like <pre>, and replaced "<pre>" with "elements with white-space:pre|pre-line|pre-wrap". As you point out, that was a mistake: the part about elements with white-space:pre|pre-line|pre-wrap should only be in the CSS spec. The HTML spec should just speak of <textarea> and <pre>.
Doesn't Unicode define the bidi behaviour of U+000A?
(In reply to comment #4) > Doesn't Unicode define the bidi behaviour of U+000A? Yes, it does, and the proposal is compatible with it. But the Unicode standard is not definitive for HTML. After all, newlines are usually whitespace-collapsed in HTML. The ambiguity was sufficient to allow Firefox and Opera to treat line breaks as bidi whitespace for all this time.
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document: http://dev.w3.org/html5/decision-policy/decision-policy.html Status: Rejected Change Description: no spec change Rationale: Unicode is a normative reference, and except where otherwise specified, it is therefore entirely normative. The few places where it _is_ otherwise specified are e.g. in the rendering section, where the specification defines the suggested behaviour in terms of CSS, which itself overrides Unicode in certain specific places but by and large defers straight to Unicode, normatively. So I don't think that this is undefined. Firefox and Opera are wrong here. File bugs, pointing to Unicode. If they disagree, send me links to the bugs where they disagree and I'll help out in whatever way I can.
I filed this Mozilla bug: https://bugzilla.mozilla.org/show_bug.cgi?id=607541
For the record, this bug was actually fixed (or nearly so) by <http://html5.org/tools/web-apps-tracker?from=5669&to=5670>. (I say "nearly" because that change uses the term "newline". I think that at least in the past the precise terminology here would have been "line break", thus covering <CR>, <LF>, and combinations.)
Aharon's suggestion is fine with me. I would like to see the wording tweaked so that U2028 U2029 U000A U000D all constitute UBA paragraph breaks, -- and with this specification applying to both the pre- and text area elements. Best, --C. E. Whitehead cewcathar@hotmail.com
(In reply to comment #8) > For the record, this bug was actually fixed (or nearly so) by > <http://html5.org/tools/web-apps-tracker?from=5669&to=5670>. (I say "nearly" > because that change uses the term "newline". I think that at least in the past > the precise terminology here would have been "line break", thus covering <CR>, > <LF>, and combinations.) Should we file a new bug requesting this change?
(In reply to comment #10) Filed bug 11436.
Minor correction to Comment 9: U2028 should probably be excluded from the list. U2028 is LS, which, by UAX#13, only breaks the line without affecting the flow.