This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10812 - i18n comment 7 : line breaks in textarea and pre elements
Summary: i18n comment 7 : line breaks in textarea and pre elements
Status: CLOSED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-29 12:37 UTC by i18n bidi group
Modified: 2011-01-22 20:14 UTC (History)
11 users (show)

See Also:


Attachments

Description i18n bidi group 2010-09-29 12:37:42 UTC
Comment from the i18n review of:
http://dev.w3.org/html5/spec/

Comment 7
At http://www.w3.org/International/reviews/html5-bidi/
Editorial/substantive: S
Tracked by: AL

Location in reviewed document:
undefined [http://dev.w3.org/html5/spec/spec.html#contents]

Comment:This is a part of the proposals made by the "Additional Requirements for Bidi in HTML" W3C First Public Working Draft. For a full description of the use cases, please see 
http://www.w3.org/International/docs/html-bidi-requirements/#newline-as-separator [http://www.w3.org/International/docs/html-bidi-requirements/#newline-as-separator]
. Here is the proposal made there:

In elements where line breaks are not collapsed, e.g. <textarea> and elements with white-space:pre|pre-line|pre-wrap, line breaks should constitute UBA paragraph breaks.

When a line break introduces a UBA paragraph break, the base direction of the new UBA paragraph will be determined by the computed direction of the nearest ancestor element whose bidi properties require its contents to be in a separate UBA paragraph (or sequence of paragraphs), e.g. a block element or an element directionally isolated by the ubi attribute (which is being proposed in a separate bug). Furthermore, for every element on the path in between that results in the creation of an embedding or override level, e.g. a <bdo> element or any element with a dir attribute or a value other than "normal" for the unicode-bidi CSS property, the correspondeng embedding or override level is re-introduced at the start of the new UBA paragraph (to be closed at the end of the element or the UBA paragraph, whichever comes first).
Comment 1 Maciej Stachowiak 2010-09-29 16:08:24 UTC
Shouldn't this be defined in CSS instead of HTML, at least the part that ties behavior to specific white-space values?

See also bug 10810
Comment 2 Ian 'Hixie' Hickson 2010-10-05 00:44:32 UTC
I don't understand what HTML has to do here.

Could you give an example that doesn't use CSS that demonstrates the ambiguity in the spec today?
Comment 3 Aharon Lanin 2010-10-06 20:32:48 UTC
(In reply to comment #2)
> I don't understand what HTML has to do here.
> 
> Could you give an example that doesn't use CSS that demonstrates the ambiguity
> in the spec today?

Here is an example. Uppercase Latin letters are used to represent RTL characters:

<textarea dir=rtl>
1. IT IS IMPORTANT TO LEARN html.
2. css IS IMPORTANT TOO.
</textarea>

The correct display should be:

          .html NRAEL OT TNATROPMI SI TI .1
                   .OOT TNATROPMI SI css .2

Currently, because the bidi behavior of line breaks in <textarea> has not been explicitly specified, the display in Firefox and Opera is:

          html. NRAEL OT TNATROPMI SI TI .1
                   .OOT TNATROPMI SI 2. css

This is unreadable.

The exact same example could have been given with <pre> instead of <textarea>, and that is why I originally formulated this bug as "In elements where line breaks are not collapsed, e.g. <textarea> and <pre>, line breaks should constitute UBA paragraph breaks."

However, I then realized that <div style="white-space:pre"> should also be covered, just like <pre>, and replaced "<pre>" with "elements with white-space:pre|pre-line|pre-wrap".

As you point out, that was a mistake: the part about elements with white-space:pre|pre-line|pre-wrap should only be in the CSS spec. The HTML spec should just speak of <textarea> and <pre>.
Comment 4 Ian 'Hixie' Hickson 2010-10-12 09:19:37 UTC
Doesn't Unicode define the bidi behaviour of U+000A?
Comment 5 Aharon Lanin 2010-10-12 09:34:23 UTC
(In reply to comment #4)
> Doesn't Unicode define the bidi behaviour of U+000A?

Yes, it does, and the proposal is compatible with it. But the Unicode standard is not definitive for HTML. After all, newlines are usually whitespace-collapsed in HTML. The ambiguity was sufficient to allow Firefox and Opera to treat line breaks as bidi whitespace for all this time.
Comment 6 Ian 'Hixie' Hickson 2010-10-14 07:48:13 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale:

Unicode is a normative reference, and except where otherwise specified, it is therefore entirely normative. The few places where it _is_ otherwise specified are e.g. in the rendering section, where the specification defines the suggested behaviour in terms of CSS, which itself overrides Unicode in certain specific places but by and large defers straight to Unicode, normatively.

So I don't think that this is undefined. Firefox and Opera are wrong here. File bugs, pointing to Unicode. If they disagree, send me links to the bugs where they disagree and I'll help out in whatever way I can.
Comment 7 Ehsan Akhgari [:ehsan] 2010-10-27 02:51:07 UTC
I filed this Mozilla bug:

https://bugzilla.mozilla.org/show_bug.cgi?id=607541
Comment 8 Aharon Lanin 2010-11-08 12:04:34 UTC
For the record, this bug was actually fixed (or nearly so) by <http://html5.org/tools/web-apps-tracker?from=5669&to=5670>. (I say "nearly" because that change uses the term "newline". I think that at least in the past the precise terminology here would have been "line break", thus covering <CR>, <LF>, and combinations.)
Comment 9 CE Whitehead 2010-11-23 23:40:28 UTC
Aharon's suggestion is fine with me. I would like to see the wording tweaked so that U2028 U2029 U000A U000D all constitute UBA paragraph breaks, 
-- and with this specification applying to both the pre-  and text area elements.  

Best,

--C. E. Whitehead
cewcathar@hotmail.com
Comment 10 Ehsan Akhgari [:ehsan] 2010-11-25 06:50:30 UTC
(In reply to comment #8)
> For the record, this bug was actually fixed (or nearly so) by
> <http://html5.org/tools/web-apps-tracker?from=5669&to=5670>. (I say "nearly"
> because that change uses the term "newline". I think that at least in the past
> the precise terminology here would have been "line break", thus covering <CR>,
> <LF>, and combinations.)

Should we file a new bug requesting this change?
Comment 11 Aharon Lanin 2010-11-30 09:24:24 UTC
(In reply to comment #10)

Filed bug 11436.
Comment 12 Amit Aronovitch 2010-12-02 07:29:37 UTC
Minor correction to Comment 9:

U2028 should probably be excluded from the list. U2028 is LS, which, by UAX#13, only breaks the line without affecting the flow.