This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 19067 - i18n comment 4 : at least by default, <br> should constitute a bidi paragraph break
Summary: i18n comment 4 : at least by default, <br> should constitute a bidi paragraph...
Status: RESOLVED LATER
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: HTML WG Bugzilla archive list
URL: http://www.w3.org/Bugs/Public/show_bu...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-09-25 22:03 UTC by contributor
Modified: 2012-09-26 08:26 UTC (History)
27 users (show)

See Also:


Attachments

Description contributor 2012-09-25 22:03:44 UTC
This was was cloned from bug 10828 as part of operation LATER convergence.
Originally filed: 2010-09-29 13:17:00 +0000
Original reporter: i18n bidi group <public-i18n-bidi@w3.org>

================================================================================
 #0   i18n bidi group                                 2010-09-29 13:17:48 +0000 
--------------------------------------------------------------------------------
Comment from the i18n review of:
http://dev.w3.org/html5/spec/

Comment 4
At http://www.w3.org/International/reviews/html5-bidi/
Editorial/substantive: S
Tracked by: AL

Location in reviewed document:
undefined [http://dev.w3.org/html5/spec/spec.html#contents]

Comment:This is a part of the proposals made by the "Additional Requirements for Bidi in HTML" W3C First Public Working Draft. For a full description of the use cases, please see 
http://www.w3.org/International/docs/html-bidi-requirements/#br-as-separator [http://www.w3.org/International/docs/html-bidi-requirements/#br-as-separator]
. Here is the proposal made there:

Support a new HTML element attribute, bidibreak=hard|soft. On a <br> element, the "soft" value means that the <br> is to be treated as a UBA bidi class WS (whitespace) character, as was 
required in HTML 4 [http://www.w3.org/TR/html4/struct/text.html#edef-BR]
. The "hard" value means that the <br> is to be treated as UBA bidi class B, i.e. paragraph break. If neither is specified, the bidibreak attribute value is inherited from the parent. Thus, when specified on an element other than <br>, bidibreak serves to determine the behavior of descendant <br> elements. For the root element, the default is "hard" (which, of course, spreads to every <br> element in the document, unless an intervening element sets bidibreak otherwise).

Alternatively, if and only if all major browser makers reach unanimous consensus that the default value for the root element should be "soft" and commit to implementing it as such to the HTML WG prior to the new HTML specification publication, that too would be fine.

When the author wants to use <br> just to wrap a line without adding bidi separation, <br bidibreak="soft"> will do the trick.

Reasonable use cases for specifying bidibreak="soft" on non-<br> elements would include an element containing poetry, as well as the root element of a document that relies on the bidi behavior specified for <br> by HTML 4.

When <br> introduces a UBA paragraph break, the base direction of the new UBA paragraph will be determined by the computed direction of the nearest ancestor element whose bidi properties require its contents to be in a separate UBA paragraph (or sequence of paragraphs), e.g. a block element or an element directionally isolated by the ubi attribute (which is being proposed in a separate bug). Furthermore, for every element between there and the <br> that results in the creation of an embedding or override level, e.g. a <bdo> element or any element with a dir attribute or a value other than "normal" for the unicode-bidi CSS property, the correspondeng embedding or override level is re-introduced at the start of the new UBA paragraph (to be closed at the end of the element or the UBA paragraph, whichever comes first).
================================================================================
 #1   Maciej Stachowiak                               2010-09-29 16:36:18 +0000 
--------------------------------------------------------------------------------
If this really needs to be expressed in markup, perhaps a new element would be better.

In particular, having a markup attribute that doesn't correspond to a CSS property but still inherits and affects rendering of other elements is an unusual pattern and would be awkward to implement.

Is there a Unicode character that creates a line break but has Unicode class WS instead of B? If so, that would make it easier to define what happens for the proposed "soft" line breaks.
================================================================================
 #2   CE Whitehead                                    2010-10-06 00:57:04 +0000 
--------------------------------------------------------------------------------
Hi I see no reason not to have this element adopt -- in the specifications -- the behavior that it currently has in ie -- this works well for most use cases, and so should probably be the default break behavior.

We can then add a soft break that would have -- in the specifications -- the behavior that this element currently is specified as having.  

Best,

--C. E. Whitehead
cewcathar@hotmail.com
================================================================================
 #3   Aharon Lanin                                    2010-10-06 20:59:23 +0000 
--------------------------------------------------------------------------------
(In reply to comment #1)
> If this really needs to be expressed in markup, perhaps a new element would be
> better.
> 
> In particular, having a markup attribute that doesn't correspond to a CSS
> property but still inherits and affects rendering of other elements is an
> unusual pattern and would be awkward to implement.
> 
> Is there a Unicode character that creates a line break but has Unicode class WS
> instead of B? If so, that would make it easier to define what happens for the
> proposed "soft" line breaks.

The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.

Regarding doing this through a new element, it would get the job done, but I have been warned that new elements are problematic in terms of support from existing software (e.g. how would an existing browser know that the new element does not need a closing tag?) and generally very hard to get in.
================================================================================
 #4   Maciej Stachowiak                               2010-10-06 22:51:49 +0000 
--------------------------------------------------------------------------------
(In reply to comment #3)
> (In reply to comment #1)
> > If this really needs to be expressed in markup, perhaps a new element would be
> > better.
> > 
> > In particular, having a markup attribute that doesn't correspond to a CSS
> > property but still inherits and affects rendering of other elements is an
> > unusual pattern and would be awkward to implement.
> > 
> > Is there a Unicode character that creates a line break but has Unicode class WS
> > instead of B? If so, that would make it easier to define what happens for the
> > proposed "soft" line breaks.
> 
> The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.
> 
> Regarding doing this through a new element, it would get the job done, but I
> have been warned that new elements are problematic in terms of support from
> existing software (e.g. how would an existing browser know that the new element
> does not need a closing tag?) and generally very hard to get in.

New global attributes are also hard to get in. And in this case, I think an inheriting global attribute is not as clean an approach.

Question: does including U+2028, either as a literal unicode character or as a numeric character reference, get the job done? Or does that character get affected by whitespace collapsing?
================================================================================
 #5   fantasai                                        2010-10-07 09:22:07 +0000 
--------------------------------------------------------------------------------
I believe the LINE SEPARATOR character is not supposed to be affected by white space handling in CSS unless the source document language defines it as equivalent to a LINE FEED or SGML RECORD-START/END token or similar.

(I'll note that implementations don't currently support it very well, though. It's usually either ignored or turned into boxes.)
================================================================================
 #6   Aharon Lanin                                    2010-10-11 08:17:22 +0000 
--------------------------------------------------------------------------------
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > If this really needs to be expressed in markup, perhaps a new element would be
> > > better.
> > > 
> > > In particular, having a markup attribute that doesn't correspond to a CSS
> > > property but still inherits and affects rendering of other elements is an
> > > unusual pattern and would be awkward to implement.
> > > 
> > > Is there a Unicode character that creates a line break but has Unicode class WS
> > > instead of B? If so, that would make it easier to define what happens for the
> > > proposed "soft" line breaks.
> > 
> > The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.
> > 
> > Regarding doing this through a new element, it would get the job done, but I
> > have been warned that new elements are problematic in terms of support from
> > existing software (e.g. how would an existing browser know that the new element
> > does not need a closing tag?) and generally very hard to get in.
> 
> New global attributes are also hard to get in. And in this case, I think an
> inheriting global attribute is not as clean an approach.
> 
> Question: does including U+2028, either as a literal unicode character or as a
> numeric character reference, get the job done? Or does that character get
> affected by whitespace collapsing?

As far as I am concerned, either bidibreak or and a new element is fine, and I would prefer to leave the choice up to the experts here.

Regarding LINE SEPARATOR, I guess what Maciej is proposing is a change in the spec that explicitly says that it is to be treated as a (bidi-soft) line break in all contexts and is not subject to whitespace collapsing. If so, the PARAGRAPH SEPARATOR (U+2029) should be treated similarly: a bidi-hard line break that is not subject to whitespace collapsing, i.e. exactly the same effect as <br>. That's because these two characters are a pair introduced into Unicode at the same time for the same reason: to provide unambiguous alternatives to newline (and the othet line break characters).

Such a solution would also be fine with me (as long as the <br> spec is changed to make it bidi-hard - or the browser manufacturers achieve a unanimous commitment to treat it as bidi-soft).

However, please note that http://unicode.org/reports/tr20/#Line currently says the following about U+2028 and U+2029:

"Problems when used in markup: Including these characters in markup text does not work where it would duplicate the existing markup commands for delimiting paragraphs and lines."

It is up to the HTML experts here to judge whether starting to support these characters in HTML contexts where appropriate mark-up can be used instead would be in keeping with the spirit of HTML, given that apparently this was not considered to be the case at some point in the past.
================================================================================
 #7   Ian 'Hixie' Hickson                             2010-10-12 10:36:29 +0000 
--------------------------------------------------------------------------------
Given how rarely <br> is allowed to be used (basically only in poems and addresses), what's the use case here?
================================================================================
 #8   Aharon Lanin                                    2010-10-13 16:53:12 +0000 
--------------------------------------------------------------------------------
(In reply to comment #7)
> Given how rarely <br> is allowed to be used (basically only in poems and
> addresses), what's the use case here?

There is a huge gap between how <br> is supposed to be used and how it is used in practice.
================================================================================
 #9   Ian 'Hixie' Hickson                             2010-10-13 18:21:55 +0000 
--------------------------------------------------------------------------------
Granted, but the idea when adding new features is to support use cases in whatever way leads to best practices, not to add band-aids to help people continue to write hard-to-maintain code.

What's the use case here?
================================================================================
 #10  Ian 'Hixie' Hickson                             2010-10-15 00:15:15 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: It's unclear what the use case is. Please describe the use case so that this proposal can be properly evaluated.
================================================================================
 #11  Aharon Lanin                                    2010-10-18 11:50:25 +0000 
--------------------------------------------------------------------------------
I have heard that part of what HTML5 is about is bringing what the spec says and what the browsers do more into alignment.

What IE and Webkit do for <br> is treat it as a bidi paragraph break, despite the spec saying otherwise. This is not about to change, because what RTL users expect from <br> is bidi paragraph separation.

For the same reason, Firefox regularly gets bug reports about its treatment of <br> as bidi whitespace. As far as I understand, the developers there would like to accede but don't want to do so as long as the spec says otherwise.

The results is lack of interoperability that has lasted for many years and will continue to last as long as <br> is specified to be bidi whitespace.

And <br> is used all the time. One instance of use is not even due to poorly educated users, but to the automated translation of plain-text newlines into mark-up. One example of that is Gmail, which generates a <br> every time one enters a newline in a rich-text message. This does not even sound to me like the abuse of <br>. How is Gmail to know whether the author meant the newline to signify the end of a paragraph or simply the means to force the wrapping of a line (as in when manually transforming a paragraph of text to which one is replying into short lines with an > at the beginning of each)?

To leave things as they are is to perpetuate the current lack of interoperability.
================================================================================
 #12  Ms2ger                                          2010-10-18 12:17:18 +0000 
--------------------------------------------------------------------------------
So, what you're saying is the following:

* Spec should say to always treat br as a bidi paragraph break
* IE and WebKit do this (test cases to prove this?)
* Gecko gets bug reports about its differing behaviour (link?)
* Opera follows Gecko

Is this correct? That sounds like a much saner solution than adding an attribute.
================================================================================
 #13  Aharon Lanin                                    2010-10-18 14:31:06 +0000 
--------------------------------------------------------------------------------
Created attachment 925 [details]
Test case for whether a browser treats <br> as a UBA pargraph break. If it does, the two arrows will point right. If not, they point left.
================================================================================
 #14  Aharon Lanin                                    2010-10-18 14:44:28 +0000 
--------------------------------------------------------------------------------
(In reply to comment #12)
> So, what you're saying is the following:
> 
> * Spec should say to always treat br as a bidi paragraph break
> * IE and WebKit do this (test cases to prove this?)
> * Gecko gets bug reports about its differing behaviour (link?)
> * Opera follows Gecko
> 
> Is this correct? That sounds like a much saner solution than adding an
> attribute.

This is correct, and is the core of what is being suggested.

However, given that the spec has up to now defined <br> as bidi whitespace, it would seem that a line break with bidi whitespace semantics is apparently useful enough to warrant some way of getting it. I do not feel comfortable getting rid of it completely, without providing some opt-in way of getting it.
================================================================================
 #15  Maciej Stachowiak                               2010-10-18 19:14:00 +0000 
--------------------------------------------------------------------------------
(In reply to comment #12)
> So, what you're saying is the following:
> 
> * Spec should say to always treat br as a bidi paragraph break
> * IE and WebKit do this (test cases to prove this?)
> * Gecko gets bug reports about its differing behaviour (link?)
> * Opera follows Gecko
> 
> Is this correct? That sounds like a much saner solution than adding an
> attribute.

Seems like that could be accomplished simply by removing this line:

"A br element does not separate paragraphs for the purposes of the Unicode bidirectional algorithm. [BIDI]"

That's arguably a separate request from a new mechanism that breaks the line without acting as a paragraph break. For the new mechanism, is &#x2028; a sufficient solution? While not as memorable as <br>, it nonetheless seems less complicated than the bidibreak proposal.
================================================================================
 #16  Ehsan Akhgari [:ehsan]                          2010-10-18 21:38:37 +0000 
--------------------------------------------------------------------------------
(In reply to comment #7)
> Given how rarely <br> is allowed to be used (basically only in poems and
> addresses), what's the use case here?

Also, please note that the problem in comment 0 can also happen in these two use cases.
================================================================================
 #17  Ian 'Hixie' Hickson                             2010-10-19 06:27:42 +0000 
--------------------------------------------------------------------------------
If this request is just to change the <br> element's definition to match IE, then that is definitely something we can do. Should I just change the spec to instead say "A br element must separate paragraphs for the purposes of the Unicode bidirectional algorithm. [BIDI]" ?
================================================================================
 #18  Adil                                            2010-10-20 21:31:56 +0000 
--------------------------------------------------------------------------------
(In reply to comment #17)
> If this request is just to change the <br> element's definition to match IE,
> then that is definitely something we can do. Should I just change the spec to
> instead say "A br element must separate paragraphs for the purposes of the
> Unicode bidirectional algorithm. [BIDI]" ?

I would like to see <br> defined as paragraph separator by default. However, this alone does not solve a specific use case that affects my work. I am developing a web app that displays text extracted from a book or a newspaper in a similar way to this site: http://newspapers.nla.gov.au/ndp/del/article/1118868. 

The requirement is to match exactly the line breaks in the original document regardless of the font width. The problem is, for mixed rtl-ltr text, I need to insert a line break that is not a bidi paragraph break.

If <br> is redefined as a bidi paragraph break instead of a line-break then, in this case, the <br> will give the wrong reordering for the broken line.
================================================================================
 #19  Simon Pieters                                   2010-10-21 06:37:03 +0000 
--------------------------------------------------------------------------------
Would it work to make *two* subsequent <br>s (possibly with whitespace-only text nodes between) a bidi paragraph break?
================================================================================
 #20  Maciej Stachowiak                               2010-10-21 06:47:49 +0000 
--------------------------------------------------------------------------------
(In reply to comment #17)
> If this request is just to change the <br> element's definition to match IE,
> then that is definitely something we can do. Should I just change the spec to
> instead say "A br element must separate paragraphs for the purposes of the
> Unicode bidirectional algorithm. [BIDI]" ?

Absent the explicit requirement to the contrary, doesn't this follow from what the rendering section says about <br> (since it renders as a newline character, which is unicode class B)?
================================================================================
 #21  fantasai                                        2010-10-21 07:33:48 +0000 
--------------------------------------------------------------------------------
You could indeed let it be defined implicitly by the rendering section, if that's what it says. However, given that previous versions of HTML defined <br> as a soft break, and the bidi spec itself cites <br> as an example of a soft break, it's probably better to make it explicit. :)
================================================================================
 #22  Aharon Lanin                                    2010-10-26 03:12:12 +0000 
--------------------------------------------------------------------------------
(In reply to comment #17)
> If this request is just to change the <br> element's definition to match IE,
> then that is definitely something we can do. Should I just change the spec to
> instead say "A br element must separate paragraphs for the purposes of the
> Unicode bidirectional algorithm. [BIDI]" ?

How about this: define <br> to be a bidi paragraph separator, but define <br ubi> to be a "soft" line separator. This would seem to follow from a part of ubi's definition, which is to make the element act on its surroundings as a bidi-neutral character. That way, you don't have to add bidibreak, but we still get a soft <br> when we want one.
================================================================================
 #23  Ian 'Hixie' Hickson                             2010-11-02 22:17:19 +0000 
--------------------------------------------------------------------------------
People. Please. Stop proposing solutions before the problem is clearly stated.

Is the use case in comment 18 the use case that this bug is about? It seems different than the previous discussed problems. Is comment 11's second paragraph the problem? That's a very clearly defined problem, is it the one for which the bug was filed?

Could someone clearly state what the problem is that this bug is about and avoid the temptation to discuss possible solutions?
================================================================================
 #24  Ian 'Hixie' Hickson                             2010-11-03 08:27:27 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: see comment 23. Awaiting clear problem description.
================================================================================
 #25  fantasai                                        2010-11-03 11:37:12 +0000 
--------------------------------------------------------------------------------
This bug seems to have been filed about two related issues:
  1. Current definition of <br> is incompatible with widespread usage and
     implementation. (Comment 11)
  2. Current definition of <br> solves real use cases and its behavior
     needs to be available in HTML. (Comment 18)
================================================================================
 #26  Aharon Lanin                                    2010-11-03 17:23:15 +0000 
--------------------------------------------------------------------------------
(In reply to comment #25)
> This bug seems to have been filed about two related issues:
>   1. Current definition of <br> is incompatible with widespread usage and
>      implementation. (Comment 11)
>   2. Current definition of <br> solves real use cases and its behavior
>      needs to be available in HTML. (Comment 18)

Yes, this is an excellent summary. We want <br>'s default behavior changed to match widespread usage (comment 11), but still want some way to deal with use cases like the one in comment 18.
================================================================================
 #27  Ian 'Hixie' Hickson                             2010-11-03 18:47:49 +0000 
--------------------------------------------------------------------------------
Please file a separate bug for the separate issue. Each bug should be about exactly one issue.
================================================================================
 #28  Aharon Lanin                                    2010-11-03 22:02:41 +0000 
--------------------------------------------------------------------------------
(In reply to comment #27)

This bug is now purely about the need to make <br> a bidi paragraph break, at least by default. Will file a separate bug for the need to be able to force a line wrap with the bidi semantics of LINE SEPARATOR when necessary.
================================================================================
 #29  Aharon Lanin                                    2010-11-03 22:43:19 +0000 
--------------------------------------------------------------------------------
(In reply to comment #28)
> Will file a separate bug for the need to be able to force a
> line wrap with the bidi semantics of LINE SEPARATOR when necessary.

Filed as bug 11211.
================================================================================
 #30  Ian 'Hixie' Hickson                             2010-11-04 06:08:12 +0000 
--------------------------------------------------------------------------------
Gecko and Opera people: could you comment on whether you are ok with changing how <br> works from what you currently do (no effect on bidi) to what WebKit and IE do (treat <br> as a paragraph separator)?

Could you also comment on whether you would like linebreaks in <pre> to be treated the same way? (I don't know that we currently define how those are supposed to be processed from an HTML perspective.)
================================================================================
 #31  Boris Zbarsky                                   2010-11-04 06:13:46 +0000 
--------------------------------------------------------------------------------
Elika, do you recall what we decided here?
================================================================================
 #32  Anne                                            2010-11-04 13:09:56 +0000 
--------------------------------------------------------------------------------
Not having checked with anyone in particular I think we would be fine with making that change. The rendering of <pre> should probably be defined entirely by CSS as it matters for everything with white-space:pre.
================================================================================
 #33  fantasai                                        2010-11-04 13:27:12 +0000 
--------------------------------------------------------------------------------
Per CSS2.1, block boundaries and forced line breaks of bidi class B (everything except LINE SEPARATOR) break the bidi paragraph. So line breaks in <pre> follow the default UAX9 rules.
================================================================================
 #34  fantasai                                        2010-11-04 13:58:08 +0000 
--------------------------------------------------------------------------------
Wrt Gecko, here's a summary of our discussions on the topic:
  http://groups.google.com/group/mozilla.dev.tech.layout/msg/2f14fe783b737cec?

I'll note additionally that the CSSWG resolution to clarify CSS2.1 on this point was after those discussions. This was Issue 145 here:
  http://wiki.csswg.org/spec/css2.1#issue-145
I would expect any CSSWG Members to have spoken up during those discussions if the proposed behavior was a problem. :)
================================================================================
 #35  Ian 'Hixie' Hickson                             2010-11-05 01:02:36 +0000 
--------------------------------------------------------------------------------
I'll take that as a "yes".

Ok, I'll make the change described in comment 30.
================================================================================
 #36  contributor@whatwg.org                          2010-11-05 20:07:09 +0000 
--------------------------------------------------------------------------------
Checked in as WHATWG revision r5670.
Check-in comment: Update <br>'s bidi behavior to match WebKit and IE rather than Gecko and Opera.
http://html5.org/tools/web-apps-tracker?from=5669&to=5670
================================================================================
 #37  Ian 'Hixie' Hickson                             2010-11-05 20:15:34 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given above
Rationale: see long discussion above
================================================================================
 #38  Aharon Lanin                                    2010-11-08 12:14:22 +0000 
--------------------------------------------------------------------------------
The change is great. It actually also addresses 10812, although where it uses the term "newline", the precise terminology would have been (or at least used to be) "line break", thus covering <CR>, <LF>, and combinations.
================================================================================
 #39  Aharon Lanin                                    2010-12-23 16:20:53 +0000 
--------------------------------------------------------------------------------
The reason that this bug was filed is expressed (by me) in comment 11:

"What IE and Webkit do for <br> is treat it as a bidi paragraph break, despite the spec saying otherwise. This is not about to change, because what RTL users expect from <br> is bidi paragraph separation."

Or, as fanatasai summarized it in comment 25, "Current definition of <br> is incompatible with widespread usage and implementation."

Well, it turns out that part of these statements is out of date. As I learned a few days ago from Simon Mantagu, IE did change, to an extent. While IE7 indeed treated <br> as a bidi paragraph break, even in its standards mode, IE8 treats it as bidi whitespace (i.e. per the HTML4 spec) - when in *its* standards mode. IE8 continues to treat <br> as a bidi paragraph break in its quirks mode and its IE7 compatibility mode.

I guess that this should not be surprising, given that IE's standards mode is about following standards, and HTML4 is the current standard. However, it did surprise me, since I thought I had tested this in IE8 before filing the bug. (I wasn't careful to make sure I was in IE8 standards mode.)

Please note that the other part of the reason for filing this bug remains unaffected: despite the HTML spec limiting <br> to esoteric uses like poetry and addresses, most of the time that <br> is used, it is used as the HTML equivalent of a plain text newline. When used that way in a bidi document, it does not work as intended unless it is a bidi paragraph break.
================================================================================
 #40  Ian 'Hixie' Hickson                             2011-01-08 22:23:30 +0000 
--------------------------------------------------------------------------------
I'm confused as to why this bug is reopened. The original bug is fixed, no? Is it reopened to revert the fix so that instead of being compatible with IE and WebKit, we go back to being compatible with IE and Firefox?
================================================================================
 #41  Shachar Shemesh                                 2011-01-09 03:19:02 +0000 
--------------------------------------------------------------------------------
(In reply to comment #40)
> I'm confused as to why this bug is reopened. The original bug is fixed, no? Is
> it reopened to revert the fix so that instead of being compatible with IE and
> WebKit, we go back to being compatible with IE and Firefox?

From what I understood, the original bug said "please change what HTML4 says should happen as it is incompatible with both what implementations do and what happens in real life".

What Aharon is saying is that he was wrong about the first half of the statement. Upon a re-check, current implementations do follow HTML4 when working in 'standards' mode. This means, to me, that we should avoid causing previously standard compliant behavior to suddenly become non-standard.

In other words, I believe this bug should, under these changed circumstances, be marked "Invalid", and its solution reverted.

I should point out that, as far as I know, Aharon does not share this belief. He thinks that the bug should still be fixed (i.e., the situation should remain as it is), but he was honest enough to state that with the different state of affairs, the discussion should be re-opened.

Shachar
================================================================================
 #42  CE Whitehead                                    2011-01-09 03:44:32 +0000 
--------------------------------------------------------------------------------
(In reply to comment #41)  Hi, will there be a way for a br element to still sometimes constitute a bidi-paragraph break, although no longer by default?  

Thanks.

Best,

--C. E. Whitehead
cewcathar@hotmail.com 
> (In reply to comment #40)
> > I'm confused as to why this bug is reopened. The original bug is fixed, no? Is
> > it reopened to revert the fix so that instead of being compatible with IE and
> > WebKit, we go back to being compatible with IE and Firefox?
> From what I understood, the original bug said "please change what HTML4 says
> should happen as it is incompatible with both what implementations do and what
> happens in real life".
> What Aharon is saying is that he was wrong about the first half of the
> statement. Upon a re-check, current implementations do follow HTML4 when
> working in 'standards' mode. This means, to me, that we should avoid causing
> previously standard compliant behavior to suddenly become non-standard.
> In other words, I believe this bug should, under these changed circumstances,
> be marked "Invalid", and its solution reverted.
> I should point out that, as far as I know, Aharon does not share this belief.
> He thinks that the bug should still be fixed (i.e., the situation should remain
> as it is), but he was honest enough to state that with the different state of
> affairs, the discussion should be re-opened.
> Shachar
================================================================================
 #43  Shachar Shemesh                                 2011-01-09 04:10:32 +0000 
--------------------------------------------------------------------------------
(In reply to comment #42)
> (In reply to comment #41)  Hi, will there be a way for a br element to still
> sometimes constitute a bidi-paragraph break, although no longer by default?  
> 

To me, that seems broken. The whole point behind bidi break on <br> was to make pages that would not consider BiDi "do the right thing". If you have a non-default option, then the pages that would not consider BiDi still wouldn't, and the pages that do can use <p>. The only real use I see for <br> as an optional BiDi break is for applications such as greasemonkey, where the user add our hypothetical CSS (or whatever) to the <br> in order to fix a broken page.

Shachar
================================================================================
 #44  Aharon Lanin                                    2011-01-09 08:36:27 +0000 
--------------------------------------------------------------------------------
I reopened the bug because I was the one who opened it, and it turns that part of the information based on which I opened it was incorrect. This deserved to be brought to your attention, and let the chips fall where they may.

In my opinion things should stay as they currently are, i.e. with <br> defined as a bidi paragraph break. As always, I admit that this is theoretically inconsistent with the recommended use of <br>, for things like poetry and addresses. However:

1. This is consistent with the way <br> is actually used, which is as the HTML equivalent of a plain-text line break, which is most often actually a paragraph break. Authors and applications that use <br> that way, e.g. when taking user input into a contentEditable element, do so because it is a lot more convenient to use than <p> or <div>, and (besides the bidi aspect, which most often is at best a secondary concern) it works regardless of how the spec says <br> should be used.

2. The bidi mis-ordering that is caused by the treatment as whitespace of a line break that the author meant as a paragraph break is far worse than the mis-ordering in the opposite case.
================================================================================
 #45  Shachar Shemesh                                 2011-01-09 08:46:43 +0000 
--------------------------------------------------------------------------------
(In reply to comment #44)
> 2. The bidi mis-ordering that is caused by the treatment as whitespace of a
> line break that the author meant as a paragraph break is far worse than the
> mis-ordering in the opposite case.

I should point out that this point is arguable. Also, the misordering done by treating <br> as it should is fixable by placing an RLM/LRM (depending on the desired paragraph direction) before and after the <br>, whereas the misordering as a result of treating <br> as a paragraph break is not fixable at all.

Shachar
================================================================================
 #46  Adil                                            2011-01-10 11:57:16 +0000 
--------------------------------------------------------------------------------
(In reply to comment #44)
> I reopened the bug because I was the one who opened it, and it turns that part
> of the information based on which I opened it was incorrect. This deserved to
> be brought to your attention, and let the chips fall where they may.
> 

Talking of falling chips - a strong determining factor should be what Microsoft says. Has any attempt been made to contact them? I have contacts within their bidi group and I can ask if you need.
================================================================================
 #47  Aharon Lanin                                    2011-01-10 14:46:35 +0000 
--------------------------------------------------------------------------------
(In reply to comment #46)
> (In reply to comment #44)
> > I reopened the bug because I was the one who opened it, and it turns that part
> > of the information based on which I opened it was incorrect. This deserved to
> > be brought to your attention, and let the chips fall where they may.
> > 
> 
> Talking of falling chips - a strong determining factor should be what Microsoft
> says. Has any attempt been made to contact them? I have contacts within their
> bidi group and I can ask if you need.

A good idea. I have only recently made contact with them, and have not asked about this specifically. I will do so now, but you might as well too.
================================================================================
 #48  Ian 'Hixie' Hickson                             2011-01-13 19:40:48 +0000 
--------------------------------------------------------------------------------
I wouldn't be above making this a quirks vs standards thing, if it meant we could make <br> work right... but that might be tilting at windmills.

Microsoft people: some of the earlier comments request your input.
================================================================================
 #49  Sam Ruby                                        2011-01-17 21:54:38 +0000 
--------------------------------------------------------------------------------
Reminder: - Jan 22, 2010 is the cutoff for escalating bugs for pre-LC consideration - all issues in tracker, calls for proposal issued by this date.
Consequences of missing this date: any further escalations will be treated as a Last Call comment.
================================================================================
 #50  Ian 'Hixie' Hickson                             2011-02-15 01:18:57 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: 

Well as much as I want to change this, realistically it seems that compatibility with IE quirks mode for <br> is going to be more important than compatibility with its standards mode, and I doubt Microsoft are willing to change their quirks mode.

So I guess this gets left as is, unless any of the browsers are willing to actually change the spec to the more sensible model and compatibility be damned. In particular, if WebKit is willing to change to match what the spec used to say (that BR doesn't reset the bidi paragraph level) then that would be a compelling argument to change the spec here.
================================================================================
 #51  Adrian Bateman [MSFT]                           2011-02-15 01:32:09 +0000 
--------------------------------------------------------------------------------
Sorry for the delay in reviewing this - I had to track down the correct people.

We think IE standards mode is the correct behaviour. <br> is intended to be a line break and not a paragraph break. <p> is for paragraphs. The spec says "br elements must not be used for separating thematic groups in a paragraph" and further says that this is an abuse of the <br> element.

<br> should mean a line break within a paragraph and not be treated as a paragraph break. We don't currently plan to change this behaviour in IE9.
================================================================================
 #52  CE Whitehead                                    2011-02-15 21:35:49 +0000 
--------------------------------------------------------------------------------
(In reply to comment #51)
> Sorry for the delay in reviewing this - I had to track down the correct people.
> We think IE standards mode is the correct behaviour. <br> is intended to be a
> line break and not a paragraph break. <p> is for paragraphs. The spec says "br
> elements must not be used for separating thematic groups in a paragraph" and
> further says that this is an abuse of the <br> element.
> <br> should mean a line break within a paragraph and not be treated as a
> paragraph break. We don't currently plan to change this behaviour in IE9.

Hi.  This is o.k. for me I guess.  I do wonder, however, would it be worthwhile to have a non-default option for break [br] in css that behaved as a paragraph break rather than as simply a line break?  (The problem with [p] [/p] for some coders is it's considered best to close it and it is declared at the beginning of the line not the end; however for poetry [br] only makes sense as a line and not a paragraph break.)

Best,

--C. E. Whitehead
cewcathar@hotmail.com
================================================================================
 #53  CE Whitehead                                    2011-02-15 21:36:16 +0000 
--------------------------------------------------------------------------------
(In reply to comment #51)
> Sorry for the delay in reviewing this - I had to track down the correct people.
> We think IE standards mode is the correct behaviour. <br> is intended to be a
> line break and not a paragraph break. <p> is for paragraphs. The spec says "br
> elements must not be used for separating thematic groups in a paragraph" and
> further says that this is an abuse of the <br> element.
> <br> should mean a line break within a paragraph and not be treated as a
> paragraph break. We don't currently plan to change this behaviour in IE9.

Hi.  This is o.k. for me I guess.  I do wonder, however, would it be worthwhile to have a non-default option for break [br] in css that behaved as a paragraph break rather than as simply a line break?  (The problem with [p] [/p] for some coders is it's considered best to close it and it is declared at the beginning of the line not the end; however for poetry [br] only makes sense as a line and not a paragraph break.)

Best,

--C. E. Whitehead
cewcathar@hotmail.com
================================================================================
 #54  Ian 'Hixie' Hickson                             2011-02-25 10:09:45 +0000 
--------------------------------------------------------------------------------
(In reply to comment #51)
> We think IE standards mode is the correct behaviour.

Would you change IE's quirks modes to the same behaviour?

The question is not really what the right behaviour is _in theory_, but what behaviour browsers should apply to existing Web content. If you would not change your quirks mode behaviour, then that is a pretty strong signal that you think that would browsers need to do in practice is what IE's quirks modes do.
================================================================================
 #55  Adrian Bateman [MSFT]                           2011-02-25 14:20:15 +0000 
--------------------------------------------------------------------------------
No, we won't change IE's quirks mode. Quirks mode is supposed to be quirky - not changing it is a pretty strong signal that we don't want pages written 10+ years ago to start looking different. On the other hand we don't think changing standards mode is the right thing to do. Not changing standards mode is a pretty strong signal that we think what standards mode does is the right behaviour.
================================================================================
 #56  Anne                                            2011-02-25 14:50:46 +0000 
--------------------------------------------------------------------------------
Opera would like to minimize the differences between the various modes. And although I cannot speak for other non-Microsoft vendors I believe they feel the same. Introducing new quirks is not nice.
================================================================================
 #57  Adrian Bateman [MSFT]                           2011-02-25 16:14:00 +0000 
--------------------------------------------------------------------------------
Opera, Firefox and IE standards mode all have the same behaviour.
================================================================================
 #58  Boris Zbarsky                                   2011-02-25 17:07:26 +0000 
--------------------------------------------------------------------------------
For what it's worth, last I checked we were strongly considering changing the Gecko behavior.  We just hadn't gotten to it yet.
================================================================================
 #59  fantasai                                        2011-03-23 02:40:47 +0000 
--------------------------------------------------------------------------------
Here's the implementation data from smontagu's tests:

Impl      <BR>     <PRE> CR/LF
===============================
IE7        PS          PS
IE8        LS          LS
IE9        LS          PS
Chrome9   PS/LS       PS/LS
Safari5   PS/LS       PS/LS
FF3.6      LS          LS
Opera11    LS          LS

WebKit's behavior is really weird. Whether the break is LS or PS seems to depend on what type of content is near the break: if there is an *embedded* element after the <br>, it's treated as LS (the RTL effect passes through the <br>).

To summarize, the ideal behavior would be IE9's, i.e.
Ideal      LS          PS
The safest behavior is probably IE7's,
Safe       PS          PS
================================================================================
 #60  CE Whitehead                                    2011-03-24 16:41:31 +0000 
--------------------------------------------------------------------------------
(In reply to comment #59)
> Here's the implementation data from smontagu's tests:
> Impl      <BR>     <PRE> CR/LF
> ===============================
> IE7        PS          PS
> IE8        LS          LS
> IE9        LS          PS
> Chrome9   PS/LS       PS/LS
> Safari5   PS/LS       PS/LS
> FF3.6      LS          LS
> Opera11    LS          LS
> WebKit's behavior is really weird. Whether the break is LS or PS seems to
> depend on what type of content is near the break: if there is an *embedded*
> element after the <br>, it's treated as LS (the RTL effect passes through the
> <br>).
> To summarize, the ideal behavior would be IE9's, i.e.
> Ideal      LS          PS

O.k; under this p is the only way to get paragraph breaks; this does solve the use case described in comment 18 (I assume br clear="all" which is for images would at least force a hard break).
> The safest behavior is probably IE7's,
> Safe       PS          PS

I thought this behavior however was to be relegated to quirks mode only,
and that people who wanted to use break as a paragraph separator would have to be in quirks mode from now on.  But I still have my question about br clear="all"  (but I drop my request to have any other hard bidi break as you all are right; people who use it would know to use the p element).

Best,

--C. E. Whitehead
cewcathar@hotmail.com


(In reply to comment #59)
> Here's the implementation data from smontagu's tests:
> Impl      <BR>     <PRE> CR/LF
> ===============================
> IE7        PS          PS
> IE8        LS          LS
> IE9        LS          PS
> Chrome9   PS/LS       PS/LS
> Safari5   PS/LS       PS/LS

> FF3.6      LS          LS
> Opera11    LS          LS
> WebKit's behavior is really weird. Whether the break is LS or PS seems to
> depend on what type of content is near the break: if there is an *embedded*
> element after the <br>, it's treated as LS (the RTL effect passes through the
> <br>).

> To summarize, the ideal behavior would be IE9's, i.e.
> Ideal      LS          PS
> The safest behavior is probably IE7's,
> Safe       PS          PS
================================================================================
 #61  Levi Weintraub                                  2011-04-19 23:18:23 +0000 
--------------------------------------------------------------------------------
In tip of tree WebKit, we now treat br as a paragraph separator that clears all state from Unicode control characters, but has no effect on state from style/DOM (like dir=rtl).
================================================================================
 #62  Levi Weintraub                                  2011-04-19 23:19:17 +0000 
--------------------------------------------------------------------------------
In tip of tree WebKit, we now treat br as a paragraph separator that clears all state from Unicode control characters, but has no effect on state from style/DOM (like dir=rtl). This is not tied to quirks mode.
================================================================================