This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10827 - i18n comment 23 : script dialog text direction
Summary: i18n comment 23 : script dialog text direction
Status: CLOSED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-29 13:04 UTC by i18n bidi group
Modified: 2011-01-22 20:26 UTC (History)
10 users (show)

See Also:


Attachments

Description i18n bidi group 2010-09-29 13:04:05 UTC
Comment from the i18n review of:
http://dev.w3.org/html5/spec/

Comment 23
At http://www.w3.org/International/reviews/html5-bidi/
Editorial/substantive: S
Tracked by: AL

Location in reviewed document:
undefined [http://dev.w3.org/html5/spec/spec.html#contents]

Comment:This is a part of the proposals made by the "Additional Requirements for Bidi in HTML" W3C First Public Working Draft. For a full description of the use cases, please see 
http://www.w3.org/International/docs/html-bidi-requirements/#script-dialog [http://www.w3.org/International/docs/html-bidi-requirements/#script-dialog]
. Here is the proposal made there:

The HTML specification should state that plain text passed by page scripts without specifying an explicit direction to whatever services script languages provide for dialog display (e.g. Javascript's alert(), confirm() and prompt()) should be displayed according to the UBA's rules P1, P2, and P3, which estimate the direction of each paragraph according to its first strong character.
Comment 1 Ian 'Hixie' Hickson 2010-10-11 22:49:56 UTC
Isn't that entirely up to the platform? I don't understand what this has got to do with HTML. Could you given an example of how interoperability would be harmed if we don't have this requirement and implementations implement things differently?
Comment 2 Ian 'Hixie' Hickson 2010-10-12 10:40:46 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: see comment 1
Comment 3 Ehsan Akhgari [:ehsan] 2010-10-18 21:40:59 UTC
(In reply to comment #1)
> Isn't that entirely up to the platform? I don't understand what this has got to
> do with HTML. Could you given an example of how interoperability would be
> harmed if we don't have this requirement and implementations implement things
> differently?

Consider an HTML document written in an RTL language, displayed on a LTR user agent and platform.  If the user agent chooses to follow the platform default direction for script dialog text, any text on dialogs launched from this HTML page will have the incorrect base direction, and hence could be affected by text rendering problems.
Comment 4 Ian 'Hixie' Hickson 2010-10-19 06:33:38 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given below
Rationale:

Ah, excellent. Thanks for the explanation. Done.
Comment 5 contributor 2010-10-19 06:34:36 UTC
Checked in as WHATWG revision r5641.
Check-in comment: Define directional behaviour for window.alert() text.
http://html5.org/tools/web-apps-tracker?from=5640&to=5641
Comment 6 fantasai 2010-11-05 11:52:29 UTC
Looking at the diff, I think you are introducing some confusion with your terminology.

+  Text from scripts (e.g. the argument to
+  <code title="dom-alert">window.alert()</code>) is expected to be
+  rendered as a separate bidirectional algorithm paragraph. <a
+  href="#refsBIDI">[BIDI]</a></p>

Minor comment: as worded, I would s/separate/independent/

But either way, there is some confusion here as I believe it is expected for hard line breaks like CRLF to break the bidi paragraph (as well as the line). That is, the text is not rendered as "a" bidi paragraph, it may potentially be rendered as multiple bidi paragraphs.

Perhaps something like
Text from scripts (e.g. the argument to window.alert()) is expected to be rendered as Unicode plain text. For example, the line and paragraph-breaking behavior of LF is honored, and the base direction of each bidi paragraph is determined according to Unicode's detection rules. [BIDI]
?
Comment 7 Aharon Lanin 2010-11-08 10:42:50 UTC
(In reply to comment #6)
> Looking at the diff, I think you are introducing some confusion with your
> terminology.
> 
> +  Text from scripts (e.g. the argument to
> +  <code title="dom-alert">window.alert()</code>) is expected to be
> +  rendered as a separate bidirectional algorithm paragraph. <a
> +  href="#refsBIDI">[BIDI]</a></p>
> 
> Minor comment: as worded, I would s/separate/independent/
> 
> But either way, there is some confusion here as I believe it is expected for
> hard line breaks like CRLF to break the bidi paragraph (as well as the line).
> That is, the text is not rendered as "a" bidi paragraph, it may potentially be
> rendered as multiple bidi paragraphs.
> 
> Perhaps something like
> Text from scripts (e.g. the argument to window.alert()) is expected to be
> rendered as Unicode plain text. For example, the line and paragraph-breaking
> behavior of LF is honored, and the base direction of each bidi paragraph is
> determined according to Unicode's detection rules. [BIDI]
> ?

I fully agree with Fantasai's comments. As it currently stands, the diff is not sufficient to address the bug. In fact, I think it should be clarified even further, to something like this:

Text from scripts (e.g. the argument to window.alert()) is expected to be rendered as Unicode plain text, with the base direction of each paragraph determined by Unicode detection rules from the paragraph's content. For example, the line and paragraph-breaking behavior of LF is honored, and a paragraph starting with RTL text must appear in an RTL base direction. [BIDI]

It would also be best to move the programming languages select example to appear between "from which the text was obtained." and "Text from scripts".
Comment 8 Ian 'Hixie' Hickson 2010-11-10 17:28:15 UTC
Thanks for the comments. I'll fix the text.
Comment 9 Ian 'Hixie' Hickson 2010-11-11 02:16:00 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: see diff given below
Rationale: I've tried to fix this. Let me know if it's still unclear, or if the examples I've used are wrong (this is especially likely given that I don't speak Arabic or Hebrew and the examples involve the bidi algorithm).
Comment 10 contributor 2010-11-11 02:16:16 UTC
Checked in as WHATWG revision r5678.
Check-in comment: tweak wording and add an example to make things clearer
http://html5.org/tools/web-apps-tracker?from=5677&to=5678
Comment 11 Aharon Lanin 2010-11-16 12:00:05 UTC
Sorry, but it still needs tweaking. I would like the paragraph beginning with "A string provided by a script" to read more like this:

--- PROPOSED TEXT ---
Text provided by a script (e.g. the argument to window.alert()) is expected to be displayed according to the rules of the Unicode bidirectional algorithm for splitting text into paragraphs and determining the overall direction of each paragraph. For instance, U+000A LINE FEED (LF) characters are expected to separate between paragraph, and the overall direction of a paragraph starting with a right-to-left character is expected to be right-to-left."
--- END OF PROPOSED TEXT ---

Also, the code example needs to be drastically simplified. It must not deal with inserting data of one direction into a message of a different direction. This is an "advanced" topic that comes up less frequently than using a simple message with no admixtures of data. Unfortunately, inserting data correctly often requires wrapping the data being inserted into the message in LRE|RLE and PDF, followed by an LRM or RLM. Furthermore, due to the imperfections of the direction estimation algorithm mandated by Unicode, one must not start a paragraph with such data, since it will force the paragraph into the wrong direction (unless, as you point out, one uses LRM or RLM first). The code in your example does not deal properly with either of these requirements and would output a garbled message. To avoid having to give an extended lesson on on the right way to do bidi, while still giving a correct example, we can simply stay away from the "advanced" topic and stick to a simple message with no inserted data, e.g.:

--- PROPOSED TEXT ---
For example, alert('\u05DC\u05DE\u05D3 HTML \u05D4\u05D9\u05D5\u05DD!') should always result in a message reading "<span dir=rtl>&#1500;&#1502;&#1491; HTML &#1492;&#1497;&#1493;&#1501;!</span>" (not "&#1500;&#1502;&#1491; HTML &#1492;&#1497;&#1493;&#1501;!"), regardless of the language of the user agent UI or the direction of the page or any of its elements.

When necessary, authors can enforce the direction interpreted for any given paragraph by starting it with the Unicode character U+200E LEFT-TO-RIGHT MARK or U+200F RIGHT-TO-LEFT MARK.
--- END OF PROPOSED TEXT ---

(When I indicated "<span dir=rtl>" above I meant that you need to actually put that into the HTML source of the spec.)
Comment 12 Ian 'Hixie' Hickson 2010-11-30 21:21:32 UTC
(In reply to comment #11)
> Sorry, but it still needs tweaking. I would like the paragraph beginning with
> "A string provided by a script" to read more like this:
> 
> --- PROPOSED TEXT ---
> Text provided by a script (e.g. the argument to window.alert()) is expected to
> be displayed according to the rules of the Unicode bidirectional algorithm for
> splitting text into paragraphs and determining the overall direction of each
> paragraph. For instance, U+000A LINE FEED (LF) characters are expected to
> separate between paragraph, and the overall direction of a paragraph starting
> with a right-to-left character is expected to be right-to-left."
> --- END OF PROPOSED TEXT ---

Could you elaborate on what problem this proposal is intended to solve?


> The code in your example does not deal properly with either of these
> requirements and would output a garbled message.

Right, that's the point of the example. :-)

I've added your proposed example, but I haven't removed the previous one.
Comment 13 contributor 2010-11-30 21:22:00 UTC
Checked in as WHATWG revision r5689.
Check-in comment: Add a new example for alert()'s bidi implications
http://html5.org/tools/web-apps-tracker?from=5688&to=5689
Comment 14 Aharon Lanin 2010-12-01 12:35:34 UTC
(In reply to comment #12)
> Could you elaborate on what problem this proposal is intended to solve?

1. The spec currently says, basically, that script dialog text is to be treated as a set of Unicode Bidi Algorithm paragraphs. The Unicode Bidi Algorithm does specify rules - specifically P2 and P3 - for determining the overall direction of each paragraph. However, it explicitly allows a "higher level protocol" to override P2 and P3 and give the paragraphs whatever direction it wants. In HTML, the exception is actually the rule: the direction of text content is determined by mark-up, which is a higher-level protocol. All current browser versions use the same approach for dialog text and apply to it some higher level protocol, e.g. the direction of the root element or the user agent's UI direction. I do not think anyone has claimed that by doing so, user agents violate the Unicode Bidi Algorithm. Thus, I believe that to require the behavior we want, the HTML spec must give some clear indication that the direction of script dialog text paragraphs must be determined without applying a higher-level protocol. The spec's current requirement does not seem to do so, except in the examples. It mentions the division of the text into paragraphs, but not the directions these paragraphs are to take.

2. [Editorial] It is better to stick to the term "text" instead of switching to the term "string" for no apparent reason ("A string provided by a script").

3. [Editorial] The repetition of the term "bidirectional algorithm" should be avoided if possible.

> > The code in your example does not deal properly with either of these
> > requirements and would output a garbled message.
> 
> Right, that's the point of the example. :-)

Are you saying that you were trying to illustrate what can go wrong? This was not clear to me. I do not think that the spec should give buggy code as an example, especially without stating extremely clearly that it is buggy and without giving a correct version. If you want to illustrate the necessity of using LRM or RLM to enforce a direction, I would recommend giving the correct version instead, e.g.:

--- PROPOSED TEXT ---
When necessary, authors can enforce a particular direction for a given paragraph by starting it with the Unicode U+200E LEFT-TO-RIGHT MARK or U+200F RIGHT-TO-LEFT MARK character.

Consider, for example, the following more complex script:

var s;
if (s = prompt('What is your name?')) {
  alert('\u200E' + s + '! Ok, Fred, ' + s + ', and Wilma will get the car.');
}

The script starts the alert with a LEFT-TO-RIGHT MARK because the alert is in English overall and must be treated as left-to-right to be displayed correctly. Without the LEFT-TO-RIGHT MARK, when the user enters a right-to-left string like "&#1604;&#1575; &#1571;&#1601;&#1607;&#1605;" in response to the prompt, the alert would start with right-to-left text and would be displayed as right-to-left, unintelligibly.
--- END OF PROPOSED TEXT ---
Comment 15 Ian 'Hixie' Hickson 2010-12-03 20:21:45 UTC
> Thus, I believe that to require the
> behavior we want, the HTML spec must give some clear indication that the
> direction of script dialog text paragraphs must be determined without applying
> a higher-level protocol. The spec's current requirement does not seem to do so

Fixed.


> 2. [Editorial] It is better to stick to the term "text" instead of switching to
> the term "string" for no apparent reason ("A string provided by a script").

Scripts provide strings, strings contain text.


> 3. [Editorial] The repetition of the term "bidirectional algorithm" should be
> avoided if possible.

Why? Seems fine to me.


> Are you saying that you were trying to illustrate what can go wrong?

Yes. I've tried to make this clearer.


> I do not think that the spec should give buggy code as an
> example, especially without stating extremely clearly that it is buggy and
> without giving a correct version.

It's not buggy code for most purposes. You're simply not going to be able to convince people writing LTR-only sites (e.g. the majority of Web authors) to worry about what happens when their alert()s happen to start with user-provided RTL text.

I've added an example that handles it correctly though.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: see diff given below
Rationale: see above
Comment 16 contributor 2010-12-03 23:54:27 UTC
Checked in as WHATWG revision r5700.
Check-in comment: Clarify the way bidi interacts with alert()
http://html5.org/tools/web-apps-tracker?from=5699&to=5700
Comment 17 Aharon Lanin 2010-12-05 08:11:29 UTC
(In reply to comment #16)
> Checked in as WHATWG revision r5700.
> Check-in comment: Clarify the way bidi interacts with alert()
> http://html5.org/tools/web-apps-tracker?from=5699&to=5700

For some strange reason, I do not see the change at http://dev.w3.org/html5/spec/Overview.html.

But the change in the diff looks good, with one minor suggestion: instead of saying that "this specification does <em>not</em> provide a higher-level override of rules P2 and P3", which could be interpreted as allowing user agents to come up with their own override, would it be possible to specify that no such override shall be used, e.g. "For the purposes of determining the paragraph level of such text in the bidirectional algorithm, rules P2 and P3 of that algorithm are to be applied without applying a higher-level override."?
Comment 18 CE Whitehead 2010-12-06 02:26:19 UTC
(In reply to comment #15)

> > Are you saying that you were trying to illustrate what can go wrong?
> Yes. I've tried to make this clearer.
> > I do not think that the spec should give buggy code as an
> > example, especially without stating extremely clearly that it is buggy and
> > without giving a correct version.
> It's not buggy code for most purposes. You're simply not going to be able to
> convince people writing LTR-only sites (e.g. the majority of Web authors) to
> worry about what happens when their alert()s happen to start with user-provided
> RTL text.
> I've added an example that handles it correctly though.

Thanks!
Best,

--C. E. Whitehead
cewcathar@hotmail.com
Comment 19 Ian 'Hixie' Hickson 2010-12-07 21:03:36 UTC
> For some strange reason, I do not see the change at
> http://dev.w3.org/html5/spec/Overview.html.

Should be there now, there was some temporary glitch in the commit process. For future reference, the draft at http://whatwg.org/c has all the text discussed here also.

> But the change in the diff looks good, with one minor suggestion: instead of
> saying that "this specification does <em>not</em> provide a higher-level
> override of rules P2 and P3", which could be interpreted as allowing user
> agents to come up with their own override, would it be possible to specify that
> no such override shall be used, e.g. "For the purposes of determining the
> paragraph level of such text in the bidirectional algorithm, rules P2 and P3 of
> that algorithm are to be applied without applying a higher-level override."?

Some other spec might define it, so we can't really do that without risking specs contradicting each other. It doesn't allow UAs to do anything they want, since in the absence of some other spec, the bidi spec is already authoritative and doesn't allow UAs to make up their own rules.
Comment 20 Ian 'Hixie' Hickson 2010-12-07 21:10:29 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Partially Accepted
Change Description: no new change
Rationale: see above