Re: I18N-ISSUE-117: Close definition of dir auto [HTML5-prep]

I wish to respectfully disagree both with the note about dir='auto' (see 
below) and with Addison Phillips's comment.

The note says: "The heuristic used by this state is very crude (it just 
looks at the first character with a strong directionality, in a manner 
analogous to the Paragraph Level determination in the bidirectional 
algorithm)."

This heuristic can be called crude, or simplistic, or simple. 
It has the advantage of following the Unicode Bidirectional Algorithm, 
which is good for interoperability.
As a bidi user myself, I can tell that this simple algorithm is pretty 
effective. In the cases where it would not give the expected result, the 
originator of the text can  remedy the situation by inserting an invisible 
character (LRM or RLM) at the beginning of the text.
Thus the advice "Authors are urged to only use this value as a last resort
" seems to me unjustified.

Addison commented: "Does HTML5 need to define auto so closely that no 
user-agent can provide a better algorithm? That seems counter-productive. 
Some room for innovation should be preserved."
I beg to differ. Even if better algorithms can be invented, IMHO 
interoperability here is more important than algorithm effectiveness. This 
heuristic affects the display order of the text. It is not a matter of 
cosmetics, but of readability. It is essential that readers see the text 
exactly in the order that the author intended.
For that to happen, the algorithm must be defined closely, and followed 
punctiliously by all user-agents.
Sorry, but this is not the place for innovation.

I propose the following text to replace the note:

The heuristic used by this state is simple, but effective (it just looks 
at the first character with a strong directionality, in a manner analogous 
to the Paragraph Level determination in the bidirectional algorithm). 
However, there are cases where it gives a wrong result. Authors are urged 
to only use this value when the direction of the text is truly unknown.


By the way, it seems that the reference for this comment should be 
3.2.3.4 The dir attribute
http://www.w3.org/TR/html5/elements.html#the-dir-attribute


Shalom (Regards),  Mati
           Bidi Architect
           Globalization Center Of Competency - Bidirectional Scripts
           IBM Israel
           Fax: +972 2 5870333    Mobile: +972 52 2554160




From:   Internationalization Core Working Group Issue Tracker 
<sysbot+tracker@w3.org>
To:     public-i18n-core@w3.org
Date:   23/07/2011 21:51
Subject:        I18N-ISSUE-117: Close definition of dir auto [HTML5-prep]
Sent by:        public-i18n-core-request@w3.org




I18N-ISSUE-117: Close definition of dir auto [HTML5-prep]

http://www.w3.org/International/track/issues/117

Raised by: Addison Phillips
On product: HTML5-prep

3.2.3.4 The xml:base attribute (XML only)
http://www.w3.org/TR/html5/elements.html#the-xml:base-attribute-xml-only

The dir 'auto' value has this note:

--
The heuristic used by this state is very crude (it just looks at the first 
character with a strong directionality, in a manner analogous to the 
Paragraph Level determination in the bidirectional algorithm). Authors are 
urged to only use this value as a last resort when the direction of the 
text is truly unknown and no better server-side heuristic can be applied.
--

Does HTML5 need to define auto so closely that no user-agent can provide a 
better algorithm? That seems counter-productive. Some room for innovation 
should be preserved.

Received on Sunday, 24 July 2011 10:07:06 UTC