Bug 19888 - select the default value of "dir" according to "lang"
select the default value of "dir" according to "lang"
Status: NEW
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec
unspecified
PC Linux
: P2 normal
: ---
Assigned To: Robin Berjon
HTML WG Bugzilla archive list
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-07 08:12 UTC by Amir E. Aharoni
Modified: 2013-03-11 11:49 UTC (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Amir E. Aharoni 2012-11-07 08:12:32 UTC
All multilingual websites, such as Wikipedia, must at some point determine the direction of content according to the language code, either on the server or on the client. All multilingual websites need it, so it should be built into the HTML technology.

HTML4 specifically prohibits deducing the value of the "dir" attribute from the value of the "lang" attribute.

HTML5 in its current state also says that dir must be specified separately and that it must not depend on "lang".

I tried to understand the reasons for this. I asked several people who are knowledgeable in web standards about this, and nobody had a convincing answer. Usually the answers revolved around these issues, all of which seem very easy to resolve:

1. Problem: "The browser doesn't have enough information about the language, so it can't determine the direction from the language."

Solution: Direction is not really a content attribute. Direction is an _inherent_ property of the writing system in which the element's content is written. BCP 47 and CLDR provide sane default values for the writing system of all languages, and the direction can be easily deduced from the writing system. This may have been untrue when HTML4 was release, but CLDR is quite mature now.

2. Problem: "It's not backwards-compatible."

Solution: The default direction can be applied only to documents that specify the right DOCTYPE.

3. Problem: "Some languages can be both rtl and ltr, such as Azerbaijani and Punjabi."

Solution: All languages codes have sane standard defaults, including these two languages and all others. In any case, it is completely reasonable to put the responsibility to specify the correct and complete language code in the web authors' hands. The can use, for example az-latn, az-arab, pa-guru or pa-arab or use ISO 639 3 three-letter codes: azn, azb, pan and pnb. Resolving bug 19887 will make this even more robust.
Comment 1 Robin Berjon 2013-01-21 15:59:40 UTC
Mass move to "HTML WG"
Comment 2 Robin Berjon 2013-01-21 16:02:25 UTC
Mass move to "HTML WG"
Comment 3 Amir E. Aharoni 2013-02-20 07:47:17 UTC
Maybe, if people don't like this proposal because of backwards compatibility issues, it could be redone in way similar to
http://www.w3.org/International/wiki/Html-bidi-isolation

that is: leave the attribute "lang" with the current semantics and add a new attribute called "language" that would also apply direction according to language (or, more precisely, writing system).