[Bug 19888] New: select the default value of "dir" according to "lang"

https://www.w3.org/Bugs/Public/show_bug.cgi?id=19888

          Priority: P2
            Bug ID: 19888
                CC: mike@w3.org, robin@w3.org
          Assignee: dave.null@w3.org
           Summary: select the default value of "dir" according to "lang"
        QA Contact: public-html-bugzilla@w3.org
          Severity: normal
    Classification: Unclassified
                OS: Linux
          Reporter: amir.aharoni@mail.huji.ac.il
          Hardware: PC
            Status: NEW
           Version: unspecified
         Component: default
           Product: HTML.next

All multilingual websites, such as Wikipedia, must at some point determine the
direction of content according to the language code, either on the server or on
the client. All multilingual websites need it, so it should be built into the
HTML technology.

HTML4 specifically prohibits deducing the value of the "dir" attribute from the
value of the "lang" attribute.

HTML5 in its current state also says that dir must be specified separately and
that it must not depend on "lang".

I tried to understand the reasons for this. I asked several people who are
knowledgeable in web standards about this, and nobody had a convincing answer.
Usually the answers revolved around these issues, all of which seem very easy
to resolve:

1. Problem: "The browser doesn't have enough information about the language, so
it can't determine the direction from the language."

Solution: Direction is not really a content attribute. Direction is an
_inherent_ property of the writing system in which the element's content is
written. BCP 47 and CLDR provide sane default values for the writing system of
all languages, and the direction can be easily deduced from the writing system.
This may have been untrue when HTML4 was release, but CLDR is quite mature now.

2. Problem: "It's not backwards-compatible."

Solution: The default direction can be applied only to documents that specify
the right DOCTYPE.

3. Problem: "Some languages can be both rtl and ltr, such as Azerbaijani and
Punjabi."

Solution: All languages codes have sane standard defaults, including these two
languages and all others. In any case, it is completely reasonable to put the
responsibility to specify the correct and complete language code in the web
authors' hands. The can use, for example az-latn, az-arab, pa-guru or pa-arab
or use ISO 639 3 three-letter codes: azn, azb, pan and pnb. Resolving bug 19887
will make this even more robust.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Wednesday, 7 November 2012 08:12:33 UTC