All multilingual websites, such as Wikipedia, must at some point determine the direction of content according to the language code, either on the server or on the client. All multilingual websites need it, so it should be built into the HTML technology.
HTML4 specifically prohibits deducing the value of the "dir" attribute from the value of the "lang" attribute.
HTML5 in its current state also says that dir must be specified separately and that it must not depend on "lang".
I tried to understand the reasons for this. I asked several people who are knowledgeable in web standards about this, and nobody had a convincing answer. Usually the answers revolved around these issues, all of which seem very easy to resolve:
1. Problem: "The browser doesn't have enough information about the language, so it can't determine the direction from the language."
Solution: Direction is not really a content attribute. Direction is an _inherent_ property of the writing system in which the element's content is written. BCP 47 and CLDR provide sane default values for the writing system of all languages, and the direction can be easily deduced from the writing system. This may have been untrue when HTML4 was release, but CLDR is quite mature now.
2. Problem: "It's not backwards-compatible."
Solution: The default direction can be applied only to documents that specify the right DOCTYPE.
3. Problem: "Some languages can be both rtl and ltr, such as Azerbaijani and Punjabi."
Solution: All languages codes have sane standard defaults, including these two languages and all others. In any case, it is completely reasonable to put the responsibility to specify the correct and complete language code in the web authors' hands. The can use, for example az-latn, az-arab, pa-guru or pa-arab or use ISO 639 3 three-letter codes: azn, azb, pan and pnb. Resolving bug 19887 will make this even more robust.
Mass move to "HTML WG"
Maybe, if people don't like this proposal because of backwards compatibility issues, it could be redone in way similar to
that is: leave the attribute "lang" with the current semantics and add a new attribute called "language" that would also apply direction according to language (or, more precisely, writing system).
This would cause backwards-compatibility problems, and there are zero indications of any implementer support for it, not indications of much webdev support.