JavaScriptInternationalization
From Internationalization
Contents |
JavaScript Internationalization
Issues with Current Spec Wording
- Fix toLowerCase/UpperCase prohibition on proper handling of casing on supplementary characters (P1)
- 15.9.1.8 strongly encourages DST handling not to consider actual rules applied in the past. Too strong? (P1)
Locale-related Behavior
((some aspects could be permitted without major changes, some require major work))
- Locale parameters for formatting dates, numbers, lists, toLocaleString(#locale). (P1)
- Locale-sensitive sorting (P1)
- Method for obtaining the default locale (P1) and for obtaining available locales (P2)
- Method to obtain default time zone. (P1)
- MessageFormat (P2)
- Date/Time formatting pattern strings (P2)
- TimeZone parameter for formatting dates. (P3)
- Note: IETF BCP 47 language tags are generally considered the standard for identifying locales and language specific formats.
Regular Expressions
(( Important, but requires major work. ))
- Character classes for complete range of Unicode characters (digit, letter, usw.) (P2)
- Sets and ranges with supplementary characters (P2)
- Grapheme cluster handling (counting, parsing, incrementing) and code point handling. (P2)
- See UTS#18 (http://www.unicode.org/reports/tr18/) and Perl regexp for more.
Supplementary Character Support and Unicode References
- Track the Unicode version, at least at the major version level (currently 6.0). (P1)
- Remove references to UCS-2 and require UTF-16 support. Require full character set and remove limits to BMP. (P1)
- Unicode escapes to support supplementary characters directly (e.g. \U######, \u{######}) (P3) (regex???)
- Possibly extend fromCharCode() to accept supplementary code points or provide "fromCodePoint()". (P1)
- Add "codePointAt()" to complement "charCodeAt()" to support supplementary characters. (P1)
- Line Terminators missing some characters. (P1)
Providing supplementary character support is an important requirement. Changes made to the Java programming language in this regard (adding additional methods for accessing code points instead of UTF-16 code units) might be an appropriate model. Norbert Lindenberg has an article on the choices Sun made that provides good reference:
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
Some backup notes and references
Markus's pages:
- http://sites.google.com/site/markusicu/unicode/es/unicode-2003
- http://sites.google.com/site/markusicu/unicode/es/i18n-2003
ECMAScript 5ed:
