[Bug 6746] New: case-insensitivity of other than a-z and A-Z, e.g., diacritics

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6746

           Summary: case-insensitivity of other than a-z and A-Z, e.g.,
                    diacritics
           Product: HTML WG
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P3
         Component: HTML 5: The Markup Language
        AssignedTo: mike@w3.org
        ReportedBy: Nick_Levinson@yahoo.com
         QAContact: public-html-bugzilla@w3.org
                CC: public-html@w3.org


Shouldn't there be a case-insensitivity or variant thereof that accepts
insensitivity for diacritically-marked letters? Recognizing an option for
diacritics and anything like them would make authoring somewhat easier.

This should also apply to any characters other than a-z and A-Z that exist in
multiple cases. I don't know if there are any other than diacritically-marked
letters, but all that's needed is an abstract definition.

No letters other than the 26 in two cases exist in 7-bit ASCII but they do in
other charsets.

This refers to http://www.w3.org/html/wg/markup-spec/ (Editor's Draft (24 March
2009), accessed 3-27-09), section 4. Presumably, it also applies to many other
programming and authoring contexts.

For the HTML 5 standard, I think all that would be needed would be a
terminology, such as _extended-case-insensitivity_. The definition would extend
to any character pair in which characters differ only in case. Listing all
possible character case pairs can be deferred and done by others, perhaps using
a Wiki so anyone can add case pairs from various alphabets.

Implementation need not be mandatory. Each user agent designer and each tool
designer could implement it using agreed-upon terminology whenever they choose
to. Once one browser recognizes extended case insensitivity, authors can take
advantage of it.

Example: In a form, a user types their name in sentence case with a tilde over
a lower-case letter. From many form submissions, a list of names is produced in
all capitals. The tilde should be preserved through case-changing. It can be
now, but it takes more work to, for instance, write a regular expression that
recognizes such characters case-insensitively. The trend, albeit delayed,
toward internationalization of compatibility with popular use means a growing
expectation that such characters will be accepted as they are when
hand-written.

Thank you.

-- 
Nick


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Sunday, 29 March 2009 06:24:42 UTC