ISSUE-66: find() method sensitivity to Unicode normalization

find() method sensitivity to Unicode normalization

State:
CLOSED
Product:
contacts-api
Raised by:
Koji Ishii
Opened on:
2011-07-04
Description:
Section 4.2.1 find method
http://dev.w3.org/2009/dap/contacts/#methods

Section 5
http://dev.w3.org/2009/dap/contacts/#contact-search-processing

WG Approved: Yes

As with I18N-ISSUE-65, the find() method and search processing do not clearly define the details of "match". When processing a search, we feel that it should be clear if Unicode Normalization has been applied to the arguments and/or contacts being searched.

In our WG's opinion, Unicode normalization is desirable when searching, since many keyboards or user-agents generate non-normalized search strings (for example, Vietnamese keyboards vary by vendor). As a result, search strings entered by the user might not match content that uses a different normalization form. For example, a user might enter the pre-composed character U+1EA6 (LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND GRAVE)--or they might enter U+0132 U+0300 instead. [They could also technically use U+00C0 U+030C, although this is less likely.] Ensuring that searches are done in a normalized manner will improve interoperability, since a collection of contacts may have been entered on a variety of devices and into a variety of systems.

The I18N WG recommends requiring comparisons be done in a Unicode normalized manner. We note that this is currently an issue raised before the TAG and guidance here is subject to change. If TAG were to decide that normalization is undesirable, a health warning would be warranted.

For more information on normalization see:

Unicode Standard Annex 15 http://www.unicode.org/reports/tr15/
Character Model-Normalization http://www.w3.org/TR/charmod-norm
Related Actions Items:
No related actions
Related emails:
  1. Review of tracker issues for best practices (part II) (from addison@lab126.com on 2015-03-28)
  2. Re: I18N-ISSUE-66: find() method sensitivity to Unicode normalization [Contacts API] (from robin@berjon.com on 2011-07-05)
  3. I18N-ISSUE-66: find() method sensitivity to Unicode normalization [Contacts API] (from sysbot+tracker@w3.org on 2011-07-04)

Related notes:

No additional notes.

Display change log ATOM feed


Addison Phillips <addisonI18N@gmail.com>, Chair, Richard Ishida <ishida@w3.org>, Bert Bos <bert@w3.org>, Fuqiao Xue <xfq@w3.org>, Atsushi Shimono <atsushi@w3.org>, Staff Contacts
Tracker: documentation, (configuration for this group), originally developed by Dean Jackson, is developed and maintained by the Systems Team <w3t-sys@w3.org>.
$Id: 66.html,v 1.1 2023/07/19 12:02:05 carcone Exp $