Equivalences: A long list
- Problems already in ASCII: Upper/lower case; l/I/1, 0/O,...
- Precomposed (å) vs. decomposed (a°) (canonical)
- Singletons (Å from Latin-1 vs. Ångstrom) (canonical)
- Croatian digraphs
- Full-width Latin compatibility variants
- Half-width Kana and Hangul compatibility variants
- Vertical compatibility variants (U+FE30...)
- Superscript/subscript variants (numbers and IPA)
- Small form compatibility variants (U+FE50...)
- Enclosed/encircled alphanumerics, Kana, Hangul,...
- Letterlike symbols, Roman numerals,...
- Squared Katakana and Latin abbreviations (units,...)
- Hangul jamo representation alternatives for historical Hangul
- Presence or absence of joiner/non-joiner and other control
characters
- Upper case/lower case distinction
- Distinction between Katakana and Hiragana
- Similar letters from different scripts (e.g. "A" in Latin, Greek, and
Cyrillic)
- CJK ideograph variants (glyph variants introduced due to the source
separation rule, simplifications)
- Various punctuation variants (apostrophes, middle dots, spaces,...)
- Ignorable whitespace, hyphens,...
- Ignorable accents,...
- ...