-
Default ordering of multiple non-spacing marks
-
Precomposed/decomposed diacritic character representation
-
Hangul jamo vs. johab and jamo representation alternatives
-
CJK compatibility ideographs
-
Other backwards compatibility duplicated characters
-
Separately coded Indic length/AI/AU marks
-
Glyphs for vertical variants
-
Croatian digraphs, other ligatures (Latin, Arabic,...)
-
Various variant punctuation (apostrophes, middle dots, spaces,...)
-
Half-width/full-width characters (Latin, Katakana and Hangul)
-
Vertical variants (U+FE30...)
-
Presence or absence of joiner/non-joiner
-
Superscript/subscript variants (numbers and IPA)
-
Small form variants (U+FE50...)
-
Upper case/lower case
-
Similar letters from different scripts (varying degrees) (e.g. "A" in Latin,
Greek, and Cyrillic)
-
Letterlike symbols, Roman numerals (varying degrees)
-
Enclosed alphanumerics, katakana, hangul,...
-
Squared katakana (units,...), squared Latin abbreviations,...
-
CJK ideograph variants (varying degrees, in particular general simplifications,
backwards-compatibility non-unifications, JIS 78/83 problems)
-
Ignorable whitespace, hyphens,... (sorting)
-
Ignorable accents,... (sorting)