This document:Public document·View comments·Disposition of Comments·
Nearby:Efficient Extensible Interchange Working Group Other specs in this tool
Quick access to LC-2103 LC-2104 LC-2105 LC-2106 LC-2107 LC-2108 LC-2109 LC-2110 LC-2130 LC-2132 LC-2133 LC-2164 LC-2165 LC-2166 LC-2167 LC-2168 LC-2169 LC-2170 LC-2171 LC-2172 LC-2173 LC-2174 LC-2175 LC-2176 LC-2177 LC-2178 LC-2179 LC-2180 LC-2181 LC-2182 LC-2183 LC-2184 LC-2185 LC-2186 LC-2187 LC-2188 LC-2189 LC-2190 LC-2191 LC-2192 LC-2193 LC-2194 LC-2196 LC-2197 LC-2198 LC-2227 LC-2248
Previous: LC-2108 Next: LC-2190
Hello, From 7.1.10.1 Restricted Character Sets: "... If the restricted character set for a datatype contains at least 255 characters or contains non-BMP characters, the character set of the datatype is not restricted and can be omitted from further consideration..." Appendix E Deriving Character Sets from XML Schema Regular Expressions explains how to build character sets. It enumerates character groups that if they are contained in regular expression atom, the charset of the whole expression is defined to be the entire set of XML characters. One of the exceptions is multi-character escape "\d". By XSD definition it is equivalent to category escape "\p{Nd}". But according Unicode 5.0.0's UnicodeData.txt data file this category contains 290 characters (230 BMP and 60 non-BMP). The exception of "\d" (and "\p{Nd}") is in correct: after all processing the expression "\d" becomes non-suitable for datatype encoding using restricted character set since the set has more than 255 and contains non-BMP characters. There are a totals from UnicodeData.txt: Category BMP non-BMP Total chars Excl.in EXI \p{Cc} 65 0 65 \p{Cf} 33 105 138 ? \p{Co} 2 4 6 X \p{Cs} 6 0 6 \p{Ll} 1102 532 1634 X \p{Lm} 167 0 167 \p{Lo} 6009 1954 7963 X \p{Lt} 31 0 31 \p{Lu} 836 484 1320 X \p{Mc} 167 8 175 ? \p{Me} 10 0 10 \p{Mn} 602 278 880 X \p{Nd} 230 60 290 ? \p{Nl} 51 159 210 ? \p{No} 252 84 336 ? \p{Pc} 10 0 10 \p{Pd} 18 0 18 \p{Pe} 65 0 65 \p{Pf} 9 0 9 \p{Pi} 11 0 11 \p{Po} 260 18 278 ? \p{Ps} 66 0 66 \p{Sc} 41 0 41 \p{Sk} 99 0 99 \p{Sm} 904 10 914 X \p{So} 2350 608 2958 X \p{Zl} 1 0 1 \p{Zp} 1 0 1 \p{Zs} 18 0 18 Regards, Yuri Delendik