This document:Public document·View comments·Disposition of Comments·
Nearby:Efficient Extensible Interchange Working Group Other specs in this tool
Quick access to LC-2103 LC-2104 LC-2105 LC-2106 LC-2107 LC-2108 LC-2109 LC-2110 LC-2130 LC-2132 LC-2133 LC-2164 LC-2165 LC-2166 LC-2167 LC-2168 LC-2169 LC-2170 LC-2171 LC-2172 LC-2173 LC-2174 LC-2175 LC-2176 LC-2177 LC-2178 LC-2179 LC-2180 LC-2181 LC-2182 LC-2183 LC-2184 LC-2185 LC-2186 LC-2187 LC-2188 LC-2189 LC-2190 LC-2191 LC-2192 LC-2193 LC-2194 LC-2196 LC-2197 LC-2198 LC-2227 LC-2248
Previous: LC-2248 Next: LC-2130
I want to make a suggestion on the section 'Deriving Character Sets from XML Schema Regular Expressions': I want to propose that datatypes with a regular expression containing a "charClassSub" should have no restricted character set. The reason is that all the remaining parts of the regular expression derivation expect only a union of characters, which is very efficient in determining whether the expression contains a restricted character set or not. Having a 'charClassSub' as part of the derivation process may complicate this, as the program now has to subtract portions of the character set as well as add to them, which may be a problem if the character set contains a large number of characters, like this: [ -＀-[`-＀]] That regular expression above would yield a restricted character set of 64 characters; however the implementation may require storing thousands of characters (a naive implementation, yes) before it must exclude them in the 'charClassSub' portion of the regular expression. Another problem is nested 'charClassSub' sets. For example, the following regular expression is allowed: [A-Z-[B-Z-[C-Z-[D-Z-[E-Z-[...]]]]]] Both problems make 'charClassSub' problematic in restricted character set derivation. I thank you for your time.