This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
The paragraph following the Normal Character ("Char") production in the RE appendix says "Note that a ·normal character· can be represented either as itself, or with a character reference." Two problems: 1. '-' and '^' are ·normal characters·, but cannot always represent themselves in an RE. 2. A character reference is a string of characters; the productions and accompanying semantics defining REs do not allow such a string to be interpreted other than matching each character autonymously. Character references are used in productions, not REs.
As is revealed by following the hyperlink, the character references it is referring to are those (such as #) used in XML, not those (such as #x5B) used in production rules. It might be clearer to say something like: "Note: when regular expressions are written in an XML document, for example in the value attribute of the xs:pattern element, non-ASCII characters can be represented using XML entity or character references. For this reason, the regular expression syntax does not provide any way of representing characters using octal or hexadecimal character codes. The syntax defined here assumes that XML entity and character references have already been expanded."
(In reply to comment #1) > As is revealed by following the hyperlink, the character references it is > referring to are those (such as #) used in XML, not those (such as #x5B) > used in production rules. You got me. :-( I'm embarrassed. See following. > It might be clearer to say something like: "Note: when regular expressions are > written in an XML document, for example in the value attribute of the > xs:pattern element, non-ASCII characters can be represented using XML entity or > character references. For this reason, the regular expression syntax does not > provide any way of representing characters using octal or hexadecimal character > codes. The syntax defined here assumes that XML entity and character references > have already been expanded." Good start. We also will need to comment that people using the mechanism in other situations (such as born-binary derivations in non-XML environments) will have to provide other solutions.
At its telcon of 19 December 2008, the XML Schema WG accepted a proposal presented in http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.dp081203.html with amendments, as a resolution of this issue. The changes have now been integrated into the status-quo document, so I'm marking the issue resolved. DaveP, you know what to do next.