<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>5431</bug_id>
          
          <creation_ts>2008-01-26 02:38:10 +0000</creation_ts>
          <short_desc>Normal characters, character references</short_desc>
          <delta_ts>2009-01-21 02:22:02 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XML Schema</product>
          <component>Datatypes: XSD Part 2</component>
          <version>1.0/1.1 both</version>
          <rep_platform>Macintosh</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard>cluster: regex</status_whiteboard>
          <keywords>editorial, resolved</keywords>
          <priority>P4</priority>
          <bug_severity>minor</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Dave Peterson">davep</reporter>
          <assigned_to name="C. M. Sperberg-McQueen">cmsmcq</assigned_to>
          
          
          <qa_contact name="XML Schema comments list">www-xml-schema-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>18592</commentid>
    <comment_count>0</comment_count>
    <who name="Dave Peterson">davep</who>
    <bug_when>2008-01-26 02:38:10 +0000</bug_when>
    <thetext>The paragraph following the Normal Character (&quot;Char&quot;) production in the RE appendix says &quot;Note that a ·normal character· can be represented either as itself, or with a character reference.&quot;

Two problems: 

1.  &apos;-&apos; and &apos;^&apos; are ·normal characters·, but cannot always represent themselves in an RE.

2.  A character reference is a string of characters; the productions and accompanying semantics defining REs do not allow such a string to be interpreted other than matching each character autonymously.  Character references are used in productions, not REs.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>18593</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2008-01-26 10:25:15 +0000</bug_when>
    <thetext>As is revealed by following the hyperlink, the character references it is referring to are those (such as &amp;#x23;) used in XML, not those (such as #x5B) used in production rules.

It might be clearer to say something like: &quot;Note: when regular expressions are written in an XML document, for example in the value attribute of the xs:pattern element, non-ASCII characters can be represented using XML entity or character references. For this reason, the regular expression syntax does not provide any way of representing characters using octal or hexadecimal character codes. The syntax defined here assumes that XML entity and character references have already been expanded.&quot;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>18603</commentid>
    <comment_count>2</comment_count>
    <who name="Dave Peterson">davep</who>
    <bug_when>2008-01-27 02:40:23 +0000</bug_when>
    <thetext>(In reply to comment #1)
&gt; As is revealed by following the hyperlink, the character references it is
&gt; referring to are those (such as &amp;#x23;) used in XML, not those (such as #x5B)
&gt; used in production rules.

You got me.  :-(  I&apos;m embarrassed.  See following.

&gt; It might be clearer to say something like: &quot;Note: when regular expressions are
&gt; written in an XML document, for example in the value attribute of the
&gt; xs:pattern element, non-ASCII characters can be represented using XML entity or
&gt; character references. For this reason, the regular expression syntax does not
&gt; provide any way of representing characters using octal or hexadecimal character
&gt; codes. The syntax defined here assumes that XML entity and character references
&gt; have already been expanded.&quot;

Good start.  We also will need to comment that people using the mechanism in other situations (such as born-binary derivations in non-XML environments) will have to provide other solutions.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>23195</commentid>
    <comment_count>3</comment_count>
    <who name="C. M. Sperberg-McQueen">cmsmcq</who>
    <bug_when>2009-01-20 23:58:05 +0000</bug_when>
    <thetext>At its telcon of 19 December 2008, the XML Schema WG accepted a proposal
presented in 

  http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.dp081203.html

with amendments, as a resolution of this issue.  The changes have now
been integrated into the status-quo document, so I&apos;m marking the issue 
resolved.

DaveP, you know what to do next.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>