<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>21425</bug_id>
          
          <creation_ts>2013-03-28 16:40:32 +0000</creation_ts>
          <short_desc>possible regex error in test cases from &quot;fn-matches.re&quot;</short_desc>
          <delta_ts>2013-06-13 10:27:04 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>XQuery 3 &amp; XPath 3 Test Suite</component>
          <version>Working drafts</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Sorin Nasoi">spungi</reporter>
          <assigned_to name="O&apos;Neil Delpratt">oneil</assigned_to>
          <cc>mike</cc>
    
    <cc>paul</cc>
    
    <cc>tim</cc>
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>85100</commentid>
    <comment_count>0</comment_count>
    <who name="Sorin Nasoi">spungi</who>
    <bug_when>2013-03-28 16:40:32 +0000</bug_when>
    <thetext>The test case &quot;re00056&quot; is:

    (every $s in tokenize(&apos;&apos;, &apos;,&apos;)
      satisfies matches($s, &apos;^(?:[^a-d-b-c])$&apos;))
    and
    (every $s in tokenize(&apos;a-b,c-c,ab,cc&apos;, &apos;,&apos;)
      satisfies not(matches($s, &apos;^(?:[^a-d-b-c])$&apos;)))

The regular expression [^a-d-b-c] seems wrong. The &quot;a-d&quot; means &quot;&apos;a&apos; through &apos;d&apos;, i.e., abcd, and the &quot;b-c&quot; means &quot;&apos;b&apos; through &apos;c&apos;, i.e., &quot;bc&quot;. However, the &apos;-&apos; between the &apos;d&apos; and &apos;b&apos; makes no sense. It can&apos;t mean &quot;&apos;d&apos; through &apos;b&apos;&quot; since &apos;b&apos; is less than &apos;d&apos;, nor can it mean &quot;a-d without &apos;b&apos; and without &apos;c&apos;,&quot; i.e., range subtraction per &lt;http://www.w3.org/TR/xmlschema-2/#nt-charClassSub&gt;.

Similarly, the test case &quot;re00086&quot; is:

    (every $s in tokenize(&apos;,a-1x-7,c-4z-9,a-1z-8a-1z-9,a1z-9,a-1z8,a-1,z-9&apos;, &apos;,&apos;)
      satisfies matches($s, &apos;^(?:[a-c-1-4x-z-7-9]*)$&apos;))
    and
    (every $s in tokenize(&apos;&apos;, &apos;,&apos;)
      satisfies not(matches($s, &apos;^(?:[a-c-1-4x-z-7-9]*)$&apos;)))

The regular expression [a-c-1-4x-z-7-9] seems wrong for the same reason.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>85109</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2013-03-28 17:21:46 +0000</bug_when>
    <thetext>The meaning of hyphens in regular expressions is underspecified in XSD 1.0, but is clarified in XSD 1.1. My advice would be that XPath implementations should follow what XSD 1.1 says, though this is not mandatory. I believe that the tests are consistent with this assumption.

For example, I think that [a-c-1-4x-z-7-9] means [a-c]|-|[1-4]|[x-z]|-|[7-9].</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>85113</commentid>
    <comment_count>2</comment_count>
    <who name="Paul J. Lucas">paul</who>
    <bug_when>2013-03-28 17:50:25 +0000</bug_when>
    <thetext>But according to XML Schema Part 2, section F.1 under &quot;Character Range&quot;:

• The - character is a valid character range only at the beginning or end of a ·positive character group·.

Hence the two -&apos;s in the middle that you beak out in your &quot;means&quot; would seem to contradict that.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>85115</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2013-03-28 18:11:58 +0000</bug_when>
    <thetext>Yes, XSD 1.0 is a mess in this area. It also says (two bullets earlier) that &quot;-&quot; is not a valid character range, which contradicts the sentence you cite. If you try and implement what XSD 1.0 says regarding hyphens, you tie yourself in knots, which is why I suggest that following the XSD 1.1 spec is the wisest course.

But you&apos;re probably right that these tests results should be marked as being dependent on XSD 1.1 support. I wouldn&apos;t like to say what the correct result is for a processor following the 1.0 rules.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>87103</commentid>
    <comment_count>4</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2013-05-01 13:40:44 +0000</bug_when>
    <thetext>Looking at this again, although XSD 1.0 is confused about hyphens in regexes, I think it is clear that a-b-c-d is not allowed. (The rules are contradictory, but not about this particular case).

So I&apos;m going to split these two tests into XSD 1.0 and XSD 1.1 versions.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>89239</commentid>
    <comment_count>5</comment_count>
    <who name="Tim Mills">tim</who>
    <bug_when>2013-06-13 09:59:10 +0000</bug_when>
    <thetext>If regular expressions of the form 

[a-d-b-c]

are now considered to be errors, I think there are two other tests which are possibly incorrect.
	
re00102: regex contains [a-a-x-x]
K2-MatchesFunc-16: regex contains [0-9-.]</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>89240</commentid>
    <comment_count>6</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2013-06-13 10:27:04 +0000</bug_when>
    <thetext>re00102 - agree, I have split it into two versions.

K2-MatchesFunc-16 - the problem was previously raised in bug #4466 and the test already allows alternative results. But the correct error is surely FORX0002 rather than FORG0001. I&apos;ve split it into two versions.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>