This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4610 - K2-ReplaceFunc-3
Summary: K2-ReplaceFunc-3
Status: CLOSED FIXED
Alias: None
Product: XML Query Test Suite
Classification: Unclassified
Component: XML Query Test Suite (show other bugs)
Version: unspecified
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Frans Englich
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on: 4634
Blocks:
  Show dependency treegraph
 
Reported: 2007-06-08 17:42 UTC by Andrew Eisenberg
Modified: 2007-07-17 20:09 UTC (History)
0 users

See Also:


Attachments

Description Andrew Eisenberg 2007-06-08 17:42:29 UTC
The query for K2-ReplaceFunc-3 is:

replace("abcd", "(a)\2(b)", "")

I believe that this test case should raise error FORX0002. A sentence in 7.6.3 fn:replace says:

"An error is raised [err:FORX0002] if the value of $pattern is invalid according to the rules described in section 7.6.1 Regular Expression Syntax."

The bullet that defines back-reference in 7.6.1 Regular Expression Syntax says:

"A back-reference matches the string that was matched by the nth capturing subexpression within the regular expression, that is, the parenthesized subexpression whose opening left parenthesis is the nth unescaped left parenthesis within the regular expression. The closing right parenthesis of this subexpression must occur before the back-reference."


The closing right parenthesis of the 2nd capturing subexpression occurs after the back-reference.
Comment 1 Andrew Eisenberg 2007-06-08 17:49:07 UTC
This problem also occurs in K2-MatchesFunc-8, K2-MatchesFunc-9, and K2-MatchesFunc-10.
 
Comment 2 Frans Englich 2007-06-12 09:00:18 UTC
(I think the tests are slightly different. K2-ReplaceFunc-3 has a back reference before the corresponding paranthesis, while K2-MatchesFunc-*'s back references have no corresponding paranthesis at all.)

I agree that raising FORX0002 makes sense from a usability perspective(although my knowledge in regexps is brief) and that the spec says so; "The closing right parenthesis of this subexpression must occur before the back-reference."

Though, I remember discussing this with Mike and we came to the conclusion that back references that didn't match, matched the empty string: "If no string is matched by the nth capturing subexpression, the back-reference is interpreted as matching a zero-length string." I failed to find the mail trails from the discussion; Mike, what's your thoughts on this topic?

I'm currently inclined to change the mentioned tests to FORX0002.
Comment 3 Michael Kay 2007-06-12 09:27:49 UTC
I think I have always read the sentence "The closing right parenthesis of
this subexpression must occur before the back-reference" as meaning "For a subexpression to match, the closing right parenthesis of this subexpression must occur before the back-reference." So I would interpret this as a no-match with the resulting value being the zero-length string. (The justification for this interpretation is that static parsing of the regex was discussed at the beginning of the paragraph, whereas this sentence appears in the middle of a discussion of the matching semantics.) 

However, I'm open to it being an error in the regex. Writing "(abc)\3" can never do anything useful so one might as well strangle it at birth as return arbitrary results.

Andrew, I suggest you raise a bug against the spec on this.
Comment 4 Andrew Eisenberg 2007-06-12 20:13:19 UTC
I have raised this against the F&O specification as Baug #4634. 
Comment 5 Frans Englich 2007-07-13 12:21:51 UTC
The tests Andrew pointed out has been aligned with the spec change. XQTS_current.zip is updated.