This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4634 - [FO] poorly formed back-references
Summary: [FO] poorly formed back-references
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Functions and Operators 1.0 (show other bugs)
Version: Recommendation
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
Depends on:
Blocks: 4610
  Show dependency treegraph
Reported: 2007-06-12 20:10 UTC by Andrew Eisenberg
Modified: 2007-11-16 09:20 UTC (History)
0 users

See Also:


Description Andrew Eisenberg 2007-06-12 20:10:22 UTC
In section 7.6.1, Regular Expression Syntax, we say:

"Back-references are allowed. ... A back-reference matches the string that was matched by the nth capturing subexpression within the regular expression, that is, the parenthesized subexpression whose opening left parenthesis is the nth unescaped left parenthesis within the regular expression. The closing right parenthesis of this subexpression must occur before the back-reference. ..."

In Bug #4610 we considered the following query:

replace("abcd", "(a)\2(b)", "")

While I interpreted the violation of "must occur" as requiring that an error be raised, Michael Kay interpreted it as causing the back-reference to fail to match a string.

The replace function acknowledges that a pattern can be invalid, saying:

"An error is raised [err:FORX0002] if the value of $pattern is invalid according to the rules described in section 7.6.1 Regular Expression Syntax."

Let me suggest that this be clarified by changing the existing sentence: 

"The closing right parenthesis of this subexpression must occur before the back-reference."

to the following:

"The regular expression is invalid if the closing right parenthesis of this subexpression occurs before the back-reference."
Comment 1 Michael Kay 2007-06-12 20:38:03 UTC
For consistency this also means that it should be an error to use \3 if no third subexpression exists. So I would suggest changing

The closing right parenthesis of this subexpression must occur before the back-reference. 


"The regular expression is invalid if this subexpression does not exist or if its closing right parenthesis occurs after the back-reference."
Comment 2 Michael Kay 2007-06-27 15:03:04 UTC
The WGs accepted the proposal in comment #1
Comment 3 Michael Kay 2007-07-31 20:42:34 UTC
The change has been merged into erratum E4, which affects another sentence in the same paragraph. See bug #4106.