[Bug 17160] New: [FO3.0] Captured groups in regular expressions.

https://www.w3.org/Bugs/Public/show_bug.cgi?id=17160

           Summary: [FO3.0] Captured groups in regular expressions.
           Product: XPath / XQuery / XSLT
           Version: Last Call drafts
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Functions and Operators 3.0
        AssignedTo: mike@saxonica.com
        ReportedBy: mike@saxonica.com
         QAContact: public-qt-comments@w3.org


ACTION A-510-05 on Michael Kay to raise a bug against the regular expression
specification following up on test bug 15545; the proposal should be to permit
some implementation-dependent variation in the matching of groups within a
regex (for example by a back-reference) when the regex is technically
ambiguous.

First, some editorial housekeeping. Delete this sentence: "The fn:replace
function described below allows access to the parts of the input string that
matched a sub-expression (called captured substrings)." and replace it with:
"Some operations associated with regular expressions (for example,
back-references, and the fn:replace function) allow access to the parts of the
input string that matched a sub-expression (called captured substrings)." 

Then replace this sentence "If a sub-expression matches more than one substring
(because it is within a construct that allows repetition), then only the last
substring that it matched will be captured." with a new paragraph:

When parentheses are used in a part of the regular expression that is matched
more than once (because it is within a construct that allows repetition), then
only the last substring that it matched will be captured. Note that this rule
is not sufficient in all cases to ensure an unambiguous result, especially in
cases where (a) the regular expression contains nested repeating constructs,
and/or (b) the repeating construct matches a zero-length string. In such cases
it is implementation-dependent which substring is captured. For example given
the regular expression (a*)+ and the input string "aaaa", an implementation
might legitimately capture either "aaaa" or a zero length string as the content
of the captured subgroup.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Wednesday, 23 May 2012 13:41:47 UTC