[Bug 16809] New: [FO3.0] '$' in regular expressions

https://www.w3.org/Bugs/Public/show_bug.cgi?id=16809

           Summary: [FO3.0] '$' in regular expressions
           Product: XPath / XQuery / XSLT
           Version: Proposed Edited Recommendation
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Functions and Operators 3.0
        AssignedTo: mike@saxonica.com
        ReportedBy: mike@saxonica.com
         QAContact: public-qt-comments@w3.org


In bug #4543 we decided that in multiline mode, a trailing newline should be
disregarded when considering whether we are at the end of the string for the
purpose of matching '$'. However, there was no suggestion that a trailing
newline should be disregarded when not in multiline mode.

However, it appears that some popular regex engines do interpret '$' in this
way. As evidence, the XSLT 2.0 test case regex02 expects '$' to match in these
circumstances, and in several years the test case has not been challenged.

A new test case has been added to QT3 to help resolve the question.
fn-matches-47 does this query

fn:matches(concat('abcd', codepoints-to-string(10), 'defg',
codepoints-to-string(10)), "g$")

and the answer is clearly false given the spec as currently written. It is
reported in email that XmlPrime returns false, while DB2 LUW 9.7 returns true.
Saxon 9.4 also returns true, but the current development build of Saxon, with a
new regex engine, returns false: that's how the problem came to my attention.

We need to decide whether to confirm the current specification or to be
"bug-compatible" with existing implementations. If we do decide to be
"bug-compatible" we need to decide exactly what characters at the end of the
string should be ignored.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 20 April 2012 17:01:47 UTC