This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29407 - [QT3] analyze-string-008, analyzeString-017a, nested grouping
Summary: [QT3] analyze-string-008, analyzeString-017a, nested grouping
Status: RESOLVED WORKSFORME
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQuery 3 & XPath 3 Test Suite (show other bugs)
Version: Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: O'Neil Delpratt
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-30 04:34 UTC by Abel Braaksma
Modified: 2016-01-31 16:25 UTC (History)
1 user (show)

See Also:


Attachments

Description Abel Braaksma 2016-01-30 04:34:14 UTC
I have trouble understanding why the expected test result is correct here. The test is: analyze-string("banana", "(a(n?))")

This can be interpreted as:
b  -- non-match
an -- match, group 1: an, group 2: n
an -- match, group 1: an, group 2: n
a  -- match, group 1: a, group 2: <empty> or absent

A valid outcome, I think, is the following:

<fn:analyze-string-result xmlns:fn="http://www.w3.org/2005/xpath-functions">
    <fn:non-match>b</fn:non-match>
    <fn:match>
        <fn:group nr="1">an<fn:group nr="2">n</fn:group></fn:group>
    </fn:match>
    <fn:match>
        <fn:group nr="1">an<fn:group nr="2">n</fn:group></fn:group>
    </fn:match>
    <fn:match>
        <fn:group nr="1">a<fn:group nr="2" /></fn:group>
    </fn:match>
</fn:analyze-string-result>

This is not the same as the current expected outcome, which assumes that "n" is not part of group 1.

I'm unsure whether the spec mandates that an empty group *must* be specified, or *may* be absent. Though I see no harm in doing so, I can't readily find this in the spec.
Comment 1 Abel Braaksma 2016-01-30 04:40:56 UTC
Same is true for analyzeString-017a: analyze-string("banana", "(b(x?))")

Currently expected is:

    <fn:match>
        <fn:group nr="1">b</fn:group>
        <fn:group nr="2" />
    </fn:match>
    <fn:non-match>anana</fn:non-match>

But I believe the correct result ought to be:

    <fn:match>
        <fn:group nr="1">b<fn:group nr="2" /></fn:group>
    </fn:match>
    <fn:non-match>anana</fn:non-match>

I.e.: the groups are nested, so should be nested in the result. Otherwise there would be no distinction with an regex such as "(b)(x?)" (which I believe would result in the first result above).
Comment 2 Michael Kay 2016-01-31 10:13:41 UTC
The string value of the XML that results from analyze-string is always the original input string; the function essentially takes the original string and adds markup to show the way in which the regex was (or wasn't) matched. I think the spec makes this clear. Your proposed result doesn't have this property.

You say that "the current result assumes that "n" is not part of group 1". That's not the case. The current result is

<fn:match><fn:group nr="1">a<fn:group nr="2">n</fn:group></fn:group></fn:match>

which shows that the "n" is part of both groups 1 and 2 - the groups are nested.
Comment 3 Abel Braaksma 2016-01-31 16:25:07 UTC
Oh my, looking again at these tests I think I have either been looking at the wrong version locally, or did some copy/paste errors.

With a fresh look and your comment, I wholeheartedly agree. It also makes sense that the nesting takes care of the matched strings belonging to more than one group.

I will close with no action.