This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10555 - [XQ31ReqUC] Pattern matching proposal for XQuery
Summary: [XQ31ReqUC] Pattern matching proposal for XQuery
Status: NEW
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Requirements for Future Versions (show other bugs)
Version: Working drafts
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Jim Melton
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-06 10:49 UTC by Pavel Minaev
Modified: 2014-05-20 16:48 UTC (History)
3 users (show)

See Also:


Attachments

Description Pavel Minaev 2010-09-06 10:49:40 UTC
I propose to add a facility to XQuery to enable simple pattern matching, along the lines of XSLT template patterns, but in a more localized way (i.e. no loose "dynamic dispatch" as in XSLT). More specifically, I propose to add a new kind of expression, "match", with syntax similar to the existing "switch" statement, but with expressions in cases replaced by XPath patterns. An example:

    match ($animal)
    case (foo) return 1
    case (foo/bar) return 2
    case (*[@baz]//text()) return 3
    default return 4

Syntax for patterns, as well as semantics of what, precisely, it means to match a pattern, can be taken almost verbatim from XSLT 2.1 spec; the only adjustment that is clearly needed is to remove key(). Some other advanced pattern kinds might also be trimmed to keep things simple.

Similarly to "switch", matching is done in the order, so the first case that matches wins - this allows to start with more specific patterns, and generalize towards the end.

As described, this already matches the expressivity provided by XSLT 1.0 - "xsl:apply-templates" can be recast as a self-recursive function consisting of a single "match", with a "case" for every template. The following is an example taken from XSLT 1.0 Recommendation (http://www.w3.org/TR/xslt#section-Document-Example), and translated to a "match":

    declare function local:apply-templates($nodes)
    {
        for $n in $nodes 
        let $apply-templates := function() { local:apply-templates($n/node()) }
        return match ($n)
        case (doc) return
            <html>
                <head>
                    <title>{$n/$title}</title>
               </head>
                <body>
                    {$apply-templates()}
                </body>
            </html>
    
        case (doc/title) return
            <h1>{$apply-templates()}</h1>
    
        case (chapter/title) return
            <h2>{$apply-templates()}</h2>
    
        case (section/title) return
            <h3>{$apply-templates()}</h3>
    
        case (para) return
            <p>{$apply-templates()}</p>
    
        case (note) return
            <p class="note">
                <b>NOTE: </b>
                {$apply-templates()}
            </p>
    
        case (emph) return
            <em>{$apply-templates()}</em>
    
        (: XSLT built-in template rules :)
    
        case (text() | @*) return
            text {$n}
    
        default return
            $apply-templates()
    }

XSLT-like priorities and modes are straightforward to build on top of that using existing facilities, and it is even possible to implement "xsl:import" and "xsl:apply-imports" in terms of higher-order functions.

An interesting step further would be to also add "next-match()" as a function. That would probably necessitate adding another item to the dynamic context (representing the continuation of the current match within a case).
Comment 1 Michael Kay 2010-09-06 13:02:12 UTC
I think it would be better to avoid the duality that exists in XSLT between patterns and expressions. A pattern should be an expression that returns true or false based on evaluating some condition applied to the context item, so that it can be used anywhere a boolean expression can be used. So if for example the syntax [|pattern|] is used, then [|p|] is an expression that returns true if the context node is a p element.
Comment 2 John Snelson 2010-09-06 13:50:15 UTC
I sympathize with Mike's perspective, but there are other nice properties of XSLT patterns that generic boolean predicates don't have. For instance, a common optimization for XSLT implementations is to rule out patterns that don't match a given node type early on, so that less patterns need to be matched. This ability derives from the fact that patterns match nodes using a specific declarative syntax.
Comment 3 Michael Kay 2010-09-06 17:30:11 UTC
You could restrict the expressions in the match{} construct to be pattern-expressions; what I'm saying is that if patterns are added to the language, they should be usable anywhere expressions are allowed (notably in boolean contexts), rather than being a completely different kind of animal as they are in XSLT.
Comment 4 Pavel Minaev 2010-09-06 20:00:52 UTC
I think it's a very neat twist, actually. I can think of quite a few cases from past experience where the ability to do a simple check against a pattern would be handy; in my original proposal, this is possible with a two-branch "match", but overly verbose. In contrast, the [|pattern|] syntax is nicely composable, especially if added as another option to FilterExpr, so that it can be used in paths in lieu of [], like so:

    $nodes[|foo//bar|]/baz

So it wouldn't really be a boolean expression then, but you could combine it with exists(), or just rely on effective boolean value, to get the desired effect:

    if ($node[|foo//bar|]) then ...

And then, as Michael notes, you could still have "match", just require case expressions to be pattern expressions (on grammar level):

    match ($node)
    case [|foo|] return 1
    case [|foo/bar|] return 2
    case [|*[@baz]//text()|] return 3
    default return 4

Aesthetically, it also has a nice look to it that blends well with the use of | in patterns:

    $nodes[|foo|bar|baz|]
Comment 5 Jonathan Robie 2011-11-19 14:31:17 UTC
Moving to requirements for future versions - this is not in XQuery 3.0.
Comment 6 Jonathan Robie 2011-12-21 19:02:28 UTC
For reference, the Carrot proposal:

http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html
Comment 7 Jonathan Robie 2012-02-12 13:39:42 UTC
The Carrot Proposal, from Evan Lenz, is relevant here ...

http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html
https://t.co/kHpTqeRV
Comment 8 Jonathan Robie 2012-02-12 14:06:44 UTC
John Snelson's XML Prague paper shows two more approaches, one functional, one based on function annotations.

www.xmlprague.cz/2012/files/xmlprague-2012-proceedings.pdf

The project is here:

https://github.com/jpcs/transform.xq
Comment 9 Jonathan Robie 2014-05-20 16:48:14 UTC
Assigning to future requirements per Working Group decision (https://lists.w3.org/Archives/Member/w3c-xsl-query/2012Oct/0087.html).