<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>10555</bug_id>
          
          <creation_ts>2010-09-06 10:49:40 +0000</creation_ts>
          <short_desc>[XQ31ReqUC] Pattern matching proposal for XQuery</short_desc>
          <delta_ts>2014-05-20 16:48:27 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>Requirements for Future Versions</component>
          <version>Working drafts</version>
          <rep_platform>All</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Pavel Minaev">int19h</reporter>
          <assigned_to name="Jim Melton">jim.melton</assigned_to>
          <cc>jim.melton</cc>
    
    <cc>jonathan.robie</cc>
    
    <cc>mike</cc>
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>38480</commentid>
    <comment_count>0</comment_count>
    <who name="Pavel Minaev">int19h</who>
    <bug_when>2010-09-06 10:49:40 +0000</bug_when>
    <thetext>I propose to add a facility to XQuery to enable simple pattern matching, along the lines of XSLT template patterns, but in a more localized way (i.e. no loose &quot;dynamic dispatch&quot; as in XSLT). More specifically, I propose to add a new kind of expression, &quot;match&quot;, with syntax similar to the existing &quot;switch&quot; statement, but with expressions in cases replaced by XPath patterns. An example:

    match ($animal)
    case (foo) return 1
    case (foo/bar) return 2
    case (*[@baz]//text()) return 3
    default return 4

Syntax for patterns, as well as semantics of what, precisely, it means to match a pattern, can be taken almost verbatim from XSLT 2.1 spec; the only adjustment that is clearly needed is to remove key(). Some other advanced pattern kinds might also be trimmed to keep things simple.

Similarly to &quot;switch&quot;, matching is done in the order, so the first case that matches wins - this allows to start with more specific patterns, and generalize towards the end.

As described, this already matches the expressivity provided by XSLT 1.0 - &quot;xsl:apply-templates&quot; can be recast as a self-recursive function consisting of a single &quot;match&quot;, with a &quot;case&quot; for every template. The following is an example taken from XSLT 1.0 Recommendation (http://www.w3.org/TR/xslt#section-Document-Example), and translated to a &quot;match&quot;:

    declare function local:apply-templates($nodes)
    {
        for $n in $nodes 
        let $apply-templates := function() { local:apply-templates($n/node()) }
        return match ($n)
        case (doc) return
            &lt;html&gt;
                &lt;head&gt;
                    &lt;title&gt;{$n/$title}&lt;/title&gt;
               &lt;/head&gt;
                &lt;body&gt;
                    {$apply-templates()}
                &lt;/body&gt;
            &lt;/html&gt;
    
        case (doc/title) return
            &lt;h1&gt;{$apply-templates()}&lt;/h1&gt;
    
        case (chapter/title) return
            &lt;h2&gt;{$apply-templates()}&lt;/h2&gt;
    
        case (section/title) return
            &lt;h3&gt;{$apply-templates()}&lt;/h3&gt;
    
        case (para) return
            &lt;p&gt;{$apply-templates()}&lt;/p&gt;
    
        case (note) return
            &lt;p class=&quot;note&quot;&gt;
                &lt;b&gt;NOTE: &lt;/b&gt;
                {$apply-templates()}
            &lt;/p&gt;
    
        case (emph) return
            &lt;em&gt;{$apply-templates()}&lt;/em&gt;
    
        (: XSLT built-in template rules :)
    
        case (text() | @*) return
            text {$n}
    
        default return
            $apply-templates()
    }

XSLT-like priorities and modes are straightforward to build on top of that using existing facilities, and it is even possible to implement &quot;xsl:import&quot; and &quot;xsl:apply-imports&quot; in terms of higher-order functions.

An interesting step further would be to also add &quot;next-match()&quot; as a function. That would probably necessitate adding another item to the dynamic context (representing the continuation of the current match within a case).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>38481</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2010-09-06 13:02:12 +0000</bug_when>
    <thetext>I think it would be better to avoid the duality that exists in XSLT between patterns and expressions. A pattern should be an expression that returns true or false based on evaluating some condition applied to the context item, so that it can be used anywhere a boolean expression can be used. So if for example the syntax [|pattern|] is used, then [|p|] is an expression that returns true if the context node is a p element.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>38482</commentid>
    <comment_count>2</comment_count>
    <who name="John Snelson">john.snelson</who>
    <bug_when>2010-09-06 13:50:15 +0000</bug_when>
    <thetext>I sympathize with Mike&apos;s perspective, but there are other nice properties of XSLT patterns that generic boolean predicates don&apos;t have. For instance, a common optimization for XSLT implementations is to rule out patterns that don&apos;t match a given node type early on, so that less patterns need to be matched. This ability derives from the fact that patterns match nodes using a specific declarative syntax.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>38491</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2010-09-06 17:30:11 +0000</bug_when>
    <thetext>You could restrict the expressions in the match{} construct to be pattern-expressions; what I&apos;m saying is that if patterns are added to the language, they should be usable anywhere expressions are allowed (notably in boolean contexts), rather than being a completely different kind of animal as they are in XSLT.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>38492</commentid>
    <comment_count>4</comment_count>
    <who name="Pavel Minaev">int19h</who>
    <bug_when>2010-09-06 20:00:52 +0000</bug_when>
    <thetext>I think it&apos;s a very neat twist, actually. I can think of quite a few cases from past experience where the ability to do a simple check against a pattern would be handy; in my original proposal, this is possible with a two-branch &quot;match&quot;, but overly verbose. In contrast, the [|pattern|] syntax is nicely composable, especially if added as another option to FilterExpr, so that it can be used in paths in lieu of [], like so:

    $nodes[|foo//bar|]/baz

So it wouldn&apos;t really be a boolean expression then, but you could combine it with exists(), or just rely on effective boolean value, to get the desired effect:

    if ($node[|foo//bar|]) then ...

And then, as Michael notes, you could still have &quot;match&quot;, just require case expressions to be pattern expressions (on grammar level):

    match ($node)
    case [|foo|] return 1
    case [|foo/bar|] return 2
    case [|*[@baz]//text()|] return 3
    default return 4

Aesthetically, it also has a nice look to it that blends well with the use of | in patterns:

    $nodes[|foo|bar|baz|]</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>60173</commentid>
    <comment_count>5</comment_count>
    <who name="Jonathan Robie">jonathan.robie</who>
    <bug_when>2011-11-19 14:31:17 +0000</bug_when>
    <thetext>Moving to requirements for future versions - this is not in XQuery 3.0.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>61884</commentid>
    <comment_count>6</comment_count>
    <who name="Jonathan Robie">jonathan.robie</who>
    <bug_when>2011-12-21 19:02:28 +0000</bug_when>
    <thetext>For reference, the Carrot proposal:

http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>63999</commentid>
    <comment_count>7</comment_count>
    <who name="Jonathan Robie">jonathan.robie</who>
    <bug_when>2012-02-12 13:39:42 +0000</bug_when>
    <thetext>The Carrot Proposal, from Evan Lenz, is relevant here ...

http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.html
https://t.co/kHpTqeRV</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>64000</commentid>
    <comment_count>8</comment_count>
    <who name="Jonathan Robie">jonathan.robie</who>
    <bug_when>2012-02-12 14:06:44 +0000</bug_when>
    <thetext>John Snelson&apos;s XML Prague paper shows two more approaches, one functional, one based on function annotations.

www.xmlprague.cz/2012/files/xmlprague-2012-proceedings.pdf

The project is here:

https://github.com/jpcs/transform.xq</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>106466</commentid>
    <comment_count>9</comment_count>
    <who name="Jonathan Robie">jonathan.robie</who>
    <bug_when>2014-05-20 16:48:14 +0000</bug_when>
    <thetext>Assigning to future requirements per Working Group decision (https://lists.w3.org/Archives/Member/w3c-xsl-query/2012Oct/0087.html).</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>