<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>29507</bug_id>
          
          <creation_ts>2016-02-26 11:52:05 +0000</creation_ts>
          <short_desc>[xslt30] A problem case for streamed grouping</short_desc>
          <delta_ts>2016-10-06 18:42:16 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>XSLT 3.0</component>
          <version>Candidate Recommendation</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Michael Kay">mike</reporter>
          <assigned_to name="Michael Kay">mike</assigned_to>
          <cc>abel.braaksma</cc>
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>125267</commentid>
    <comment_count>0</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-02-26 11:52:05 +0000</bug_when>
    <thetext>I&apos;m having difficulty with this test case:

  &lt;xsl:template name=&quot;g-008&quot; use-when=&quot;true() or $RUN&quot;&gt;
    &lt;out&gt;
      &lt;xsl:stream href=&quot;../docs/books.xml&quot;&gt;
        &lt;xsl:fork&gt;
          &lt;xsl:for-each-group select=&quot;($extra, /BOOKLIST/BOOKS/ITEM)&quot; group-by=&quot;@CAT&quot;&gt;
            &lt;CAT ID=&quot;{current-grouping-key()}&quot;&gt;
              &lt;xsl:copy-of select=&quot;current-group()/PRICE&quot;/&gt;
            &lt;/CAT&gt;
          &lt;/xsl:for-each-group&gt;
        &lt;/xsl:fork&gt;
      &lt;/xsl:stream&gt;
    &lt;/out&gt;
  &lt;/xsl:template&gt;

It seems to be guaranteed-streamable according to the spec, but in practice streaming it is very difficult.

The problem is that current-group()/PRICE needs sorting into document order, because there is no guarantee that current-group() is already in document order, which is because it is not known whether the nodes in $extra will come before or after the nodes in /BOOKLIST/BOOKS/ITEM in document order.

Now arguably, we know that the subset of nodes in current-group()/PRICE which are streamed nodes will be in document order relative to each other, so one could devise a strategy that takes this into account. But this is pretty hard to achieve.

I&apos;d prefer to make this one non-streamable, but I&apos;m not sure of the best way of changing the rules to make it so.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125268</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-02-26 12:07:02 +0000</bug_when>
    <thetext>I think the same problem might apply to the much simpler expression

($extra, /BOOKLIST/BOOKS/ITEM)/PRICE

Under the GSR, the posture of the LHS is striding, and I think we tend to assume that when an expression is striding, its nodes are delivered in document order, which is not the case here.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125271</commentid>
    <comment_count>2</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-02-26 14:35:47 +0000</bug_when>
    <thetext>I have written a test case to demonstrate the problem: sx-commaExpr-201, which does

&lt;xsl:copy-of select=&quot;($extraItem, /BOOKLIST/BOOKS/ITEM) / PRICE&quot;/&gt;

Note our definition of striding:

[Definition: Striding: indicates that the result of a construct is a sequence of nodes, in document order, that are peers in the sense that none of them is an ancestor or descendant of any other.]

I think we&apos;re best off sticking with this definition, which means that an expression shouldn&apos;t be classified as striding if the results are not in document order.

It&apos;s probably the comma operator that is the main offender here - but it relies on the GSR, so perhaps the GSR is wrong.

Currently (A, B) is striding if A is striding and B is grounded, or vice versa: under GSR 2(d)(iv), if one operand is grounded and motionless and the other is striding and consuming, then the P&amp;S of the comma expression is the P&amp;S of the consuming operand.

Note that if we change this so that ($extraItem, /BOOKLIST/BOOKS/ITEM) is no longer striding, then writing the query as

&lt;xsl:copy-of select=&quot;($extraItem, /BOOKLIST/BOOKS/ITEM) ! PRICE&quot;/&gt;

also fails, even though document order shouldn&apos;t affect this one. (I&apos;ve made this one into test sx-comma-019).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125285</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-02-28 17:38:00 +0000</bug_when>
    <thetext>After debating this with myself on the XSL mailing list, and producing an initial implementation, I believe we can close this as follows.

(A) At the point where we define striding posture, remove the claim that the result of an expression in striding posture is always in document order. Replace this claim with a note that explains that a striding expression may contain a mixture of streamed nodes and grounded items, and the streamed nodes will always be in document order; as a result, some expressions that would normally require sorting into document order, such as (/book/book | $extrabook)/price, are deemed streamable because the sort can be achieved without buffering streamed nodes in memory.

(B) Under &quot;Streamability of path expressions&quot; add a similar note about mixed posture expressions.

Note that under the current rules, similar expressions that involve crawling sub-expressions are not streamable: for example (//book | $extrabook) / PRICE is not guaranteed streamable. This is because the LHS of the &quot;/&quot; operator is not a &quot;scannable expression&quot;. We could easily extend the rules for scannable expressions to cover this case, but I don&apos;t propose to do so.

The test cases available are now sx-comma-040 to -043 and -140 to -143, and sx-union-040 to -043 and -140 to -143.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125430</commentid>
    <comment_count>4</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2016-03-10 11:31:16 +0000</bug_when>
    <thetext>for reference of the discussion, see also https://lists.w3.org/Archives/Public/public-xsl-wg/2016Mar/0000.html and previous messages in that thread.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125855</commentid>
    <comment_count>5</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-04-14 15:52:25 +0000</bug_when>
    <thetext>On 2016-04-07, the WG

RESOLVED: to resolve bug 29507 by accepting the proposal in comment 3.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126288</commentid>
    <comment_count>6</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-04-28 16:45:02 +0000</bug_when>
    <thetext>Abel asked for the bug not to be closed until he has had a chance to review the changes in the updated spec.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>126606</commentid>
    <comment_count>7</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2016-05-27 11:26:10 +0000</bug_when>
    <thetext>The changes have been applied.

Under item (B) of the proposal I put the additional note under the General Streamability Rules, not under &quot;Path Expressions&quot; as suggested, because the rules apply more generally than to path expressions.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>