<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>29153</bug_id>
          
          <creation_ts>2015-09-28 13:35:13 +0000</creation_ts>
          <short_desc>[XSLT30] have we created a loop-hole with windowed streaming and copy-of or snapshot?</short_desc>
          <delta_ts>2015-10-30 15:38:05 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>XPath / XQuery / XSLT</product>
          <component>XSLT 3.0</component>
          <version>Last Call drafts</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows NT</op_sys>
          <bug_status>CLOSED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Abel Braaksma">abel.braaksma</reporter>
          <assigned_to name="Michael Kay">mike</assigned_to>
          
          
          <qa_contact name="Mailing list for public feedback on specs from XSL and XML Query WGs">public-qt-comments</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>123338</commentid>
    <comment_count>0</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2015-09-28 13:35:13 +0000</bug_when>
    <thetext>This follows from the discussion in bug 29120, comment 4, where Michael wrote:

&gt; &lt;xsl:for-each select=&quot;copy-of(/*/transaction)&quot;&gt;
&gt;   &lt;xsl:value-of select=&quot;last()&quot;/&gt;
&gt; &lt;/xsl:for-each&gt;

&gt; Here last() is allowed because the context posture is grounded. But we have 
&gt; the same problem, that in effect we have to materialize the entire result of 
&gt; copy-of(), rather than pipelining it one item at a time. 

This does not pose a problem, because the whole argument will be consumed by copy-of(), but what if you rewrite this as the following, where the &quot;consumation&quot; should happen one element at a time?

&lt;xsl:for-each select=&quot;/*/transaction/copy-of(.)&quot;&gt;
  &lt;xsl:value-of select=&quot;last()&quot;/&gt;
&lt;/xsl:for-each&gt;

This is the &quot;suggested&quot; way of writing windowed streaming. Since we take a copy each time, the xsl:for-each gets a grounded posture. But the problem is not just with copy-of:

&lt;xsl:for-each select=&quot;for $x in */* return string(*)&quot;&gt;
  &lt;xsl:value-of select=&quot;last()&quot;/&gt;
&lt;/xsl:for-each&gt;

Which, imo, also suggests that the user intends to stream it, but the only way to know the value of fn:last() is by evaluating the whole tree.

I checked the general streamability rules, and while we have exception rules for higher-order operands like xsl:for-each, they have no effect when the select expression selects a grounded result.

I think that the only problem is with fn:last(), therefore I suggest we add a rule in section 19.8.9.14 Streamability of the last function along those lines (probably with a Note explaining why so):

    x) if the fn:last function appears at any depth inside a higher-order 
       operand, then roaming and free-ranging.

This has a few nasty side-effects, so perhaps we can do better. I.e., consider:

a) a/b[last()] (should be prohibited)
b) (1 to 10)[last()] (should be allowed)
c) xsl:for-each/@select=&quot;copy-of(x)&quot;, the last() in seqtor (should be allowed?)
d) same with xsl:iterate
e) same with xsl:for-each-group
f) (for $e in a/b return copy-of($e))[last()]  (maybe allowed?)
g) a/b/copy-of(x)[last()] (should be prohibited, though can be made streamable)

Perhaps there are other scenarios to consider? In the event that we prohibit too much, someone can always create a copy of a node-set as a workaround and use count($x).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123340</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2015-09-28 13:53:57 +0000</bug_when>
    <thetext>Perhaps the rule should be that last() is disallowed if its focus-setting container is consuming (i.e., even if the posture is grounded).

And perhaps the above rule should have an exception that a value comparison or general comparison is allowed if one operand is position() and the other is last(), e.g. 

if (position() = last()) ...

because with some limited lookahead that can be streamed easily enough.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123342</commentid>
    <comment_count>2</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2015-09-28 15:39:56 +0000</bug_when>
    <thetext>&gt; if (position() = last()) ...
&gt;
&gt; because with some limited lookahead that can be streamed easily enough.

In general, I agree, but I find it a tricky rule to get right. What with if(position() + 10 = last()) or if(@x = last())? 

If the suggestion is to *only* allow the exact expression &quot;position() = last()&quot;, then still, that can be written in a variety of similar ways.

If we want to allow that, we should perhaps better consider to:

a) introduce something like fn:is-last, in line with fn:has-children
b) allow xsl:on-completion inside xsl:for-each (in tail position)

I would opt for (a), as it will be simpler to ban fn:last completely (you can then use &quot;if(is-last()) then position() else ()&quot; as alternative), or at least inside any higher-order context.

For a moment I thought this would also allow the now forbidden a/b[last()] as a/b[is-last()], but that won&apos;t work when combining &quot;a/b[is-last()] | a/c[is-last()]&quot;. So the rule could be &quot;is-last() is motionless and grounded in grounded and climbing postures, and roaming and free-ranging otherwise&quot;.

(I know we aren&apos;t adding any new features, but this is no suggestion thereto, the suggestion is to fix the bug.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123343</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2015-09-28 15:55:11 +0000</bug_when>
    <thetext>I think the rule &quot;a value comparison or general comparison is allowed if one operand is position() and the other is last()&quot; is easy enough to state, and easy enough to test for, and flexible enough to handle all the common ways of asking &quot;is this the last item?&quot;.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123361</commentid>
    <comment_count>4</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2015-09-30 01:47:06 +0000</bug_when>
    <thetext>Hmm... This is harder than I thought: what about applying templates with windowed streaming? I have raised this separately here: bug 29161.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123362</commentid>
    <comment_count>5</comment_count>
    <who name="Abel Braaksma">abel.braaksma</who>
    <bug_when>2015-09-30 02:11:47 +0000</bug_when>
    <thetext>&gt; &quot;a value comparison or general comparison is allowed if one operand is 
&gt; position() and the other is last()&quot; is easy enough to state,

Yes, I think you are right. Unfortunately, while pondering over it again, I think it cannot be done (I hope to be wrong though) because of accumulators. Consider:

&lt;xsl:accumulator name=&quot;count&quot; initial-value=&quot;0&quot;&gt;
    &lt;xsl:accumulator-rule match=&quot;*&quot; select=&quot;$value + 1&quot; /&gt;
&lt;/xsl:accumulator&gt;

&lt;xsl:template match=&quot;/&quot;&gt;
    &lt;xsl:for-each select=&quot;*/special/copy-of()&quot;&gt;
        &lt;xsl:if test=&quot;position() = last()&quot;&gt;
            &lt;last&gt;{accumulator-after()}&lt;/last&gt;
        &lt;/xsl:if&gt;
        &lt;xsl:if test=&quot;position() != last()&quot;&gt;
            &lt;elem&gt;accumulator-after()&lt;/elem&gt;
        &lt;/xsl:if&gt;
    &lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;

If the input stream is something like:

&lt;root&gt;
  &lt;special /&gt;
  &lt;foo /&gt;
  &lt;special /&gt;
  &lt;foo /&gt;
  &lt;foo /&gt;
  ... 1000&apos;s more w/o special ...
  &lt;foo /&gt;
  &lt;special /&gt;
&lt;/root&gt;

Then the output should be something like:

&lt;elem&gt;2&lt;/elem&gt;
&lt;elem&gt;4&lt;/elem&gt;
&lt;last&gt;2038543&lt;/last&gt;

However, upon visiting each &lt;special&gt;, the input stream must proceed to the next &lt;special&gt; to peek whether or not the element is the last in the selection. By doing so, the accumulator function is called, leading to a different outcome, something like:

&lt;elem&gt;4&lt;/elem&gt;
&lt;elem&gt;2038543&lt;/elem&gt;
&lt;last&gt;2038543&lt;/last&gt;

With non-streaming, this is not a problem, but with streaming, we lose the accumulator value after visiting the node, to prevent that we have to keep track of each and every accumulated value along the way (iirc).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123366</commentid>
    <comment_count>6</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2015-09-30 09:21:21 +0000</bug_when>
    <thetext>I don&apos;t see a problem with lookahead and accumulators.

Process A does the lookahead, that is:
repeat {
  read node N
  write node N-1
}

Process B reads the output of Process A

Process A knows whether the node it has just passed to Process B is the last in the sequence

Process B evaluates the accumulators.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123735</commentid>
    <comment_count>7</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2015-10-16 10:54:59 +0000</bug_when>
    <thetext>The issues raised here (and in related bugs) have been discussed by the WG in email and at telcons. I proposed that we add text explaining the general principles as drafted here:

https://lists.w3.org/Archives/Public/public-xsl-wg/2015Oct/0013.html

and the WG accepted this approach, with an action to take into account Abel&apos;s comments at

https://lists.w3.org/Archives/Public/public-xsl-wg/2015Oct/0016.html 

I have added this text as a new section 19.8.

I believe the bug can now be closed, but I leave it open for the moment to allow WG review.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>123737</commentid>
    <comment_count>8</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2015-10-16 11:03:36 +0000</bug_when>
    <thetext>*** Bug 29161 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>124006</commentid>
    <comment_count>9</comment_count>
    <who name="Michael Kay">mike</who>
    <bug_when>2015-10-30 11:13:39 +0000</bug_when>
    <thetext>The WG reviewed this bug and determined that the actions already taken were adequate to mark it as resolved.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>