This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29161 - [XSLT30] setting the context size with xsl:apply-templates during streaming
Summary: [XSLT30] setting the context size with xsl:apply-templates during streaming
Status: CLOSED DUPLICATE of bug 29153
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XSLT 3.0 (show other bugs)
Version: Last Call drafts
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-09-30 01:45 UTC by Abel Braaksma
Modified: 2015-10-30 15:38 UTC (History)
0 users

See Also:


Attachments

Description Abel Braaksma 2015-09-30 01:45:15 UTC
We write in 6.3 Applying Template Rules:

<quote>
The rule that matches the Nth item in the sorted sequence is evaluated with that item as the context item, with N as the context position, and with the length of the sorted sequence as the context size.
</quote>

With streaming, we won't be able to set the context size without look-ahead.

This does not matter with patterns (they cannot access the context size of the match selection), but it matters to template bodies. There are essentially two issues here:

1) Using the context-size with windowed streaming (bug 29153)
2) Setting the context-size during normal streaming.

Example of (1):

<xsl:template match="/">
    <xsl:apply-templates select="*/copy-of(foo)" mode="no-stream" />
</xsl:template>

<xsl:template match="foo" mode="no-stream">
    <xsl:value-of select="last()" />
</xsl:template>

Example of (2):

<xsl:template match="/">
    <xsl:apply-templates select="*" />
</xsl:template>

In the second example, we cannot access the fn:last() function without penalty, but the prose in the spec still mentions that we should set the context size, which is not possible. In fact, the context size is undefined while applying templates while streaming.

I am not sure how to fix this for situation (1). I don't see a simple way to statically determine the use of fn:last() in a non-streaming mode that is accessed by windowed streaming.

One solution might be to define that the context size is not available until the last item is processed (to allow for the use-case from bug 29153). Another solution is to always raise a dynamic error, or a static if possible (which follows from the fact that the size of the input stream is not known).
Comment 1 Michael Kay 2015-09-30 09:14:40 UTC
For case (2) where the template is declared streamable, it can't use last() so I don't think there's a problem. It doesn't matter that we specify what the context size is, given that there's no way of writing code that depends on its value.

Case (1) - and this also applies to 29153 - is indeed a problem. In a sense we could regard it as a "pipelining" problem rather than a "streaming" problem, in that any use of last() is going to break pipelined evaluation of constructs like filter expressions and thereby increase the use of memory. I think this is probably the best way to tackle it. 

Essentially, if an expression is grounded and consuming, then the processor may or may not need to store the entire result of the expression in memory, and using last() increases the likelihood that it will need to be stored in memory. The same applies to other non-pipelinable operations such as reverse(). Users need to be aware of this but we do not explicitly prohibit these constructs.

I'm less comfortable about doing the same with xsl:merge but at this stage of the game perhaps it's the simplest answer: we should simply issue a health warning against use of last(), but not reject it statically as non-streamable.
Comment 2 Abel Braaksma 2015-09-30 10:16:27 UTC
> The same applies to other non-pipelinable operations such as reverse().

This won't happen in a sequence constructor, but I think you refer to something like:

a/b/copy-of() => reverse()

But since reverse() takes a sequence as an argument, it is different from last() that it first must materialize the whole sequence. This is similar to, say, string-join. I.e., even though it cannot be pipelined, the argument is consuming, so in a pipeline or windowed streaming scenario, the semantics of the function make sure that it does not break any rules.

As a result, the above is (from a streaming view) semantically equal to reverse(a/b/copy-of()) and reverse(copy-of(a/b)). After a rewrite to the latter, which cannot be done with last() inside a seqtor, the streamability issue goes away (well, technically, it means that the whole sequence will be grounded first, stored in memory, so this is not streaming at all).
Comment 3 Michael Kay 2015-10-16 11:03:36 UTC
I am closing this as a duplicate of bug #29153, since I think the issues raised have all been addressed there.

*** This bug has been marked as a duplicate of bug 29153 ***