29818 – [xslt30] Visibility and Applicability of Accumulators

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 29818 - [xslt30] Visibility and Applicability of Accumulators

Summary: [xslt30] Visibility and Applicability of Accumulators

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	XSLT 3.0 (show other bugs)
Version:	Candidate Recommendation
Hardware:	PC All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Michael Kay
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2016-09-13 08:26 UTC by Michael Kay
Modified:	2016-10-20 20:57 UTC (History)
CC List:	1 user (show)

See Also:

Attachments

Description Michael Kay 2016-09-13 08:26:37 UTC

This arises from the decision that use-accumulators on xsl:source-document, xsl:merge-source, and xsl:mode should apply to unstreamed documents as well as to streamed documents.

For these cases, there is a package that we can call the "originating package of the document" - for xsl:source-document, xsl:merge-source it's the package containing the instruction, for xsl:mode (which controls accumulators for documents in the initial match selection) it's effectively the top-level package. When use-accumulators="#all" is specified, it means all accumulators declared in the originating package. Note that these accumulators necessarily have distinct names.

(The term "originating package" is new, but the concept already exists in effect for whitespace stripping rules. For most kinds of source document, the whitespace stripping rules of some package apply, and we can refer to that package as the "originating package" of the document).

Now, accumulator-before() and accumulator-after() require that the accumulator name supplied is both visible in the package containing the function call, and applicable to the tree containing the context item. I wonder why the first condition is necessary. Now that the only accumulators available are those declared in the originating package, the requirement that the accumulator is visible in the invoking package seems to have no useful purpose. And it leads to potential confusion: what if the originating package declares an accumulator A and the invoking package declares a different accumulator also called A? So I would like to suggest dropping the requirement that the accumulator named in accumulator-before/after be visible in the package containing the function call.

Now consider documents/trees that aren't originated using these three instructions: for example, documents read using doc()/document()/collection(); documents created using xsl:document, temporary trees created using xsl:variable; documents created using parse-xml(), parse-xml-fragment(), analyze-string(), json-to-xml(). Which accumulators are applicable? We currently say "all accumulators", but do we mean this in the sense of "#all", or do we mean all accumulators declared in any package of the stylesheet? If accumulator-before(N) or accumulator-after(N) is used, then N is currently taken as a reference to an accumulator defined in the package containing the accumulator-before/after call. Given the changes to use-accumulator, it now feels more consistent that it should be taken as a reference to an accumulator declared in the package where the document originates. For all the above cases (exception noted below) the "originating package" is fairly obvious and not difficult to define, e.g. for doc() it is the package in which the call to doc() appears. This approach seems consistent with the fact that the whitespace stripping rules that apply to a document are the rules in its originating package.

The exception noted above is a temporary tree constructed using a global xsl:variable. Here there are two candidates for the "originating package": it could be either the declaring package of the global variable, or its containing package. If package A declares a variable V, and package B uses package A, and package B contains a reference to V which is interpreted as a reference to V(B) (that is, the global variable named V whose declaring package is A and whose containing package is B), then does accumulator-before("C") refer to an accumulator declared in B or an accumulator declared in A? I think it's probably most consistent to make it B - the containing package rather than the declaring package.

Comment 1 Michael Kay 2016-10-06 18:34:08 UTC

An alternative approach was suggested by Abel during WG discussion: make accumulator names global and require them to be unique. This isn't as bad as it sounds, it simply obliges people to use namespaces if they want to avoid a clash.

However, I can't say I feel entirely comfortable with it, simply because (as MSpMcQ pointed out) it's a good idea if names of different kinds of object are all handled in the same way.

Looking back at my proposal, there are really two parts to it:

(a) to drop the requirement that the accumulator named in the argument to acc-before/acc-after be visible in the package containing the call. I don't like this use of static name scope for a dynamic name, but we have it all over the place (e.g. decimal formats and keys) so it's not something that stands out as inconsistent.

(b) to state that for documents such as those read using doc() or collection(), or temporary trees created using xsl:variable, the applicable accumulators are those defined in the "originating package". On reflection, this seems to require a lot of machinery in the specification and in implementations that we could do without.

I think the problem that triggered this bug report was the realisation that a document might have two completely different accumulators with the same name and that this could be very confusing for users. But again, we allow multiple functions or variables or keys or decimal formats with the same name. With keys, we even allow multiple keys with the same name on the same document, so the analogy between keys and accumulators is very close.

On reflection, therefore, I'm inclined to stick with the status quo.

Comment 2 Abel Braaksma 2016-10-06 19:28:36 UTC

The WG asked me to consider the impact of making all accumulators public as a resolution to this issue (action ACTION 2016-10-06-003).

Proposed condensed set of rules:

1) All accumulators are public by default

2) Accumulators cannot be overridden with xsl:override (status quo)

3) Name clashes of declarations are detected statically, names from used packages may not clash with one another

4) Normal import precedence rules on names apply (status quo)

5) Accumulators may be used on any tree, unless this is limited by fn:copy-of etc on a streamed tree, which itself was limited by @use-accumulators.


These rules have the benefit of simplifying the complexity introduced by the proposal on the "originating document" w.r.t. accumulators, but have the down-side that a user will not be able to use two packages together if they have conflicting names (which we could resolve, but only by introducing new language elements or rules).

I don't really think the suggestion below in comment#1, about names of different kinds of objects to be handled in the same way, is relevant, we already have different rules on naming for different kinds of objects (regrettably so).

However, this proposal does feel a bit like a sledge hammer approach where a much smaller tool could've been sufficient. I'll reconsider the proposal from Mike below.

Comment 3 Michael Kay 2016-10-20 20:57:15 UTC

After a great deal of discussion we concluded that the current spec was viable and that no alternatives had been proposed that were clearly an improvement. This bug is therefore closed with no action.