9139 – [XPath 2.1] Dynamic function calls and context

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 9139 - [XPath 2.1] Dynamic function calls and context

Summary: [XPath 2.1] Dynamic function calls and context

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	XPath 3.0 (show other bugs)
Version:	Working drafts
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	John Snelson
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-02-24 17:38 UTC by Michael Kay
Modified:	2010-09-14 16:08 UTC (History)
CC List:	4 users (show)

See Also:

Attachments

Description Michael Kay 2010-02-24 17:38:58 UTC

The rules for dynamic function calls should make it clear that no context information is passed from the caller's context to the callee's context other than that explicitly mentioned (the variables in the closure). This has the consequence that a dynamic call to a contextual core function like position() is not useful (in fact, it will always fail).

The only place where we say that the focus is cleared on a function call is in 3.1.5 Function Calls, rule 4:

"During evaluation of a function body, the focus (context item, context position, and context size) is undefined, except where it is defined by some expression inside the function body."

but this rule (appearing where it does) only applies to explicit calls on "user-declared functions": it does not apply to dynamic calls on user-declared functions, or to dynamic calls on inline functions, or to dynamic calls on built-in functions such as position(). It should apply to all these cases.

Comment 1 John Snelson 2010-02-24 18:01:37 UTC

(In reply to comment #0)
> The rules for dynamic function calls should make it clear that no context
> information is passed from the caller's context to the callee's context other
> than that explicitly mentioned (the variables in the closure). This has the
> consequence that a dynamic call to a contextual core function like position()
> is not useful (in fact, it will always fail).

I'm not sure that this is a direction that the WG has decided on, although I agree that it's probably the right thing to do.

> The only place where we say that the focus is cleared on a function call is in
> 3.1.5 Function Calls, rule 4:
> 
> "During evaluation of a function body, the focus (context item, context
> position, and context size) is undefined, except where it is defined by some
> expression inside the function body."
> 
> but this rule (appearing where it does) only applies to explicit calls on
> "user-declared functions": it does not apply to dynamic calls on user-declared
> functions, or to dynamic calls on inline functions, or to dynamic calls on
> built-in functions such as position(). It should apply to all these cases.

My reading of the spec would say that that sentence already applies to calls ("dynamic" or otherwise) on user-declared functions, of which inline functions are a type. I agree that the behaviour for built-in functions needs clarifying one way or the other.

Comment 2 Jonathan Robie 2010-02-24 18:13:17 UTC

(In reply to comment #0)

> The only place where we say that the focus is cleared on a function call is in
> 3.1.5 Function Calls, rule 4:
> 
> "During evaluation of a function body, the focus (context item, context
> position, and context size) is undefined, except where it is defined by some
> expression inside the function body."
> 
> but this rule (appearing where it does) only applies to explicit calls on
> "user-declared functions": it does not apply to dynamic calls on user-declared
> functions, or to dynamic calls on inline functions, or to dynamic calls on
> built-in functions such as position(). It should apply to all these cases.

I disagree - the rules for evaluation of a dynamic function are these:

A dynamic function invocation is evaluated as follows:

1. Argument values are calculated for the function item using rules 1 and 2 for evaluation of a function call as defined in 3.1.5 Function Calls.

2. The set of variable values from the function item's closure are added to the dynamic context with a scope of the invocation of the function.

3. The function from the function item is evaluated using the argument values according to rules 3 - 5 for evaluation of a function call as defined in 3.1.5 Function Calls.

When we evaluate rules 3 - 5 above, that includes rule 4, which you reference above.

Comment 3 Michael Kay 2010-02-24 19:39:49 UTC

I see that rule 3 of dynamic function invocation does in fact invoke rules 3-5 of "function call", which I had previously missed.

So as JS says, the remaining question is whether position#0() returns the context position, or throws an error.

Note that the question is rather more important in an XSLT context, where there are many more built-in functions that give access to dynamic context information, for example current-group(), regex-group(), and so on. My feeling is that it's better if a dynamic function call always discards the context information, rather than having it depend on which function is being called.

Comment 4 Jonathan Robie 2010-02-24 19:57:58 UTC

(In reply to comment #3)
> My feeling
> is that it's better if a dynamic function call always discards the context
> information, rather than having it depend on which function is being called.


I agree.

Comment 5 Henry Zongaro 2010-03-03 14:54:30 UTC

I believe there is a closely related problem involving the default collation and base URI components of the static context.  The function literals fn:compare#2, fn:default-collation#0 and fn:static-base-uri#0 might be referenced in a module where the default collation and base URI components have different values than in the module where those function items are finally dynamically invoked.

It seems to me that it would be desirable to use the static context at the point at which the function literal was referenced.

Comment 6 John Snelson 2010-03-03 15:06:27 UTC

(In reply to comment #5)
> It seems to me that it would be desirable to use the static context at the
> point at which the function literal was referenced.

I agree.

Comment 7 Michael Dyck 2010-03-03 19:32:02 UTC

(In reply to comment #5)
> 
> It seems to me that it would be desirable to use the static context at the
> point at which the function literal was referenced.

I think that's pretty much covered by the rule: "During evaluation of a function body, the static context and dynamic context for expression evaluation are defined by the module or expression in which the function is declared, which is not necessarily the same as the context in which the function is called."

Comment 8 Henry Zongaro 2010-03-03 19:58:51 UTC

Regarding Michael Dyck's comment #7, unfortunately that rule about the static context and dynamic context used in evaluating the function body applies only to user-defined functions - what I was speaking of in comment #5 were built-in functions whose behaviour is affected by the static context.

Comment 9 Michael Kay 2010-03-03 20:52:35 UTC

>I think that's pretty much covered by the rule: "During evaluation of a function body, the static context and dynamic context for expression evaluation are defined by the module or expression in which the function is declared, which is not necessarily the same as the context in which the function is called."

Except that functions like static-base-uri() aren't "declared". The suggestion is that if I do

module "a.xq";

declare variable $f as (function() as xs:string) := fn:static-base-uri#0;

module "b.xq";

$f()

then the result should be "a.xq".

I can't say I'm especially comfortable with this: it seems to require some special-casing. Normally static-base-uri() returns the static base URI of the caller, and I think I would expect that if $f is bound to static-base-uri#0, then $f() would also return the static base URI of the caller.

A more important case is perhaps functions like name():

let $myname := if (xxxx) then name#0 else local-name#0

<x/>/$myname()

If the context item is bound to anything here, I think I would expect it to be bound at the point the function is called, not at the point where $myname is declared.

Perhaps it's just too much of a pitfall and we should just disallow binding of context-dependent functions to function items.

Comment 10 Michael Dyck 2010-03-03 21:25:44 UTC

In applying that quote to the current discussion, it wasn't my intention that "the function" should be identified with one of the built-in functions in question, but rather with the code that contains the reference to the built-in function. That code is required to be evaluated wrt the context in which it appears, therefore the reference to the built-in function will be evaluated wrt that context, therefore the function itself will be evaluated wrt that context.

Is it the last "therefore" that's in question? I guess we don't say that a built-in function is evaluated wrt the same static context as the expression that references it. We didn't say that in XQuery 1.0 either, so maybe it's an old hole.

Comment 11 Michael Dyck 2010-03-03 21:36:17 UTC

(In reply to comment #9)
> The suggestion is that if I do
> 
> module "a.xq";
> declare variable $f as (function() as xs:string) := fn:static-base-uri#0;
> 
> module "b.xq";
> $f()
> 
> then the result should be "a.xq".
> 
> I can't say I'm especially comfortable with this

[Presumably module b imports module a.]

If the example were

a.xq:   declare function f() as xs:string { fn:static-base-uri() }
b.xq:   f()

we'd expect "a.xq", wouldn't we? I'm not sure why the first example is different. In both cases, the built-in function is referenced from module a, so that determines the static context that it's evaluated against when it's invoked.

Comment 12 John Snelson 2010-03-03 21:54:28 UTC

(In reply to comment #11)
> If the example were
> 
> a.xq:   declare function f() as xs:string { fn:static-base-uri() }
> b.xq:   f()
> 
> we'd expect "a.xq", wouldn't we? I'm not sure why the first example is
> different. In both cases, the built-in function is referenced from module a, so
> that determines the static context that it's evaluated against when it's
> invoked.

For functions that return things from the static context, I've got to say that I agree with Michael Dyck. It would be extremely unfortunate if I couldn't tell at compile time everywhere that needed a given piece of information from the static context. One could argue that it would be a contradiction in terms.

Comment 13 Henry Zongaro 2010-03-03 22:16:19 UTC

(In reply to comment #11)

Sorry, Michael D. - I'm getting confused.  Reading comment #10, I thought you
were arguing that the result of the following should be "b.xq"

module "a.xq";
declare variable $f as (function() as xs:string) := fn:static-base-uri#0;

module "b.xq";
$f()

But reading comment #11, I think you're arguing that the result should be
"a.xq".  May I ask you to clarify?

By the way, in comment #5 I was going to add that I thought the current text
implied the result would be "b.xq" for an example like the one above, but that
I thought that would be undesirable behaviour.  I prefer the result "a.xq"
here, because my mental model is that the components of the static context for
a function literal are similar to the variables in the closure of an inline
function.  And I agree with what John Snelson says in comment #12.

Regardless, I'm not sure the WGs considered it one way or the other, and wanted
to make sure there was an explicit decision which behaviour to adopt.

Comment 14 John Snelson 2010-03-24 12:25:22 UTC

I propose that we resolve this bug by adding the following paragraphs to section 3.1.6 "Literal Function Items":

<new>
The static context for evaluation of the function item is inherited from the location of the literal function item expression, with the exception of the static type of the context item which is undefined.

Literal function items cannot access the focus (context item, context position, and context size), which is undefined when they are invoked. It is an error to create a function item for a function which accesses the focus [err:XPDY0002].

Note:

User-declared functions cannot access the focus, so this error only applies to built-in functions like:

fn:position#0
fn:last#0
fn:name#0
fn:namespace-uri#0
fn:local-name#0
fn:number#0
fn:string#0
fn:string-length#0
fn:normalize-space#0
fn:root#0
fn:id#1
fn:idref#1
fn:lang#1
fn:base-uri#1
fn:resolve-uri#1
</new>

This leaves open the XSLT 2.1 question of current-group(), regex-group() etc., which I imagine should be similarly restricted. I lack the words to express this adequately, since a blanket ban on all functions that access the dynamic context disallows functions like current-time(), which should be perfectly reasonable.

PS - Why does an undefined context item raise a dynamic error, rather than a type error?

Comment 15 Michael Kay 2010-03-24 12:54:53 UTC

OK in principle.

I'm not sure why base-uri#1 is on your list - should be base-uri#0?. And I don't think resolve-uri should be there at all.

Do we need to say something about partial function application as well? I think fn:lang(?) is synonymous with fn:lang#1.

If we want a general rule, then it should be a ban on functions that access non-stable parts of the dynamic context; and then we should define which parts of the dynamic context are stable and which aren't (this has some relationship with the exercise Jonathan has been doing in defining the scope of different parts of the static context). current-dateTime(), implicit-timezone(), and doc() are OK because the parts of the dynamic context that they access are stable.

Comment 16 John Snelson 2010-03-24 13:15:16 UTC

(In reply to comment #15)
> OK in principle.
> 
> I'm not sure why base-uri#1 is on your list - should be base-uri#0?. And I
> don't think resolve-uri should be there at all.

I agree - those are errors.

> Do we need to say something about partial function application as well? I think
> fn:lang(?) is synonymous with fn:lang#1.

A good point - so we need an analogous paragraph applying to partial function application.

> If we want a general rule, then it should be a ban on functions that access
> non-stable parts of the dynamic context; and then we should define which parts
> of the dynamic context are stable and which aren't (this has some relationship
> with the exercise Jonathan has been doing in defining the scope of different
> parts of the static context). current-dateTime(), implicit-timezone(), and
> doc() are OK because the parts of the dynamic context that they access are
> stable.

I would have said "dynamically scoped" rather than "stable" - but yes, that's the language that I need to formulate a more general rule.

Comment 17 John Snelson 2010-04-13 17:04:26 UTC

The XQuery and XSLT WG discussed this bug today, and agreed to the solution detailed in comment #14 as ammended in comment #15 and comment #16. This bug will be left open to continue discussion of how to adequately extend this rule to cover XSLT 2.1's additional dynamic context components.

Comment 18 Michael Kay 2010-07-15 10:16:22 UTC

This was resolved during the joint meeting this week in Oxford. We agreed some categories to which functions could be assigned, including the category of being "focus-dependent", and having defined such a category, we decided that the functions listed in this bug entry, as well as the similar XSLT functions such as regex-group() can be allocated to this category, which will ensure that they cannot be bound to function items. This will entail an expansion of the concept of the "focus" to embrace all parts of the dynamic context that are not stable for the duration of an execution scope.

See also bug #8221, under which the discussion occurred.

Leaving the bug open because there is still a need to draft detailed text to implement this approach.

Comment 19 John Snelson 2010-09-14 16:08:14 UTC

Closing this bug, as both working groups have resolved this problem.