5795 – CVS: Static Typing: K2-Steps-2, K2-FunctionProlog-14

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5795 - CVS: Static Typing: K2-Steps-2, K2-FunctionProlog-14

Summary: CVS: Static Typing: K2-Steps-2, K2-FunctionProlog-14

Status:	CLOSED WONTFIX

Alias:	None

Product:	XML Query Test Suite
Classification:	Unclassified
Component:	XML Query Test Suite (show other bugs)
Version:	unspecified
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	Frans Englich
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-06-24 12:57 UTC by Tim Mills
Modified:	2010-03-16 16:02 UTC (History)
CC List:	4 users (show)

See Also:

Attachments
Tests involving fn:error() (9.75 KB, application/octet-stream) 2008-07-29 13:05 UTC, Tim Mills	Details
Tests involving fn:error() (14.62 KB, application/octet-stream) 2008-07-30 12:16 UTC, Tim Mills	Details

Description Tim Mills 2008-06-24 12:57:56 UTC

This query can also return the empty sequence.

In this context

e[928] 

is effectively

(fn:error('err:XPDY0002'))[928]

which can be evaluated through type analysis to be the empty sequence.

K2-FunctionProlog-14 has a similar problem.

(:*******************************************************:)
(: Test: K2-Steps-2                                      :)
(: Written by: Frans Englich                             :)
(: Date: 2007-11-22T11:31:21+01:00                       :)
(: Purpose: A numeric predicate combined with a name test inside a function. :)
(:*******************************************************:)
declare function local:myFunc()
{
    e[928]
};
local:myFunc()

Comment 1 Frans Englich 2008-06-24 13:45:28 UTC

But isn't this extremely counter intuitive? The user has done wrong: she's using an axis step inside a function. Instead of that it gets reported as an error, it evaluates to a perfectly valid result: an empty sequence. It's not helpful, in my opinion.

What says an expression that happens to contain a dynamic error, can be rewritten to a call to fn:error? I'd say e[928] isn't effectively equal to  (fn:error('err:XPDY0002'))[928] because in the latter case it has the static type none, which the former doesn't have.

Comment 2 Tim Mills 2008-06-24 14:00:10 UTC

Refer to:

http://www.w3.org/TR/xquery/#id-errors-and-opt

"Consider an expression Q that has an operand (sub-expression) E. In general the value of E is a sequence. At an intermediate stage during evaluation of the sequence, some of its items will be known and others will be unknown. If, at such an intermediate stage of evaluation, a processor is able to establish that there are only two possible outcomes of evaluating Q, namely the value V or an error, then the processor may deliver the result V without evaluating further items in the operand E. For this purpose, two values are considered to represent the same outcome if their items are pairwise the same, where nodes are the same if they have the same identity, and values are the same if they are equal and have exactly the same type."

I agree it isn't intuitive, but I'm sure it is correct.  I rather wish the type of fn:error was, in Haskell style:

empty-sequence() IO 

(indicating an expression with side effects) rather than

none

which I believe would help to get rid of these oddities (and clean up XQuery Update).

Comment 3 Frans Englich 2008-06-25 10:43:43 UTC

I can't see how that paragraph applies, I need further help with this!

My understanding is that that paragraph says that an implementation can choose between delivering a sequence of items or an error, if both are the outcome of evaluating an expression.

However in the case of:

declare function local:myFunc()
{
    e[928]
};
local:myFunc()

where does the empty sequence come from? Maybe one can infer it from the fn:error() function, but hasn't yet been clarified how that one enters the picture.

I agree, the none type surely gives us a lot of trouble :}

Comment 4 Tim Mills 2008-06-25 11:03:29 UTC

The context item has static type 'none' because accessing it is equivalent to calling fn:error (whose return type is none).  So we can rewrite the path expression $fs:dot/e as a call to fn:error.

declare function local:myFunc()
{
    (fn:error('err:XPDY0002'))[928]
};

The quantifier of fn:error is 1, so we either have a sequence of one item or an error.  But we're not interested in the first item, so there's no need to evaluate it to cause an error.  We want item 928 - therefore we get the result as ().

Comment 5 Frans Englich 2008-06-26 11:12:37 UTC

"The context item has static type 'none' because accessing it is equivalent to
calling fn:error (whose return type is none).  So we can rewrite the path
expression $fs:dot/e as a call to fn:error."

So, if an expression yields an error, it's ok to rewrite it to fn:error? So for any expression that raise an expression one can potentially rewrite to fn:error and then proceed to reduce using the type none?

I don't see how that holds. I'd say that just because an expression raise an error, doesn't make it ok to rewrite to function fn:error.

Comment 6 Tim Mills 2008-06-26 12:04:05 UTC

> I don't see how that holds. I'd say that just because an expression raise an
> error, doesn't make it ok to rewrite to function fn:error.

We only do this wtih dynamic errors.

Why wouldn't this rewrite be valid?  Could a user see a difference between an expression raising an error and a call to fn:error?

Does anyone else have an opinion on this?

Comment 7 Michael Kay 2008-06-26 12:10:47 UTC

>So, if an expression yields an error, it's ok to rewrite it to fn:error? So for any expression that raise an expression one can potentially rewrite to fn:error and then proceed to reduce using the type none?

It seems to me to be perfectly reasonable to rewrite the expression as fn:error() - but not reasonable to then do type inferencing on error() and further reduction that causes the error not to be raised!

Frankly, I think that we haven't formalized error behaviour to the level where such rewrites are safe. I've found that it's generally better to infer a type of item()* for expressions known to throw an error - but that's in an optimistic environment, of course.

Comment 8 Tim Mills 2008-06-26 12:23:04 UTC

Consider:

declare function local:something() as xs:integer 
{
  1
}

(local:something())[928]

Do you think it is reasonable to return () here without evaluating the function?

If we redefine the functions as:

declare function local:something() as xs:integer 
{
  fn:error()
}

do you still think it reasonable?

As I've said before, I think fn:error should have been treated in the way that exceptions are treated in Haskell, since the error is a side-effect, and the same goes for updating expressions.

As I'm sure you are aware, the reason fn:error can't be treated as item()* is that static type checking would end up with type check errors all over the place.  e.g.

declare function local:maybe-error($arg as xs:boolean) as xs:integer
{
  if ($arg)
  then fn:error()
  else 1
}

With fn:error() as 'none', this type checks correctly.  Were it item()*, the call to fn:error would cause a static type check error.

Comment 9 Michael Kay 2008-06-26 12:35:18 UTC

>do you still think it reasonable?

I have found it necessary to be pragmatic with error situations: the criterion for what is reasonable is based on how helpful it is to the user. Failing to report a simple error that can be detected statically, and returning an empty sequence instead, does not seem helpful.

I'm well aware that treating error() as item()* doesn't work in a static typing environment, but fortunately I don't have that problem. (Well, not fortunately, it was by deliberate choice...)

But of course we're dealing with a test suite here, and it's testing for conformance not for usability. Unfortunately I don't think that static typing is spec'ed well enough to ensure interoperable results especially in error situations, so I personally think this is a losing battle. There's probably a chain of reasoning that says 42 is a valid answer to the query 2+2.

Comment 10 Tim Mills 2008-06-26 12:47:12 UTC

> But of course we're dealing with a test suite here, and it's testing for
> conformance not for usability. 

You are right.  So do you think it conformant behaviour?

Comment 11 Michael Kay 2008-06-26 13:01:10 UTC

>So do you think it conformant behaviour?

I wouldn't dare to venture an opinion. With static typing, nothing (not even a failure to report a statically detectable error) would surprise me.

But I think there comes a time when a test suite has to take a pragmatic view. The rules on errors and optimization license an implementation to throw an error for the query 2+2. I don't think that means the test suite should list this as an approved result. There comes a time when you have to report that you consider your result conformant even though it's not the expected result that was published.

Comment 12 Frans Englich 2008-06-26 13:08:41 UTC

For what it's worth I'm not convinced this is conformant behavior, but I think we can add a static typing query that follows the line of thinking of the static typing community.

Tim, could you provide a query which unconditionally raise XPDY0002 for static typing implementations? Will "node-test", without quotes, do? I'll add that as a @static-name test.

Comment 13 Tim Mills 2008-06-26 13:12:15 UTC

Thanks Mike.

fn:error really has caused us a lot of head scratching.

On some days, I can convince myself that count(fn:error()) is 1.  I don't like it.  but I think it is correct.

I may post a further question to clarify whether both:

(1, fn:error())[1]

and

(fn:error(), 1)[2]

can be conformantly executed without error.

Comment 14 Tim Mills 2008-06-26 13:12:25 UTC

Frans - we get this result whether we use static typing or not.  The result isn't really resulting from type analysis as much from the rule quoted in Comment 2.

Comment 15 Tim Mills 2008-06-26 13:14:37 UTC

Is the [928] important?

How about:

declare function local:myFunc()
{
    e[1]
};
local:myFunc()

Comment 16 Frans Englich 2008-06-26 13:29:54 UTC

Added the query in comments #15 as static typing queries for the two tests in question. Feel free to verify, and close if everything looks ok.

Comment 17 Tim Mills 2008-06-27 16:14:04 UTC

Frans -

I've looked a little further into why we are returning empty sequence here.

In K2-Steps-2, e[928] ends up being transformed into

fs:item-at($fs:seq, 928)

(see Bug 4841 for details of fs:item-at).

The type checking rule is:

        statEnv |-  QName of func expands to (FS-URI,"item-at")
        statEnv |-  Expr1 : Type1
        --------------------------------------------------------
        statEnv |-  QName(Expr1, Expr2) : prime(Type1) ?

which here is

fs:item-at( something of type none, 928)

prime(none) = none
none? = empty

Therefore it is definitely correct to say that e[928] is empty sequence..

A static typing implementation would be required to raise XPST0005 here (that's an expected result), but any implementation is free to return () here.

Would this persuade you to add this as a possible result?

Comment 18 Tim Mills 2008-06-30 09:01:03 UTC

In Bug 5810, Michael Dyck has confirmed that 

fs:item-at(fn:error(), Expr) 

can be rewritten as the empty sequence by virtual of its static type.

"I don't see how that holds. I'd say that just because an expression raise an
error, doesn't make it ok to rewrite to function fn:error."

This can all be proved without actually performing a rewrite to fn:error.

$fs:dot has type none (because it is an error).

From FS. statEnv |-  axis Axis of none : none

So $fs:dot/child::e is also of type none.

fs:item-at( $fs:dot/child::e (of type none), 928) is empty

Any expression of type empty can be rewritten as the empty sequence.

Comment 19 Michael Kay 2008-06-30 09:28:35 UTC

>Any expression of type empty can be rewritten as the empty sequence.

Section 4 states:

During static analysis, it is a type error for an expression to have the empty type, except for the following expressions and function calls:

    * Empty parentheses (), which denote the empty sequence.
    * The fn:data function and all functions in the fs namespace applied to empty parentheses ().
    * Any function which returns the empty type.

If this rule is to make sense at all, then it must surely be applied before doing any rewrites. You can't rewrite an expression as "()" as a way of getting around this rule.

Comment 20 Tim Mills 2008-06-30 10:32:47 UTC

This is the paragraph which gives rise to XPST0005 - an error distinct to the static typing feature (2.3.1 Kinds of Errors, paragraph 7).

If the static typing feature isn't in effect, an implementation isn't required to throw this error.  Incidentally, many static typing implementations appear to ignore, apply selectively (e.g. only in path expressions) or allow this error to be switched off.

If the static typing feature is switched off, the extent to which the static typing rules are used during query analysis is not prescribed by the standard.

I'm specifically concerned here with running the tests with static typing feature switched off.  The test case already correctly lists XPST0005 as a valid result.

Comment 21 Frans Englich 2008-07-23 11:25:26 UTC

Regarding: "fs:dot has type none (because it is an error)". Could you elaborate on how the spec backs this up?

Here's how I understand it: if fs:dot has type none because it raises an error(the context is undefined), then that means XPDY0002 never appears in the cases where subsequent code starts inferring based on the none type. And can the same kind of logic be applied to other expressions? E.g 1) it's statically detected an expression will raise an error; 2) the expression's type changes to none(which I don't understand why); 3) inference goes to work and the error is not reported.

Comment 22 Tim Mills 2008-07-29 13:05:07 UTC

Created attachment 561 [details]
Tests involving fn:error()

Attached are a set of queries which I believe show how fn:error() can be optimized away through static analysis of the queries.

Comment 23 Tim Mills 2008-07-29 13:27:22 UTC

> Regarding: "fs:dot has type none (because it is an error)". Could you elaborate
> on how the spec backs this up?

I can only state the equivalence of a dynmaic error (namely accessing an undefined context item) and a call to fn:error.

This is backed up by Formal Semantics, which shows how the dynamic error raised by the "treat as" expression is normalized to a call to fn:error.

[Expr treat as SequenceType]Expr
==
typeswitch ([ Expr ]Expr)
  case $fs:new as SequenceType return $fs:new
  default $fs:new return fn:error()

Comment 24 Tim Mills 2008-07-29 15:42:41 UTC

I've just had a play with Saxon to see what it does here.

For the original query:

declare function local:myFunc()
{
    e[928]
};
local:myFunc()

it throws an error, but it returns empty sequence for:

declare function local:myFunc()
{
    e[false()]
};
local:myFunc()

Regarding Comment #1, in both these queries "The user has done wrong: she's
using an axis step inside a function."

Comment 25 Michael Kay 2008-07-29 16:00:30 UTC

I think we're in the realms of usability rather than conformance here: it's going to be very hard to formalize the rules on error handling to ensure all products are compatible in such cases. But I think the test suite needs to make assumptions about what's reasonable: for example it can legitimately assume that the implementation's limit on the length of strings will be greater than 10, and I don't think the expected results should be changed because some implementation blows its limits.

I think that the rewrite from e[928] to error()[928] to () is simply poor design from a usability point of view. Once you've decided there is an error here you should take care to ensure the user knows about it.

The difference with e[false()] is that the sequence of rewrites never went via error() to something else. Yes, arguably the processor should have tried harder to check for error conditions before doing the rewrite, but it's hard to argue against doing context-independent rewrites before starting to look at the context.

Comment 26 Tim Mills 2008-07-29 16:07:28 UTC

We're following the rules for static typing here, which (like it or not) make it quite clear that:

e[1]

is of type empty if the context item's value is undefined.  This isn't really comparable to differences due to implementation limits.

As an aside, I rather think the usability problem here is that this accessing of the undefined context item in a function body is considered to be a dynamic error rather than a static error.

Comment 27 Tim Mills 2008-07-30 12:16:14 UTC

Created attachment 562 [details]
Tests involving fn:error()


Attached is an updated set of 35 queries where a call to fn:error() can be optimized away through static analysis of the queries.

Comment 28 Tim Mills 2008-07-30 12:34:43 UTC

"The difference with e[false()] is that the sequence of rewrites never went via
error() to something else. Yes, arguably the processor should have tried harder
to check for error conditions before doing the rewrite, but it's hard to argue
against doing context-independent rewrites before starting to look at the
context."

For the query:

error()[928]

we normalize to:

let $fs:seq_7 as item()* := fn:error()
return fs:item-at($fs:seq_7, 928)

We then perform static type analysis which spots that the expression has type empty.  In the original query, we necessarily look at the static context to get the context item's type.  Once that's done, the only (context-independent) rewrite is to replace an expression with type empty with ().  

I agree that it is hard for a processor to check for error conditions, since the type system does not capture a difference between a function which, say, returns an integer from one which may return an integer or raise an error.

Comment 29 Tim Mills 2008-07-30 13:27:35 UTC

Other queries which can return () or raise XPST0005 are:

K2-Steps-3 
K2-FilterExpr-8
K-SeqRemoveFunc-13 (reported in Bug 5991)

Comment 30 Frans Englich 2008-08-08 10:15:06 UTC

Hi,

There is unfortunately some administrative details to my W3C membership to attend to, which has as result that I can't do W3C work until end of august(and let's hope it doesn't stretch out more than that).

As a result, progress on this report as well as others assigned to me/XQTS will have to wait.

Comment 31 zhen hua liu 2009-02-24 22:17:16 UTC

In K2-Steps-2, the query e[928] makes a reference of a context item, however,
the input variable binding is $input-context, not context item, therefore, this
shall result as XPDY0002 errors. Furthermore, xquery spec allows implementation to
do static analysis and catch and raise such error statically. However, using
static anlaysis to infer types and compute results while there are errors detected in previous steps are not very sounded and not very well defined (at
least not covered by the xquery spec or formal semantic spec). So per working
group discussion, we close this bug as resolved.

Comment 32 Jim Melton 2009-03-01 18:56:32 UTC

Based on the solution Zhen provided in comment 31, we're marking this bug RESOLVED/WONTFIX.  If this resolution is satisfactory, please mark the bug CLOSED.