This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 27908 - K2-SeqDocFunc-12
Summary: K2-SeqDocFunc-12
Status: RESOLVED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: XQuery 3 & XPath 3 Test Suite (show other bugs)
Version: Candidate Recommendation
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: O'Neil Delpratt
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-01-27 13:42 UTC by Christian Gruen
Modified: 2016-01-12 17:04 UTC (History)
1 user (show)

See Also:


Attachments

Description Christian Gruen 2015-01-27 13:42:43 UTC
In "K2-SeqDocFunc-12", it is assumed that doc() will never be called. However, this may not necessarily be the case. It may e.g. be called if $fileToOpen is statically bound, and if fn:doc is pre-evaluated, in which case FODC0002 would be raised.

The input of the doc() function (i.e., the atomized value of $fileToOpen) is an empty string. As the behavior of doc() is implementation-defined, there may be implementations that raise an error different to FODC0002.

So I propose we should change the value of $fileToOpen to a string that is guaranteed to raise a well-defined error (e.g. '%gg', leading to 'FODC0005'). 

Does anyone disagree? This is the full test case:

<test><![CDATA[
  declare variable $fileToOpen := <Variable id="_7" name="constComplex2" type="_11c" context="_1" location="f0:17" file="f0" line="17"/>; 
  empty($fileToOpen//*[let $i := @type return doc($fileToOpen)//*[$i]])
]]></test>
<result>
   <any-of>
      <assert-true/>
      <error code="XPST0005"/>
   </any-of>
</result>

This is a revised proposal:

<test><![CDATA[
  declare variable $fileToOpen := '%gg'; 
  empty(<Variable id="_7" name="constComplex2" type="_11c" context="_1" location="f0:17" file="f0" line="17"/>//*[let $i := @type return doc($fileToOpen)//*[$i]])
]]></test>
<result>
   <any-of>
      <assert-true/>
      <error code="XPST0005"/>
      <error code="FODC0005"/>
   </any-of>
</result>
Comment 1 Michael Kay 2015-01-27 15:01:51 UTC
Personally, I think that when $X is empty, $X[ff()] should never raise a dynamic error. I know that our "errors and optimization" rules allow expressions to be rewritten in a way that raises errors that would not be raised by a "rigorous" evaluation, but I think that if we take a literal interpretation of those rules, then any test case can raise any error it likes. So I'm against making this change.

When eager (compile-time) evaluation of expressions fails, the correct strategy is to replace the expression with something like error(...) that delays the reporting of the error until execution time, so the failure only occurs if the expression is actually evaluated.
Comment 2 Christian Gruen 2015-01-27 15:19:06 UTC
Michael, thanks for your assessment. In that case, it could make sense to add some more test cases to ensure that pre-evaluated expressions are replaced with errors. I tested this and similar queries with a number of implementations, and I found out that they all yield different results.

As an example, the following query should probably yield an empty sequence. But the implementations I tested raised either FODC0005 or FODC0002:

  <a/>/*[doc('%gg')]

For the following query, all implementations returned FODC0002:

  <a/>/*[doc('i-do-not.exist')]

The following query returns an empty sequence or the typing eror XPTY0004 (depending on the implementation):

  <a/>/*[1 + '']

What do you think?
Comment 3 Michael Kay 2015-01-27 16:24:09 UTC
Section 2.3.1 says: An implementation can raise a dynamic error for a QueryBody statically only if the query can never execute without raising that error

I think some implementations take this more seriously than others. Arguably, it's contradicted by 2.3.1: "Other than the raising or not raising of errors, the result of evaluating a rewritten expression must conform to the semantics defined in this specification for the original expression." and the note that follows. An extreme reading of this rule says that it's legal to rewrite (2+2) as (error()).

My own view is that a rewrite should never cause an otherwise-unfailing query to fail except possibly in worst-case scenarios for conditions such as overflow.
Comment 4 Christian Gruen 2015-01-28 11:27:14 UTC
Thanks, this explains a lot. I have added two more test cases to doc.xml.
Comment 5 Michael Kay 2015-09-11 11:43:26 UTC
I think these new (but ante-dated) tests fn-doc-038 and fn-doc-039 are incorrect: raising FODC0005 is legitimate.

Looking at the first one, given the query 

<a/>/b[doc('%gg')]

it is perfectly legitimate to rewrite this to

if (boolean(doc('%gg')) then <a/>/b else ()

There is nothing in the spec that says the predicate must only be evaluated if the sequence is non-empty. It may not be a perfect strategy, but it's a legal one.

In previous discussion I cited "An implementation can raise a dynamic error for a QueryBody STATICALLY only if the query can never execute without raising that error". But there is nothing to stop the error being raised DYNAMICALLY. We have no way in the test suite of distinguishing whether an error is raised statically or dynamically, so we can't really test this rule.

I therefore think that FODC0005 is a legitimate outcome.
Comment 6 Christian Gruen 2015-09-12 07:32:22 UTC
Thanks, I can see your point. I am now wondering if it is then not legitimate as well to rewrite the original query to the following expression (which would most probably raise an error)?

  declare variable $fileToOpen := <Variable id="_7" name="constComplex2" type="_11c" context="_1" location="f0:17" file="f0" line="17"/>; 
  let $_ := doc($fileToOpen)
  return empty(
    if(boolean($_)) then (
      $fileToOpen//*[let $i := @type return $_//*[$i]]
    ) else (
    )
  )
Comment 7 Michael Kay 2015-12-02 15:24:45 UTC
The minutes of yesterday's telcon record "Mike Kay to resolve bug 27908 after verifying that FODC0005 should be allowed for K2-SeqDocFunc-12."

The sense of the meeting was that I ought to look at it carefully again before closing it in this way.

Looking first at the original query, K-SeqDocFunc-12, there seem to be three possible things that might happen.

(a) the doc() function is not evaluated, because $fileToOpen//* is an empty sequence. In this case the result of the query is an empty sequence.

(b) doc() is loop-lifted out of the predicate, is evaluated as doc(""), and succeeds. The string "" is a valid relative URI, the test suite documentation says that the base URI of the query is that of the containing XML document, so the resolved URI is the URI of the metadata file fn/doc.xml, and doc("") will return the content of this file. In this case the query again returns an empty sequence.

(c) doc() is loop-lifted out of the predicate, is evaluated as doc(""), and fails. I don't think this should happen. The test suite documentation says (under "Accessing Source Documents") "Source documents (including XML files and text files) accessed by a query are generally described as part of the environment... There are a few exceptions to this rule..." I think the implication of the test suite documentation is that if a call on doc() is executed with a URI that points into test suite filestore, it should successfully retrieve the file identified by that URI. There are other tests that assume this behaviour, for example all the tests in app-CatalogCheck.

So I don't think that the call on doc() should fail even if it is loop-lifted (and as stated earlier, I think it's poor practice to loop-lift it anyway). Therefore I want to push back on allowing FODC0005 as a valid result for this test.

Now consider the two tests fn-doc-038 and fn-doc-039 which were added in response to this bug entry. (They are mis-dated 2007. I have re-dated them 2015-01-28, which is the date of the comment reporting their addition to the test suite).

Despite my comment 5, I think it would be very undesirable for an implementation to actually evaluate the call on doc() in either of these two tests. However, expressions in predicates are not protected from evaluation in the same way as expressions in conditional expressions and typeswitch clauses (and that's a conscious WG decision). So I think we are obliged to allow FODC0005 in the case of fn-doc-038, and FODC0002 in the case of fn-doc-039. I have made these changes.

I am leaving the bug open for further WG review.
Comment 8 Michael Kay 2015-12-08 17:06:05 UTC
Decided we would change the variable declaration to contain an invalid URI as its typed value and thereby legitimize the FODC0005 result.
Comment 9 O'Neil Delpratt 2016-01-12 17:04:57 UTC
The test case K2-SeqDocFunc-12 has been modified with the invalid URI and error code FODC0005 added. Comitted to cvs