This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1705 - [FS] technical: 5.2 Module Declaration: cyclical imports
Summary: [FS] technical: 5.2 Module Declaration: cyclical imports
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Formal Semantics 1.0 (show other bugs)
Version: Last Call drafts
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Jerome Simeon
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-07-18 05:21 UTC by Michael Dyck
Modified: 2006-11-13 05:20 UTC (History)
0 users

See Also:


Attachments

Description Michael Dyck 2005-07-18 05:21:35 UTC
5.2 Module Declaration

"Note that the rule above and the rules for static/dynamic processing of
an "import module" declaration in [5.11 Module Import] are mutually
recursive."
    Yes, and it seems to me that if you attempt to perform SCP or DCP on
    cyclically-importing library modules, the inference will not
    terminate.
    Consider the simplest case: main module M imports namespace N, which
    is the target namespace of library module L, which also imports its
    own target namespace N (explicitly allowed by XQuery 4.11).
    SCP of M's "import module N" relies on
        N =>module_statEnv statEnv1
    which relies on
        L =>stat statEnv
    which involves SCP of L's "import module N",
    which relies on
        N =>module_statEnv statEnv1
    etc.

    To handle this, I think you need another phase (call it Export Context
    Processing), a cursory version of SCP:
        Go through the Prolog, collecting all the "exportable" items
        (variables and their types, functions and their signatures).
        You might call the result an "export environment". The important
        thing is that this phase can skip over module imports, so it
        doesn't care about import-cycles.
    Some overarching mechanism would then be responsible for performing
    ECP on all library modules (of interest), and then using the
    resulting export environments to perform SCP on them all (wherein
    a module import simply pulls in the appropriate export environments).

    The only snag is that if a variable is declared without a type, its
    type must be inferred, which requires that its initializing expression
    be normalized and type-analyzed, which could involve imported
    variables or functions, and thus could not be done with the limited
    information available in ECP.

    For instance consider the two modules:
        --------
        module        namespace L1 = "http://www.example.com/foo";
        import module namespace L2 = "http://www.example.com/foo";
        declare variable $L1:s1 = $L2:s2;
        declare variable $L1:d1 = 10.0;
        --------
        module        namespace L2 = "http://www.example.com/foo";
        import module namespace L1 = "http://www.example.com/foo";
        declare variable $L2:d2 = $L1:d1;
        declare variable $L2:s2 = "foo";
        --------
    Note that there is no circularity involved in evaluating the
    variables:
        $s1 = $s2 = "foo"
        $d2 = $d1 = 10.0
    so err:XQST0054 is not applicable. (Or maybe it is; depends on what
    "circularity" means.) But just inferring their types seems pretty
    tricky.

    Perhaps you need to require that, in a library module, a variable
    declaration must have an explicit type. Or if that's too restrictive,
    say it can omit the type as long as the initializing expression
    doesn't depend on imported variables or functions. Or if that's still
    too restrictive, say it can't depend on variables or functions
    defined in another library module with the same target namespace.

    Or maybe just say that if an initializing expression cannot be
    statically typed because of a circularity, then that's an error too.
Comment 1 Mary Holstege 2005-07-20 00:29:37 UTC
We believe that the Formal Semantics specification does not require these
rules, as cyclic module import is explicitly outlawed (see section 4.11 Module 
Import).
 
This is the official response of the XML Query WG and XSL WG. 
 
We appreciate your feedback on the XML Query specifications.  Please let us know
if this response is satisfactory.  If not, please respond to this message,
explaining your concerns.  If so, please mark the comment CLOSED.
Comment 2 Michael Dyck 2005-07-20 00:46:05 UTC
(In reply to comment #1)
> We believe that the Formal Semantics specification does not require these
> rules, as cyclic module import is explicitly outlawed (see section 4.11 Module 
> Import).

Ahem. "However, it is a static error [err:XQST0073] if the graph of module
imports contains a cycle ..., <em>unless all the modules in the cycle share a
common namespace</em>."

Note that the cases I consider fall under the "unless" clause.
Comment 3 Michael Dyck 2005-09-06 20:50:57 UTC
WONTFIX? Uh, shouldn't there be a comment saying *why* you won't fix it?
Comment 4 Paul Cotton 2005-09-06 22:52:27 UTC
We will consider your feedback at a future joint XML Query and XSL WG meeting.

/paulc
Chair, XML Query WG
Comment 5 Jerome Simeon 2005-09-27 08:35:33 UTC
Reopened the bug to reflect the push back.
- Jerome
Comment 6 Jerome Simeon 2005-09-28 10:12:40 UTC
Michael,

Thank you very much for your feedback on this issue. The working group has
looked at it in more details. We agree there is a problem with the current
specification in the context of cyclic imports within the same namespace.

The working groups have decided to adopt the following resolution for it:

(1) adopt your suggested two phase processing for the prolog which splits
export context processing from static context processing.
(2) solves the issue with static analysis of global variables without
a type declaration by relying on the existing constraint in XQuery which
disallow circularity in variable definitions (see XQuery 1.0 Section 4.14
Variable Declaration) and which already is a static constraint.

Please let us know if you find any further issues with this approach.

Thanks,
- Jerome
On behalf of the XSLT and XML Query Working groups
Comment 7 Michael Dyck 2006-05-02 07:27:14 UTC
As far as I can tell, the CR did not resolve this issue.
Comment 8 Jerome Simeon 2006-08-01 12:56:18 UTC
Implemented as suggested. This will be reflected in the next publication.
- Jerome
Comment 9 Michael Dyck 2006-10-05 00:50:05 UTC
To clarify, would you claim that in my example
    declare variable $L1:s1 = $L2:s2;
"the initializing expression cannot be evaluated because of a circularity"?
Comment 10 Michael Dyck 2006-10-05 01:02:16 UTC
(In reply to comment #6)
> 
> The working groups have decided to adopt the following resolution for it:
> 
> (1) adopt your suggested two phase processing for the prolog which splits
> export context processing from static context processing.
> (2) solves the issue with static analysis of global variables without
> a type declaration by relying on the existing constraint in XQuery which
> disallow circularity in variable definitions

These two points are somewhat contradictory. If [err:XQST0054] is the only safeguard against circularity, it sounds like ECP of a type-decl-less variable decl involves a full STA of the initializing expression, which requires the appropriate static environment, which normally results from SCP. So how can you separate ECP from SCP?
Comment 11 Jerome Simeon 2006-10-10 14:56:37 UTC
About #9. I believe those wouldn't be considered a circularity, but that is only my best understanding of what the XQuery 1.0 specification says, as it is not completely formally specified.

About #10. My understanding is that the XQuery 1.0 specification does not impose a particular way to implements the circularity detection. It does not have to be done through the static context as defined in the XQuery Formal Semantics, although a clever implementation might try and do that. I think what the resolution suggests is that this is not formally specified in FS.
Comment 12 Michael Dyck 2006-10-10 21:19:50 UTC
(In reply to comment #11)
> About #9. I believe those wouldn't be considered a circularity, but that is
> only my best understanding of what the XQuery 1.0 specification says, as it is
> not completely formally specified.

If I want an official ruling, do I need to open a separate Bug?
Comment 13 Michael Dyck 2006-10-11 08:28:11 UTC
> About #10. My understanding is that the XQuery 1.0 specification does not
> impose a particular way to implements the circularity detection.

Supposedly, it doesn't impose a particular way to implement *anything*, it just says that an implementation has to get the specified results. I guess what you mean is that, with respect to circularity detection, it's not even saying what the specified results are. So perhaps the exact meaning of "circularity" is effectively implementation-dependent.

> It does not have to be done through the static context as defined in the
> XQuery Formal Semantics, although a clever implementation might try and
> do that. I think what the resolution suggests is that this is not formally
> specified in FS.

That's not surprising, since the FS doesn't formally specify the conditions of most other static errors either. But this bug isn't really about that. Evaluation-circularity is only relevant to this bug insofar as it could conceivably be defined in a way that would outlaw certain cases that otherwise might be hard to do STA on.

Specifically, if my example were deemed illegal (due to circularity, or whatever), then the FS certainly doesn't need to give a STA for it. But if, as you suggest in Comment #11, it *is* legal, then presumably the FS is obliged to say how one could do STA on it. The current FS doesn't. Neither does my original suggestion. So if the next version of the FS merely adopts my suggestion, then it probably won't either.
Comment 14 Jerome Simeon 2006-10-16 18:19:09 UTC
Hi Michael,

Re-reading the whole thread a nth time. It seems now to me that #6 indicates that the case you mention would be raised as a static error. So I'm starting to feel we are going over the same discussion one more time and it's unclear to me that there is any new evidence. Do you still want to suggest some additional changes to the spec(s)?

Obviously, your proposed changes as adopted still need to be implemented.

Thanks,
- Jerome
Comment 15 Michael Dyck 2006-10-17 01:18:46 UTC
> Obviously, your proposed changes as adopted still need to be implemented.

But my proposed changes don't solve the original problem unless you require var decls in library modules to have an explicit type, which Comment #6 indicates you aren't going to do. So as far as I can tell, this is still an unsolved problem.

Specifically, say you introduce a new auxiliary judgment for ECP:

    exportEnv1 |- PrologDecl =>export exportEnv2

(where the structure of an export environment is roughly a subset of the structure of a static environment). Then the rule for a var decl *with* an explicit type might be:

    exportEnv1 |- VarName of var expands to Variable
    Type = [[ SequenceType ]]_sequencetype
    exportEnv2 = exportEnv1 + varType( Variable => Type )
    ---------------------------------
    exportEnv1 |- declare variable $VarName as SequenceType := Expr 
                     =>export exportEnv2

(similar to 5.14 / SCP / rule 2), and that seems to work. But for a var decl *without* an explicit type, you need something like:

    exportEnv1 |- VarName of var expands to Variable
    (( Type is the static type of Expr ))
    exportEnv2 = exportEnv1 + varType( Variable => Type )
    ---------------------------------
    exportEnv1 |- declare variable $VarName := Expr 
                     =>export exportEnv2

Now, for the real premise 2,
    exportEnv1 |- Expr : Type
won't work, because exportEnv deliberately excludes things that you might need to statically type Expr. And if instead you try 
    statEnv |- Expr : Type
then where on earth does statEnv come from? Even if you do manage to say how statEnv comes to the rule (with the appropriate value/content), it seems likely that that would fall prey to the original M-N-L cycle.

(This isn't really new evidence, but perhaps it clarifies my earlier remarks, and gives a more explicit framework on which to hang your response.)
       
Comment 16 Jerome Simeon 2006-10-17 12:58:10 UTC
I think there is clearly a circularity here. Another rule that falls into this is that the scope of global variables is only what follows them. That does not really  directly apply there, but that is just another variant where the circularity comes from the recursive module import.

So I think the example you mention should be rejected as a circularity. Since this bug is opened again, I believe the working groups will look at it one more time.

- Jerome
Comment 17 Michael Dyck 2006-10-17 20:58:05 UTC
(In reply to comment #16)
> I think there is clearly a circularity here. ...
> So I think the example you mention should be rejected as a circularity.

Please clarify whether you're talking about the M-N-L example (i.e., the statement of the original problem), or the L1-L2 example (an example of the "snag" in my proposed solution to the original problem).
Comment 18 Michael Kay 2006-10-17 21:17:16 UTC
The WG spent a long time this evening discussing this bug, without coming to a conclusion. We did come up with a draft definition of "circularity", and on this definition, your example is not a circularity; moreover, I and I think several others on the WG believe that this query is legal. I think we're going to discuss it among ourselves for a few days and will respond further when we've decided we know what we are talking about - there have been too many factually incorrect responses on this thread already!
Comment 19 Michael Dyck 2006-10-18 04:36:49 UTC
(In reply to comment #18)
> The WG spent a long time this evening discussing this bug, without coming to a
> conclusion.

Well, it sounds like it's getting the attention it deserves, so that's good.

> We did come up with a draft definition of "circularity", and on
> this definition, your example is not a circularity; moreover, I and
> I think several others on the WG believe that this query is legal.

My assessment was that if it's legal, then that poses a serious problem for the (formal) specification of its static semantics. Does it seem that way to the WG too?
Comment 20 Jerome Simeon 2006-10-19 18:05:10 UTC
Yes, I did think it was a serious problem with respect to the approach currently taken for the formal semantics. Now in the process of implementing the resolution for Bug #1704, I realized that we could process the variables at the level of the module declaration, i.e., when the static context is built for all the modules in the same namespace.

After implementation of Bug #1704, the rule looks as follows:

AnyURI  is target namespace of modules Module1 ... Modulen
Module1 = module namespace NCName1 = URILiteral; PrologDeclList1
...
Module1 = module namespace NCName1 = URILiteral; PrologDeclListn
dynEnv |-  URILiteral has atomic value AnyURI
statEnvDefault |- declare namespace NCName = URILiteral;
                    PrologDeclList1 =>stat statEnv1 with PrologDeclList1'
...
statEnvn-1 |-  declare namespace NCName = URILiteral;
                    PrologDeclListn =>stat statEnvn with PrologDeclListn'
--------------------------------------------------------------------------------------
AnyURI =>module_statEnv statEnvn

So I am thinking we could add some non-formally specified judgment that would "sort" the variables in such a way that there is no forward references anymore. If there is no circularity, that judgment should be well-defined.

That judgment could look as follows:

process variable order PrologDeclList1,...,PrologDeclListn to PrologDeclList0

this process would make sure that variables are ordered in a way that no variable declaration is declared after it is used. It would leave function declarations unchanged. That would essentially reflect the intent of the corresponding sentence in XQuery:

<<
The initializing expression for a given variable must be evaluated before the evaluation of any expression that references the variable.
>>

This idea is similar to an intuition proposed by Michael Kay.

The final rule would look as follows:

AnyURI  is target namespace of modules Module1 ... Modulen
Module1 = module namespace NCName1 = URILiteral; PrologDeclList1
...
Module1 = module namespace NCName1 = URILiteral; PrologDeclListn
dynEnv |-  URILiteral has atomic value AnyURI
process variable order PrologDeclList1,...,PrologDeclListn to PrologDeclList0
statEnvDefault |- declare namespace NCName = URILiteral;
                    PrologDeclList0 =>stat statEnv with PrologDeclList0'
--------------------------------------------------------------------------------------
AnyURI =>module_statEnv statEnv

which is not that much more complex.

I don't know if I am missing something here.
Feedback on this proposal would be greatly appreciated.

Thanks,
- Jerome


Comment 21 Michael Dyck 2006-10-19 21:23:00 UTC
It seems like it would be a promising approach, but how would the merged prolog deal with clashes between modules on BoundarySpaceDecl, DefaultCollationDecl, BaseURIDecl, etc.?

(I had a much longer response, but that seems to be the major stumbling-block.)
Comment 22 Jerome Simeon 2006-10-24 15:00:42 UTC
I agree that this is an issue.

Another issue with the proposed resolution of course is that it does not take the
two-phase processing of the module declarations into account.

I would think that we should be able to deal with both at the same time, but that requires a bit more work. I will try and come up with a proposal in the next few days.

- Jerome
Comment 23 Jerome Simeon 2006-10-27 20:19:07 UTC
Here is a much more detailed proposal to address that bug. Unfortunately, it does not fully address the very last comment from Michael Dyck for which there is a serious roadblock. (see comment 4 in the proposal overview).

Suggestions and comments would be very much appreciated.

- Jerome


Proposal Overview
=================

High-level characteristics of the proposal:

 (1) it separates the context building for modules in two parts one
part deals with the context internal to a module, and the context
being exported for a set of modules in the same namespace.
 (2) it breaks the circularity for recursive modules.
 (3) it address the issue with forward variable references by
sorting them appropriately.
 (4) it only partially deals with the issue raised by Michael Dyck
about possible conflicts between declarations in multiple prologs. In
fact there is a very serious road block here, as I think the only way
to deal with it would be to keep track of the specific static
environment *per* variable. This makes things increadibly messy to
specific.

Here is a concrete variant of Michael Dyck's example which shows the
problem:

        module        namespace L1 = "http://www.example.com/foo";
        import module namespace L2 = "http://www.example.com/foo";
        declare boundary-space preserve;
        declare variable $L1:s1 = $L2:s2;
        declare variable $L1:d1 = <a>  { "Hello" }  </a>;
        --------
        module        namespace L2 = "http://www.example.com/foo";
        import module namespace L1 = "http://www.example.com/foo";
        declare boundary-space strip;
        declare variable $L2:d2 = $L1:d1;
        declare variable $L2:s2 = <a>  { "World!" }  </a>;

The point of the example is that you need to do two things at the same
time: (i) re-order the variable to remove forward references, (ii)
keep track of the environment *on a per variable basis*.  This last
bit is the road-block as it is again a pretty fundamental departure
from the formalization approach we have taken throughout the FS
documents. Fixing it will requires a lot of surgical hacking on top of
the already complex proposal below.


Detailed Proposal
=================

In what follows 'SCP' means static context processing.

Changes to judgments
--------------------

*** Replace the judgment:

  statEnv1 |- PrologDecl1 =>stat statEnv2

by two new judgments:

  statEnv1 ; AnyURI |- PrologDecl1 =>stat statEnv2
working only over "setters"
and
  statEnv1 ; statEnv2 |- PrologDecl1 =>stat statEnv3 ; statEnv4
working over variable and function declarations, where the first
environment on each side is the local module environment, and the
second environment is the environment being exported.

The addition of the URI in the first judgment is to prevent module
import of a module in the same namespace. Breaking the recursion.


*** Replace the judgment:

  statEnv1 |- PrologDeclList =>stat statEnv2 with PrologDeclList1

By two distinct judgments:
  statEnv1 |- PrologDeclList =>prestat statEnv2 with PrologDeclList1

which is used to pre-process the declarations for a given prolog,
extending the environment with the "setters" but not the variable and
function declarations, and returning all of the prolog normalized.

and
  statEnvn ; statEnvDefault; AnyURI |- PrologDeclList0 =>stat statEnv0

which processes the prolog's variable and functions declarations,
returning the exportable context.


*** Add the new judgment:

  PrologDeclList1' ... PrologDeclListn' preprocess as PrologDeclList0

Which takes a set of *core* prolog declarations, and reorders the
variable declarations so that they do not depend on forward
references. Because they have already been normalized, the rest of the
prolog should not be necessary for evaluation. [[[ WARNING. There is a
very serious road block here as I think that property is not true for
many properties in the static context. That means that the only way to
deal with this is to keep a different static environment for each
variable declarations which is pretty much a nightmare. ]]]


Changes to the introduction of Section 5
----------------------------------------

*** Replace the static context processing rule by the following rules,
distinguishing between the 'setters and declarations' from the
variable and function declarations. The purpose of those rules is
twofold. First build the internal static environment for a given
module. Second, they completely normalize the declarations for that
module based on that local environment.


------------------------------------------
statEnv ; AnyURI |- =>prestat statEnv with

PrologDecl != VariableDecl or FunctionDecl or OptionDecl
[PrologDecl]_PrologDecl  == PrologDecl1
statEnv1 ; AnyURI |- PrologDecl1 =>stat statEnv2
statEnv2 ; AnyURI |- PrologDeclList =>prestat statEnv3 with PrologDeclList1
---------------------------------------------------------------------------
statEnv1 ; AnyURI |- PrologDecl ;
      PrologDeclList =>prestat statEnv3 with PrologDecl1 ; PrologDeclList1

PrologDecl = VariableDecl or FunctionDecl or OptionDecl
[PrologDecl]_PrologDecl  == PrologDecl1
statEnv1 ; AnyURI |- PrologDeclList =>prestat statEnv1 with PrologDeclList1
---------------------------------------------------------------------------
statEnv1 ; AnyURI |- PrologDecl ;
      PrologDeclList =>prestat statEnv1 with PrologDecl1 ; PrologDeclList1


*** Add the following rules, which processes only variable and
function declarations for the purpose of constructing the "exported"
static environment.

-----------------------------------------------
statEnv1 ; statEnv2 ; AnyURI |- =>stat statEnv2


PrologDecl != VariableDecl or FunctionDecl or OptionDecl
statEnv1 ; statEnv2 ; AnyURI |- PrologDeclList =>stat statEnv3
-------------------------------------------------------------------------
statEnv1 ; statEnv2 ; AnyURI |- ProgDecl; PrologDeclList =>stat statEnv3

PrologDecl = VariableDecl or FunctionDecl or OptionDecl
statEnv1 ; statEnv2 ; AnyURI |- PrologDecl1 =>stat statEnv3 ; statEnv4
statEnv3 ; statEnv4 ; AnyURI |- PrologDeclList =>stat statEnv5
--------------------------------------------------------------------------
statEnv1 ; statEnv2 ; AnyURI |- PrologDecl; PrologDeclList =>stat statEnv5


Changes to Section 5.2 Module Declaration
-----------------------------------------

*** Replace the existing SCP rule by:

AnyURI  is target namespace of modules Module1 ... Modulen
Module1 = module namespace NCName1 = URILiteral; PrologDeclList1
...
Modulen = module namespace NCName1 = URILiteral; PrologDeclListn
dynEnv |- URILiteral has atomic value AnyURI
statEnvDefault ; AnyURI |- declare namespace NCName = URILiteral;
                   PrologDeclList1 =>prestat statEnv1 with PrologDeclList1'
...
statEnvn-1 ; AnyURI |- declare namespace NCName = URILiteral;
                   PrologDeclListn =>prestat statEnvn with PrologDeclListn'
PrologDeclList1' ... PrologDeclListn' preprocess as PrologDeclList0
statEnvn ; statEnvDefault; AnyURI |- PrologDeclList0 =>stat statEnv0
-------------------------------------------------------------------------------
AnyURI =>module_statEnv statEnvn0


Changes to Sections
    5.3 Boundary-space Declaration
    5.4 Default Collation Declaration
    5.5 Base URI Declaration
    5.6 Construction Declaration
    5.7 Ordering Mode Declaration
    5.8 Empty Order Declaration
    5.9 Copy-Namespaces Declaration
    5.10 Schema Import
    5.12 Namespace Declaration
    5.13 Default Namespace Declaration
--------------------------------------

*** In SCP, simply replace the use of:

  statEnv1 |- PrologDecl1 =>stat statEnv2
by
  statEnv1 ; AnyURI |- PrologDecl1 =>stat statEnv2


Changes to Section 5.11 Module Import
-------------------------------------

*** In SCP, replace the rule:

  AnyURI1 =>module_statEnv statEnv1
  ...
  AnyURI1 =>module_statEnv statEnvn
  statEnv extended with static environment statEnv1
              yields statEnv1' for uri AnyURI
  ...
  statEnvn-1 extended with static environment statEnvn
              yields statEnvn' for uri AnyURI
  ----------------------------------------------------------------
  statEnv |-  import module AnyURI1 LocationHints? =>stat statEnvn

By:

  AnyURI0 != AnyURI1
  AnyURI1 =>module_statEnv statEnv1
  statEnv extended with static environment statEnv1
              yields statEnv2 for uri AnyURI1
  -------------------------------------------------------------------------
  statEnv ; AnyURI0 |- import module AnyURI1 LocationHints? =>stat statEnv2

and

  AnyURI0 = AnyURI1
  ------------------------------------------------------------------------
  statEnv ; AnyURI0 |- import module AnyURI1 LocationHints? =>stat statEnv

Note that the module does not import its own context within itself
anymore. (see the corresponding treatment at the beginning of the
proposal).


Replace the rule:

  dynEnv |- URILiteral1 has atomic value AnyURI1
  AnyURI1 =>module_statEnv statEnv1
  ...
  AnyURI1 =>module_statEnv statEnvn
  statEnv extended with static environment statEnv1
             yields statEnv1' for uri AnyURI
  ...
  statEnvn-1 extended with static environment statEnvn
             yields statEnvn' for uri AnyURI
  statEnv' = statEnvn' + namespace(NCName => (passive, AnyURI))
  -------------------------------------------------------------
  statEnv |-  import module namespace NCName = URILiteral1
  LocationHints? =>stat statEnvn'

By the rule:

  dynEnv |- URILiteral1 has atomic value AnyURI1
  AnyURI0 != AnyURI1
  AnyURI1 =>module_statEnv statEnv1
  statEnv extended with static environment statEnv1
              yields statEnv2 for uri AnyURI1
  statEnv3 = statEnvn2 + namespace(NCName => (passive, AnyURI1))
  -----------------------------------------------------------------
  statEnv ; AnyURI0 |- import module namespace NCName = URILiteral1
  LocationHints? =>stat statEnv3

and
  dynEnv |- URILiteral1 has atomic value AnyURI1
  AnyURI0 = AnyURI1
  statEnv1 = statEnv + namespace(NCName => (passive, AnyURI1))
  -----------------------------------------------------------------
  statEnv ; AnyURI0 |- import module namespace NCName = URILiteral1
  LocationHints? =>stat statEnv1


Changes to Section 5.14 Variable Declaration
--------------------------------------------

In SCP, change the rule:

  statEnv |- VarName of var expands to Variable
  statEnv |- Expr : Type
  statEnv1 = statEnv + varType(Variable => Type)
  ------------------------------------------------------------
  statEnv |- declare variable $VarName := Expr =>stat statEnv1

by:

  statEnv1 |- VarName of var expands to Variable
  statEnv1 |- Expr : Type
  statEnv3 = statEnv1 + varType(Variable => Type)
  statEnv4 = statEnv2 + varType(Variable => Type)
  -------------------------------------------------------
  statEnv1; statEnv2 |- declare variable $VarName := Expr
                            =>stat statEnv3;statEnv4

Note that both environments are extended by the variable type, but
that the expression is statically typed in statEnv1 which is the local
environment for that module.

Similarly, change the rule:

  statEnv  |-  VarName of var expands to Variable
  Type = [SequenceType]sequencetype
  statEnv |-  Expr : Type2
  statEnv |-  Type2 <: Type
  statEnv1 = statEnv + varType( Variable => Type)
  ----------------------------------------------------------------------------
  statEnv |- declare variable $VarName as SequenceType := Expr =>stat statEnv2

to:

  statEnv1 |-  VarName of var expands to Variable
  Type = [SequenceType]sequencetype
  statEnv1 |-  Expr : Type2
  statEnv1 |-  Type2 <: Type
  statEnv3 = statEnv1 + varType( Variable => Type)
  statEnv4 = statEnv2 + varType( Variable => Type)
  -----------------------------------------------------------------------
  statEnv1; statEnv2 |- declare variable $VarName as SequenceType := Expr
        =>stat statEnv3;statEnv4


Changes to Section 5.15 Function Declaration
--------------------------------------------

In SCP, similar change as for variables. I.e.,
for every rule:

-- Replace: statEnv |- FunctionDecl =>stat statEnv1
by: statEnv1;statEnv2 |- FunctionDecl =>stat statEnv3;statEnv4

-- Replace: statEnv by: statEnv1 above the line.
-- Replace: statEnv1 = statEnv + funcType(expanded-QName => FunctionSig)
by the two lines:
statEnv3 = statEnv1 + funcType(expanded-QName => FunctionSig)
statEnv4 = statEnv2 + funcType(expanded-QName => FunctionSig)


Comment 24 Michael Kay 2006-10-27 20:57:26 UTC
Could you explain the problem in layman's terms? I can't see how rearranging the variable declarations within a module can materially affect the static context for a particular variable declaration. We have:

((DefaultNamespaceDecl | Setter | NamespaceDecl | Import) Separator)* ((VarDecl | FunctionDecl | OptionDecl) Separator)*

which means that the only effect is that a variable is moved relative to another variable, function, or option declaration. Option declarations don't matter because they have no formal semantics. Variables don't matter because rearranging the order of variables is the whole point. That leaves functions. Variables can contain forwards references to functions, but functions can't contain forwards references to variables. So we need to do the rearrangement in such a way that references from functions to variables are validated before the variables are rearranged. That seems to be all. I appreciate that it's much easier to say this in English than in the formal notation of the spec, but it doesn't look insuperable to me.

Michael Kay
Comment 25 Jerome Simeon 2006-10-27 21:38:11 UTC
Michael, let me try.

The issue is to specify which context for which variable. Static typing takes the static context into account. The static context must verify the following conditions:

(1) the static context for a variable must be based on the local static context for that module. This is based on the syntactic location in a given module and therefor the static context should be different for variables in different modules.

(2) the static context must contain the static type for the previous variables, "previous" here is is based on a reordering which is global to *all* the modules in the same namespace.

I don't know how to reconcile those two conditions. It seems the type information for variables and the rest of the static context have to be propagated completely differently. There might be a simple fix, but I can't see it. The complex fix that comes to mind is to split the context at a finer grained throughout the whole spec!

Hope that is helpful,
- Jerome

Comment 26 Michael Dyck 2006-10-27 22:07:20 UTC
Problem with =>prestat: if the prolog contains a VarDecl (or FuncDecl) that contains a function call, then =>prestat depends on [VarDecl]_PrologDecl (or [FuncDecl]_PrologDecl), which depends on [FunctionCall]_Expr, which needs:
-- a statEnv.funcType with entries for all functions visible at that point
(i.e., all functions declared in or imported by the prolog), and also
-- a statEnv.typeDefn with entries for all atomic types visible at that point.

But in the statEnv that =>prestat hands down to []_PrologDecl, the funcType and
typeDefn components will only have entries from statEnvDefault, so normalization of the function call will fail, in general.

It would probably help to rewrite the rules for FunctionCall so that its normalization required little if anything from the statEnv.
Comment 27 Jerome Simeon 2006-10-27 22:51:38 UTC
Michael,

I'm thinking that the issue you mention about function declarations
is already covered by Bug #1743.

And that the typeDefn should be populated properly as it was before as =>prestat
still propagates the static environment through the normalization rules as before.

Am I missing something?

- Jerome
Comment 28 Michael Dyck 2006-10-28 03:15:48 UTC
(In reply to comment #27)
> 
> I'm thinking that the issue you mention about function declarations
> is already covered by Bug #1743.

Yes, but the proposal in Comment #23 doesn't incorporate the fix for Bug 1743. And if you try, you'll find it doesn't quite work when multiple ("sibling") modules share a target namespace, because statEnv.funcType won't have entries for functions defined in the sibling modules, because the "self-import" that would conceptually bring them in is defined (by =>stat) as basically a no-op/pass-through.

And that's just *normalization* of FunctionCalls. You also need a properly defined statEnv.funcType for their STA, which (in this proposal) occurs when =>stat is applied to the merged PrologDeclList0. So you'd need to apply the fix for Bug 1743 there too.

And note that the presence or absence of a self-import in a module does not have the necessary effect on whether its expressions can "see" the var decls and func decls of the sibling modules when you apply =>stat to the merged PrologDeclList0.

> And that the typeDefn should be populated properly as it was before as
> =>prestat still propagates the static environment through the
> normalization rules as before.

Oops, yes, as =>prestat recurses down the prolog, it invokes =>stat on the setters and accumulates the resulting statEnv, so when you get to normalizing FunctionCalls, statEnv.typeDefn will have the types imported by the prolog.

However, because =>prestat's "output" statEnv only reflects the prolog's setters, it's pointless, in the definition for =>module_statEnv, to use one module's output statEnv as another's input statEnv. The input statEnvs should all be statEnvDefault. (So in fact, there's no reason for =>prestat to have an output statEnv at all.)
Comment 29 Jerome Simeon 2006-11-02 17:27:36 UTC
Here is (yet) another proposal trying to break the Gordian knot on bug
1705.  This is a (preliminary) proposal currently being considered by
the XML Query working group at its face to face meeting. Any early
feedback on it would be greatly appreciated.

- Jerome


Overview
--------

The proposal defines circularity between modules based on whether
variables depend on variables or functions in another module.

This notion outlaws the specific example that was posted in comment #1
for that bug. It allows to order modules based on how variables in one
module depend on variables in another module.

Details follows.


Changes to XQuery 1.0 book
--------------------------

(1) Make 'depends on a variable or function' a definition in XQuery
1.0.

(2) Define formally a notion of module dependency as follows.

<<
A module M1 depends on a module M2 if a variable or function in M1
depends on a variable or function in a module M2, where M1 and M2 are
distinct modules.
>>

NOTE: Based on that definition M1 never depends on M1.

(3) Add a restriction to module import:

<<
If a module depends on itself by any combination of transitive
dependencies, a static error is raised.
>>


Changes to XQuery FS book
-------------------------

Note this proposal also takes bug #1743 into account. What follows are
only the main rules, not all specific changes yet.

*** Add the new judgments:

  Module1 ... Modulen reordered as Module1' ... Modulen'

Which takes a set of Modules, and reorders them based on the absence
of circularity. The order is such that variables in a module Modulei
only depends on variables defined in Module1...Modulei-1.

  statEnv1 |- PrologDecls gather function signatures statEnv2

Which extends the static environment with a set of function signatures.



*** In Section 5.2 Module Declaration, replace the existing SCP rule by:

  AnyURI  is target namespace of modules Module1 ... Modulen
  Module1 ... Modulen reordered to Module1' ... Modulen' (***)
  Module1' = module namespace NCName1 = URILiteral; PrologDeclList1
  ...
  Modulen' = module namespace NCName1 = URILiteral; PrologDeclListn
  statEnvDefault |- PrologDecls gather function signatures statEnv0
  statEnv0 ; AnyURI |- declare namespace NCName = URILiteral;
    PrologDeclList1 =>stat statEnv1 with PrologDeclList1'
   ...
  statEnvn-1 ; AnyURI |- declare namespace NCName = URILiteral;
    PrologDeclList1 =>stat statEnvn with PrologDeclList1'
  ------------------------------------------------------------------
  AnyURI =>module_statEnv statEnvn0

(***) Always exists because there isn't circularity

The main idea is that we can alway order the modules so they do not
depend on eeach other, getting rid of circularity and making sure the
simpler version of SCP which goes through modules in order can
terminate.

NOTE: There should be some changes to the judgment:
  statEnv0 ; AnyURI |- declare namespace NCName = URILiteral;
  PrologDeclList1 =>stat statEnvn with PrologDeclList1' so that it
  passes only the relevant part of the static environment between
  modules.


Changes to Section 5.11 Module Import
-------------------------------------

*** In SCP, replace the rule:

  AnyURI1 =>module_statEnv statEnv1
  ...
  AnyURI1 =>module_statEnv statEnvn
  statEnv extended with static environment statEnv1
              yields statEnv1' for uri AnyURI
  ...
  statEnvn-1 extended with static environment statEnvn
              yields statEnvn' for uri AnyURI
  ----------------------------------------------------------------
  statEnv |-  import module AnyURI1 LocationHints? =>stat statEnvn

By:

  AnyURI =>module_statEnv statEnv
  statEnv extended with static environment statEnv1
              yields statEnv1' for uri AnyURI
  ...
  statEnvn-1 extended with static environment statEnvn
              yields statEnvn' for uri AnyURI
  ----------------------------------------------------------------
  statEnv |-  import module AnyURI LocationHints? =>stat statEnvn


Changes to Section 5.14 Variable Declaration
--------------------------------------------

No changes.

Changes to Section 5.15 Function Declaration
--------------------------------------------

No changes
Comment 30 Michael Dyck 2006-11-02 19:33:17 UTC
When you say "No changes", do you mean "No changes from the 2006-06 CR" ?
Comment 31 Jerome Simeon 2006-11-02 19:45:21 UTC
Michael,
This is a loose statement. It more or less means 'no additional
substantive changes should be required specifically to address bug
1705'. It is not a statement indicating that other changes necessary
for any other bugs will be backtracked.
As indicated in the comment, the proposed resolution does not
contain a complete specific set of changes to FS. Any feedback
on the proposal still would be extremely welcome.
Thanks,
- Jerome
Comment 32 Michael Dyck 2006-11-02 20:58:48 UTC
(In reply to comment #29)
>
> Changes to XQuery 1.0 book
> ...
> If a module depends on itself by any combination of transitive
> dependencies, a static error is raised.

Is this in addition to, or instead of, the current "graph of module imports contains a cycle" error? Because there are some cases that are errors according to the current rule that are not errors according to the proposed rule.

One trivial example is
    ----
    module        namespace A = "http://www.example.com/A";
    import module namespace B = "http://www.example.com/B";
    declare variable $A:a = 'a';
    ----
    module        namespace B = "http://www.example.com/B";
    import module namespace A = "http://www.example.com/A";
    declare variable $B:b = 'b';
    ----
There's certainly a cycle in the graph of module imports (and so, an error for the current rule), but because neither module references any of the other's variables or functions, neither "depends on" the other, so the proposed rule raises no error.

Less trivial:
    --------
    module        namespace S1 = "http://www.example.com/siblings";
    declare variable $S1:s1 = 's1';
    --------
    module        namespace X = "http://www.example.com/x";
    import module namespace S = "http://www.example.com/siblings";
    declare variable $X:x1 = $S:s1;
    declare variable $X:x2 = 'x2';
    --------
    module        namespace S2 = "http://www.example.com/siblings";
    import module namespace X  = "http://www.example.com/x";
    declare variable $S2:s2 = $X:x2;
    --------
Module X imports the 'siblings' namespace, and thereby the modules S1 and S2; S2 imports the 'x' namespace and thereby the module X; thus there is a cycle in the import graph (X -> S2 -> X) and the current rule raises an error. However, in the "depends on" graph, we merely have S2 depends on X (because of $x2) and X depends on S1 (because of $s1); there's no cycle, so the proposed rule does not raise an error.
Comment 33 Jerome Simeon 2006-11-02 21:16:36 UTC
There is no proposal to remove other rules about module imports. This is in addition to it.
- Jerome
Comment 34 Michael Dyck 2006-11-02 22:01:41 UTC
It looks like the proposal of Comment #29 is subject to the M-N-L inference cycle that started this whole bug.
Comment 35 Jerome Simeon 2006-11-02 23:13:45 UTC
The suggested resolution as written is a bit misleading on this I agree. I believe
that the basis for the proposal in comment #23 still apply here. We are just looking for a way to deal with the issues with how static environments collide.

So more precisely:

 -- we still need to distinguish imports within the same namespaces from other imports (in the spirit of comment #23) so that the "exported context" for a given namespace will be computed at once.
 -- in addition that "exported context" will be computed based on the approach suggested in comment #29.

- Jerome


Comment 36 Michael Dyck 2006-11-03 00:05:00 UTC
You can break the M-N-L cycle by adding to 5.11 the "AnyURI0 != AnyURI1" etc. machinery from Comment #23. But then you have the problem I raised in Comment #28: the presence or absence of a self-import in a module has no effect on the statEnv that its expressions "see". However, I think you can fix that...

You say "There should be some changes to the =>stat judgment so that it passes only the relevant part of the static environment between modules." One way to do this would be to remix some more stuff from Comment #28. Make the =>stat judgment be:

    statEnv_N_in ; statEnv_E_in ; AnyURI |- PrologDeclList_in
    =>stat
    statEnv_N_out ; statEnv_E_out with PrologDeclList_out

statEnv_N is the "normal" statEnv, used for all the usual purposes.

statEnv_E is the "export" statEnv, containing (conceptually) the funcType and varType of the current module's "siblings". (In reality, it doesn't contain entries for *all* the functions and variables of a module's siblings, but it does contain all the ones that the module depends on, which is what matters.)

The big =>module_statEnv rule in 5.2 would say:

    ...
    statEnvDefault |- PrologDecls gather function signatures statEnv_E_0
    statEnvDefault ; statEnv_E_0 ; AnyURI |- ...1
        =>stat
        statEnv_1 ; statEnv_E_1 with PrologDeclList1'
    ...
    statEnvDefault ; statEnv_E_[n-1] ; AnyURI |- ...n
        =>stat
        statEnv_n ; statEnv_E_n with PrologDeclListn'
    --------
    AnyURI =>module_statEnv statEnvn_E_n

(Note that all modules start out with just statEnvDefault as their "normal" statEnv.)

So that would take care of initializing the export-env and propagating it between sibling modules (and propagating it out to whatever called =>module_statEnv). =>stat would only update it for var decls (also updating the normal statEnv), and would only consult it for self-imports ("AnyURI0 = AnyURI1"), copying its var + func info into statEnv_N_out.
Comment 37 Michael Dyck 2006-11-04 08:57:37 UTC
And while you're thinking about that, here's some other comments:

> Module1 ... Modulen reordered as Module1' ... Modulen'
> The order is such that variables in a module Modulei
> only depends on variables defined in Module1...Modulei-1.

That isn't quite what you want to say. First, it's the "primed" Modules that satisfy the ordering constraint. Second, it doesn't allow variables in Modulei' to depend on other variables in Modulei' itself, or on variables in Modules other than the listed ones, both of which should be legal. Third, it ignores functions depending on variables: if you try to do STA on a function body before you have the type of a variable that it references, the STA fails.

So you could say:
    The order is such that variables and functions in module Modulei'
    don't depend on variables defined in Modulei+1'...Modulen'.

But it would be simpler to say
    The order is such that Modulei' does not depend on any of
    Modulei+1'...Modulen'.
Or equivalently,
    The order is such that if Modulei' depends on Modulej', then i >= j.

Yes, it's a slightly stronger constraint on the ordering, but the net effect is the same, and since the non-circularity restriction is defined on the Module-Module version of "depends on", it's easier to see that this version (rather than the variables+functions version) of the ordering constraint is guaranteed possible.

------

> AnyURI  is target namespace of modules Module1 ... Modulen
> Module1 ... Modulen reordered to Module1' ... Modulen'

You could merge these two judgments. Instead of introducing the "reordered to" judgment, just say that the list of Modules "returned" by the "is target namespace of" judgment is in the constrained order.

------

> Module1' = module namespace NCName1 = URILiteral; PrologDeclList1
> ...
> Modulen' = module namespace NCName1 = URILiteral; PrologDeclListn

The second NCName1 should be NCNamen, since sibling modules don't have to use the same prefix for their target namespace. And the URILiterals should be subscripted too, since those don't have to be exactly the same either (although they do have to identify the same target namespace, of course).

------

> statEnv0 ; AnyURI |- declare namespace NCName = URILiteral;
>     PrologDeclList1 =>stat statEnv1 with PrologDeclList1'
>  ...
> statEnvn-1 ; AnyURI |- declare namespace NCName = URILiteral;
>     PrologDeclList1 =>stat statEnvn with PrologDeclList1'

In the last premise, the subscripts '1' should be 'n-1'.
Also, the NCNames and URILiterals should be subscripted.

------

> AnyURI =>module_statEnv statEnv
> statEnv extended with static environment statEnv1
>             yields statEnv1' for uri AnyURI
> ...
> statEnvn-1 extended with static environment statEnvn
>             yields statEnvn' for uri AnyURI
> ----------------------------------------------------------------
> statEnv |- import module AnyURI LocationHints? =>stat statEnvn

-- There aren't statEnv1 ... statEnvn any more.
-- You don't want to require that the statEnv returned by =>module_statEnv be the same as the statEnv in the conclusion.
-- An AnyURI is not a URILiteral.

So that should be:

    URILiteral has atomic value AnyURI
    AnyURI =>module_statEnv statEnv2
    statEnv1 extended with static environment statEnv2
                yields statEnv3 for uri AnyURI
    ----------------------------------------------------------------
    statEnv1 |- import module URILiteral LocationHints? =>stat statEnv3
Comment 38 Jerome Simeon 2006-11-08 00:53:50 UTC
Hi Michael,
Yes I think your comment and suggestions totally make sense there. I think that's more or less what I had in mind.
Thanks for the suggestions. I do believe that addresses your earlier problem in #28. Please let me know asap if you still see other issues as I am trying to produce a new version of the document as we speak.
- Jerome

(In reply to comment #36)
> You can break the M-N-L cycle by adding to 5.11 the "AnyURI0 != AnyURI1" etc.
> machinery from Comment #23. But then you have the problem I raised in Comment
> #28: the presence or absence of a self-import in a module has no effect on the
> statEnv that its expressions "see". However, I think you can fix that...
> 
> You say "There should be some changes to the =>stat judgment so that it passes
> only the relevant part of the static environment between modules." One way to
> do this would be to remix some more stuff from Comment #28. Make the =>stat
> judgment be:
> 
>     statEnv_N_in ; statEnv_E_in ; AnyURI |- PrologDeclList_in
>     =>stat
>     statEnv_N_out ; statEnv_E_out with PrologDeclList_out
> 
> statEnv_N is the "normal" statEnv, used for all the usual purposes.
> 
> statEnv_E is the "export" statEnv, containing (conceptually) the funcType and
> varType of the current module's "siblings". (In reality, it doesn't contain
> entries for *all* the functions and variables of a module's siblings, but it
> does contain all the ones that the module depends on, which is what matters.)
> 
> The big =>module_statEnv rule in 5.2 would say:
> 
>     ...
>     statEnvDefault |- PrologDecls gather function signatures statEnv_E_0
>     statEnvDefault ; statEnv_E_0 ; AnyURI |- ...1
>         =>stat
>         statEnv_1 ; statEnv_E_1 with PrologDeclList1'
>     ...
>     statEnvDefault ; statEnv_E_[n-1] ; AnyURI |- ...n
>         =>stat
>         statEnv_n ; statEnv_E_n with PrologDeclListn'
>     --------
>     AnyURI =>module_statEnv statEnvn_E_n
> 
> (Note that all modules start out with just statEnvDefault as their "normal"
> statEnv.)
> 
> So that would take care of initializing the export-env and propagating it
> between sibling modules (and propagating it out to whatever called
> =>module_statEnv). =>stat would only update it for var decls (also updating the
> normal statEnv), and would only consult it for self-imports ("AnyURI0 =
> AnyURI1"), copying its var + func info into statEnv_N_out.
> 

Comment 39 Michael Dyck 2006-11-08 02:50:07 UTC
(In reply to comment #38)
> Please let me know asap if you still see other issues as I am trying to
> produce a new version of the document as we speak.

What did you have in mind for defining "gather function signatures"?
Comment 40 Jerome Simeon 2006-11-09 16:40:47 UTC
(In reply to comment #39)
> (In reply to comment #38)
> > Please let me know asap if you still see other issues as I am trying to
> > produce a new version of the document as we speak.
> 
> What did you have in mind for defining "gather function signatures"?
> 

This is essentially the proposed resolution for #1743. Possibly there is some consolidation that can happen between the various passes over the prolog, but the idea is to first get an environment with all the function declarations.
- Jerome
Comment 41 Michael Dyck 2006-11-10 21:22:35 UTC
(In reply to comment #40)
> >
> > What did you have in mind for defining "gather function signatures"?
> 
> This is essentially the proposed resolution for #1743.

As it stands, the =>stat1 judgment in Bug #1743 puts imported variables and functions in its output env (which made sense for 1743, but is not wanted here), and also will fail in the presence of an import cycle. To fix both of these problems, you need to modify the judgment so that it does minimal processing of prolog decls:

-- For decls that introduce a namespace binding, just add that binding to statEnv.namespace (or statEnv.default_elem_namespace).

-- For function decls, just construct the signature and add it to statEnv.funcType. (This will consult the statEnv, but should only need the namespace bindings from the previous point, to expand QNames.)

-- For all other decls, it's a pass-through.
Comment 42 Jerome Simeon 2006-11-13 05:17:19 UTC
This has been implemented. Here is a sketch of the implementation:

(1) added a judgment:

 statEnv1 |-   PrologDeclList =>sigs statEnv2  

which 'gather' function signatures, as suggested in comment #41.

(2) fixed the main module declaration rule, as suggested in comment #29.

(3) fixed the mapping from URIs to module declarations to return modules in order, as suggested in comment #37, relying on the definition of dependency specified in the XQuery 1.0 book.

(4) broke the module circularity similarly to what was suggested in comments #36.

- Jerome
Comment 43 Jerome Simeon 2006-11-13 05:18:54 UTC
Also, an important part of the implementation is:

(5) split the static context processing for function declarations to:
  -- part1, used to compute the signature used in the judgment =>sigs.
  -- part2, simplified the function declaration rule to access the function signature directly from the context as it has been processed earlier.

- Jerome