This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
5.2 Module Declaration "Note that the rule above and the rules for static/dynamic processing of an "import module" declaration in [5.11 Module Import] are mutually recursive." Yes, and it seems to me that if you attempt to perform SCP or DCP on cyclically-importing library modules, the inference will not terminate. Consider the simplest case: main module M imports namespace N, which is the target namespace of library module L, which also imports its own target namespace N (explicitly allowed by XQuery 4.11). SCP of M's "import module N" relies on N =>module_statEnv statEnv1 which relies on L =>stat statEnv which involves SCP of L's "import module N", which relies on N =>module_statEnv statEnv1 etc. To handle this, I think you need another phase (call it Export Context Processing), a cursory version of SCP: Go through the Prolog, collecting all the "exportable" items (variables and their types, functions and their signatures). You might call the result an "export environment". The important thing is that this phase can skip over module imports, so it doesn't care about import-cycles. Some overarching mechanism would then be responsible for performing ECP on all library modules (of interest), and then using the resulting export environments to perform SCP on them all (wherein a module import simply pulls in the appropriate export environments). The only snag is that if a variable is declared without a type, its type must be inferred, which requires that its initializing expression be normalized and type-analyzed, which could involve imported variables or functions, and thus could not be done with the limited information available in ECP. For instance consider the two modules: -------- module namespace L1 = "http://www.example.com/foo"; import module namespace L2 = "http://www.example.com/foo"; declare variable $L1:s1 = $L2:s2; declare variable $L1:d1 = 10.0; -------- module namespace L2 = "http://www.example.com/foo"; import module namespace L1 = "http://www.example.com/foo"; declare variable $L2:d2 = $L1:d1; declare variable $L2:s2 = "foo"; -------- Note that there is no circularity involved in evaluating the variables: $s1 = $s2 = "foo" $d2 = $d1 = 10.0 so err:XQST0054 is not applicable. (Or maybe it is; depends on what "circularity" means.) But just inferring their types seems pretty tricky. Perhaps you need to require that, in a library module, a variable declaration must have an explicit type. Or if that's too restrictive, say it can omit the type as long as the initializing expression doesn't depend on imported variables or functions. Or if that's still too restrictive, say it can't depend on variables or functions defined in another library module with the same target namespace. Or maybe just say that if an initializing expression cannot be statically typed because of a circularity, then that's an error too.
We believe that the Formal Semantics specification does not require these rules, as cyclic module import is explicitly outlawed (see section 4.11 Module Import). This is the official response of the XML Query WG and XSL WG. We appreciate your feedback on the XML Query specifications. Please let us know if this response is satisfactory. If not, please respond to this message, explaining your concerns. If so, please mark the comment CLOSED.
(In reply to comment #1) > We believe that the Formal Semantics specification does not require these > rules, as cyclic module import is explicitly outlawed (see section 4.11 Module > Import). Ahem. "However, it is a static error [err:XQST0073] if the graph of module imports contains a cycle ..., <em>unless all the modules in the cycle share a common namespace</em>." Note that the cases I consider fall under the "unless" clause.
WONTFIX? Uh, shouldn't there be a comment saying *why* you won't fix it?
We will consider your feedback at a future joint XML Query and XSL WG meeting. /paulc Chair, XML Query WG
Reopened the bug to reflect the push back. - Jerome
Michael, Thank you very much for your feedback on this issue. The working group has looked at it in more details. We agree there is a problem with the current specification in the context of cyclic imports within the same namespace. The working groups have decided to adopt the following resolution for it: (1) adopt your suggested two phase processing for the prolog which splits export context processing from static context processing. (2) solves the issue with static analysis of global variables without a type declaration by relying on the existing constraint in XQuery which disallow circularity in variable definitions (see XQuery 1.0 Section 4.14 Variable Declaration) and which already is a static constraint. Please let us know if you find any further issues with this approach. Thanks, - Jerome On behalf of the XSLT and XML Query Working groups
As far as I can tell, the CR did not resolve this issue.
Implemented as suggested. This will be reflected in the next publication. - Jerome
To clarify, would you claim that in my example declare variable $L1:s1 = $L2:s2; "the initializing expression cannot be evaluated because of a circularity"?
(In reply to comment #6) > > The working groups have decided to adopt the following resolution for it: > > (1) adopt your suggested two phase processing for the prolog which splits > export context processing from static context processing. > (2) solves the issue with static analysis of global variables without > a type declaration by relying on the existing constraint in XQuery which > disallow circularity in variable definitions These two points are somewhat contradictory. If [err:XQST0054] is the only safeguard against circularity, it sounds like ECP of a type-decl-less variable decl involves a full STA of the initializing expression, which requires the appropriate static environment, which normally results from SCP. So how can you separate ECP from SCP?
About #9. I believe those wouldn't be considered a circularity, but that is only my best understanding of what the XQuery 1.0 specification says, as it is not completely formally specified. About #10. My understanding is that the XQuery 1.0 specification does not impose a particular way to implements the circularity detection. It does not have to be done through the static context as defined in the XQuery Formal Semantics, although a clever implementation might try and do that. I think what the resolution suggests is that this is not formally specified in FS.
(In reply to comment #11) > About #9. I believe those wouldn't be considered a circularity, but that is > only my best understanding of what the XQuery 1.0 specification says, as it is > not completely formally specified. If I want an official ruling, do I need to open a separate Bug?
> About #10. My understanding is that the XQuery 1.0 specification does not > impose a particular way to implements the circularity detection. Supposedly, it doesn't impose a particular way to implement *anything*, it just says that an implementation has to get the specified results. I guess what you mean is that, with respect to circularity detection, it's not even saying what the specified results are. So perhaps the exact meaning of "circularity" is effectively implementation-dependent. > It does not have to be done through the static context as defined in the > XQuery Formal Semantics, although a clever implementation might try and > do that. I think what the resolution suggests is that this is not formally > specified in FS. That's not surprising, since the FS doesn't formally specify the conditions of most other static errors either. But this bug isn't really about that. Evaluation-circularity is only relevant to this bug insofar as it could conceivably be defined in a way that would outlaw certain cases that otherwise might be hard to do STA on. Specifically, if my example were deemed illegal (due to circularity, or whatever), then the FS certainly doesn't need to give a STA for it. But if, as you suggest in Comment #11, it *is* legal, then presumably the FS is obliged to say how one could do STA on it. The current FS doesn't. Neither does my original suggestion. So if the next version of the FS merely adopts my suggestion, then it probably won't either.
Hi Michael, Re-reading the whole thread a nth time. It seems now to me that #6 indicates that the case you mention would be raised as a static error. So I'm starting to feel we are going over the same discussion one more time and it's unclear to me that there is any new evidence. Do you still want to suggest some additional changes to the spec(s)? Obviously, your proposed changes as adopted still need to be implemented. Thanks, - Jerome
> Obviously, your proposed changes as adopted still need to be implemented. But my proposed changes don't solve the original problem unless you require var decls in library modules to have an explicit type, which Comment #6 indicates you aren't going to do. So as far as I can tell, this is still an unsolved problem. Specifically, say you introduce a new auxiliary judgment for ECP: exportEnv1 |- PrologDecl =>export exportEnv2 (where the structure of an export environment is roughly a subset of the structure of a static environment). Then the rule for a var decl *with* an explicit type might be: exportEnv1 |- VarName of var expands to Variable Type = [[ SequenceType ]]_sequencetype exportEnv2 = exportEnv1 + varType( Variable => Type ) --------------------------------- exportEnv1 |- declare variable $VarName as SequenceType := Expr =>export exportEnv2 (similar to 5.14 / SCP / rule 2), and that seems to work. But for a var decl *without* an explicit type, you need something like: exportEnv1 |- VarName of var expands to Variable (( Type is the static type of Expr )) exportEnv2 = exportEnv1 + varType( Variable => Type ) --------------------------------- exportEnv1 |- declare variable $VarName := Expr =>export exportEnv2 Now, for the real premise 2, exportEnv1 |- Expr : Type won't work, because exportEnv deliberately excludes things that you might need to statically type Expr. And if instead you try statEnv |- Expr : Type then where on earth does statEnv come from? Even if you do manage to say how statEnv comes to the rule (with the appropriate value/content), it seems likely that that would fall prey to the original M-N-L cycle. (This isn't really new evidence, but perhaps it clarifies my earlier remarks, and gives a more explicit framework on which to hang your response.)
I think there is clearly a circularity here. Another rule that falls into this is that the scope of global variables is only what follows them. That does not really directly apply there, but that is just another variant where the circularity comes from the recursive module import. So I think the example you mention should be rejected as a circularity. Since this bug is opened again, I believe the working groups will look at it one more time. - Jerome
(In reply to comment #16) > I think there is clearly a circularity here. ... > So I think the example you mention should be rejected as a circularity. Please clarify whether you're talking about the M-N-L example (i.e., the statement of the original problem), or the L1-L2 example (an example of the "snag" in my proposed solution to the original problem).
The WG spent a long time this evening discussing this bug, without coming to a conclusion. We did come up with a draft definition of "circularity", and on this definition, your example is not a circularity; moreover, I and I think several others on the WG believe that this query is legal. I think we're going to discuss it among ourselves for a few days and will respond further when we've decided we know what we are talking about - there have been too many factually incorrect responses on this thread already!
(In reply to comment #18) > The WG spent a long time this evening discussing this bug, without coming to a > conclusion. Well, it sounds like it's getting the attention it deserves, so that's good. > We did come up with a draft definition of "circularity", and on > this definition, your example is not a circularity; moreover, I and > I think several others on the WG believe that this query is legal. My assessment was that if it's legal, then that poses a serious problem for the (formal) specification of its static semantics. Does it seem that way to the WG too?
Yes, I did think it was a serious problem with respect to the approach currently taken for the formal semantics. Now in the process of implementing the resolution for Bug #1704, I realized that we could process the variables at the level of the module declaration, i.e., when the static context is built for all the modules in the same namespace. After implementation of Bug #1704, the rule looks as follows: AnyURI is target namespace of modules Module1 ... Modulen Module1 = module namespace NCName1 = URILiteral; PrologDeclList1 ... Module1 = module namespace NCName1 = URILiteral; PrologDeclListn dynEnv |- URILiteral has atomic value AnyURI statEnvDefault |- declare namespace NCName = URILiteral; PrologDeclList1 =>stat statEnv1 with PrologDeclList1' ... statEnvn-1 |- declare namespace NCName = URILiteral; PrologDeclListn =>stat statEnvn with PrologDeclListn' -------------------------------------------------------------------------------------- AnyURI =>module_statEnv statEnvn So I am thinking we could add some non-formally specified judgment that would "sort" the variables in such a way that there is no forward references anymore. If there is no circularity, that judgment should be well-defined. That judgment could look as follows: process variable order PrologDeclList1,...,PrologDeclListn to PrologDeclList0 this process would make sure that variables are ordered in a way that no variable declaration is declared after it is used. It would leave function declarations unchanged. That would essentially reflect the intent of the corresponding sentence in XQuery: << The initializing expression for a given variable must be evaluated before the evaluation of any expression that references the variable. >> This idea is similar to an intuition proposed by Michael Kay. The final rule would look as follows: AnyURI is target namespace of modules Module1 ... Modulen Module1 = module namespace NCName1 = URILiteral; PrologDeclList1 ... Module1 = module namespace NCName1 = URILiteral; PrologDeclListn dynEnv |- URILiteral has atomic value AnyURI process variable order PrologDeclList1,...,PrologDeclListn to PrologDeclList0 statEnvDefault |- declare namespace NCName = URILiteral; PrologDeclList0 =>stat statEnv with PrologDeclList0' -------------------------------------------------------------------------------------- AnyURI =>module_statEnv statEnv which is not that much more complex. I don't know if I am missing something here. Feedback on this proposal would be greatly appreciated. Thanks, - Jerome
It seems like it would be a promising approach, but how would the merged prolog deal with clashes between modules on BoundarySpaceDecl, DefaultCollationDecl, BaseURIDecl, etc.? (I had a much longer response, but that seems to be the major stumbling-block.)
I agree that this is an issue. Another issue with the proposed resolution of course is that it does not take the two-phase processing of the module declarations into account. I would think that we should be able to deal with both at the same time, but that requires a bit more work. I will try and come up with a proposal in the next few days. - Jerome
Here is a much more detailed proposal to address that bug. Unfortunately, it does not fully address the very last comment from Michael Dyck for which there is a serious roadblock. (see comment 4 in the proposal overview). Suggestions and comments would be very much appreciated. - Jerome Proposal Overview ================= High-level characteristics of the proposal: (1) it separates the context building for modules in two parts one part deals with the context internal to a module, and the context being exported for a set of modules in the same namespace. (2) it breaks the circularity for recursive modules. (3) it address the issue with forward variable references by sorting them appropriately. (4) it only partially deals with the issue raised by Michael Dyck about possible conflicts between declarations in multiple prologs. In fact there is a very serious road block here, as I think the only way to deal with it would be to keep track of the specific static environment *per* variable. This makes things increadibly messy to specific. Here is a concrete variant of Michael Dyck's example which shows the problem: module namespace L1 = "http://www.example.com/foo"; import module namespace L2 = "http://www.example.com/foo"; declare boundary-space preserve; declare variable $L1:s1 = $L2:s2; declare variable $L1:d1 = <a> { "Hello" } </a>; -------- module namespace L2 = "http://www.example.com/foo"; import module namespace L1 = "http://www.example.com/foo"; declare boundary-space strip; declare variable $L2:d2 = $L1:d1; declare variable $L2:s2 = <a> { "World!" } </a>; The point of the example is that you need to do two things at the same time: (i) re-order the variable to remove forward references, (ii) keep track of the environment *on a per variable basis*. This last bit is the road-block as it is again a pretty fundamental departure from the formalization approach we have taken throughout the FS documents. Fixing it will requires a lot of surgical hacking on top of the already complex proposal below. Detailed Proposal ================= In what follows 'SCP' means static context processing. Changes to judgments -------------------- *** Replace the judgment: statEnv1 |- PrologDecl1 =>stat statEnv2 by two new judgments: statEnv1 ; AnyURI |- PrologDecl1 =>stat statEnv2 working only over "setters" and statEnv1 ; statEnv2 |- PrologDecl1 =>stat statEnv3 ; statEnv4 working over variable and function declarations, where the first environment on each side is the local module environment, and the second environment is the environment being exported. The addition of the URI in the first judgment is to prevent module import of a module in the same namespace. Breaking the recursion. *** Replace the judgment: statEnv1 |- PrologDeclList =>stat statEnv2 with PrologDeclList1 By two distinct judgments: statEnv1 |- PrologDeclList =>prestat statEnv2 with PrologDeclList1 which is used to pre-process the declarations for a given prolog, extending the environment with the "setters" but not the variable and function declarations, and returning all of the prolog normalized. and statEnvn ; statEnvDefault; AnyURI |- PrologDeclList0 =>stat statEnv0 which processes the prolog's variable and functions declarations, returning the exportable context. *** Add the new judgment: PrologDeclList1' ... PrologDeclListn' preprocess as PrologDeclList0 Which takes a set of *core* prolog declarations, and reorders the variable declarations so that they do not depend on forward references. Because they have already been normalized, the rest of the prolog should not be necessary for evaluation. [[[ WARNING. There is a very serious road block here as I think that property is not true for many properties in the static context. That means that the only way to deal with this is to keep a different static environment for each variable declarations which is pretty much a nightmare. ]]] Changes to the introduction of Section 5 ---------------------------------------- *** Replace the static context processing rule by the following rules, distinguishing between the 'setters and declarations' from the variable and function declarations. The purpose of those rules is twofold. First build the internal static environment for a given module. Second, they completely normalize the declarations for that module based on that local environment. ------------------------------------------ statEnv ; AnyURI |- =>prestat statEnv with PrologDecl != VariableDecl or FunctionDecl or OptionDecl [PrologDecl]_PrologDecl == PrologDecl1 statEnv1 ; AnyURI |- PrologDecl1 =>stat statEnv2 statEnv2 ; AnyURI |- PrologDeclList =>prestat statEnv3 with PrologDeclList1 --------------------------------------------------------------------------- statEnv1 ; AnyURI |- PrologDecl ; PrologDeclList =>prestat statEnv3 with PrologDecl1 ; PrologDeclList1 PrologDecl = VariableDecl or FunctionDecl or OptionDecl [PrologDecl]_PrologDecl == PrologDecl1 statEnv1 ; AnyURI |- PrologDeclList =>prestat statEnv1 with PrologDeclList1 --------------------------------------------------------------------------- statEnv1 ; AnyURI |- PrologDecl ; PrologDeclList =>prestat statEnv1 with PrologDecl1 ; PrologDeclList1 *** Add the following rules, which processes only variable and function declarations for the purpose of constructing the "exported" static environment. ----------------------------------------------- statEnv1 ; statEnv2 ; AnyURI |- =>stat statEnv2 PrologDecl != VariableDecl or FunctionDecl or OptionDecl statEnv1 ; statEnv2 ; AnyURI |- PrologDeclList =>stat statEnv3 ------------------------------------------------------------------------- statEnv1 ; statEnv2 ; AnyURI |- ProgDecl; PrologDeclList =>stat statEnv3 PrologDecl = VariableDecl or FunctionDecl or OptionDecl statEnv1 ; statEnv2 ; AnyURI |- PrologDecl1 =>stat statEnv3 ; statEnv4 statEnv3 ; statEnv4 ; AnyURI |- PrologDeclList =>stat statEnv5 -------------------------------------------------------------------------- statEnv1 ; statEnv2 ; AnyURI |- PrologDecl; PrologDeclList =>stat statEnv5 Changes to Section 5.2 Module Declaration ----------------------------------------- *** Replace the existing SCP rule by: AnyURI is target namespace of modules Module1 ... Modulen Module1 = module namespace NCName1 = URILiteral; PrologDeclList1 ... Modulen = module namespace NCName1 = URILiteral; PrologDeclListn dynEnv |- URILiteral has atomic value AnyURI statEnvDefault ; AnyURI |- declare namespace NCName = URILiteral; PrologDeclList1 =>prestat statEnv1 with PrologDeclList1' ... statEnvn-1 ; AnyURI |- declare namespace NCName = URILiteral; PrologDeclListn =>prestat statEnvn with PrologDeclListn' PrologDeclList1' ... PrologDeclListn' preprocess as PrologDeclList0 statEnvn ; statEnvDefault; AnyURI |- PrologDeclList0 =>stat statEnv0 ------------------------------------------------------------------------------- AnyURI =>module_statEnv statEnvn0 Changes to Sections 5.3 Boundary-space Declaration 5.4 Default Collation Declaration 5.5 Base URI Declaration 5.6 Construction Declaration 5.7 Ordering Mode Declaration 5.8 Empty Order Declaration 5.9 Copy-Namespaces Declaration 5.10 Schema Import 5.12 Namespace Declaration 5.13 Default Namespace Declaration -------------------------------------- *** In SCP, simply replace the use of: statEnv1 |- PrologDecl1 =>stat statEnv2 by statEnv1 ; AnyURI |- PrologDecl1 =>stat statEnv2 Changes to Section 5.11 Module Import ------------------------------------- *** In SCP, replace the rule: AnyURI1 =>module_statEnv statEnv1 ... AnyURI1 =>module_statEnv statEnvn statEnv extended with static environment statEnv1 yields statEnv1' for uri AnyURI ... statEnvn-1 extended with static environment statEnvn yields statEnvn' for uri AnyURI ---------------------------------------------------------------- statEnv |- import module AnyURI1 LocationHints? =>stat statEnvn By: AnyURI0 != AnyURI1 AnyURI1 =>module_statEnv statEnv1 statEnv extended with static environment statEnv1 yields statEnv2 for uri AnyURI1 ------------------------------------------------------------------------- statEnv ; AnyURI0 |- import module AnyURI1 LocationHints? =>stat statEnv2 and AnyURI0 = AnyURI1 ------------------------------------------------------------------------ statEnv ; AnyURI0 |- import module AnyURI1 LocationHints? =>stat statEnv Note that the module does not import its own context within itself anymore. (see the corresponding treatment at the beginning of the proposal). Replace the rule: dynEnv |- URILiteral1 has atomic value AnyURI1 AnyURI1 =>module_statEnv statEnv1 ... AnyURI1 =>module_statEnv statEnvn statEnv extended with static environment statEnv1 yields statEnv1' for uri AnyURI ... statEnvn-1 extended with static environment statEnvn yields statEnvn' for uri AnyURI statEnv' = statEnvn' + namespace(NCName => (passive, AnyURI)) ------------------------------------------------------------- statEnv |- import module namespace NCName = URILiteral1 LocationHints? =>stat statEnvn' By the rule: dynEnv |- URILiteral1 has atomic value AnyURI1 AnyURI0 != AnyURI1 AnyURI1 =>module_statEnv statEnv1 statEnv extended with static environment statEnv1 yields statEnv2 for uri AnyURI1 statEnv3 = statEnvn2 + namespace(NCName => (passive, AnyURI1)) ----------------------------------------------------------------- statEnv ; AnyURI0 |- import module namespace NCName = URILiteral1 LocationHints? =>stat statEnv3 and dynEnv |- URILiteral1 has atomic value AnyURI1 AnyURI0 = AnyURI1 statEnv1 = statEnv + namespace(NCName => (passive, AnyURI1)) ----------------------------------------------------------------- statEnv ; AnyURI0 |- import module namespace NCName = URILiteral1 LocationHints? =>stat statEnv1 Changes to Section 5.14 Variable Declaration -------------------------------------------- In SCP, change the rule: statEnv |- VarName of var expands to Variable statEnv |- Expr : Type statEnv1 = statEnv + varType(Variable => Type) ------------------------------------------------------------ statEnv |- declare variable $VarName := Expr =>stat statEnv1 by: statEnv1 |- VarName of var expands to Variable statEnv1 |- Expr : Type statEnv3 = statEnv1 + varType(Variable => Type) statEnv4 = statEnv2 + varType(Variable => Type) ------------------------------------------------------- statEnv1; statEnv2 |- declare variable $VarName := Expr =>stat statEnv3;statEnv4 Note that both environments are extended by the variable type, but that the expression is statically typed in statEnv1 which is the local environment for that module. Similarly, change the rule: statEnv |- VarName of var expands to Variable Type = [SequenceType]sequencetype statEnv |- Expr : Type2 statEnv |- Type2 <: Type statEnv1 = statEnv + varType( Variable => Type) ---------------------------------------------------------------------------- statEnv |- declare variable $VarName as SequenceType := Expr =>stat statEnv2 to: statEnv1 |- VarName of var expands to Variable Type = [SequenceType]sequencetype statEnv1 |- Expr : Type2 statEnv1 |- Type2 <: Type statEnv3 = statEnv1 + varType( Variable => Type) statEnv4 = statEnv2 + varType( Variable => Type) ----------------------------------------------------------------------- statEnv1; statEnv2 |- declare variable $VarName as SequenceType := Expr =>stat statEnv3;statEnv4 Changes to Section 5.15 Function Declaration -------------------------------------------- In SCP, similar change as for variables. I.e., for every rule: -- Replace: statEnv |- FunctionDecl =>stat statEnv1 by: statEnv1;statEnv2 |- FunctionDecl =>stat statEnv3;statEnv4 -- Replace: statEnv by: statEnv1 above the line. -- Replace: statEnv1 = statEnv + funcType(expanded-QName => FunctionSig) by the two lines: statEnv3 = statEnv1 + funcType(expanded-QName => FunctionSig) statEnv4 = statEnv2 + funcType(expanded-QName => FunctionSig)
Could you explain the problem in layman's terms? I can't see how rearranging the variable declarations within a module can materially affect the static context for a particular variable declaration. We have: ((DefaultNamespaceDecl | Setter | NamespaceDecl | Import) Separator)* ((VarDecl | FunctionDecl | OptionDecl) Separator)* which means that the only effect is that a variable is moved relative to another variable, function, or option declaration. Option declarations don't matter because they have no formal semantics. Variables don't matter because rearranging the order of variables is the whole point. That leaves functions. Variables can contain forwards references to functions, but functions can't contain forwards references to variables. So we need to do the rearrangement in such a way that references from functions to variables are validated before the variables are rearranged. That seems to be all. I appreciate that it's much easier to say this in English than in the formal notation of the spec, but it doesn't look insuperable to me. Michael Kay
Michael, let me try. The issue is to specify which context for which variable. Static typing takes the static context into account. The static context must verify the following conditions: (1) the static context for a variable must be based on the local static context for that module. This is based on the syntactic location in a given module and therefor the static context should be different for variables in different modules. (2) the static context must contain the static type for the previous variables, "previous" here is is based on a reordering which is global to *all* the modules in the same namespace. I don't know how to reconcile those two conditions. It seems the type information for variables and the rest of the static context have to be propagated completely differently. There might be a simple fix, but I can't see it. The complex fix that comes to mind is to split the context at a finer grained throughout the whole spec! Hope that is helpful, - Jerome
Problem with =>prestat: if the prolog contains a VarDecl (or FuncDecl) that contains a function call, then =>prestat depends on [VarDecl]_PrologDecl (or [FuncDecl]_PrologDecl), which depends on [FunctionCall]_Expr, which needs: -- a statEnv.funcType with entries for all functions visible at that point (i.e., all functions declared in or imported by the prolog), and also -- a statEnv.typeDefn with entries for all atomic types visible at that point. But in the statEnv that =>prestat hands down to []_PrologDecl, the funcType and typeDefn components will only have entries from statEnvDefault, so normalization of the function call will fail, in general. It would probably help to rewrite the rules for FunctionCall so that its normalization required little if anything from the statEnv.
Michael, I'm thinking that the issue you mention about function declarations is already covered by Bug #1743. And that the typeDefn should be populated properly as it was before as =>prestat still propagates the static environment through the normalization rules as before. Am I missing something? - Jerome
(In reply to comment #27) > > I'm thinking that the issue you mention about function declarations > is already covered by Bug #1743. Yes, but the proposal in Comment #23 doesn't incorporate the fix for Bug 1743. And if you try, you'll find it doesn't quite work when multiple ("sibling") modules share a target namespace, because statEnv.funcType won't have entries for functions defined in the sibling modules, because the "self-import" that would conceptually bring them in is defined (by =>stat) as basically a no-op/pass-through. And that's just *normalization* of FunctionCalls. You also need a properly defined statEnv.funcType for their STA, which (in this proposal) occurs when =>stat is applied to the merged PrologDeclList0. So you'd need to apply the fix for Bug 1743 there too. And note that the presence or absence of a self-import in a module does not have the necessary effect on whether its expressions can "see" the var decls and func decls of the sibling modules when you apply =>stat to the merged PrologDeclList0. > And that the typeDefn should be populated properly as it was before as > =>prestat still propagates the static environment through the > normalization rules as before. Oops, yes, as =>prestat recurses down the prolog, it invokes =>stat on the setters and accumulates the resulting statEnv, so when you get to normalizing FunctionCalls, statEnv.typeDefn will have the types imported by the prolog. However, because =>prestat's "output" statEnv only reflects the prolog's setters, it's pointless, in the definition for =>module_statEnv, to use one module's output statEnv as another's input statEnv. The input statEnvs should all be statEnvDefault. (So in fact, there's no reason for =>prestat to have an output statEnv at all.)
Here is (yet) another proposal trying to break the Gordian knot on bug 1705. This is a (preliminary) proposal currently being considered by the XML Query working group at its face to face meeting. Any early feedback on it would be greatly appreciated. - Jerome Overview -------- The proposal defines circularity between modules based on whether variables depend on variables or functions in another module. This notion outlaws the specific example that was posted in comment #1 for that bug. It allows to order modules based on how variables in one module depend on variables in another module. Details follows. Changes to XQuery 1.0 book -------------------------- (1) Make 'depends on a variable or function' a definition in XQuery 1.0. (2) Define formally a notion of module dependency as follows. << A module M1 depends on a module M2 if a variable or function in M1 depends on a variable or function in a module M2, where M1 and M2 are distinct modules. >> NOTE: Based on that definition M1 never depends on M1. (3) Add a restriction to module import: << If a module depends on itself by any combination of transitive dependencies, a static error is raised. >> Changes to XQuery FS book ------------------------- Note this proposal also takes bug #1743 into account. What follows are only the main rules, not all specific changes yet. *** Add the new judgments: Module1 ... Modulen reordered as Module1' ... Modulen' Which takes a set of Modules, and reorders them based on the absence of circularity. The order is such that variables in a module Modulei only depends on variables defined in Module1...Modulei-1. statEnv1 |- PrologDecls gather function signatures statEnv2 Which extends the static environment with a set of function signatures. *** In Section 5.2 Module Declaration, replace the existing SCP rule by: AnyURI is target namespace of modules Module1 ... Modulen Module1 ... Modulen reordered to Module1' ... Modulen' (***) Module1' = module namespace NCName1 = URILiteral; PrologDeclList1 ... Modulen' = module namespace NCName1 = URILiteral; PrologDeclListn statEnvDefault |- PrologDecls gather function signatures statEnv0 statEnv0 ; AnyURI |- declare namespace NCName = URILiteral; PrologDeclList1 =>stat statEnv1 with PrologDeclList1' ... statEnvn-1 ; AnyURI |- declare namespace NCName = URILiteral; PrologDeclList1 =>stat statEnvn with PrologDeclList1' ------------------------------------------------------------------ AnyURI =>module_statEnv statEnvn0 (***) Always exists because there isn't circularity The main idea is that we can alway order the modules so they do not depend on eeach other, getting rid of circularity and making sure the simpler version of SCP which goes through modules in order can terminate. NOTE: There should be some changes to the judgment: statEnv0 ; AnyURI |- declare namespace NCName = URILiteral; PrologDeclList1 =>stat statEnvn with PrologDeclList1' so that it passes only the relevant part of the static environment between modules. Changes to Section 5.11 Module Import ------------------------------------- *** In SCP, replace the rule: AnyURI1 =>module_statEnv statEnv1 ... AnyURI1 =>module_statEnv statEnvn statEnv extended with static environment statEnv1 yields statEnv1' for uri AnyURI ... statEnvn-1 extended with static environment statEnvn yields statEnvn' for uri AnyURI ---------------------------------------------------------------- statEnv |- import module AnyURI1 LocationHints? =>stat statEnvn By: AnyURI =>module_statEnv statEnv statEnv extended with static environment statEnv1 yields statEnv1' for uri AnyURI ... statEnvn-1 extended with static environment statEnvn yields statEnvn' for uri AnyURI ---------------------------------------------------------------- statEnv |- import module AnyURI LocationHints? =>stat statEnvn Changes to Section 5.14 Variable Declaration -------------------------------------------- No changes. Changes to Section 5.15 Function Declaration -------------------------------------------- No changes
When you say "No changes", do you mean "No changes from the 2006-06 CR" ?
Michael, This is a loose statement. It more or less means 'no additional substantive changes should be required specifically to address bug 1705'. It is not a statement indicating that other changes necessary for any other bugs will be backtracked. As indicated in the comment, the proposed resolution does not contain a complete specific set of changes to FS. Any feedback on the proposal still would be extremely welcome. Thanks, - Jerome
(In reply to comment #29) > > Changes to XQuery 1.0 book > ... > If a module depends on itself by any combination of transitive > dependencies, a static error is raised. Is this in addition to, or instead of, the current "graph of module imports contains a cycle" error? Because there are some cases that are errors according to the current rule that are not errors according to the proposed rule. One trivial example is ---- module namespace A = "http://www.example.com/A"; import module namespace B = "http://www.example.com/B"; declare variable $A:a = 'a'; ---- module namespace B = "http://www.example.com/B"; import module namespace A = "http://www.example.com/A"; declare variable $B:b = 'b'; ---- There's certainly a cycle in the graph of module imports (and so, an error for the current rule), but because neither module references any of the other's variables or functions, neither "depends on" the other, so the proposed rule raises no error. Less trivial: -------- module namespace S1 = "http://www.example.com/siblings"; declare variable $S1:s1 = 's1'; -------- module namespace X = "http://www.example.com/x"; import module namespace S = "http://www.example.com/siblings"; declare variable $X:x1 = $S:s1; declare variable $X:x2 = 'x2'; -------- module namespace S2 = "http://www.example.com/siblings"; import module namespace X = "http://www.example.com/x"; declare variable $S2:s2 = $X:x2; -------- Module X imports the 'siblings' namespace, and thereby the modules S1 and S2; S2 imports the 'x' namespace and thereby the module X; thus there is a cycle in the import graph (X -> S2 -> X) and the current rule raises an error. However, in the "depends on" graph, we merely have S2 depends on X (because of $x2) and X depends on S1 (because of $s1); there's no cycle, so the proposed rule does not raise an error.
There is no proposal to remove other rules about module imports. This is in addition to it. - Jerome
It looks like the proposal of Comment #29 is subject to the M-N-L inference cycle that started this whole bug.
The suggested resolution as written is a bit misleading on this I agree. I believe that the basis for the proposal in comment #23 still apply here. We are just looking for a way to deal with the issues with how static environments collide. So more precisely: -- we still need to distinguish imports within the same namespaces from other imports (in the spirit of comment #23) so that the "exported context" for a given namespace will be computed at once. -- in addition that "exported context" will be computed based on the approach suggested in comment #29. - Jerome
You can break the M-N-L cycle by adding to 5.11 the "AnyURI0 != AnyURI1" etc. machinery from Comment #23. But then you have the problem I raised in Comment #28: the presence or absence of a self-import in a module has no effect on the statEnv that its expressions "see". However, I think you can fix that... You say "There should be some changes to the =>stat judgment so that it passes only the relevant part of the static environment between modules." One way to do this would be to remix some more stuff from Comment #28. Make the =>stat judgment be: statEnv_N_in ; statEnv_E_in ; AnyURI |- PrologDeclList_in =>stat statEnv_N_out ; statEnv_E_out with PrologDeclList_out statEnv_N is the "normal" statEnv, used for all the usual purposes. statEnv_E is the "export" statEnv, containing (conceptually) the funcType and varType of the current module's "siblings". (In reality, it doesn't contain entries for *all* the functions and variables of a module's siblings, but it does contain all the ones that the module depends on, which is what matters.) The big =>module_statEnv rule in 5.2 would say: ... statEnvDefault |- PrologDecls gather function signatures statEnv_E_0 statEnvDefault ; statEnv_E_0 ; AnyURI |- ...1 =>stat statEnv_1 ; statEnv_E_1 with PrologDeclList1' ... statEnvDefault ; statEnv_E_[n-1] ; AnyURI |- ...n =>stat statEnv_n ; statEnv_E_n with PrologDeclListn' -------- AnyURI =>module_statEnv statEnvn_E_n (Note that all modules start out with just statEnvDefault as their "normal" statEnv.) So that would take care of initializing the export-env and propagating it between sibling modules (and propagating it out to whatever called =>module_statEnv). =>stat would only update it for var decls (also updating the normal statEnv), and would only consult it for self-imports ("AnyURI0 = AnyURI1"), copying its var + func info into statEnv_N_out.
And while you're thinking about that, here's some other comments: > Module1 ... Modulen reordered as Module1' ... Modulen' > The order is such that variables in a module Modulei > only depends on variables defined in Module1...Modulei-1. That isn't quite what you want to say. First, it's the "primed" Modules that satisfy the ordering constraint. Second, it doesn't allow variables in Modulei' to depend on other variables in Modulei' itself, or on variables in Modules other than the listed ones, both of which should be legal. Third, it ignores functions depending on variables: if you try to do STA on a function body before you have the type of a variable that it references, the STA fails. So you could say: The order is such that variables and functions in module Modulei' don't depend on variables defined in Modulei+1'...Modulen'. But it would be simpler to say The order is such that Modulei' does not depend on any of Modulei+1'...Modulen'. Or equivalently, The order is such that if Modulei' depends on Modulej', then i >= j. Yes, it's a slightly stronger constraint on the ordering, but the net effect is the same, and since the non-circularity restriction is defined on the Module-Module version of "depends on", it's easier to see that this version (rather than the variables+functions version) of the ordering constraint is guaranteed possible. ------ > AnyURI is target namespace of modules Module1 ... Modulen > Module1 ... Modulen reordered to Module1' ... Modulen' You could merge these two judgments. Instead of introducing the "reordered to" judgment, just say that the list of Modules "returned" by the "is target namespace of" judgment is in the constrained order. ------ > Module1' = module namespace NCName1 = URILiteral; PrologDeclList1 > ... > Modulen' = module namespace NCName1 = URILiteral; PrologDeclListn The second NCName1 should be NCNamen, since sibling modules don't have to use the same prefix for their target namespace. And the URILiterals should be subscripted too, since those don't have to be exactly the same either (although they do have to identify the same target namespace, of course). ------ > statEnv0 ; AnyURI |- declare namespace NCName = URILiteral; > PrologDeclList1 =>stat statEnv1 with PrologDeclList1' > ... > statEnvn-1 ; AnyURI |- declare namespace NCName = URILiteral; > PrologDeclList1 =>stat statEnvn with PrologDeclList1' In the last premise, the subscripts '1' should be 'n-1'. Also, the NCNames and URILiterals should be subscripted. ------ > AnyURI =>module_statEnv statEnv > statEnv extended with static environment statEnv1 > yields statEnv1' for uri AnyURI > ... > statEnvn-1 extended with static environment statEnvn > yields statEnvn' for uri AnyURI > ---------------------------------------------------------------- > statEnv |- import module AnyURI LocationHints? =>stat statEnvn -- There aren't statEnv1 ... statEnvn any more. -- You don't want to require that the statEnv returned by =>module_statEnv be the same as the statEnv in the conclusion. -- An AnyURI is not a URILiteral. So that should be: URILiteral has atomic value AnyURI AnyURI =>module_statEnv statEnv2 statEnv1 extended with static environment statEnv2 yields statEnv3 for uri AnyURI ---------------------------------------------------------------- statEnv1 |- import module URILiteral LocationHints? =>stat statEnv3
Hi Michael, Yes I think your comment and suggestions totally make sense there. I think that's more or less what I had in mind. Thanks for the suggestions. I do believe that addresses your earlier problem in #28. Please let me know asap if you still see other issues as I am trying to produce a new version of the document as we speak. - Jerome (In reply to comment #36) > You can break the M-N-L cycle by adding to 5.11 the "AnyURI0 != AnyURI1" etc. > machinery from Comment #23. But then you have the problem I raised in Comment > #28: the presence or absence of a self-import in a module has no effect on the > statEnv that its expressions "see". However, I think you can fix that... > > You say "There should be some changes to the =>stat judgment so that it passes > only the relevant part of the static environment between modules." One way to > do this would be to remix some more stuff from Comment #28. Make the =>stat > judgment be: > > statEnv_N_in ; statEnv_E_in ; AnyURI |- PrologDeclList_in > =>stat > statEnv_N_out ; statEnv_E_out with PrologDeclList_out > > statEnv_N is the "normal" statEnv, used for all the usual purposes. > > statEnv_E is the "export" statEnv, containing (conceptually) the funcType and > varType of the current module's "siblings". (In reality, it doesn't contain > entries for *all* the functions and variables of a module's siblings, but it > does contain all the ones that the module depends on, which is what matters.) > > The big =>module_statEnv rule in 5.2 would say: > > ... > statEnvDefault |- PrologDecls gather function signatures statEnv_E_0 > statEnvDefault ; statEnv_E_0 ; AnyURI |- ...1 > =>stat > statEnv_1 ; statEnv_E_1 with PrologDeclList1' > ... > statEnvDefault ; statEnv_E_[n-1] ; AnyURI |- ...n > =>stat > statEnv_n ; statEnv_E_n with PrologDeclListn' > -------- > AnyURI =>module_statEnv statEnvn_E_n > > (Note that all modules start out with just statEnvDefault as their "normal" > statEnv.) > > So that would take care of initializing the export-env and propagating it > between sibling modules (and propagating it out to whatever called > =>module_statEnv). =>stat would only update it for var decls (also updating the > normal statEnv), and would only consult it for self-imports ("AnyURI0 = > AnyURI1"), copying its var + func info into statEnv_N_out. >
(In reply to comment #38) > Please let me know asap if you still see other issues as I am trying to > produce a new version of the document as we speak. What did you have in mind for defining "gather function signatures"?
(In reply to comment #39) > (In reply to comment #38) > > Please let me know asap if you still see other issues as I am trying to > > produce a new version of the document as we speak. > > What did you have in mind for defining "gather function signatures"? > This is essentially the proposed resolution for #1743. Possibly there is some consolidation that can happen between the various passes over the prolog, but the idea is to first get an environment with all the function declarations. - Jerome
(In reply to comment #40) > > > > What did you have in mind for defining "gather function signatures"? > > This is essentially the proposed resolution for #1743. As it stands, the =>stat1 judgment in Bug #1743 puts imported variables and functions in its output env (which made sense for 1743, but is not wanted here), and also will fail in the presence of an import cycle. To fix both of these problems, you need to modify the judgment so that it does minimal processing of prolog decls: -- For decls that introduce a namespace binding, just add that binding to statEnv.namespace (or statEnv.default_elem_namespace). -- For function decls, just construct the signature and add it to statEnv.funcType. (This will consult the statEnv, but should only need the namespace bindings from the previous point, to expand QNames.) -- For all other decls, it's a pass-through.
This has been implemented. Here is a sketch of the implementation: (1) added a judgment: statEnv1 |- PrologDeclList =>sigs statEnv2 which 'gather' function signatures, as suggested in comment #41. (2) fixed the main module declaration rule, as suggested in comment #29. (3) fixed the mapping from URIs to module declarations to return modules in order, as suggested in comment #37, relying on the definition of dependency specified in the XQuery 1.0 book. (4) broke the module circularity similarly to what was suggested in comments #36. - Jerome
Also, an important part of the implementation is: (5) split the static context processing for function declarations to: -- part1, used to compute the signature used in the judgment =>sigs. -- part2, simplified the function declaration rule to access the function signature directly from the context as it has been processed earlier. - Jerome