Copyright © 2003 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document defines formally the semantics of XQuery 1.0 [XQuery 1.0: A Query Language for XML] XPath 2.0 [XML Path Language (XPath) 2.0].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. It contains a number of open issues, and should not be considered to be fully stable. Vendors who wish to create preview implementations based on this document do so at their own risk. While this document reflects the general consensus of the working groups, there are still controversial areas that may be subject to change.
This version reflects the most recent semantics of [XPath/XQuery]. Among the most important changes from the previous version of this document are:
Specification of schema context in [7.1.2 Context is] and in [7.2.4 Element and attribute type lookup]. This closes Issue 481.
Specification in [8 Importing Schemas] of the XML Schema features supported by the XQuery type system. This closes Issue 496.
Implementation of global variables. This closes Issue 556.
Implementation of validation declaration. This closes Issue 557.
XQuery 1.0, XPath 2.0, and their formal semantics has been defined jointly by the XML http://www.w3.org/XML/Query Query Working Group and the XSL http://www.w3.org/Style/XSL/ Working Group (both part of the XML Activity http://www.w3.org/XML/Activity.html).
Public feedback is requested on the open issues: Issue 555 (Formal Semantics of Module Import) and and Issue 559 (New Sequence Type needs to be fully implemented in Formal Semantics).
The Working Groups plan to publish a Last Call Working Draft of this document as soon as the remaining open issues are resolved.
Public comments on this document and its open issues are welcome. Comments should be sent to the W3C XPath/XQuery mailing list, public-qt-comments@w3.org (archived at http://lists.w3.org/Archives/Public/public-qt-comments/) with "[FS]" at the beginning of the Subject field.
Patent disclosures relevant to this specification may be found on the XML Query Working Group's patent disclosure page at http://www.w3.org/2002/08/xmlquery-IPR-statements and the XSL Working Group's patent disclosure page at http://www.w3.org/Style/XSL/Disclosures.
1 Introduction
1.1 Normative and
Informative Sections
2 Preliminaries
2.1 Introduction to the Formal
Semantics
2.1.1 Notations from grammar productions
2.1.2 Notations for judgments
2.1.3 Notations for inference rules
2.1.4 Notations for environments
2.1.5 Putting it together
2.2 XML Values
2.2.1 Formal values
2.2.2 Examples of values
2.3 The [XPath/XQuery] Type
System
2.3.1 XML
Schema and the [XPath/XQuery] Type System
2.3.2 Item types
2.3.3 Content models
2.3.4 Top level definitions
2.3.5 Example of a complete Schema
2.4 Processing model and main
judgments
2.4.1 Processing model
2.4.2 Normalization judgment
2.4.3 Static typing judgment
2.4.4 Dynamic evaluation judgment
2.5 Relationship with other
documents
2.5.1 Namespaces
2.5.2 Functions and operators
3 Basics
3.1 Expression
Context
3.1.1 Static Context
3.1.1.1
Resolving QNames to Expanded QNames
3.1.2 Dynamic Context
3.2 Processing
Model
3.3 Documents
3.3.1 Document Order
3.3.2 Atomization
3.3.3 Effective Boolean Value
3.3.4 Input Sources
3.4 Types
3.4.1 Predefined Types
3.4.2 Typed Value and String Value
3.4.3 SequenceType Syntax
3.4.4 SequenceType Matching
3.5 Error Handling
3.5.1 Kinds of Errors
3.5.2 Handling Dynamic Errors
3.5.3 Errors and Optimization
3.6 Optional
Features
3.6.1 Schema Import Feature
3.6.2 Static Typing Feature
3.6.3 Full Axis Feature
3.6.4 Module Feature
3.6.5 Pragmas
3.6.6 Must-Understand Extensions
3.6.7 Static Typing Extensions
4 Expressions
4.1 Primary
Expressions
4.1.1 Literals
4.1.2 Variable References
4.1.3 Parenthesized Expressions
4.1.4 Context Item Expression
4.1.5 Function Calls
4.1.6 [XPath/XQuery] Comments
4.2 Path
Expressions
4.2.1 Steps
4.2.1.1
Axes
4.2.1.2
Node Tests
4.2.2 Predicates
4.2.3 Unabbreviated Syntax
4.2.4 Abbreviated Syntax
4.3 Sequence
Expressions
4.3.1 Constructing Sequences
4.3.2 Combining Sequences
4.4 Arithmetic
Expressions
4.5 Comparison
Expressions
4.5.1 Value Comparisons
4.5.2 General Comparisons
4.5.3 Node
Comparisons
4.6 Logical
Expressions
4.7 Constructors
4.7.1 Direct Element Constructors
4.7.1.1
Attributes
4.7.1.2
Namespace Declaration Attributes
4.7.1.3
Content
4.7.1.4
Whitespace in Element Content
4.7.1.5
Type of a Constructed Element
4.7.2 Other Direct Constructors
4.7.3 Computed Constructors
4.7.3.1
Computed Element Constructors
4.7.3.2
Computed Attribute
Constructors
4.7.3.3
Document Node Constructors
4.7.3.4
Text Nodes Constructors
4.7.3.5
Computed Processing Instruction Constructors
4.7.3.6
Computed Comment Constructors
4.7.3.7
Computed Namespace Constructors
4.7.3.8
Namespace Nodes on Constructed Elements
4.8 [For/FLWR]
Expressions
4.8.1 FLWOR
expressions
4.8.2 For expression
4.8.3 Let
Expression
4.8.4 Order By and Return Clauses
4.9 Unordered Expressions
4.10 Conditional
Expressions
4.11 Quantified
Expressions
4.12 Expressions on
SequenceTypes
4.12.1 Instance Of
4.12.2 Typeswitch
4.12.3 Cast
4.12.4 Castable
4.12.5 Constructor Functions
4.12.6 Treat
4.13 Validate
Expressions
5 Modules and Prologs
5.1 Version Declaration
5.2 Module Declaration
5.3 Base URI Declaration
5.4 Namespace
Declaration
5.5 Default
Namespace Declaration
5.6 Schema
Import
5.7 Module Import
5.8 Variable Declaration
5.9 Validation
Declaration
5.10 Xmlspace Declaration
5.11 Default Collation
Declaraion
5.12 Function Declaration
6 Additional Semantics of
Functions
6.1 Formal
Semantics Functions
6.1.1 The
arithmetic operator pseudo-functions: fs:minus, fs:plus, fs:times, fs:idiv,
fs:div, and fs:mod
6.1.2 The
comparison pseudo-functions: fs:eq, fs:ne, fs:lt, fs:le, fs:gt, and
fs:ge
6.1.3 The fs:convert-operand function
6.1.4 The fs:convert-simple-operand
function
6.1.5 The fs:distinct-doc-order function
6.1.6 The fs:item-sequence-to-node-sequence
function
6.1.7 The fs:item-sequence-to-untypedAtomic
function
6.2 Standard functions with
specific typing rules
6.2.1 The fn:abs, fn:ceiling, fn:floor, fn:round,
and fn:round-half-to-even functions
6.2.2 The fn:collection and fn:doc functions
6.2.3 The fn:data function
6.2.4 The fn:distinct-values function
6.2.5 The fn:error function
6.2.6 The fn:min, fn:max, fn:avg, and fn:sum
functions
6.2.7 The fn:remove function
6.2.8 The fn:reverse function
6.2.9 The fn:subsequence function
6.2.10 The op:union, op:intersect, and op:except
operators
7 Auxiliary Judgments
7.1 Judgments for schema
contexts
7.1.1 Mode is
7.1.2 Context is
7.2 Judgments for accessing
types
7.2.1 Derives
7.2.2 Substitutes
7.2.3 Element and attribute name lookup
7.2.4 Element and attribute type lookup
7.2.5 Extension
7.2.6 Type
adjustment
7.2.7 Type
expansion
7.3 Judgments for step
expressions and filtering
7.3.1 Principal Node Kind
7.3.2 Auxiliary judgments for axis
7.3.2.1
Static semantics of axis
7.3.2.2
Dynamic semantics of axis
7.3.3 Auxiliary judgments for node tests
7.3.3.1
Static typing of node tests
7.3.3.2
Dynamic semantics of node tests
7.3.4 Attribute filtering
7.4 Judgments for type
matching
7.4.1 Matches
7.4.2 Subtype
7.5 Judgments for sequences of
item types
7.6 Judgments for
type promotion
7.7 Judgments for the
validate expression
7.7.1 Builtin attributes
7.7.2 Mixed
content
7.7.3 Type
resolution
7.7.4 Interleaving
7.7.5 Erasure
7.7.5.1
Simply erases
7.7.5.2
Erases
7.7.6 Annotate
7.7.6.1
Simply annotate
7.7.6.2
Nil-annotate
7.7.6.3
Annotate
7.7.7 Validates as
8 Importing Schemas
8.1 Introduction
8.1.1 Features
8.1.2 Organization
8.1.3 Main
mapping rules
8.1.4 Special attributes
8.1.4.1
use
8.1.4.2
minOccurs and maxOccurs
8.1.4.3
mixed
8.1.4.4
nillable
8.1.4.5
substitutionGroup
8.1.5 Anonymous type names
8.2 Schemas as a
whole
8.2.1 Schema
8.2.2 Include
8.2.3 Redefine
8.2.4 Import
8.3 Attribute
Declarations
8.3.1 Global attributes declarations
8.3.2 Local
attribute declarations
8.4 Element Declarations
8.4.1 Global element declarations
8.4.2 Local
element declarations
8.5 Complex Type
Definitions
8.5.1 Global complex type
8.5.2 Local
complex type
8.5.3 Complex type with simple content
8.5.4 Complex type with complex content
8.6 Attribute Uses
8.7 Attribute Group
Definitions
8.7.1 Attribute group definitions
8.7.2 Attribute group reference
8.8 Model Group
Definitions
8.9 Model Groups
8.9.1 All
groups
8.9.2 Choice groups
8.9.3 Sequence groups
8.10 Particles
8.10.1 Element reference
8.10.2 Group reference
8.11 Wildcards
8.11.1 Attribute wildcards
8.11.2 Element wildcards
8.12 Identity-constraint
Definitions
8.13 Notation
Declarations
8.14 Annotation
8.15 Simple Type
Definitions
8.15.1 Global simple type definition
8.15.2 Local simple type definition
8.15.3 Simple type content
A Normalized core grammar
A.1 Core BNF
B Functions and Operators
B.1 Functions and
Operators used in the Formal Semantics
B.2 Mapping of Overloaded
Internal Functions
C References
C.1 Normative
References
C.2 Non-normative References
C.3 Background
References
D XQuery 1.0 and XPath 2.0 Issues
This document defines the formal semantics of XQuery 1.0 and XPath 2.0. The present document is part of a set of documents that together define the XQuery 1.0 and XPath 2.0 languages:
[XQuery 1.0: A Query Language for XML] introduces the XQuery 1.0 language, defines its capabilities from a user-centric view, and defines the language syntax.
[XML Path Language (XPath) 2.0] introduces the XPath 2.0 language, defines its capabilities from a user-centric view, and defines the language syntax.
[Functions and Operators] lists the functions and operators defined for the [XPath/XQuery] language and specifies the required types of their parameters and return value.
[Data Model] formally specifies the data model used by [XPath/XQuery] to represent the content of XML documents. The [XPath/XQuery] language is formally defined by operations on this data model.
[Data Model Serialization] specifies how [XPath/XQuery] data model values are serialized into XML.
The scope and goals for the [XPath/XQuery] language are discussed in the charter of the W3C [XSL/XML Query] Working Group and in the [XPath/XQuery] requirements [XML Query 1.0 Requirements].
This document defines the semantics of [XPath/XQuery] by giving a precise formal meaning to each of the expressions of the [XPath/XQuery] specification in terms of the [XPath/XQuery] data model. This document assumes that the reader is already familiar with the [XPath/XQuery] language.
Two important design aspects of [XPath/XQuery] are that it is functional and that it is typed. These two aspects play an important role in the [XPath/XQuery] Formal Semantics.
[XPath/XQuery] is a functional language. [XPath/XQuery] is built from expressions, rather than statements. Every construct in the language (except for the XQuery query prolog) is an expression and expressions can be composed arbitrarily. The result of one expression can be used as the input to any other expression, as long as the type of the result of the former expression is compatible with the input type of the latter expression with which it is composed. Another characteristic of a functional language is that variables are always passed by value, and a variable's value cannot be modified through side effects.
[XPath/XQuery] is a typed language. Types can be imported from one or more XML Schemas that describe the input documents and the output document, and the [XPath/XQuery] language can then perform operations based on these types. In addition, [XPath/XQuery] supports static type analysis. Static type analysis infers the output type of an expression based on the type of its input expressions. Static typing allows early detection of type errors, and can be used as the basis for certain forms of optimization. The [XPath/XQuery] type system captures most of the features of [Schema Part 1], including global and local element and attribute declarations, complex and simple type definitions, named and anonymous types, derivation by restriction, extension, list and union, substitution groups, and wildcard types. It does not model uniqueness constraints and facet constraints on simple types.
This document is organized as follows. [2 Preliminaries] introduces the notations used to define the [XPath/XQuery] Formal Semantics. These include the formal notations for values in the [XPath/XQuery] data model and for types in XML Schema. The next three sections: [3 Basics], [4 Expressions], and [5 Modules and Prologs] have the same structure as the corresponding sections in the [XQuery 1.0: A Query Language for XML] and [XML Path Language (XPath) 2.0] documents. This allows the reader to quickly find the formal definition of a particular language construct. [3 Basics] defines the semantics for basic [XPath/XQuery] concepts, and [4 Expressions] defines the dynamic and static semantics of each [XPath/XQuery] expression. [5 Modules and Prologs] defines the semantics of the [XPath/XQuery] prolog. [6 Additional Semantics of Functions] defines the static semantics of several functions in [Functions and Operators] and gives the dynamic and static semantics of several supporting functions used in this document. The remaining sections, [7 Auxiliary Judgments] and [8 Importing Schemas], contain material that supports the formal semantics of [XPath/XQuery]. [7 Auxiliary Judgments] defines formal judgments that relate data model values to types, that relate types to types, and that support the formal definition of validation. These judgments are used in the definition of expressions in [4 Expressions]. Lastly, [8 Importing Schemas], specifies how XML Schema documents are imported into the [XPath/XQuery] type system and relates XML Schema types to the [XPath/XQuery] type system.
The and [XML Path Language (XPath) 2.0] documents contain the normative dynamic semantics of [XPath/XQuery]. The dynamic semantic rules in [3 Basics], [4 Expressions], and [5 Modules and Prologs] are informative only.
This document contains the normative static semantics of [XPath/XQuery]. The static semantics rules in [3 Basics], [4 Expressions], [5 Modules and Prologs], and [6 Additional Semantics of Functions] are normative. [3.1.1 Static Context] is normative, because it defines the static context used in the static typing rules. [7 Auxiliary Judgments] is normative, because it contains all the judgments necessary for defining SequenceType Matching.
[8 Importing Schemas] is informative, because the mapping rules from XML Schema to the XQuery type system do not handle every feature of XML Schema.
This section provides the background necessary to understand the Formal Semantics and introduces the notations that are used.
Why a Formal Semantics? The goal of the formal semantics is to complement the [XPath/XQuery] specification ([XQuery 1.0: A Query Language for XML] and [XML Path Language (XPath) 2.0]), by defining the meaning of [XPath/XQuery] expressions with mathematical rigor.
A rigorous formal semantics clarifies the intended meaning of the English specification, ensures that no corner cases are left out, and provides a reference for implementation.
Why use formal notations? Rigor is achieved by the use of formal notations to represent [XPath/XQuery] objects such as expressions, XML values, and XML Schema types, and by the systematic definition of the relationships between those objects to reflect the meaning of the language. In particular, the dynamic semantics relates [XPath/XQuery] expressions to the XML value to which they evaluate, and the static semantics relates [XPath/XQuery] expressions to the XML Schema type that is inferred for that expression.
The Formal Semantics uses several kinds of formal notations to define the relationships between [XPath/XQuery] expressions, XML values, and XML Schema types. This section introduces the notations for judgments, inference rules, and mapping rules as well as the notation for environments, which implement the dynamic and static contexts. The reader already familiar with these notations can skip this section and continue with [2.2 XML Values].
Grammar productions are used to describe "objects" (values, types, [XPath/XQuery] expressions, etc.) manipulated by the Formal Semantics. The Formal Semantics makes use of several kinds of grammar productions.
XQuery grammar productions describe the XQuery language and expressions. XQuery productions are identified by a number, which corresponds to their number in the [XQuery 1.0: A Query Language for XML] document, and are annotated with "(XQuery)". For instance, the following production describes FLWR expressions in XQuery.
[42 (XQuery)] | FLWORExpr |
::= | (ForClause | LetClause)+ WhereClause? OrderByClause? "return" ExprSingle |
For the purpose of this document, the differences between the XQuery 1.0 and the XPath 2.0 grammars are mostly irrelevant. By default, this document uses XQuery 1.0 grammar productions. Whenever the grammar for XPath 2.0 differs from the one for XQuery 1.0, the corresponding XPath 2.0 productions are also given. XPath productions are identified by a number, which corresponds to their number in [XML Path Language (XPath) 2.0], and are annotated with "(XPath)". For instance, the following production describes for expressions in XPath.
[25 (XPath)] | ForExpr |
::= | SimpleForClause "return"
ExprSingle |
XQuery Core grammar productions describe the XQuery Core. The Core grammar is given in [A Normalized core grammar]. Core productions are identified by a number, which corresponds to their number in [A Normalized core grammar], and are annotated by "(Core)". For instance, the following production describes the simpler form of the "FLWR" expression in the XQuery Core.
[33 (Core)] | FLWORExpr |
::= | (ForClause | LetClause) "return" ExprSingle |
The Formal Semantics manipulates "objects" (values, types, expressions, etc.) for which there is no existing grammar production in the [XQuery 1.0: A Query Language for XML] document. In these cases, specific grammar productions are introduced. Notably, additional productions are used to describe values in the [Data Model], and to describe the [XPath/XQuery] type system. Formal Semantics productions are identified by a number, and are annotated by "(Formal)". For instance, the following production describes global type definitions in the [XPath/XQuery] type system.
[43 (Formal)] | Definition |
::= | ("define" "element" ElementName
Substitution? Nillable? TypeReference) |
Note that grammar productions that are specific to the Formal Semantics (i.e., with the "(Formal)" annotation) are not part of [XPath/XQuery]. They are not accessible to the user and are only used in the course of defining the language's semantics.
The basic building block of the formal specification is called a judgment. A judgment expresses whether a property holds or not.
For example:
Notation
The judgment
holds if the object Painting is beautiful.
Notation
Here are three judgments that are used extensively in this document.
The judgment
holds if the expression Expr yields (or evaluates to) the value Value.
The judgment
holds when the expression Expr has type Type.
The judgment
holds when the expression Expr raises the error Error.
A judgment can contain symbols and patterns.
Symbols are purely syntactic and are used to write the judgment itself. In general, symbols in a judgment are chosen to reflect its meaning. For example, 'is beautiful', '=>' and ':' are symbols, the second and third of which should be read "yields", and "has type" respectively.
Patterns are written with italicized words. The name of a pattern is significant: each pattern name corresponds to an "object" (a value, a type, an expression, etc.) that can be substituted legally for the pattern. By convention, all patterns in the Formal Semantics correspond to grammar non-terminals, and are used to represent entities that can be constructed through application of the corresponding grammar production. For example, Expr represents any [XPath/XQuery] expression, and Value represents any value in the [XPath/XQuery] data model.
When applying the judgment, each pattern must be instantiated to an appropriate sort of "object" (value, type, expression, etc). For example, '3 => 3' and '$x+0 => 3' are both instances of the judgment 'Expr => Value'. Note that in the first judgment, '3' corresponds to both the expression '3' (on the left-hand side of the => symbol) and to the the value '3' (on the right-hand side of the => symbol).
Patterns may appear with subscripts (e.g. Expr1, Expr2) to distinguish different instances of the same sort of pattern. Each distinct pattern must be instantiated to a single "object" (value, type, expression, etc.). If the same pattern occurs twice in a judgment description then it should be instantiated with the same "object". For example, '3 => 3' is an instance of the judgment 'Expr1 => Expr1' but '$x+0 => 3' is not since the two expressions '$x+0' and '3' cannot be both instance of the pattern Expr1. The judgment'$x+0 => 3' is an instance of the judgment 'Expr1 => Expr2'.
In a few cases, patterns may have a name that is not exactly the name of a grammar production but is based on it. For instance, a BaseTypeName is a pattern that stands for a type name, as would TypeName, or TypeName2. This usage is limited, and only occurs to improve the readability of some of the inference rules.
Inference rules are used to specify whether a judgment holds or not. Inference rules express the logical relation between judgments and describe how complex judgments can be concluded from simpler premise judgments.
A logical inference rule is written as a collection of premises and a conclusion, written respectively above and below a dividing line:
premise1 ... premisen |
|
conclusion |
All premises and the conclusion are judgments. The interpretation of an inference rule is: if all the premise judgments above the line hold, then the conclusion judgment below the line must also hold.
Here is a simple example of inference rule, which uses the example judgment 'Expr => Value' from above:
$x => 0 3 => 3 |
|
$x + 3 => 3 |
This inference rule expresses the following property of the judgment 'Expr => Value': if the variable expression '$x' yields the value '0', and the literal expression '3' yields the value '3', then the expression '$x + 3' yields the value '3'.
An inference rule may have no premises above the line, which means that the expression below the line always holds:
|
3 => 3 |
This inference rule expresses the following property of the judgment 'Expr => Value': evaluating the literal expression '3' always yields the value '3'.
The two above rules are expressed in terms of specific variables and values, but usually rules are more abstract. That is, the judgments they relate contain patterns. Here is a rule that says that for any variable Variable that yields the integer value Integer, adding '0' yields the same integer value:
Variable => Integer |
|
Variable + 0 => Integer |
As in a judgment, each pattern in a particular inference rule must be instantiated to the same "object" within the entire rule. This means that one can talk about "the value of Variable" instead of the more precise "what Variable is instantiated to in (this particular instantiation of) the inference rule".
Note
In effect, inference rules are just a notation that describes a bottom-up algorithm. In the examples above, the rules describe an evaluation algorithm where the result of an expression depends on the result for its sub-expressions.
Logical inference rules use environments to record information computed during static type analysis or dynamic evaluation so that this information can be used by other logical inference rules. For example, the type signature of a user-defined function in a [expression/query] prolog can be recorded in an environment and used by subsequent rules. Similarly, the value assigned to a variable within a "let" expression can be captured in an environment and used for further evaluations.
An environment is a dictionary that maps a symbol (e.g., a function name or a variable name) to an "object" (e.g., a function body, a type, a value). One can access information in an environment or update the environment.
The two main environments used in the Formal Semantics are: a dynamic environment (dynEnv), which captures the [XPath/XQuery]'s dynamic context, and a static environment (statEnv), which captures the [XPath/XQuery]'s static context. Both are defined in [3.1 Expression Context].
If "env" is an environment, then "env(symbol)" denotes the "object" to which symbol is mapped. The notation is intentionally similar to function application, because an environment can be considered a function from the argument symbol to the "object" to which the symbol is mapped.
This document uses environment groups that group related environments. If "env" is an environment group with the member "mem", then that environment is denoted "env.mem" and the value that it maps symbol to is denoted "env.mem(symbol)". For example, dynEnv.varValue denotes the dynamic environment that maps variables to values and dynEnv.varValue(Variable) denotes the value of the variable Variable in the dynamic environment.
Updating is only defined on environment groups:
"env + mem(symbol => object) " denotes the new environment group that is identical to env except that the mem environment has been updated to map symbol to object. The notation symbol => object indicates that symbol is mapped to object in the new environment.
The following shorthand is also allowed: "env + mem( symbol1 => object1 ; ... ; symboln => objectn ) " in which each symbol is mapped to a corresponding object in the new environment.
This notation is equivalent to nested updates, as in " (env + mem( symbol1 => object1) + ... ) + mem(symboln => objectn)".
If the "object" is a type then the following notation relates a symbol to a type: "env + mem(symbol : object) ".
Updating the environment overrides any previous binding that might exist for the same name. Updating the environment captures the scope of a symbol (e.g., a variable, a namespace prefix, etc.) Also, note that there are no operations to remove entries from environments: this is never necessary because updating an the environment group effectively creates a new extended copy of the original environment group, and the original environment group remains accessible along with the updated copy.
Environments are used in a judgment to capture some of the context in which the judgment is computed, and most judgments are computed assuming that some environment is given. This assumption is denoted by prefixing the judgment with "env |-". The "|-" symbol is called a "turnstile" and is used in almost all inference rules.
For instance, the judgment
is read as: Assuming the dynamic environment dynEnv, the expression Expr yields the value Value.
Putting the above notations together, here is an example of an inference rule that occurs later in this document:
This rule is read as follows: if two expressions Expr1 and Expr2 are known to have the static types types Type1 and Type2 (the two premises above the line), then it is the case that the expression below the line "Expr1 , Expr2" must have the static type "Type1, Type2", which is the sequence of types Type1 and Type2. The above inference rule does not modify the (static) environment.
The following rule defines the static semantics of a "let" expression. The binding of the new variable is captured by an update to the varType component of the original static environment.
|
|||
|
|||
statEnv |- let
$ VarName := Expr1 return Expr2 : Type2 |
This rule is read as follows: First, because the variable is a QName, it is first expanded into an expanded QName. Second, the type Type1 for the "let" input expression Expr1 is computed. Ten the "let" variable with expanded name, expanded-QName with type Type1 is added into the varType component of the static environment group statEnv. Finally, the type Type2 of Expr2 is computed in that new environment.
Editorial note | |
Jonathan suggests that we should explain 'chain' inference rules. I.e., how several inference rules are applied recursively. |
[Data Model] specifies normatively the data model of [XPath/XQuery]. The [XPath/XQuery] language is formally defined by operations on this data model.
This section defines formal notations to denote values in [Data Model]. These notations are used to describe and manipulate values in inference rules, but are not exposed to the [XPath/XQuery] user.
For reference, a summary of the data model is given below, followed by the formal notation for data model values. Although not specified in this document, all the normative constraints specified in [Data Model] apply to the formal notation for data model values.
A value is a sequence of zero or more items. An item is either an atomic value or a node.
An atomic value is a value in the value space of an atomic type and is
labeled with the name of that atomic type. An XML Schema atomic type [Schema Part 2] may be primitive or derived, or xdt:untypedAtomic
.
A node is either an element, an attribute, a document, a text, a comment,
or a processing-instruction node. Elements have a type annotation and contain
a value. Attributes have a type annotation and contain a simple value, which
is a sequence of atomic values. Text nodes always contain one string value of
type xdt:untypedAtomic
,
therefore the corresponding type annotation is omitted.
A type annotation can be either the QName of a declared type or an anonymous type. An anonymous type corresponds to an XML Schema type for which the schema writer did not provide a name. Anonymous type names are not visible to the user, but are generated during schema validation and used to annotate nodes in the data model. By convention, anonymous type names are written in the Formal Semantics as: [Anon0], [Anon1], etc.
Untyped elements (e.g., from well-formed documents) are annotated with
xdt:untypedAny
, untyped
attributes are annotated with xdt:untypedAtomic
, and untyped
atomic values (i.e., text content or attribute content in well-formed
documents) are annotated with xdt:untypedAtomic
.
An element has an optional "nilled" marker. This marker can only be
present if the element has been validated against an element type in the
schema which is "nillable", and the element has no content and an attribute
xsi:nil
set to "true"
.
An element also has a sequence of namespace annotations, which are the set of active namespace declarations that are in-scope for the element. Each namespace annotation is a prefix, URI pair. Namespace annotations are not values, i.e., they are never the result of evaluating an expression, and they only occur as annotations on elements. In examples, we omit the namespace annotations when they are empty. For example, the following two values are equivalent:
element weight of type xs:integer { text { "42" } } {} element weight of type xs:integer { text { "42" } }
[14 (Formal)] | Value |
::= | Item |
[25 (Formal)] | Item |
::= | NodeValue |
[26 (Formal)] | AtomicValue |
::= | AtomicValueContent
TypeAnnotation |
[7 (Formal)] | AtomicValueContent |
::= | String |
[8 (Formal)] | TypeAnnotation |
::= | "of" "type" TypeName |
[16 (Formal)] | ElementValue |
::= | "element" ElementName "nilled"?
TypeAnnotation "{" Value "}" "{" NamespaceAnnotations "}" |
[17 (Formal)] | AttributeValue |
::= | "attribute" AttributeName
TypeAnnotation "{" SimpleValue "}" |
[15 (Formal)] | SimpleValue |
::= | AtomicValue |
[18 (Formal)] | DocumentValue |
::= | "document" "{" Value "}" |
[20 (Formal)] | CommentValue |
::= | "comment" "{" String "}" |
[21 (Formal)] | ProcessingInstructionValue |
::= | "processing-instruction" QName "{"
String "}" |
[19 (Formal)] | TextValue |
::= | "text" "{" String "}" |
[24 (Formal)] | NodeValue |
::= | ElementValue |
[9 (Formal)] | ElementName |
::= | QName |
[12 (Formal)] | AttributeName |
::= | QName |
[27 (Formal)] | TypeName |
::= | QName | AnonymousTypeName |
[13 (Formal)] | AnonymousTypeName |
::= | [Anon1] | [Anon2] | ... |
[22 (Formal)] | NamespaceAnnotations |
::= | NamespaceAnnotation ...
NamespaceAnnotation |
[23 (Formal)] | NamespaceAnnotation |
::= | "namespace" NCName "{" String
"}" |
Notation
In the above grammar, "String" indicates the value space of
xs:string
, "Decimal" indicates the value space of
xs:decimal
, etc.
Note that the same rule about constructing sequences apply to the values
described by that grammar. Notably sequences cannot be nested. For example,
the sequence (10, (1, 2), (), (3, 4))
is equivalent to the
sequence (10, 1, 2, 3, 4)
.
A well-formed document
<fact>The cat weighs <weight units="lbs">12</weight> pounds.</fact>
In the absence of a Schema, this document is represented as
element fact of type xdt:untyped { text { "The cat weighs " }, element weight of type xdt:untyped { attribute units of type xdt:untypedAtomic { "lbs" of type xdt:untypedAtomic } text { "12" } }, text { " pounds." } }
A document before and after validation.
<weight xsi:type="xs:integer">42</weight>
The formal model for values can represent values before and after validation. Before validation, this element is represented as:
element weight of type xdt:untyped { attribute xsi:type of type xdt:untypedAtomic { "xs:integer" of type xdt:untypedAtomic }, text { "42" } }
After validation, this element is represented as:
element weight of type xs:integer { attribute xsi:type of type xs:QName { "xs:integer" of type xs:QName }, 42 of type xs:integer }
An element with a list type
<sizes>1 2 3</sizes>
Before validation, this element is represented as:
element sizes of type xdt:untyped { text { "1 2 3" } }
Assume the following Schema.
<xs:element name="sizes" type="sizesType"/> <xs:simpleType name="sizesType"> <xs:list itemType="sizeType"/> </xs:simpleType> <xs:simpleType name="sizeType"> <xs:restriction base="xs:integer"/> </xs:simpleType>
After validation against this Schema, the element is represented as:
element sizes of type sizesType { 1 of type sizeType, 2 of type sizeType, 3 of type sizeType }
An element with an anonymous type
<sizes>1 2 3</sizes>
Before validation, this element is represented as:
element sizes of type xdt:untyped { text { "1 2 3" } }
Assume the following Schema.
<xs:element name="sizes"> <xs:simpleType> <xs:list itemType="xs:integer"/> </xs:simpleType> </xs:element>
After validation, this element is represented as:
element sizes of type [Anon1] { 1 of type xs:integer, 2 of type xs:integer, 3 of type xs:integer }
where [Anon1]
stands for the internal anonymous name
generated by the system for the sizes
element.
An nillable element with xsi:type
set to true
<sizes xsi:nil="true"/>
Before validation, this element is represented as:
element sizes of type xdt:untyped { attribute xsi:nil of type xdt:untypedAtomic { "true" } }
Assume the following Schema.
<xs:element name="sizes" type="sizesType" nillable="true"/>
After validation against this Schema, the element is represented as:
element sizes nilled of type sizesType { attribute xsi:nil of type xs:boolean { "true" } }
An element with a union type
<sizes>1 two 3 four</sizes>
Before validation, this element is represented as:
element sizes of type xdt:untypedAtomic { text { "1 two 3 four" } }
Assume the following Schema:
<xs:element name="sizes" type="sizesType"/> <xs:simpleType name="sizesType"> <xs:list itemType="sizeType"/> </xs:simpleType> <xs:simpleType name="sizeType"> <xs:union memberType="xs:integer xs:string"/> </xs:simpleType>
After validation against this Schema, the element is represented as:
element sizes of type sizesType { 1 of type xs:integer, "two" of type xs:string, 3 of type xs:integer, "four" of type xs:string }
The [XPath/XQuery] type system is used in the specification of the dynamic and of the static semantics of [XPath/XQuery]. This section introduces formal notations for describing types.
The [XPath/XQuery] type system is based on [Schema Part 1] and [Schema Part 2]. [Schema Part 1] and [Schema Part 2] specify normatively the schema information available in [XPath/XQuery]. Formalizing the treatment of types in [XPath/XQuery], however, requires some adjustments.
Use of formal notations for types. The Formal Semantics uses formal notations for types instead of XML Schema syntax. These notations are used extensively to describe and manipulate types in the inference rules. The formal notations for types introduced here are not exposed to the [XPath/XQuery] user.
Representation of content models. For the purpose of static typing,
the [XPath/XQuery] type system only describes minOccurs, maxOccurs
combinations that correspond to the DTD operators +
,
*
, and ?
. Choices are represented using the DTD
operator |
. All groups are represented using the
&
notation.
Representation of anonymous types. To clarify the semantics, the [XPath/XQuery] type system makes all anonymous types explicit.
Representation of XML Schema simple type facets and identity constraints. For simplicity, XML Schema simple type facets and identity constraints are not formally represented in the [XPath/XQuery] type system. However, an [XPath/XQuery] implementation supporting XML Schema import and validation must must take simple type facets and identity constraints into account.
The mapping from XML Schema into the [XPath/XQuery] type system is given in [8 Importing Schemas]. The rest of this section is organized as follows. [2.3.2 Item types] describes types items, [2.3.3 Content models] describes content models, and [2.3.4 Top level definitions] describe top-level type declarations.
An item type is either an atomic type, an element type, an attribute type, a document node type, a text node type, a comment node type, or a processing instruction type.
[30 (Formal)] | ItemType |
::= | AtomicTypeName | NodeType |
[32 (Formal)] | AtomicTypeName |
::= | QName |
[31 (Formal)] | NodeType |
::= | ElementType |
[33 (Formal)] | ElementType |
::= | "element" ElementName? TypeSpecifier? |
[34 (Formal)] | TypeSpecifier |
::= | Nillable? TypeReference |
[35 (Formal)] | AttributeType |
::= | "attribute" AttributeName?
TypeReference? |
[36 (Formal)] | Nillable |
::= | "nillable" |
[40 (Formal)] | TypeReference |
::= | "of" "type" TypeName |
[51 (Formal)] | DocumentType |
::= | "document" ("{" Type? "}")? |
An element or attribute type has an optional name and an optional type reference. A name alone corresponds to a reference to a global element or attribute declaration. A name with a type reference corresponds to a local element or attribute declaration. The word "element" or "attribute" alone refers to the wildcard types for any element or any attribute. In addition, an element type has an optional nillable flag that indicates whether the element can be nilled or not.
A document type has an optional content type. If no content type is given, then it refers to the wildcard type describing any document.
Note
Generic node types (e.g., node()
), are interpreted in the
type system as union types (e.g., element | attribute | text | comment
| processing-instruction
) and therefore do not appear here. The
semantics of sequence types is described in [3.4.4 SequenceType Matching].
Examples
The following is a text node type
text
The following is a type for all elements
element
The following is a type for all elements of type string
element of type xs:integer
The following is a type for a nillable element of type string
element size nillable of type xs:integer
The following is a reference to a global attribute declaration
attribute sizes
The following is a type for elements with anonymous type
[Anon1]
element sizes of type [Anon1]
Following XML Schema, types in [XPath/XQuery] are composed from item types
by optional, one or more, zero or more, all group, sequence, choice, empty
sequence, or empty choice (written none
).
The type empty
matches the empty sequence. The type
none
matches no values. It is called the empty choice
because it is the identity for choice, that is (Type | none
) = Type)). The type none
is the static
type for [6.2.5 The fn:error function].
[28 (Formal)] | Type |
::= | ItemType |
[29 (Formal)] | Occurrence |
::= | "*" | "+" | "?" |
The [XPath/XQuery] type system includes three binary operators on types: ",", "|" and "&", corresponding respectively to sequence, choice and all groups in Schema. The [XPath/XQuery] type system includes three unary operators on types: "*", "+", and "?", corresponding respectively to zero or more instances of the type, one or more instances of the type, or an optional instance of the type.
The "&" operator builds the "interleaved product" of two types. The
type Type1 &
Type2 matches
any sequence that is an interleaving of a sequence that matches Type1 and a
sequence that matches Type2. The
interleaved product represents all groups in XML Schema. All groups in XML
Schema are restricted to apply only on global or local element declarations
with lower bound 0 or 1, and upper bound 1.
Examples
A sequence of elements
The "," operator builds the "sequence" of two types. For example,
element title of type xs:string, element year of type xs:integer
is a sequence of an element title of type string followed by an element year of type integer.
The union of two element types
The "|" operator builds the "union" of two types. For example,
element editor of type xs:string | element bib:author
means either an element editor of type string, or a reference to the
global element bib:author
.
An all group of two elements
The "&" operator builds the "interleaved product" of two types. For example,
(element a & element b) = element a, element b | element b, element a
which specifies that the a
and b
elements can
occur in any order.
An empty type
The following type matches the empty sequence.
empty
A sequence of zero or more elements
The following type matches zero or more elements each of which can be a
surgeon
or a plumber
.
(element surgeon | element plumber)*
Top level definitions correspond to global element declarations, global attribute declarations and type definitions in XML Schema.
[43 (Formal)] | Definition |
::= | ("define" "element" ElementName
Substitution? Nillable? TypeReference) |
[44 (Formal)] | Substitution |
::= | "substitutes" "for" ElementName |
[37 (Formal)] | TypeDerivation |
::= | ComplexTypeDerivation |
AtomicTypeDerivation |
[38 (Formal)] | ComplexTypeDerivation |
::= | Derivation? Mixed? "{" Type?
"}" |
[39 (Formal)] | AtomicTypeDerivation |
::= | "restricts" AtomicTypeName |
[41 (Formal)] | Derivation |
::= | ("restricts" TypeName) |
[42 (Formal)] | Mixed |
::= | "mixed" |
A type definition has a name (possibly anonymous) and a type derivation.
In the case of a complex type, or a simple type derived by list or union,
derivation indicates if the type is derived by extension or restriction from
its base type, gives its content model, and an optional flag indicating if it
has mixed content. In the case of an atomic type, it just indicates from
which type it is derived. When the type derivation is omitted, the type
derives by restriction from xs:anyType
, as in:
define type Bib { xs:integer } = define type Bib restricts xs:anyType { xs:integer }
Empty content can be indicated with the explicit empty sequence, or omitted, as in:
define type Bib { } = define type Bib { empty }
Global element and attribute declarations always have a name and a reference to a (possibly anonymous) type. A global element declaration also may declare a substitution group for the element and whether the element is nillable.
Examples
A type declaration with complex content
define type Address { element name of type xsd:string, element street of type xsd:string* }
A type declaration with complex content derived by extension
define type USAddress extends Address { element zip name of type xsd:integer }
A type declaration with mixed content
define type Section mixed { (element h1 of type xsd:string | element p of type xsd:string | element div of type Section)* }
A type declaration with simple content derived by restriction
define type SKU restricts xsd:string { xsd:string }
An element declaration
define element address of type Address
An element declaration with a substitution group
define element usaddress substitutes for address of type USAddress
An element declaration which is nillable
define element zip nillable of type xs:integer
Here is a schema describing purchase orders from [XML Schema Part 0].
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:annotation> <xsd:documentation xml:lang="en"> Purchase order schema for Example.com. Copyright 2000 Example.com. All rights reserved. </xsd:documentation> </xsd:annotation> <xsd:element name="purchaseOrder" type="PurchaseOrderType"/> <xsd:element name="comment" type="xsd:string"/> <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress"/> <xsd:element name="billTo" type="USAddress"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType> <xsd:complexType name="USAddress"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:decimal"/> </xsd:sequence> <xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/> </xsd:complexType> <xsd:complexType name="Items"> <xsd:sequence> <xsd:element name="item" minOccurs="0" maxOccurs="unbounded"> <xsd:complexType> <xsd:sequence> <xsd:element name="productName" type="xsd:string"/> <xsd:element name="quantity"> <xsd:simpleType> <xsd:restriction base="xsd:positiveInteger"> <xsd:maxExclusive value="100"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="USPrice" type="xsd:decimal"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="shipDate" type="xsd:date" minOccurs="0"/> </xsd:sequence> <xsd:attribute name="partNum" type="SKU" use="required"/> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <!-- Stock Keeping Unit, a code for identifying products --> <xsd:simpleType name="SKU"> <xsd:restriction base="xsd:string"> <xsd:pattern value="\d{3}-[A-Z]{2}"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>
Here is the mapping of the above schema into the [XPath/XQuery] type system.
namespace xsd = "http://www.w3.org/2001/XMLSchema" define element purchaseOrder of type PurchaseOrderType define element comment of type xsd:string define type PurchaseOrderType { attribute orderDate of type xsd:date?, element shipTo of type USAddress, element billTo of type USAddress, element comment?, element items of type Items } define type USAddress { attribute country of type xsd:NMTOKEN?, element name of type xsd:string, element street of type xsd:string, element city of type xsd:string, element state of type xsd:string, element zip of type xsd:decimal } define type Items { attribute partNum of type SKU, element item of type [Anon1]* } define type [Anon1] { element productName of type xsd:string, element quantity of type [Anon2], element USPrice of type xsd:decimal, element comment?, element shipDate of type xsd:date? } define type [Anon2] restricts xsd:positiveInteger define type SKU restrict xsd:string
Note that the two anonymous types in the item
element
declarations are mapping to types with names [Anon1]
and
[Anon2]
.
The following additional definitions illustrate how more advanced XML Schema features (a complex type derived by extension, an anonymous simple type derived by restriction, and substitution groups) are represented in the [XPath/XQuery] type system.
<complexType name="NYCAddress"> <complexContent> <extension base="USAddress"> <sequence> <element ref="apt"/> </sequence> </extension> </complexContent> </complexType> <element name="apt"> <xsd:simpleType> <xsd:restriction base="xsd:positiveInteger"> <xsd:maxExclusive value="10000"/> </xsd:restriction> </xsd:simpleType> </element> <element name="usaddress" substitutionGroup="address" type="USAddress"/> <element name="nycaddress" substitutionGroup="usaddress" type="NYCAddress"/>
The above definitions are mapped into the [XPath/XQuery] type system as follows:
define type NYCAddress extends USAddress { element apt } define element apt of type [Anon3] define type [Anon3] restricts xsd:positiveInteger define element usaddress substitutes for address of type USAddress define element nycaddress substitutes for usaddress of type NYCAddress
This section defines a processing model for [XPath/XQuery], then defines formal judgments for each key phase in that processing model (normalization, static type analysis and dynamic evaluation).
The [XPath/XQuery] processing model is defined in Section 2.2 Processing ModelXQ, which contains the following figure depicting the processing model.
Figure 1: Processing Model Overview
This processing model is not intended to describe an actual implementation, although a naive implementation might be based upon it. It does not prescribe an implementation technique, but any implementation should produce the same results as obtained by following this processing model and applying the rest of the Formal Semantics specification.
Query processing consists of two phases: a static analysis phase and a dynamic evaluation phase. Static analysis is further divided into four sub-phases. Each phase consumes the result of the previous phase and generates output for the next phase. For each processing phase, we point to the relevant notations introduced later in the document.
The static analysis phase (sometimes also called "compilation"), can be performed on a [expression/query] before examining any input document. Indeed, static analysis can be performed before input documents even exist. Static analysis can detect many errors early, e.g., syntax errors or type errors. If no error occurs, the result of static analysis could be some compiled form of [expression/query], suitable for execution by a compiled-[expression/query] processor. Static analysis consists of the following sub-phases:
Parsing. (Step SQ1 in Figure 1). The grammar for the [XPath/XQuery] syntax is defined in [XQuery 1.0: A Query Language for XML]. Parsing may generate syntax errors. If no error occurs, an internal operation tree of the parsed query is created.
Static Context Processing. (Steps SQ2, SQ3, and SQ4 in Figure 1). The static semantics of [expression/query] depends on the static input context. The static input context needs to be generated before the [expression/query] can be analysed. In XQuery, static the input context may be defined by the processing environment and by declarations in the Query Prolog (See [5 Modules and Prologs]). In XPath, the static input context is defined by the processing environment. The static input context is denoted by statEnv.
Normalization. (Step SQ5 in Figure
1). To simplify the semantics specification, some normalization is performed
on the [expression/query]. The [XPath/XQuery] language provides many powerful
features that make [expression/query]s simpler to write and use, but are also
redundant. For instance, a complex for
expression might be
rewritten as a composition of several simple for
expressions.
The language composed of these simpler [expression/query] is called the
[XPath/XQuery] Core language and is described by a grammar which is
a subset of the XQuery grammar. The grammar of the [XPath/XQuery] Core
language is given in [A Normalized core
grammar].
During the normalization phase, each [XPath/XQuery] [expression/query] is mapped into its equivalent [expression/query] in the core. (Note that this has nothing to do with Unicode Normalization, which works on character strings.) Normalization works by bottom-up application of normalization rules over expressions, starting with normalization of literal expressions and variables.
Specifically the normalization phase is defined in terms of the static part of the context (statEnv) and a [expression/query] (Expr) abstract syntax tree. Formal notations for the normalization phase are introduced in [2.4.2 Normalization judgment].
After normalization, the full semantics is obtained by giving a semantics to the normalized Core [expression/query]. This is done during the last two phases.
Static type analysis. (Step SQ6 in Figure 1). Static type analysis is optional. If this phase is not supported, then normalization is followed directly by dynamic evaluation.
Static type analysis checks whether each [expression/query] is type safe, and if so, determines its static type. Static type analysis is defined only for Core [expression/query]. Static type analysis works by bottom-up application of type inference rules over expressions, taking the type of literals and of input documents into account.
If the [expression/query] is not type-safe, static type analysis yields a type error. For instance, a comparison between an integer value and a string value might be detected as an type error during the static type analysis. If static type analysis succeeds, it yields an abstract syntax tree where each sub-expression is "annotated" with its static type.
More precisely, the static analysis phase is defined in terms of the static context (statEnv) and a core [expression/query] (CoreExpr). Formal notations for the static analysis phase are introduced in [2.4.3 Static typing judgment].
Static typing does not imply that the content of XML documents must be rigidly fixed or even known in advance. The [XPath/XQuery] type system accommodates "flexible" types, such as elements that can contain any content. Schema-less documents are handled in [XPath/XQuery] by associating a standard type with the document, such that it may include any legal XML content.
The dynamic evaluation phase (sometimes also called "execution") evaluates a query on input document(s).
Dynamic Context Processing. (Steps DQ2 and DQ3 in Figure 1).The dynamic semantics of [expression/query] depends on the dynamic input context. The dynamic input context needs to be generated before the [expression/query] can be evaluated. The dynamic input context may be defined by the processing environment and by statements in the Query Prolog (See [5 Modules and Prologs]). In XPath, the dynamic input context is defined by the processing environment. The static input context is denoted by dynEnv.
Dynamic Evaluation. (Steps DQ4 and DQ5 in Figure 1). This phase computes the value of an [expression/query]. The semantics of evaluation is defined only for Core [expression/query] terms. Evaluation works by bottom-up application of evaluation rules over expressions, starting with evaluation of literals and variables. Evaluation may result in a value OR a dynamic error, which may be a non-type error or a type error. If static typing of an expression does not raise a type error, then dynamic evaluation of the same expression will not raise a type error, although dynamic evaluation may raise some non-type error.
The dynamic evaluation phase is defined in terms of the static context (statEnv) and evaluation context (dynEnv), and a core [expression/query] (CoreExpr). Formal notations for the dynamic evaluation phase are introduced in [2.4.4 Dynamic evaluation judgment].
Static type analysis catches only certain classes of errors. For instance,
it can detect a comparison operation applied between incompatible types
(e.g., xs:int
and xs:date
). Some other classes of
errors cannot be detected by the static analysis and are only detected at
evaluation time. For instance, whether an arithmetic expression on 32 bits
integers (xs:int
) yields an out-of-bound value can only be
detected at run-time by looking at the data.
While implementations are free to implement different processing models, the [XPath/XQuery] static semantics relies on the existence of a static type analysis phase that precedes any access to the input data. Statically typed implementations are required to find and report type errors during static analysis, as specified in this document. Dynamically typed implementations are required to find and report type errors during evaluation, but are permitted to report them during static analysis.
Notice that the separation of logical processing into phases is not meant to imply that implementations must separate the static analysis phase from the dynamic evaluation phase; processors may choose to perform all phases simultaneously at evaluation-time and may even mix the phases in their internal implementations. The processing model simply defines the final result.
The above processing phases are all internal to the [XPath/XQuery] processor. They do not deal with how the [XPath/XQuery] processor interacts with the outside world, notably how it accesses actual documents and types. A typical [expression/query] engine would support at least three other important processing phases:
Schema Import Processing. The [XPath/XQuery] type system is based on XML Schema. In order to perform dynamic or static typing, the [XPath/XQuery] processor needs to build type descriptions that correspond to the schema(s) of the input documents. This phase is achieved by mapping all schemas required by the [expression/query] into the [XPath/XQuery] type system. The XML Schema import phase is described in [8 Importing Schemas].
Data Model Generation. Expressions are evaluated on values in the [Data Model]. XML documents must be loaded into the [Data Model] before the evaluation phase. This is described in the [Data Model] and is not discussed further here.
Serialization. Once the [expression/query] is evaluated, processors might want to serialize the result of the [expression/query] as actual XML documents. Serialization of data model instances is described in [Data Model Serialization] and is not discussed further here.
The parsing phase is not specified formally; the formal semantics does not define a formal model for the syntax trees, but uses the [XPath/XQuery] concrete syntax directly. More details about parsing for XQuery 1.0 can be found in the [XQuery 1.0: A Query Language for XML] document and more details about parsing for XPath 2.0 can be found in the [XML Path Language (XPath) 2.0] document. No further discussion of parsing is included here.
Normalization is specified using mapping rules, which describe how a [XPath/XQuery] expression is rewritten into an expression in the [XPath/XQuery] Core. Mapping rules are also used in [8 Importing Schemas] to specify how XML Schemas are imported into the [XPath/XQuery] type system.
Notation
Mapping rules are written using a square bracket notation, as follows:
[Object]Subscript |
== |
Mapped Object |
The original "object" is written above the == sign. The rewritten "object" is written beneath the == sign. The subscript is used to indicate what kind of "object" is mapped, and sometimes to pass some information between mapping rules.
Since normalization is always applied in the presence of a static context, the above rule is a shorthand for:
The static environment is used in certain normalization rules (e.g. for normalization of function calls). To keep the notation simpler, the static environment is not written in the normalization rules, but it is assumed to be available.
Editorial note | |
[Kristoffer/XSL] We should decide whether to use a shorthand notation as suggested or modify the mapping rules throughout. |
The normalization rule that is used to map "top-level" expressions in the [XPath/XQuery] syntax into expressions in the [XPath/XQuery] Core is:
[Expr]Expr |
== |
CoreExpr |
which indicates that the expression Expr is normalized to the expression CoreExpr in the [XPath/XQuery] core (with the implied statEnv).
Example
For instance, the following [expression/query]
for $i in (1, 2), $j in (3, 4) return element pair { ($i,$j) }
is normalized to the core expression
for $i in (1, 2) return for $j in (3, 4) return element pair { ($i,$j) }
in which the "FWLR" expression is mapped into a composition of two simpler "for" expressions.
The static semantics is specified using type inference rules, which relate [XPath/XQuery] expressions to types and specify under what conditions an expression is well typed.
Notation
The judgment
holds when, in the static environment statEnv, the expression Expr has type Type.
Example
The result of static type inference is to associate a static type with every [expression/query], such that any evaluation of that [expression/query] is guaranteed to yield a value that belongs to that type.
For instance, the following expression.
let $v := 3 return $v+5
has type xs:integer
. This can be inferred as follows: the
input literals '3' and '5' have type integer, so the variable $v also has
type integer. Since the sum of two integers is an integer, the complete
expression has type integer.
Note
The type of an expression is computed by inference. Static type inference rules define for each kind of expression how to compute the type of the expression given the types of its sub-expressions. Here is a simple example:
statEnv |-
Expr1 : xs:boolean
statEnv
|- Expr2 : Type2
statEnv
|- Expr3 : Type3 |
|
statEnv |- if
Expr1 then
Expr2 else
Expr3 : ( Type2 | Type3 ) |
This rule states that if the conditional expression of an "if" expression has type boolean, then the type of the entire expression is one of the two types of its "then" and "else" clauses. Note that the resulting type is represented as a union: '(Type2|Type3)'.
The "left half" (the part before the :) of the expression below the line corresponds to some [expression/query], for which a type is computed. If the [expression/query] has been parsed into an internal abstract syntax tree, this usually corresponds to some node in that tree. The expression usually has patterns in it (here Expr1, Expr2, and Expr3) that need to be matched against the children of the node in the abstract syntax tree. The expressions above the line indicate things that need to be computed to use this rule; in this case, the types of the condition expression and the two branches of the if-then-else expression. Once those types are computed (by further applying static inference rules recursively to the expressions on each side), then the type of the expression below the line can be computed. This example illustrates a general feature of the [XPath/XQuery] type system: the type of an expression depends only on the type of its sub-expressions. The overall static type inference algorithm is recursive, following the abstract syntax of the [expression/query]. At each point in the recursion, an appropriate matching inference rule is sought; if at any point there is no applicable rule, then static type inference has failed and the [expression/query] is not type correct.
The dynamic, or operational, semantics is specified using value inference rules, which relate [XPath/XQuery] expressions to values, and in some cases specify the order in which an [XPath/XQuery] expression is evaluated.
Notation
The judgment
holds when, in the static environment statEnv and dynamic environment dynEnv, the expression Expr yields the value Value.
The judgment
holds when, in the static environment statEnv and dynamic environment dynEnv, the expression Expr raises the error Error.
The static environment is used in certain cases (e.g. for type matching) during evaluation. To keep the notation simpler, the static environment is not written in the dynamic inference rules, but it is assumed to be available.
Example
For instance, the following expression.
let $v := 3 return $v+5
yields the integer value 8. This can be inferred as follows: the input literals '3' and '5' denote the values 3 and 5, respectively, so the variable $v has the value 3. Since the sum of 3 and 5 is 8, the complete expression has the value 8.
Note
As with static type inference, logical inference rules are used to determine the value of each expression, given the dynamic environment and the values of its sub-expressions.
The inference rules used for dynamic evaluation, like those for static type inference, follow a bottom-up recursive structure, computing the value of expressions from the values of their sub-expressions.
The Formal Semantics uses the following namespace prefixes.
fn:
for functions and operators from the [Functions and Operators] document.
xs:
for XML Schema components and built-in types.
xdt:
for XML Schema components and built-in types.
All these prefixes are assumed to be bound to the appropriate URIs.
In addition, the Formal Semantics uses the following special prefixes for the means of specification.
dm: for accessors of the [Data Model].
op: for operators in [Functions and Operators].
fs: for functions and types defined in the formal semantics.
These prefixes are always italicized to emphasize that the corresponding functions, variables, and types are abstract: they are not and cannot be made accessible in the host language. None of these special prefixes are given a URI.
The [Functions and Operators] document defines built-in functions available in [XPath/XQuery]. A number of these functions are used to define the [XPath/XQuery] semantics; those functions are listed in [B.1 Functions and Operators used in the Formal Semantics].
Many functions in the [Functions and
Operators] document are generic: they perform operations on
arbitrary components of the data model, e.g., any kind of node, or any
sequence of items. For instance, the fn:distinct-nodes
function
removes duplicates in any sequence of nodes. As a result, the signature given
in the [Functions and Operators] document is
also generic. For instance, the signature of the
fn:distinct-nodes
function is:
fn:distinct-nodes(node*) as node*
As defined, this signature provides little useful type information. For
such functions, better type information can often be obtained by having the
output type depend on the type of input parameters. For instance, if the
function fn:distinct-nodes
is applied on a parameter of type
element a*, element b
, one can easily deduce that the resulting
sequence is a collection of either a
or b
elements.
In order to provide better static typing for those functions, specific typing rules are given in [6 Additional Semantics of Functions].
The organization of this section parallels the organization of Section 2 BasicsXQ.
Introduction
The expression context for a given expression consists of all the information that can affect the result of the expression. This information is organized into the static context and the dynamic context. This section specifies the environments that represent the context information used by [XPath/XQuery] expressions.
The environment group statEnv denotes the environments that are available during static analysis. Static analysis may extend parts of the static environment. The static environment is also available during dynamic evaluation.
If analysis of an expression relies on some component of the static context that has not been assigned a value, a static error is raised. This constraint is formalized in [3.5 Error Handling].
The following environments are part of the static environment group:
statEnv.xpath1.0_compatibility |
|
||||
statEnv.namespace |
|
||||
statEnv.default_elem_namespace |
|
||||
statEnv.default_function_namespace |
|
||||
statEnv.typeDefn |
|
||||
statEnv.elemDecl |
|
||||
statEnv.attrDecl |
|
||||
statEnv.varType |
|
||||
statEnv.funcType |
|
||||
statEnv.collations |
|
||||
statEnv.defaultCollation |
|
||||
statEnv.validationMode |
|
||||
statEnv.validationContext |
|
||||
statEnv.baseUri |
|
||||
statEnv.docType |
|
||||
statEnv.collectionType |
|
The XMLSpace policy in the [XPath/XQuery] static context is not represented in the Formal Semantics. All white-space processing the query text should be handled during parsing, and therefore is formalized in this document.
Environments have an initial state when [expression/query] processing begins, containing, for example, the function signatures of all built-in functions. The initial values for the static context are defined in Section C Context ComponentsXQ and Section C Context ComponentsXP and is denoted by statEnvDefault in the Formal Semantics.
Here is an example that shows how the static environment is modified in response to a namespace definition.
|
||
|
||
statEnv |- declare
namespace NCName = URI Expr* |
This rule reads as follows: "the phrase on the bottom (a namespace declaration in the query prolog followed by a sequence of expressions) is well-typed (accepted by the static type inference rules) within an environment statEnv if the sequence of expressions above the line is well-typed in the environment obtained from statEnv by adding the namespace declaration".
This is a common idiom for adding new information to an environment and passing that environment for use in sub-expressions. If the environment must be updated with a completely new component, the following notation is used:
statEnv [ namespace = (NewEnvironment) ]The helper function fs:active_ns
(statEnv) returns all the active in-scope namespaces in the
given static environment.
For each attribute and element node in Value, such that the node has name expanded-QName in the namespace
URI, the helper function fs:get_ns_from_items
(statEnv, Value) returns the in-scope namespace that
corresponds to URI. This is a reverse-lookup on statEnv.namespace by URI.
A common use of the static environment is to expand a QName by looking up the URI that corresponds to the QName's namespace prefix in the statEnv.namespace environment and by constructing an expanded-QName DM, which contains the URI and the QName's local part.
The auxiliary judgments below expand an element, type, attribute, variable, or function QName by looking up the namespace prefix in statEnv.namespace or, if the QName is unqualified, by using the appropriate default namespace.
Notation
The judgment
holds when the element or type QName expands to the given expanded QName.
The judgment
holds when the attribute QName expands to the given expanded QName.
The judgment
holds when the variable QName expands to the given expanded QName.
The judgment
holds when the function QName expands to the given expanded QName.
Semantics
An element or type QName consisting of a prefix NCName and a local part NCName expands to the URI that corresponds to the prefix and the local part.
statEnv.namespace(NCName1) = URI |
|
statEnv |- NCName1:NCName2of elem/type expands to (URI,NCName2) |
An element or type QName consisting only of a local part NCName expands to the default element/type namespace URI and the local part.
statEnv.default_elem_namespace = URI |
|
statEnv |- NCName of elem/type expands to (URI,NCName) |
An attribute QName consisting of a prefix NCName and a local part NCName expands to the URI that corresponds to the prefix and the local part.
statEnv.namespace(NCName1) = URI |
|
statEnv |- NCName1:NCName2of attr expands to (URI,NCName2) |
An attribute QName consisting only of a local part NCName expands to the empty namespace and the local part.
statEnv.default_elem_namespace = URI |
|
statEnv |- NCName of attr expands to ("",NCName) |
A variable QName consisting of a prefix NCName and a local part NCName expands to the URI that corresponds to the prefix and the local part.
statEnv.namespace(NCName1) = URI |
|
statEnv |- NCName1:NCName2of var expands to (URI,NCName2) |
A variable QName consisting only of a local part NCName expands to the empty namespace and the local part.
statEnv.default_elem_namespace = URI |
|
statEnv |- NCName of var expands to ("",NCName) |
A function QName consisting of a prefix NCName and a local part NCName expands to the URI that corresponds to the prefix and the local part.
statEnv.namespace(NCName1) = URI |
|
statEnv |- NCName1:NCName2 of func expands to (URI,NCName2) |
A function QName consisting only of a local part NCName expands to the default function namespace URI and the local part.
statEnv.default_function_namespace = URI |
|
statEnv |- NCName of func expands to (URI,NCName) |
The environment group dynEnv denotes the group of environments built and used during dynamic evaluation.
If evaluation of an expression relies on some component of the dynamic context that has not been assigned a value, a dynamic error is raised. This constraint is formalized in [3.5 Error Handling].
The following environments are part of evaluation environment group:
dynEnv.funcDefn |
|
|||||
dynEnv.varValue |
|
|||||
dynEnv.dateTime |
|
|||||
dynEnv.timezone |
|
|||||
dynEnv.docValue |
|
|||||
dynEnv.collectionValue |
|
The initial values for the dynamic context are defined in Section C Context ComponentsXQ and Section C Context ComponentsXP and is denoted by dynEnvDefault in the Formal Semantics.
The following Formal Semantics built-in variables represent the context item, context position, and context size properties of the dynamic context:
Built-in Variable | Represents: |
$ fs:dot |
context item |
$ fs:position |
context position |
$ fs:last |
context size |
Variables with the "fs" namespace prefix are reserved for use in the definition of the Formal Semantics. It is a static error to define a variable in the "fs" namespace.
Values of $
fs:dot
and
$
fs:position
and $
fs:last
can be
obtained by invoking the fn:position
and fn:last
functions, respectively.
A simplified version of the processing model, used as the basis for formalization is given in [2.4 Processing model and main judgments].
[XPath/XQuery] is most generally used to process documents. The representation of a document is normatively defined in [Data Model]. The functions used to access documents and collections are normatively defined in [Functions and Operators].
Document order is defined in [Data Model].
Atomization converts an item sequence into a sequence of atomic values and
is implemented by the fn:data
function. Atomization is applied
in contexts where an arbitrary sequence of items is used where a sequence of
atomic values is expected.
If a sequence of items is encountered where a boolean value is expected,
the item sequence's effective boolean value is used. The
fn:boolean
function returns the effective boolean value of an
item sequence.
[XPath/XQuery] has a set of functions that provide access to input data. These functions are of particular importance because they provide a way in which an expression can reference a document or a collection of documents. The dynamic semantics of these three input functions are described in more detail in [Functions and Operators].
Editorial note | |
Mary: What should be the normative definition of the xdt types? These definitions must be normative for the purpose of the type rules in this document. |
All the built-in types of XML Schema are recognized by [XPath/XQuery]. In addition, [XPath/XQuery] recognizes the predefined types Section 1.5.1 xdt:anyAtomicTypeFO, Section 1.5.2 xdt:untypedAtomicFO and Section 1.5.3 xdt:untypedAnyFO and the duration subtypes Section 9.2.1 xdt:yearMonthDurationFO and Section 9.2.2 xdt:dayTimeDurationFO. The representation of those types in the [XPath/XQuery] type system is given below.
[Definition: The following type definition of
xs:anyType
reflects the semantics of the Ur
type from Schema in the [XPath/XQuery] type system.]
define type xs:anyType restricts xs:anyType { attribute*, ( xdt:anyAtomicType* | ( element? & text? & comment? & processing-instruction? )* ) }
[Definition: The following type definition of
xs:anySimpleType
reflects the semantics of
the Ur simple type from Schema in the [XPath/XQuery] type system.]
define type xs:anySimpleType restricts xs:anyType { ( xs:string | xs:boolean | xs:decimal | xs:float | xs:double | xs:duration | xs:dateTime | xs:time | xs:date | xs:gYearMonth | xs:gYear | xs:gMonthDay | xs:gDay | xs:gMonth | xs:hexBinary | xs:base64Binary | xs:anyURI | xs:QName | xs:NOTATION )* }
The name of the Ur simple type is xs:anySimpleType
. It is
derived by restriction from xs:anyType
, its content a sequence
of the union of all primitive atomic types.
[Definition: The following type definition of
xdt:anyAtomicType
reflects the semantics of xdt:anyAtomicType
in the
[XPath/XQuery] type system.]
define type xdt:anyAtomicType restricts xs:anySimpleType { ( xs:string | xs:boolean | xs:decimal | xs:float | xs:double | xs:duration | xs:dateTime | xs:time | xs:date | xs:gYearMonth | xs:gYear | xs:gMonthDay | xs:gDay | xs:gMonth | xs:hexBinary | xs:base64Binary | xs:anyURI | xs:QName | xs:NOTATION | xdt:untypedAtomic) }
[Definition: The following type definitions of the XML Schema primitive types reflects the semantics of the primitive types from Schema in the [XPath/XQuery] type system.]
define type xs:string restricts xdt:anyAtomicType define type xs:boolean restricts xdt:anyAtomicType define type xs:decimal restricts xdt:anyAtomicType define type xs:float restricts xdt:anyAtomicType define type xs:double restricts xdt:anyAtomicType define type xs:duration restricts xdt:anyAtomicType define type xs:dateTime restricts xdt:anyAtomicType define type xs:time restricts xdt:anyAtomicType define type xs:date restricts xdt:anyAtomicType define type xs:gYearMonth restricts xdt:anyAtomicType define type xs:gYear restricts xdt:anyAtomicType define type xs:gMonthDay restricts xdt:anyAtomicType define type xs:gDay restricts xdt:anyAtomicType define type xs:gMonth restricts xdt:anyAtomicType define type xs:hexBinary restricts xdt:anyAtomicType define type xs:base64Binary restricts xdt:anyAtomicType define type xs:anyURI restricts xdt:anyAtomicType define type xs:QName restricts xdt:anyAtomicType define type xs:NOTATION restricts xdt:anyAtomicType
All of those primitive types derive from xdt:anyAtomicType
. Note that the
value space of each atomic type (such as xs:string
) does not
appear. The value space for each type is built-in and is as defined in
[Schema Part 2].
[Definition: The type xdt:untypedAtomic
is
defined as follows.]
define type xdt:untypedAtomic restricts xdt:anyAtomicType
Note that this rule does not indicate the value space of xdt:untypedAtomic
. By definition,
xdt:untypedAtomic
has the
same value space as xs:string
.
The following example shows two atomic values. The first one is a value of type string containing "Database". The second one is an untyped value containing "Database". Both are using a string as content, but they have different type annotations.
"Databases" of type xs:string "Databases" of type xdt:untypedAtomic
[Definition: The type xdt:untypedAny
is defined as
follows.]
define type xdt:untypedAny restricts xs:anyType { attribute of type xdt:untypedAtomic*, ( element of type xdt:untypedAny? & text? & comment? & processing-instruction? )* }
[Definition: The following type definitions of the XML Schema derived types reflects the semantics of the XML Schema types derived derived by restriction from another atomic type.]
define type xs:normalizedString restricts xs:string define type xs:token restricts xs:normalizedString define type xs:language restricts xs:token define type xs:NMTOKEN restricts xs:token define type xs:Name restricts xs:token define type xs:NCName restricts xs:Name define type xs:ID restricts xs:Name define type xs:IDREF restricts xs:Name define type xs:ENTITY restricts xs:Name define type xs:integer restricts xs:decimal define type xs:nonPositiveInteger restricts xs:integer define type xs:negativeInteger restricts xs:nonPositiveInteger define type xs:long restricts xs:integer define type xs:int restricts xs:long define type xs:short restricts xs:int define type xs:byte restricts xs:short define type xs:nonNegativeInteger restricts xs:integer define type xs:unsignedLong restricts xs:nonNegativeInteger define type xs:unsignedInt restricts xs:unsignedLong define type xs:unsignedShort restricts xs:unsignedInt define type xs:unsignedByte restricts xs:unsignedShort define type xs:positiveInteger restricts xs:nonNegativeInteger
Three XML Schema built-in derived types are derived by list, as follows.
Note that those derive directly from xs:anySimpleType
, since
they are derived by list, and that their value space is defined using a "one
or more" occurrence indicator.
define type xs:NMTOKENS restricts xs:anySimpleType { xs:NMTOKEN+ } define type xs:IDREFS restricts xs:anySimpleType { xs:IDREF+ } define type xs:ENTITIES restricts xs:anySimpleType { xs:ENTITY+ }
For example, here is an element whose content is of type
xs:IDREFS
.
element a of type xs:IDREFS { "id1" of type xs:IDREF, "id2" of type xs:IDREF, "id3" of type xs:IDREF }
Note that the type name xs:IDREFS
derives from
xs:anySimpleType
, but not from xs:IDREF
. As a
consequence, calling the following three XQuery functions with the element
a
as a parameter succeeds for f1
and
f2
, but raises a type error for f3
.
define function f1($x as element(*,xs:anySimpleType)) { $x } define function f2($x as element(*,xs:IDREFS)) { $x } define function f3($x as element(*,xs:IDREF)) { $x }
[Definition: The totally ordered duration types,
xdt:yearMonthDuration
and
xdt:dayTimeDuration
, are
derived by restriction from xs:duration
.]
define type xdt:yearMonthDuration restricts xs:duration define type xdt:dayTimeDuration restricts xs:duration
The typed value of a node is computed by the fn:data
function, and the string value of a node is computed by the
fn:string
function, defined in [Functions and Operators]. The normative definitions
of typed value and string value are defined in [Data Model].
Editorial note | |
Mary: Jerome is working on this section. |
Introduction
SequenceTypes can be used in [XPath/XQuery] to refer to a type imported from a schema (see [5 Modules and Prologs]). SequenceTypes are used to declare the types of function parameters and in several kinds of [XPath/XQuery] expressions.
The syntax of SequenceTypes is described by the following grammar productions.
[125 (XQuery)] | SequenceType |
::= | (ItemType OccurrenceIndicator?) |
[144 (XQuery)] | OccurrenceIndicator |
::= | "?" | "*" | "+" |
[127 (XQuery)] | ItemType |
::= | AtomicType | KindTest | ("item" "(" ")") |
[126 (XQuery)] | AtomicType |
::= | QName |
[128 (XQuery)] | KindTest |
::= | DocumentTest |
[137 (XQuery)] | PITest |
::= | "processing-instruction" "(" (NCName | StringLiteral)? ")" |
[139 (XQuery)] | CommentTest |
::= | "comment" "(" ")" |
[140 (XQuery)] | TextTest |
::= | "text" "(" ")" |
[141 (XQuery)] | AnyKindTest |
::= | "node" "(" ")" |
[138 (XQuery)] | DocumentTest |
::= | "document-node" "(" ElementTest? ")" |
[129 (XQuery)] | ElementTest |
::= | "element" "(" ((SchemaContextPath ElementName) |
[131 (XQuery)] | ElementName |
::= | QName |
[130 (XQuery)] | AttributeTest |
::= | "attribute" "(" ((SchemaContextPath AttributeName) |
[132 (XQuery)] | AttributeName |
::= | QName |
[142 (XQuery)] | SchemaContextPath |
::= | SchemaGlobalContext
"/" (SchemaContextStep
"/")* |
[14 (XQuery)] | SchemaGlobalContext |
::= | QName | SchemaGlobalTypeName |
[15 (XQuery)] | SchemaContextStep |
::= | QName |
[13 (XQuery)] | SchemaGlobalTypeName |
::= | "type" "(" QName ")" |
[133 (XQuery)] | TypeName |
::= | QName |
[134 (XQuery)] | ElementNameOrWildcard |
::= | ElementName | "*" |
The semantics of SequenceTypes is defined by means of normalization rules from SequenceTypes to the [XPath/XQuery] type system (see [2.3 The [XPath/XQuery] Type System]).
Core Grammar
The Core grammar productions for sequence types are:
[86 (Core)] | SequenceType |
::= | (ItemType OccurrenceIndicator?) |
[105 (Core)] | OccurrenceIndicator |
::= | "?" | "*" | "+" |
[88 (Core)] | ItemType |
::= | AtomicType | KindTest | ("item" "(" ")") |
[87 (Core)] | AtomicType |
::= | QName |
[89 (Core)] | KindTest |
::= | DocumentTest |
[98 (Core)] | PITest |
::= | "processing-instruction" "(" (NCName | StringLiteral)? ")" |
[100 (Core)] | CommentTest |
::= | "comment" "(" ")" |
[101 (Core)] | TextTest |
::= | "text" "(" ")" |
[102 (Core)] | AnyKindTest |
::= | "node" "(" ")" |
[99 (Core)] | DocumentTest |
::= | "document-node" "(" ElementTest? ")" |
[90 (Core)] | ElementTest |
::= | "element" "(" ((SchemaContextPath ElementName) |
[92 (Core)] | ElementName |
::= | QName |
[91 (Core)] | AttributeTest |
::= | "attribute" "(" ((SchemaContextPath AttributeName) |
[93 (Core)] | AttributeName |
::= | QName |
[103 (Core)] | SchemaContextPath |
::= | SchemaGlobalContext "/"
(SchemaContextStep
"/")* |
[10 (Core)] | SchemaGlobalContext |
::= | QName | SchemaGlobalTypeName |
[11 (Core)] | SchemaContextStep |
::= | QName |
[9 (Core)] | SchemaGlobalTypeName |
::= | "type" "(" QName ")" |
[94 (Core)] | TypeName |
::= | QName |
Editorial note | |
Note that normalization on SequenceTypes does not occur during the normalization phase but whenever a dynamic or static rule requires it. The reason for this deviation from the processing model is that the result of SequenceType normalization is not part of the [XPath/XQuery] syntax (See Issue 432 (FS-Issue-0089)). SequenceType normalization is the only occurrence of such a deviation in the Formal Semantics. |
Introduction
During processing of a query, it is sometimes necessary to determine whether a given value matches a type that was declared using the SequenceType syntax. This process is known as SequenceType matching, and is formally specified in [7.4 Judgments for type matching].
Sequence type matching is defined between a value and a type, rules to normalize a sequence type to the [XPath/XQuery] type system are necessary.
Notation
To define normalization of SequenceTypes to the [XPath/XQuery] type system, the following auxiliary mapping rule is used.
[SequenceType]sequencetype |
== |
Type |
specifies that SequenceType is mapped to Type, in the [XPath/XQuery] type system.
OccurenceIndicators are left unchanged when normalizing SequenceTypes into [XPath/XQuery] types. Each kind of SequenceType component is normalized separately into the [XPath/XQuery] type system.
[ItemType OccurrenceIndicator]sequencetype |
== |
[ItemType]sequencetype OccurrenceIndicator |
The "empty" sequence type is mapped to the empty sequence type in the [XPath/XQuery] type system.
[empty()]sequencetype |
== |
empty |
All of the KindTest SequenceType components are mapped directly into the [XPath/XQuery] type system.
[element()]sequencetype |
== |
element |
[element(*)]sequencetype |
== |
element |
[element(*,*)]sequencetype |
== |
element |
[attribute()]sequencetype |
== |
attribute |
[attribute(*)]sequencetype |
== |
attribute |
[attribute(*,*)]sequencetype |
== |
attribute |
The mapping still does not handle SequenceTypes using SchemaContext and nillable annotation (See Issue 481 (FS-Issue-0138)). The following rules map references to a global element or attribute, without taking the schema context or nillable into account.
[element(ElementName)]sequencetype |
== |
element ElementName |
[attribute(AttributeName)]sequencetype |
== |
attribute AttributeName |
[element(ElementName,TypeName)]sequencetype |
== |
element ElementName of type TypeName |
[attribute(AttributeName,TypeName)]sequencetype |
== |
attribute AttributeName of type TypeName |
[element(ElementName,TypeName nillable)]sequencetype |
== |
element ElementName nillable of type TypeName |
Document nodes, text nodes, and atomic types are left unchanged.
[text()]sequencetype |
== |
text |
[document-node()]sequencetype |
== |
document |
[document-node(ElementTest)]sequencetype |
== |
document { [ElementTest]sequencetype } |
[comment()]sequencetype |
== |
comment |
[processing-instruction()]sequencetype |
== |
processing-instruction |
[processing-instruction(String)]sequencetype |
== |
processing-instruction |
The SequenceType components "node()" and "item()" correspond to wildcard
types. node()
indicates that any node is allowed and
item()
indicates that any node or atomic value is allowed. The
following mapping rules make use of the corresponding wildcard types.
[node()]sequencetype |
== |
(element | attribute | text | document | comment | processing-instruction) |
[item()]sequencetype |
== |
(element | attribute | text |
document | comment | processing-instruction | xdt:anyAtomicType ) |
Editorial note | |
Jerome: The Formal Semantics makes
use of fs:numeric which is not in XML Schema. This is
necessary for the specification of some of XPath type conversion rules. |
Expressions can raise errors during static analysis or dynamic evaluation.
The [Functions and Operators] [XQuery 1.0: A Query Language for XML], and [XML Path Language (XPath) 2.0] specify the conditions under
which an expression or operator raises an error. The user may raise an error
explicitly by calling the fn:error
function, which takes an
optional item as an argument.
Notation
The symbol Error denotes any error. We distinguish between a static errorXQ (denoted by statError), a type errorXQ (denoted by typeError), and a generic dynamic errorXQ (denoted by dynError), which represents all dynamic errors. A static error is raised during static analysis. A type error may be raised during static analysis or dynamic evaluation. A dynamic error is raised during dynamic evaluation. Non-type static errors are not formalized in this document.
Editorial note | |
We use a generic dynamic error. The [Functions and Operators] and [XQuery 1.0: A Query Language for XML] documents raise specific error conditions, but because these error conditions can be implemented in any way, we do not formalize them here. |
In general, when an error is raised during evaluation of some expression Expr, the error is propogated to the expression Expr1 in which Expr is evaluated. The expression Expr1, in turn, propogates the error to the expression in which Expr1 is evaluated, and so on, until the error is returned to the query environment.
Since most expressions propogate errors as described, we use one inference rule to specify this default behavior. The rule below states that if any sub-expression Expri of expression Expr raises an error dynError then Expr also raises dynError.
There are several expressions, such as [4.6 Logical Expressions] and [4.11 Quantified Expressions], that do not necessarily propogate an error raised by some sub-expression. For each such expression, we give specific error inference rules.
If analysis (evaluation) of an expression relies on some component of the static (dynamic) context that has not been assigned a value, a static (dynamic) error is raised. The following two rules handle all those cases when a component of an environment is accessed but not defined.
In [XPath/XQuery], the detection and reporting of dynamic errors is implementation dependent. This permits different implementations to choose to evaluate or optimize an expression in different ways. When an implementation is able to evaluate an expression without evaluating some subexpression, the implementation is never required to evaluate that subexpression solely to determine whether it raises a dynamic error. For example, if a function parameter is never used in the body of the function, an implementation may choose whether to evaluate the expression bound to that parameter in a function call. Similarly, if the variable bound by a let expression is never used in the corresponding return expression, the implementation is not required to evaluate the expression to which the variable is bound.
For simplicity, the dynamic inference rules in Formal Semantics define an eager evaluation semantics for all expressions, i.e., all sub-expressions are evaluated regardless of whether their values are necessary to evaluate the containing expression. For example, every function parameter is evaluated before the body of the function is evaluated, and the expression bound to a let variable is always evaluated. The dynamic semantics rules in the Formal Semantics do not formalize the more flexible evaluation strategy above.
For example, in the following expression, the dynamic semantics rules of the Formal Semantics would raise a dynamic error because a path expression may not be applied to an atomic value. An implementation, however, may not raise an error, because the path expression is not necessary to evaluate the containing let expression.
let $x := 1/foobar return 1
However, the static semantic rules, by definition, are conservative, and therefore a type is computed for every subexpression. In the example above, a static type error would be raised because a path expression may be applied to an atomic value.
The semantics of XML Schema import is described in [8 Importing Schemas] and [5.6 Schema Import].
The semantics of the static typing feature is specified by the static inference rules in the Formal Semantics.
The semantics of all axes are specified in [4.2.1.1 Axes].
The semantics of main modules is specified in [5 Modules and Prologs]. The semantics of library modules and module import are not currently not specified.
Because pragmas are implementation defined, the Formal Semantics of extensions cannot be specified.
Because extensions are implementation defined, the Formal Semantics of extensions cannot be specified.
For some expressions, the static typing rules defined in the Formal
Semantics are not very precise (see, for instance, the type inference rules
for the ancestor axes — parent, ancestor, and ancestor-or-self—
and for the function fn:root
function. Some implementations may
wish to support more precise static typing rules.
A conforming implementation may provide a static typing extensionXQ, which is a type inference rule that infers a more precise static type than that inferred by the type inference rules in the Formal Semantics. Given an expression Expr, the type of Expr given by the extension must be a subtype of the type of Expr given by the Formal Semantics. This constraint is expressed formally by the following judgment:
This section gives the semantics of all the [XPath/XQuery] expressions. The organization of this section parallels the organization of Section 3 ExpressionsXQ.
[40 (XQuery)] | Expr |
::= | ExprSingle ("," ExprSingle)* |
[41 (XQuery)] | ExprSingle |
::= | FLWORExpr |
[15 (XPath)] | XPath |
::= | Expr? |
For each expression, a short description and the relevant grammar productions are given. The semantics of an expression includes the normalization, static analysis, and dynamic evaluation phases. Recall that normalization rules translate [XPath/XQuery] syntax into Core syntax. In the sections that contain normalization rules, the Core grammar productions into which the expression is normalized are also provided. After normalization, sections on static type inference and dynamic evaluation define the static type and dynamic value for the Core expression.
Core Grammar
The Core grammar production for expressions is:
[31 (Core)] | Expr |
::= | ExprSingle ("," ExprSingle)* |
[32 (Core)] | ExprSingle |
::= | FLWORExpr |
It is a static error for any expression other than the empty-sequence
expression ()
, or for a function with the explicit
empty()
return type, to have the empty type. The rule below
enforces the constraint that any expression other than ()
, or
one function that is meant to return the empty type (e.g., the
fn:trace()
function) to have a non-empty type. The static typing
rule for ()
is in [4.1.3
Parenthesized Expressions].
|
||
|
||
|
The rule is written this way, because for any expression such that no static type rule applies to the expression, a static type error is raised. That is, the absence of an applicable static rule indicates an error. For example, if an expression is not the empty-sequence expression but has the empty type, the above rule does not apply and therefore a static error is raised.
Example
The above rule can catch common mistakes, such as the misspelling of an element or attribute name or referencing of an element or attribute that does not exist. For instance, the following path expression
$x/title
raises a static error if the type of variable $x
does not
include any title
children elements.
Primary expressions are the basic primitives of the language.They include literals, variables, function calls, and the parenthesized expressions.
[75 (XQuery)] | PrimaryExpr |
::= | Literal | VarRef | ParenthesizedExpr | ContextItemExpr |
FunctionCall | Constructor |
[76 (XQuery)] | VarRef |
::= | "$" VarName |
[20 (XQuery)] | VarName |
::= | QName |
Core Grammar
The Core grammar productions for primary expressions are:
[51 (Core)] | PrimaryExpr |
::= | Literal | VarRef | ParenthesizedExpr | FunctionCall | Constructor |
[52 (Core)] | VarRef |
::= | "$" VarName |
[15 (Core)] | VarName |
::= | QName |
Introduction
A literal is a direct syntactic representation of an atomic value. [XPath/XQuery] supports two kinds of literals: string literals and numeric literals.
[94 (XQuery)] | Literal |
::= | NumericLiteral | StringLiteral |
[95 (XQuery)] | NumericLiteral |
::= | IntegerLiteral | DecimalLiteral | DoubleLiteral |
[7 (XQuery)] | IntegerLiteral |
::= | Digits |
[8 (XQuery)] | DecimalLiteral |
::= | ("." Digits) | (Digits "." [0-9]*) |
[9 (XQuery)] | DoubleLiteral |
::= | (("." Digits) | (Digits ("." [0-9]*)?)) ("e" | "E") ("+" | "-")?
Digits |
[10 (XQuery)] | StringLiteral |
::= | ('"' (PredefinedEntityRef | CharRef | ('"' '"') | [^"&])* '"') | ("'"
(PredefinedEntityRef | CharRef | ("'" "'") | [^'&])* "'") |
[22 (XQuery)] | PredefinedEntityRef |
::= | "&" ("lt" | "gt" | "amp" | "quot" | "apos") ";" |
[24 (XQuery)] | CharRef |
::= | "&#" (Digits | ("x" HexDigits)) ";" |
[16 (XQuery)] | Digits |
::= | [0-9]+ |
[23 (XQuery)] | HexDigits |
::= | ([0-9] | [a-f] | [A-F])+ |
Core Grammar
The Core grammar productions for literals are:
[64 (Core)] | Literal |
::= | NumericLiteral | StringLiteral |
[65 (Core)] | NumericLiteral |
::= | IntegerLiteral | DecimalLiteral | DoubleLiteral |
[3 (Core)] | IntegerLiteral |
::= | Digits |
[4 (Core)] | DecimalLiteral |
::= | ("." Digits) | (Digits "." [0-9]*) |
[5 (Core)] | DoubleLiteral |
::= | (("." Digits) | (Digits ("." [0-9]*)?)) ("e" | "E") ("+" | "-")?
Digits |
[6 (Core)] | StringLiteral |
::= | ('"' (('"' '"') | [^"])* '"') | ("'" (("'" "'") | [^'])*
"'") |
[12 (Core)] | Digits |
::= | [0-9]+ |
All literals are Core expressions, therefore no normalization rules are required for literals. Predefined entity references and character references in strings are resolved to characters as part of parsing and therefore they do not occur in the Core grammar.
In the static semantics, the type of an integer literal is simply xs:integer:
In the dynamic semantics, an integer literal is evaluated by constructing an atomic value in the data model, which consists of the literal value and its type:
The formal definitions of decimal, double, and string literals are analogous to those for integer.
|
||
|
|
||
|
Literal expressions never raise a dynamic error.
Introduction
A variable evaluates to the value to which the variable's QName is bound in the dynamic context.
A variable is a Core expression, therefore no normalization rule is required for a variable.
In the static semantics, the type of a variable is simply its type in the static type environment statEnv.varType:
|
||
|
||
|
If the variable is not bound in the static environment, a static error is raised.
In the dynamic semantics, a variable is evaluated by "looking up" its value in dynEnv.varValue:
|
||
|
||
|
If the variable is not bound in the dynamic environment, a dynamic error is raised; the default rules in [3.5 Error Handling] cover this error.
[96 (XQuery)] | ParenthesizedExpr |
::= | "(" Expr? ")" |
Core Grammar
The Core grammar production for parenthesized expressions is:
[66 (Core)] | ParenthesizedExpr |
::= | "(" Expr? ")" |
The empty-sequence expression ()
always has type
empty
. It is a static error for any expression other than
()
to have the empty type (see [4
Expressions].)
The empty-sequence expression evaluates to the empty sequence.
The default rules for propogating errors, described in [3.5 Error Handling] apply to parenthesized expressions.
[41] | ContextItemExpr |
::= | "." |
Introduction
A context item expression evaluates to the context item, which may be either a node or an atomic value.
A context item expression is normalized to the built-in variable $
fs:dot
.
Introduction
A function call consists of a QName followed by a parenthesized list of zero or more expressions. In [XPath/XQuery], the actual argument to a function is called an argument and the formal argument of a function is called a parameter. We use the same terminology here.
[97 (XQuery)] | FunctionCall |
::= | QName "(" (ExprSingle (","
ExprSingle)*)? ")" |
Because [XPath/XQuery] implicitly converts the values of function arguments, a normalization step is required.
Notation
Normalization of function calls uses an auxiliary mapping []FunctionArgument(SequenceType) used to insert conversions of function arguments that depend only on the expected SequenceType of the corresponding parameters. It is defined as follows:
[Expr]FunctionArgument(SequenceType) |
== |
[[[Expr]AtomizeAtomic(SequenceType)]Extract(SequenceType)]Convert(SequenceType) |
where
[Expr]AtomizeAtomic(SequenceType) denotes
fn:data (Expr) |
If SequenceType <: xdt:anyAtomic* and Expr : Type and not(Type=empty ) |
|
Expr | Otherwise |
which specifies that if the function expects atomic parameters, then
fn:data
is called to obtain them.
[Expr]Extract(SequenceType) denotes
fn:subsequence (Expr,1,1) |
If statEnv.xpath1.0_compatibility is true and SequenceType <: item? | |
Expr | Otherwise |
which specifies that if the backwards compatibility mode is set, then the first node of the sequence passed as an argument is selected.
[Expr]Convert(SequenceType) denotes
fs:convert-simple-operand (
Expr,PrototypicalValue) |
If SequenceType <: xs:anySimpleType |
Expr | Otherwise |
where PrototypicalValue has the following values for each possible SequenceType:
Editorial note | |
Todo: insert dummy prototypical values for each of the 44+4 types... |
Note
The fs:convert-simple-operand
function takes a PrototypicalValue, which is a value of the target
type, to ensure that conversion to base types is possible even though types
are not first class objects in [XPath/XQuery].
Core Grammar
The Core grammar production for function calls is:
[67 (Core)] | FunctionCall |
::= | QName "(" (ExprSingle ("," ExprSingle)*)? ")" |
Each argument expression in a function call is normalized to its corresponding Core expression by applying []FunctionArgument(SequenceType) for each argument with the expected SequenceType for the argument inserted.
[ QName (Expr1, ..., Exprn) ]Expr |
== |
QName ( [Expr1]FunctionArgument(SequenceType1), ..., [Exprn]FunctionArgument(SequenceTypen)) ) |
Note that this normalization rule depends on the static environment containing function signatures and is the only normalization rule that depends on statEnv. Furthermore notice that the normalization is only well-defined when it is guaranteed that overloading is restricted to atomic types with the same quantifier. This is presently the case.
Editorial note | |
Proposal (Kris): The following rules attempt to solve the "atomic overload protection" problem by factoring typing over atomic unions. Also it is made explicit that specific typing rules for functions (in [6 Additional Semantics of Functions]) take precedence. |
To typecheck a Core function call we first check in Section [6 Additional Semantics of Functions] if there is a specialized typing rule for the function, and, if so, use it. Otherwise, the function signatures matching the function name and arity are retrieved from the static environment. The type of each argument to the function must be a subtype of a type that is promotable to the corresponding function parameter type of the function; if the expected type is a union of atomic types then this check is performed separately for each possibility.
The first rule bootstraps type checking of a function call: It expands the function's QName and then applies the function call rule for the expanded function call:
|
||||
|
||||
|
For a function call in which the static type of one of the expressions passed as argument is a union of atomic types, the function call is type checked once separately for each atomic type in that union. The static type of the entire function call expression is then the union of the types computed in each case, as follows:
|
|||||
|
|||||
|
Note
Notice that this semantics makes sense since the type declared for a function parameter, which uses the sequence types syntax, cannot itself be a union.
Finally, the following auxilliary rule type checks a function call in which none of the actual arguments has a type that is a union of atomic types. The rule looks up the function in the static environment and checks that some signature for the function satisfies the following constraint: the type of each actual argument is a subtype of some type that can be promoted to the type of the correponding function parameter. In this case, the function call is well typed and the result type is the return type specified in the function's signature.
|
||||||||
|
||||||||
|
The function body itself is not analyzed for each invocation: static typing of the function definition itself guarantees that the function body always returns a value of the declared return type.
Notice that the static typing rule checks the function signature in order to determine whether a function exists rather than just the function arity: this is consistent because it will reject function calls with the wrong arity in addition to function calls with the right arity but incompatible parameter types.
Editorial note | |
We could consider changing the language document to talk about type signatures instead of arities but I (Kris) do not personally believe that we it is important...and what IS important (for the dynamic function call rule below) is that the [Functions and Operators] specification does not overload functions with more than one definition per arity! |
Based on a function's name and parameter types, the function body is retrieved from the dynamic environment.
If the function is a user-defined function then it is evaluated as follows. First, the rule evaluates each actual function argument expression. Next, a match is searched for among the function's possible declaration signatures, retrieved from statEnv.funcType. If the function is not present in the environment, or there is no matching declaration signature, a type error is raised. Otherwise, the function body and formal variables are obtained from dynEnv.funcDefn for the matching declaration signature. The rule then extends dynEnv.varValue by binding each formal variable to its corresponding value (converted by the normalization as required for the expected type and backwards compatibility flag), and evaluates the body of the function in the new environment. The resulting value is the value of the function call.
|
|||||||||||
|
|||||||||||
|
Note that the function body is evaluated in the default (top-level) environment extended with just the parameter bindings. Note also that input values and output values are matched against the types declared for the function. If static analysis was performed, all these checks are guaranteed to be true and may be omitted.
If the function is a built-in or external function then the rule is somewhat simpler:
|
|||||||||||
|
|||||||||||
|
Calls to built-in or external functions use the following auxiliary judgment to evaluate the built-in or external function:
"The built-in or external function F (from [Data Model], [Functions and Operators], [6 Additional Semantics of Functions], or as defined in dynEnvDefault.funcDefn) applied to the given parameter values yields the specified result value" | ||
|
||
|
Editorial note | |
Issue:Built-in and external function calls must be defined more precisely. See Issue 522 (FS-Issue-0179). |
If the evaluation of any actual argument raises an error, the function call can raise an error. This rule applies to both user-defined and built-in functions. Note that if more than one expression may raise an error, the function call may raise any one of the errors.
|
|||||
|
|||||
|
If, for all possible function signatures, the evaluation of some actual argument yields a value that cannot be promoted to the corresponding formal type of the parameter, the function call raises a type error. This rule applies to both user-defined and built-in functions.
|
||||||
|
||||||
|
If the evaluation of the function call to a user-defined function yields a value that cannot be promoted to the corresponding return type of the function, the function call raises a type error.
|
|||||||||||
|
|||||||||||
|
If the evaluation of the function call to a built-in or external function yields a value that cannot be promoted to the corresponding return type of the function, the built-in or external function call raises a type error.
|
|||||||||||
|
|||||||||||
|
Built-in function calls use the following auxiliary judgment to evaluate the built-in function call. If the built-in function raises an error, the function call raises an error.
"The built-in function F (from data model, type constructor, or functions and operators) applied to the given parameter raises an error" | ||
|
||
|
Editorial note | |
Issue:Built-in function calls must be defined more precisely. See Issue 522 (FS-Issue-0179). |
[3 (XQuery)] | ExprComment |
::= | "(:" (ExprCommentContent | ExprComment)* ":)" |
[4 (XQuery)] | ExprCommentContent |
::= | Char |
Comments are lexical constructs only, and have no meaning within the query and therefore have no formal semantics.
Introduction
Path expressions are used to locate nodes within a tree. There are two kinds of path expressions, absolute path expressions and relative path expressions. An absolute path expression is a rooted relative path expression. A relative path expression is composed of a sequence of steps.
[69 (XQuery)] | PathExpr |
::= | ("/" RelativePathExpr?) |
[70 (XQuery)] | RelativePathExpr |
::= | StepExpr (("/" | "//") StepExpr)* |
Core Grammar
PathExpr and RelativePathExpr are fully normalized, therefore they have no corresponding productions in the Core. The grammar for path expressions in the Core starts with the StepExpr production.
Absolute path expressions are path expressions starting with the
/
or //
symbols, indicating that the expression
must be applied on the root node in the current context. The root node in the
current context is the greatest ancestor of the context node. The following
two rules normalize absolute path expressions to relative ones. They use the
fn:root
function, which returns the greatest ancestor of its
argument node. The treat expressions guarantee that the value bound to the
context variable $
fs:dot
is a
document node.
["/"]Expr |
== |
(fn:root (self::node()) treat as document-node()) |
["/" RelativePathExpr]Expr |
== |
[((fn:root (self::node())) treat as document-node())
"/" RelativePathExpr]Expr |
["//" RelativePathExpr]Expr |
== |
[((fn:root (self::node())) treat as document-node) "/"
descendant-or-self::node() "/"
RelativePathExpr]Expr |
A composite relative path expression (using /
) is normalized
into a for
expression by concatenating the sequences obtained by
mapping each node of the left-hand side in document order to the sequence it
generates on the right-hand side. The call to the fs:distinct-doc-order
function ensures that the result is in document order without duplicates. The
dynamic context is defined by binding the
$
fs:dot
,
$
fs:sequence
,
$
fs:position
and
$
fs:last
variables.
Note that sorting by document order enforces the restriction that input and output sequences contains only nodes.
[StepExpr1 "/" StepExpr2]Expr | ||||||
== | ||||||
|
Note that for this section uses some auxiliary judgments which are defined in [7.3 Judgments for step expressions and filtering].
Introduction
[71 (XQuery)] | StepExpr |
::= | AxisStep | FilterStep |
[72 (XQuery)] | AxisStep |
::= | (ForwardStep | ReverseStep) Predicates |
[73 (XQuery)] | FilterStep |
::= | PrimaryExpr Predicates |
[85 (XQuery)] | ForwardStep |
::= | (ForwardAxis NodeTest) | AbbrevForwardStep |
[86 (XQuery)] | ReverseStep |
::= | (ReverseAxis NodeTest) | AbbrevReverseStep |
Core Grammar
The Core grammar productions for XPath steps are:
[50 (Core)] | AxisStep |
::= | ForwardStep | ReverseStep |
[57 (Core)] | ForwardStep |
::= | ForwardAxis NodeTest |
[58 (Core)] | ReverseStep |
::= | ReverseAxis NodeTest |
Note
Step expressions can be followed by predicates. Normalization of predicates uses the following auxiliary mapping rule: []Predicates, which is specified in [4.2.2 Predicates]. Normalization for step expressions also uses the following auxiliary mapping rule: []Axis, which is specified in [4.2.1.1 Axes].
Normalization of predicates need to distinguish between forward steps, reverse steps, and primary expressions.
As explained in the [XPath/XQuery] document, applying a step in XPath
changes the focus (or context). The change of focus is made explicit by the
normalization rule below, which binds the variable $
fs:dot
to the node
currently being processed, and the variable $
fs:position
to the
position (i.e., the position within the input sequence) of that node.
There are two sets of normalization rules for Predicates. The first set of
rules apply when the predicate is a numeric literal or the expression
last()
. The second set of rules apply to all predicate
expressions other than numeric literals and the expression
last()
. In the first case, the normalization rules provides a
more precise static type than if the general rules were applied.
When the predicate expression is a numeric literal or the
fn:last
function, the following normalization rules apply.
[ForwardStep Predicates "[" Numeric "]"]Expr | ||
== | ||
|
[ForwardStep Predicates "["
fn:last () "]"]Expr |
|||
== | |||
|
When predicates are applied on a reverse step, the position variable is bound in reverse document order.
[ReverseStep Predicates "[" Numeric "]"]Expr | ||||
== | ||||
|
When the step is a reverse axis, then the last item in the context sequence is the first in document order.
[ReverseStep Predicates "["
fn:last () "]"]Expr |
||
== | ||
|
When predicates are applied on a primary expression, the input sequence is processed in sequence order and the context is bound as in the case of forward axes. In that case, the sequence can contain both nodes and atomic values.
[PrimaryExpr Predicates "[" Numeric "]"]Expr | ||
== | ||
|
The normalization rules above all use the function
fn:subsequence
to select a particular item. The static typing
rules for this function are defined in [6.2.9 The fn:subsequence function].
When predicates are applied on a forward step, the input sequence is first
sorted in document order and duplicates are removed. The context is changed
by binding the $
fs:dot
variable to
each node in document order.
[ForwardStep Predicates "[" Expr "]"]Expr | ||||
== | ||||
|
When predicates are applied on a reverse step, the input sequence is first
sorted in document order and duplicates are removed. The context is changed
by binding the $
fs:dot
variable to
each node in document order.
[ReverseStep Predicates "[" Expr "]"]Expr | |||||
== | |||||
|
When predicates are applied on a primary expression, the input sequence is processed in sequence order and the context variable is bound to each item in the input sequence, which may contain both nodes and atomic values.
[PrimaryExpr Predicates "[" Expr "]"]Expr | ||||
== | ||||
|
Finally, a stand-alone forward or reverse step is normalized by the auxiliary normalization rule for Axis.
The static semantics of an Axis NodeTest pair is obtained by retrieving the type of the context node, and applying the two filters (the Axis, and then the NodeTest with a PrincipalNodeKind) on the result.
|
||||||
|
||||||
statEnv |- Axis NodeTest : Type3 |
Note
Note that the second judgment in the inference rule requires that the context item be a node, guaranteeing that a type error is raised when the context item is an atomic value.
The dynamic semantics of an Axis NodeTest pair is obtained by retrieving the context node, and applying the two filters (Axis, then NodeTest) on the result. The application of each filter is expressed through the filter judgment as follows.
|
||||||
|
||||||
dynEnv |- Axis
NodeTest =>
fs:distinct-doc-order (Value3) |
Note
Note that the second judgment in the inference rule guarantees that the context item is bound to a node.
If the context item is not a node, the evaluation of an axis node test expression raises a dynamic error.
Introduction
[89 (XQuery)] | ForwardAxis |
::= | ("child" "::") |
[90 (XQuery)] | ReverseAxis |
::= | "parent" "::" |
Core Grammar
The Core grammar productions for XPath axis are:
[59 (Core)] | ForwardAxis |
::= | ("child" "::") |
[60 (Core)] | ReverseAxis |
::= | "parent" "::" |
Notation
Normalization of axis uses the following auxiliary mapping rule: []Axis.
Normalization for all axis is specified as follows.
The semantics of the following(-sibling) and preceding(-sibling) axes are expressed by mapping them to Core expressions, all other axes are part of the Core and therefore are left unchanged through normalization.
[following:: NodeTest]Axis |
== |
[ancestor-or-self::node()/following-sibling::node()/descendant-or-self::
NodeTest]Expr |
All other forward axes are part of the core [XPath/XQuery] and handled by the normalization rules below:
[child:: NodeTest]Axis |
== |
child::
NodeTest |
[attribute:: NodeTest]Axis |
== |
attribute::
NodeTest |
[self:: NodeTest]Axis |
== |
self::
NodeTest |
[descendant:: NodeTest]Axis |
== |
descendant::
NodeTest |
[descendant-or-self:: NodeTest]Axis |
== |
descendant-or-self:: NodeTest |
[namespace:: NodeTest]Axis |
== |
namespace::
NodeTest |
Reverse axes:
[preceding:: NodeTest]Axis |
== |
[ancestor-or-self::node()/preceding-sibling::node()/descendant-or-self::
NodeTest]Expr |
All other reverse axes are part of the core:
[parent:: NodeTest]Axis |
== |
parent::
NodeTest |
[ancestor:: NodeTest]Axis |
== |
ancestor::
NodeTest |
[ancestor-or-self:: NodeTest]Axis |
== |
ancestor-or-self::
NodeTest |
Introduction
A node test is a condition applied on the nodes selected by an axis step. Node tests are described by the following grammar productions.
[91 (XQuery)] | NodeTest |
::= | KindTest | NameTest |
[92 (XQuery)] | NameTest |
::= | QName | Wildcard |
[93 (XQuery)] | Wildcard |
::= | "*" |
Core Grammar
The Core grammar productions for node tests are:
[61 (Core)] | NodeTest |
::= | KindTest | NameTest |
[62 (Core)] | NameTest |
::= | QName | Wildcard |
[63 (Core)] | Wildcard |
::= | "*" |
Introduction
Predicates are composed of zero or more expressions enclosed in square brackets.
[77 (XQuery)] | Predicates |
::= | ("[" Expr "]")* |
Notation
Normalization of predicates uses the following auxiliary mapping rule: []Predicates.
Predicates in path expressions are normalized with a special mapping rule:
[Expr]Predicates | |||
== | |||
|
Note that the semantics of predicates whose parameter is a numeric value
also works for other numeric than integer values, in which case the
op:numeric-equal
returns false when compared to a position. For
example the expression //a[3.4]
is allowed and always returns
the empty sequence)
The corresponding Section in the [XPath/XQuery] document just contains examples.
[87 (XQuery)] | AbbrevForwardStep |
::= | "@"? NodeTest |
[88 (XQuery)] | AbbrevReverseStep |
::= | ".." |
Here are normalization rules for the abbreviated syntax.
[ @ NameTest ]Expr |
== |
attribute :: NameTest |
Introduction
[XPath/XQuery] supports operators to construct and combine sequences. A sequence is an ordered collection of zero or more items. An item is either an atomic value or a node.
[40 (XQuery)] | Expr |
::= | ExprSingle ("," ExprSingle)* |
[62 (XQuery)] | RangeExpr |
::= | AdditiveExpr ( "to" AdditiveExpr )? |
Core Grammar
The Core grammar productions for sequence expressions are:
[31 (Core)] | Expr |
::= | ExprSingle ("," ExprSingle)* |
A sequence expression is normalized into a sequence of normalized single expressions:
The type of the sequence expression is the sequence over the types of the individual expressions.
Each expression in the sequence is evaluated and the resulting values are concatenated into one sequence.
The default rules for propogating errors, described in [3.5 Error Handling] apply to sequence expressions.
The range operator is normalized to the op:to
operator.
The static semantics of the op:to operator is defined in [Functions and Operators].
The dynamic semantics of the op:to operator is defined in [Functions and Operators].
The error semantics of the op:to operator is defined in [Functions and Operators].
[XPath/XQuery] provides several operators for combining sequences.
[66 (XQuery)] | UnionExpr |
::= | IntersectExceptExpr (
("union" | "|") IntersectExceptExpr )* |
[67 (XQuery)] | IntersectExceptExpr |
::= | ValueExpr ( ("intersect" |
"except") ValueExpr )* |
[68 (XQuery)] | ValueExpr |
::= | ValidateExpr | PathExpr |
[42 (XPath)] | ValueExpr |
::= | PathExpr |
Core Grammar
The corresponding core grammar productions are:
[49 (Core)] | ValueExpr |
::= | ValidateExpr | AxisStep |
Notation
The union, intersect, and except expressions are normalized into function calls to the appropriate functions. The mapping function []SequenceOp is defined by the following tables:
SequenceOp | [SequenceOp]SequenceOp |
"union" | op:union |
"|" | op:union |
"intersect" | op:intersect |
"except" | op:except |
[Expr1 SequenceOp Expr2]Expr |
== |
[SequenceOp]SequenceOp ( [Expr1]Expr, [Expr2]Expr ) |
The static semantics of the functions that operate on sequences are defined in [6 Additional Semantics of Functions].
The dynamic semantics for function calls is given in [4.1.5 Function Calls].
The error semantics for function calls is given in [4.1.5 Function Calls].
[XPath/XQuery] provides arithmetic operators for addition, subtraction, multiplication, division, and modulus, in their usual binary and unary forms.
[63 (XQuery)] | AdditiveExpr |
::= | MultiplicativeExpr (
("+" | "-") MultiplicativeExpr
)* |
[64 (XQuery)] | MultiplicativeExpr |
::= | UnaryExpr ( ("*" | "div" |
"idiv" | "mod") UnaryExpr )* |
[65 (XQuery)] | UnaryExpr |
::= | ("-" | "+")* UnionExpr |
Notation
The mapping function []ArithOp is defined by the following table:
ArithOp | [ArithOp]ArithOp |
"+" | fs:plus |
"-" | fs:minus |
"*" | fs:times |
"div" | fs:div |
"mod" | fs:mod |
Core Grammar
There are no Core grammar rules for value comparisons as they are normalized to other Core expressions.
The normalization rules for all the arithmetic operators except
idiv
first atomize each argument by applying
fn:data
and then apply the internal function fs:convert-operand
to each
argument. If the first argument to this function has type xdt:untypedAtomic
, then the first
argument is cast to a double, otherwise it is returned unchanged. The
overloaded internal function corresponding to the arithmetic operator is then
applied to the two converted arguments. The table above maps the operators to
the corresponding internal function. The mapping from the overloaded internal
functions to the corresponding monomorphic function is given in [B.2 Mapping of Overloaded Internal
Functions].
[Expr1 ArithOp Expr2]Expr | ||||
== | ||||
|
The normalization rules for the idiv
operator are similar,
but instead of casting arguments with type xdt:untypedAtomic
to
xs:double
, they are cast to xs:integer
.
[Expr1 idiv
Expr2]Expr |
||||
== | ||||
|
The unary operators are mapped similarly.
[+ Expr]Expr |
== |
fs:plus(0, fs:convert-operand (fn:data (
[Expr]Expr), 1.0E0)) |
[- Expr]Expr |
== |
fs:minus(0, fs:convert-operand (fn:data (
[Expr]Expr), 1.0E0)) |
The static semantics for function calls is given in [4.1.5 Function Calls].
The dynamic semantics for function calls is given in [4.1.5 Function Calls].
The error semantics for function calls is given in [4.1.5 Function Calls].
Introduction
Comparison expressions allow two values to be compared. [XPath/XQuery] provides four kinds of comparison expressions, called value comparisons, general comparisons, node comparisons, and order comparisons.
[61 (XQuery)] | ComparisonExpr |
::= | RangeExpr ( (ValueComp |
[83 (XQuery)] | ValueComp |
::= | "eq" | "ne" | "lt" | "le" | "gt" | "ge" |
[82 (XQuery)] | GeneralComp |
::= | "=" | "!=" | "<" | "<=" | ">" | ">=" |
[84 (XQuery)] | NodeComp |
::= | "is" | "<<" | ">>" |
Notation
The mapping function []ValueOp is defined by the following table:
ValueOp | [ValueOp]ValueOp |
"eq " |
fs:eq |
"ne " |
fs:ne |
"lt " |
fs:lt |
"le " |
fs:le |
"gt " |
fs:gt |
"ge " |
fs:ge |
Core Grammar
There are no Core grammar rules for value comparisons as they are normalized to other Core expressions.
The normalization rules for the value comparison operators first atomize
each argument by applying fn:data
and then apply the internal
function fs:convert-operand
defined
in [6.1.3 The fs:convert-operand
function]. If the first argument to this function has type xdt:untypedAtomic
, then the first
argument is cast to a string, otherwise it is returned unchanged. The
overloaded internal function corresponding to the value comparison operator
is then applied to the two converted arguments. The table above maps the
value operators to the corresponding internal function. The mapping from the
overloaded internal functions to the corresponding monomorphic function is
given in [B.2 Mapping of Overloaded Internal
Functions].
[Expr1 ValueOp Expr2]Expr | ||||
== | ||||
|
The static semantics for function calls is given in [4.1.5 Function Calls]. The comparison
functions all have return type xs:boolean
, as specified in
[Functions and Operators].
The dynamic semantics for function calls is given in [4.1.5 Function Calls].
The error semantics rules for function calls is given in [4.1.5 Function Calls].
Introduction
General comparisons are defined by adding existential semantics to value
comparisons. The operands of a general comparison may be sequences of any
length. The result of a general comparison is always true
or
false
.
Notation
For convenience, GeneralOp denotes the operators "=
",
"!=
", "<
", "<=
",
">
", or ">=
".
The function []GeneralOp is defined by the following table:
GeneralOp | [GeneralOp]GeneralOp |
"= " |
fs:eq |
"!= " |
fs:ne |
"< " |
fs:lt |
"<= " |
fs:le |
"> " |
fs:gt |
">= " |
fs:ge |
Core Grammar
There are no Core grammar rules for general comparisons as they are normalized to existentially quantified Core expressions.
The normalization rule for a general comparison expression first atomizes
each argument by applying fn:data
and then applies the
existentially quantified some expression to each sequence. The internal
function fs:convert-operand
is
applied to each pair of atomic values. If the first argument to this function
has type xdt:untypedAtomic
,
then the first argument is cast to type of the second argument. If the second
argument has type xdt:untypedAtomic
, the first
argument is cast to a string. The overloaded internal function corresponding
to the general comparison operator is then applied to the two converted
values.
[Expr1 GeneralOp Expr2]Expr | |||||
== | |||||
|
Core Grammar
There are no Core grammar rules for node comparisons as they are normalized to other Core expressions.
The normalization rules for node comparisons map each argument expression and then apply the internal function corresponding to the node comparison operator. The internal function are defined in [B.2 Mapping of Overloaded Internal Functions].
The static semantics for the internal functions are defined in [B.2 Mapping of Overloaded Internal Functions].
The dynamic semantics for internal function is defined in [B.2 Mapping of Overloaded Internal Functions].
The error semantics rules for function calls is given in [4.1.5 Function Calls].
Introduction
A logical expression is either an and-expression or an
or-expression. The value of a logical expression is always one of the
boolean values: true
or false
.
[55 (XQuery)] | OrExpr |
::= | AndExpr ( "or" AndExpr )* |
[56 (XQuery)] | AndExpr |
::= | InstanceofExpr ( "and"
InstanceofExpr )* |
Core Grammar
The Core grammar productions for logical expressions are:
[45 (Core)] | OrExpr |
::= | AndExpr ( "or" AndExpr )* |
[46 (Core)] | AndExpr |
::= | CastableExpr ( "and" CastableExpr )* |
The normalization rules for "and
" and "or
" first
get the effective boolean value of each argument, then apply the appropriate
Core operator.
The logical expressions require that each subexpression have type
xs:boolean
. The result type is also xs:boolean
.
The dynamic semantics of logical expressions is non-deterministic. This
non-determinism permits implementations to use short-circuit evaluation
strategies when evaluating logical expressions. In the expression,
Expr1 and
Expr2, if either expression raises
an error or evaluates to false, the entire expression may raise an error or
evaluate to false. In the expression, Expr1 or
Expr2, if either expression raises an error or evaluates to
true, the entire expression may raise an error or evaluate to true.
[XPath/XQuery] supports two forms of constructors: a direct constructors that supports literal XML syntax for elements, attributes, and text nodes, and a computed constructors that can be used to construct element and attribute nodes, possibly with computed names, and also document and text nodes. All direct constructors are normalized into computed constructors, i.e., there are no direct-constructor expressions in the Core.
Introduction
The static and dynamic semantics of the direct forms of element and attribute constructors is obtained after normalization to computed element constructors.
Notation
The auxiliary mapping rules []ElementContent, and []ElementContent-unit are defined in this section and are used for the normalization of the content of direct element constructors.
Core Grammar
The Core grammar productions for constructors are:
[55 (Core)] | Constructor |
::= | ComputedConstructor |
[76 (Core)] | EnclosedExpr |
::= | "{" Expr "}" |
Core Grammar
There are no Core grammar rules for direct XML element or attribute constructors as they are normalized to computed constructors.
We start with the rules for normalizing a direct element constructor's content. We distinguish between direct element constructors that contain only one element-content unit and those that contain more than one element-content unit. An element-content unit is a contiguous sequence of literal characters (character references, escaped braces, and predefined entity references), one enclosed expression, one direct element constructor, one XML comment, or one XML processing instruction. Here are three direct element constructors that each contain one element-content unit:
<date>{ xsd:date("2003-03-18") }</date> <name>Dizzy Gillespe</name> <comment><!-- Just a comment --></comment>
The first contains one enclosed expression, the second contains one contiguous sequence of characters, and the third contains one XML comment.
The next example contains six element-content units:
<address> <!-- Dizzy's address --> { 123 }-0A <street>Roosevelt Ave.</street> Flushing, NY { 11368 } </address>
It contains one XML comment, followed by one enclosed expression that contains the integer 123, one contiguous sequence of characters ("-0A "), one direct XML element constructor, one contiguous sequence of characters (" Flushing, NY"), and one enclosed expression that contains the integer 11368. Evaluating this expression yields this element value:
<address> <!-- Dizzy's address --> 123-0A <street>Roosevelt Ave.</street> Flushing, NY 11368 </address>
Adjacent element-content units are convenient because they permit arbitrary interleaving of text and atomic data. During evaluation, atomic values are converted to text nodes containing the string representations of the atomic values, and then adjacent text nodes are concatenated together. In the example above, the integer 123 is converted to a string and concatenated with "-0A" and the result is a single text node containing "123-0A".
In general, we do not want to convert all atomic values to text nodes, especially when performing static-type analysis, because we lose useful type information. For example, if we normalize the first example above as follows, we lose the important information that the user constructed a date value, not just a text node containing an arbitrary string:
<date>{ xsd:date("2003-03-18") }</date> (normalization that loses type information) == element date { text { "2003-03-18" } }
So to preserve useful type information, we distinguish between direct element constructor's that contain one element-content unit and those that contain more than one (because multiple element-content units commonly denote concatenatation of atomic data and text). Here is the normalization of the first and fourth examples above:
<date>{ xsd:date("2003-03-18") }</date> == element date { xsd:date("2003-03-18") } <address> <!-- Dizzy's address --> { 123 }-0A <street>Roosevelt Ave.</street> Flushing, NY { 11368 } </address> == element address { fs:item-sequence-to-node-sequence( comment { " Dizzy's address "}, 123, text { "-0A "}, element street {"Roosevelt Ave."}, text { " Flushing, NY " }, 11368 ) }
Editorial note | |
Mary 2003-03-21: Michael Kay has suggested that instead of distinguishing between direct element constructor's that contain one element-content unit and those that contain more than one element-content unit, we should add a new kind of enclosed expression delimited with {| |} that does not apply the conversion rules and therefore preserves the type information. For example, |
<date>{| xsd:date("2003-03-18") |}</date> == element date { xsd:date("2003-03-18") }
Given the distinction between direct element constructors that we made above, we give two normalization rules for a direct element constructor's content. If the direct element constructor contains exactly one element-content unit, we simply normalize that unit by applying the normalization rule for the element content:
[ ElementContent1 ]ElementContent-unit |
== |
[ ElementContent1 ]ElementContent |
If the direct element constructor contains more than one element-content
unit, we normalize each unit individually and construct a sequence of the
normalized results interleaved with empty text nodes. The empty text nodes
guarantee that the results of evaluating consecutive element-content units
can be distinguished. Then we apply the function fs:item-sequence-to-node-sequence
.
Section 3.7.1
Direct Element ConstructorsXQ specifies the
rules for converting a sequence of atomic values and nodes into a sequence of
nodes before element construction. The special function fs:item-sequence-to-node-sequence
implements these conversion rules.
[ElementContent1 ..., ElementContentn]ElementContent-unit, n > 1 |
== |
fs:item-sequence-to-node-sequence (
[ ElementContent1
]ElementContent , text { "" },
..., text { "" }, [
ElementContentn]ElementContent) |
We need to distinguish between multiple element-content units, because the
rule for converting sequences of atomic values into strings apply to
sequences within distinct enclosed expressions. The empty text nodes are
eliminated during evaluation of fs:item-sequence-to-node-sequence
when consecutive text nodes are coalesced into a single text node. The text
node guarantees that a whitespace character will not be inserted between
atomic values computed by distinct enclosed expressions. For example, here is
an expression, its normalization, and the resulting XML value:
<example>{ 1 }{ 2 }</example> == element example { fs:item-sequence-to-node-sequence ((1, text {""}, 2)) } ==> <example>12</example>
In the absence of the empty text node, the expression would evaluate to the following incorrect value:
<example>{ 1 }{ 2 }</example> (incorrect normalization) == element example { fs:item-sequence-to-node-sequence ((1, 2)) } (incorrect value) ==> <example>1 2</example>
Now that we have explained the normalization rules for direct element content, we give the rules for the two forms of direct XML element constructors. Note that the direct attribute constructors are normalized twice: the []NamespaceAttrs normalizes the namespace-declaration attributes and []Attribute normalizes all other attributes that are not namespace declarations.
[98 (XQuery)] | DirElemConstructor |
::= | "<" QName AttributeList
("/>" | (">" ElementContent*
"</" QName S? ">")) |
[ < QName AttributeList > ElementContent* </ QName S? > ]Expr |
== |
element [QName]Expr{ [ AttributeList ]NamespaceAttrs , [ AttributeList ]Attribute , [ ElementContent* ]ElementContent } |
[ < QName AttributeList /> ]Expr |
== |
element [QName]Expr { [ AttributeList ]NamespaceAttrs , [ AttributeList ]Attribute } |
Next, we give the normalization rules for each element-content unit. The normalization rule for a contiguous sequence of characters assumes:
that the significant whitespace characters in element constructors have been preserved, as described in [4.7.1.4 Whitespace in Element Content];
that character references have been resolved to individual characters and predefined entity references have been resolved to sequences of characters, and
that the rule is applied to the longest contiguous sequence of characters.
The following normalization rule takes the longest consecutive sequence of individual characters that include literal characters, escaped curly braces, character references, and predefined entity references and normalizes the character sequence as a text node containing the string of characters..
[(Char | "{{" | "}}" | CharRef | PredefinedEntityRef)+]ElementContent |
== |
text {
fn:codepoints-to-string ((Char | "{{" | "}}" |
CharRef | PredefinedEntityRef)+) } |
XML processing instructions and comments in element content are normalized by applying the standard normalization rules for expressions, which appear in [4.7.2 Other Direct Constructors].
[XmlProcessingInstruction]ElementContent |
== |
[XmlProcessingInstruction]Expr |
[XmlComment]ElementContent |
== |
[XmlComment]Expr |
An enclosed expression in element content is normalized by normalizing each individual expression in its expression sequence and then constructing a sequence of the normalized values:
[ { Expr1, ..., Exprn } ]ElementContent |
== |
[ Expr1 ]Expr , ..., [ Exprn]Expr |
There are no additional static type rules for direct XML element or attribute constructors.
There are no additional dynamic evaluation rules for direct XML element or attribute constructors.
There are no additional error semantics rules for direct XML element or attribute constructors.
Like literal XML element constructors, literal XML attribute constructors are normalized to computed attribute constructors.
[110 (XQuery)] | AttributeList |
::= | (S (QName S? "=" S? AttributeValue)?)* |
[111 (XQuery)] | AttributeValue |
::= | ('"' (EscapeQuot | QuotAttrValueContent)* '"') |
Notation
The auxiliary mapping rules []Attribute, []AttributeContent, and []AttributeContent-unit, are defined in this section and are used for the normalization of the content of direct attribute constructors.
Direct attribute may contain namespace-declaration attributes. The normalization rules for attributes ignore namespace-declaration attributes -- they are handled by the normalization rules in [4.7.1.2 Namespace Declaration Attributes].
An AttributeList is normalized by the following rule, which maps each of the individual attribute-value expressions in the attribute list and constructs a sequence of the normalized values.
[
|
|||
== | |||
|
Namespace-declaration attributes, i.e., those
attributes whose prefix is xmlns
are ignored by mapping them to
the empty sequence.
|
||
== | ||
() |
All attributes that are not namespace-declaration attributes are mapped to computed attribute constructors.
|
||
== | ||
attribute [Prefix:LocalPart ]Expr { [AttributeValue]AttributeContent} |
As with literal XML elements, we need to distinguish between direct attribute constructors that contain one attribute-content unit and those that contain multiple attribute-content units, because the rule for converting sequences of atomic values into strings are applied to sequences within distinct enclosed expressions. If the direct attribute constructor contains exactly one attribute-content unit, we simply normalize that unit by applying the normalization rule for the attribute content:
[ AttributeValueContent1 ]AttributeContent-unit |
== |
[AttributeValueContent1]AttributeContent |
If the direct attribute constructor contains more
than one attribute-content unit, we normalize each unit individually and
construct a sequence of the normalized results interleaved with empty text
nodes. The empty text nodes guarantee that the results of evaluating
consecutive attribute-content units can be distinguished. Then we apply the
function fs:item-sequence-to-untypedAtomic
,
which applies the appropriate conversion rules to the normalized attribute
content:
[ AttributeValueContent1 ..., AttributeValueContentn ]AttributeContent-unit, n > 1 |
== |
fs:item-sequence-to-untypedAtomic (
[ AttributeValueContent1
]AttributeContent , text { ""
}, ..., text {""}, [
AttributeValueContentn]AttributeContent) |
Literal characters, escaped curly braces, character references, and predefined entity references in attribute content are treated as in element content. In addition, the normalization rule for characters in attributes assumes:
that an escaped single or double quote is converted to an individual single or double quote.
The following normalization rules take the longest consecutive sequence of individual characters that include literal characters, escaped curly braces, escaped quotes, character references, predefined entity references, and escaped single and double quotes and normalizes the character sequence as a string.
[( Char | CharRef | EscapeQuot | EscapeApos | PredefinedEntityRef ) +]AttributeContent |
== |
fn:codepoints-to-string (( Char | CharRef |
EscapeQuot | EscapeApos | PredefinedEntityRef
)+) |
We normalize an enclosed expression in attribute content by normalizing each individual expression in its expression sequence and then construct a sequence of the normalized values:
[ { Expr0, ..., Exprn } ]AttributeContent |
== |
([ Expr0 ]Expr , ..., [ Exprn]Expr) |
Notation
The auxiliary mapping rules []NamespaceAttr, and []NamespaceAttrs are defined in this section and are used for the normalization of namespace declaration attributes.
Direct attribute may contain namespace-declaration attributes. The normalization rules for namespace-declaration attributes ignore all non-namespace attributes -- they are handled by the normalization rules in [4.7.1.1 Attributes].
An AttributeList containing namespace-declaration attributes is normalized by the following rule, which maps each of the individual namespace-declaration attributes in the attribute list and constructs a sequence of the normalized namespace attribute values.
[
|
|||
== | |||
|
Attributes whose prefix is not xmlns
are ignored by mapping
them to the empty sequence.
|
||
== | ||
() |
Namespace-declaration attributes are normalized to computed namespace constructors.
|
||
== | ||
namespace LocalPart { [AttributeValue]AttributeContent} |
The rules for normalizing element content are given above in [4.7.1 Direct Element Constructors].
Section 3.7.1.4
Whitespace in Element ContentXQ describes how
whitespace in element and attribute constructors is processed depending on
the value of the xmlspace
declaration in the query prolog. the
Formal Semantics assumes that the rules for handling whitespace are applied
prior to normalization rules, for example, during parsing of a query.
Therefore, there are no formal rules for handling whitespace.
Direct element constructors are normalized into computed element constructors. The rules for assigning a type to a constructed element are given in [4.7.3.1 Computed Element Constructors].
[106 (XQuery)] | CdataSection |
::= | "<![CDATA[" Char* "]]>" |
[107 (XQuery)] | XmlPI |
::= | "<?" PITarget Char*
"?>" |
[18 (XQuery)] | PITarget |
::= | NCName |
[108 (XQuery)] | XmlComment |
::= | "<!--" Char* "-->" |
Core Grammar
The Core grammar productions for other constructors and comments are:
[75 (Core)] | XmlComment |
::= | "<!--" Char* "-->" |
A literal XML character data (CDATA) section is normalized into a computed text-node constructor by applying the rule for converting characters to a text node in element content.
[<![CDATA[" Char* "]]>]ElementContent |
== |
[Char*]ElementContent |
A literal XML processing instruction is normalized into a computed processing-instruction constructor; its character content is converted to a string as in attribute content.
[<? NCName Char* ?>"]Expr |
== |
processing-instruction NCName { [Char*]AttributeContent } |
A literal XML comment is normalized into a computed comment constructor; its character content is converted to a string as in attribute content.
[<!-- Char* -->]Expr |
== |
comment { [Char*]AttributeContent } |
There are no additional static type rules for CDATA or direct processing-instruction constructors.
There are no additional dynamic evaluation rules for CDATA or direct processing-instruction constructors.
There are no additional error semantics rules for CDATA or direct processing-instruction constructors.
[81 (XQuery)] | ComputedConstructor |
::= | CompElemConstructor |
[100 (XQuery)] | CompElemConstructor |
::= | (("element" QName "{") | ("element" "{" Expr "}" "{")) Expr?
"}" |
[102 (XQuery)] | CompAttrConstructor |
::= | (("attribute" QName "{") | ("attribute" "{" Expr "}" "{")) Expr?
"}" |
[99 (XQuery)] | CompDocConstructor |
::= | "document" "{" Expr "}" |
[105 (XQuery)] | CompTextConstructor |
::= | "text" "{" Expr? "}" |
[103 (XQuery)] | CompXmlPI |
::= | (("processing-instruction" NCName "{") | ("processing-instruction"
"{" Expr "}" "{")) Expr? "}" |
[104 (XQuery)] | CompXmlComment |
::= | "comment" "{" Expr "}" |
[101 (XQuery)] | CompNSConstructor |
::= | ("namespace" NCName "{") Expr
"}" |
Notation
Computed namespace constructors may occur explicitly in a computed element constructor or may be the result of normalizing namespace-declaration attributes contained in direct element constructors. For computed namespace constructors that occur explicitly in a query, the immediately enclosing expression of the computed namespace constructor must be a computed element constructor; otherwise, as specified a static error is raised. To simplify typing and evaluation, we assume that the mapping function []SplitNamespaces reorders the expressions in a computed element constructor so that all computed namespace constructors preceed all other expressions. We do not define this mapping function formally.
Core Grammar
The Core grammar rule for computed element constructors is:
[69 (Core)] | CompElemConstructor |
::= | (("element" QName "{") | ("element"
"{" Expr "}" "{")) Expr? "}" |
Computed element constructors are normalized by mapping their name and content expression.
[element QName { Expr }]Expr |
== |
element [QName]Expr { [ [Expr]SplitNamespaces ]Expr } |
When the name of an element is computed, the normalization rule also
checks that the value of the element's name is a xs:QName
or a
xs:string
. If the name expression returns a string, that string
is implicitly cast to a QName by using the xs:QName
constructor
function. The resulting expanded QName is used as the name of the constructed
element.
[element { Expr1 } { Expr2 }]Expr | |||||||
== | |||||||
|
The normalization rules of direct element and attribute constructors leave us with only the computed forms of constructors. The static and dynamic semantic rules are defined on all the computed forms. The computed element constructor itself has two forms: one in which the element name is a literal QName, and the other in which the element name is a computed expression.
We start with the static rule for an element constructor with a computed
name expression, because it is the simplest rule. Because the element's name
cannot be known until runtime, the element is given the wildcard type,
element * of type xdt:untypedAny
. The computed name
expression must have type xs:QName
and the content expression
must have a type of zero-or-more attributes followed by zero-or-more element,
text, comment, or processing-instruction nodes. Note that a computed
namespace constructor has the empty type and therefore does not effect the
type of the element's content.
|
|||
|
|||
statEnv |- element {
Expr1 } {
Expr2 } : element * of type xdt:untypedAny |
In the first rule for a computed element constructor with a literal QName,
we consider the case in which the validation mode is set to "skip", which
means that all type information in the element is ignored and the new element
is labeled with the type xdt:untypedAny
. The content expression
must have a type of zero-or-more attributes followed by zero-or-more element,
text, comment, or processing-instruction nodes. The result type is an element
of type xdt:untypedAny
.
|
||||
|
||||
statEnv |- element
QName { Expr } : element expanded-QName of type xdt:untypedAny |
In the remaining rules, the validation mode is either "lax" or "strict", and the static rules for element constructors check that the type of the element's content expression are valid with respect to the type of the element being constructed.
Recall that in the static semantics of arithmetic, general comparison, and
function calls, the static rules are liberal when an expression with
type xdt:untypedAtomic
is
used in the context where an atomic typed expression is required. In these
cases, it is not possible to determine statically whether the evaluation-time
cast of the xdt:untypedAtomic
value to the
required type will succeed, but the static rules optimistically assume that
they will. In all other cases, the static rules are conservative and
require that the type of the argument expression be a subtype of the required
type. We apply a similar static semantics to element constructors in which
the required type of the constructed element is a simple type, i.e., a
subtype of xdt:anyAtomicType
*
The first rule below applies to element constructors whose required type is a simple type. If the content expression is a sequence of expressions all of which have type untypedAtomic or text, we apply a liberal static rule (i.e., assume the run-time validation will succeed) and assign the appropriate static type:
|
||||||
|
||||||
statEnv |- element QName { Expr } : element expanded-QName of type TypeName |
The second rule also applies to element constructors whose required type
is a simple type and whose content expression is not a sequence of
expressions of type xdt:untypedAtomic
or text. In this
case, the static rule is conservative and requires that the type of the
context expression be a subtype of the element's required type.
|
|||||||
|
|||||||
statEnv |- element QName { Expr } : element expanded-QName of type TypeName |
The third rule applies to element constructors whose required type is complex. In this case, the static type rule is conservative: the type of the element's content expression must be a subtype of its required type.
|
|||||||
|
|||||||
statEnv |- element QName { Expr } : element expanded-QName of type TypeName |
The following rules take a computed element constructor expression and construct an element node. The dynamic semantics for computed element constructors is the most complex of all expressions in XQuery. Here is how to read the rule below.
First, the element's name is expanded into a qualified name.
Second, the element's content expression is partitioned into the computed namespace constructors and all other expressions, and the computed namespace constructors are evaluated, yielding a sequence of namespace annotations. The static environment is extended to include the new namespace annotations, which are all active.
Third, the function fs:item-sequence-to-node-sequence
is applied to the element's content expression (excluding computed namespace
constructors); this function call is evaluated in the new static and dynamic
environment. Recall from [4.7.1 Direct
Element Constructors] that during normalization, we do not convert
the content of direct element constructors that contain one element-content
unit. This guarantees that useful type information is preserved for static
analysis. Since the conversion function fs:item-sequence-to-node-sequence
was not applied to all element constructors during normalization, we have to
apply it at evaluation time. (Obviously, it is possible to elide the
application of fs:item-sequence-to-node-sequence
injected during normalization and the application injected during
evaluation.) The resulting value Value0 must
match zero-or-more attributes followed by zero-or-more element, text,
processing-instruction or comment nodes.
Fourth, The namespace annotations are concatenated with the list of active namespaces in the namespace environment statEnv.namespace and the namespaces corresponding to the element's name and all attributes names. The resulting sequence is the sequence of namespace annotations for the element.
Lastly, recall that an element constructor automatically validates the
constructed element node, using the validation mode and validation context
from its static context. The last step in evaluation is to construct a new
element with type annotation xdt:untypedAny
and with content
Value0
and then validate the element using the validation mode and context from the
static environment.
|
|||||||||||||||
|
|||||||||||||||
statEnv dynEnv |- element QName { Expr } => Value1 |
The dynamic evaluation of an element constructor with a computed name is similar.
|
|||||||||||||||
|
|||||||||||||||
statEnv dynEnv |- element { Expr1 } { Expr2 } => Value2 |
The default rules for propogating errors, described in [3.5 Error Handling] apply to element constructors.
In addition, a computed element constructor with a computed name raises a
type error if the name value is not a xs:QName
.
|
||
|
||
dynEnv |- element { Expr1 } { Expr2 } raises typeError |
Both forms of computed element constructors raise a type error if the element's content is not a sequence of attributes followed by a sequence of element, text, comment, processing-instruction nodes, or atomic values.
Core Grammar
The Core grammar rule for computed attribute constructors is:
[71 (Core)] | CompAttrConstructor |
::= | (("attribute" QName "{") |
("attribute" "{" Expr "}" "{")) Expr? "}" |
Computed attribute constructors are normalized by mapping their name and content expression in the same way that computed element constructors are normalized.
[attribute { Expr1 } { Expr2 }]Expr | |||||||
== | |||||||
|
The normalization rules for direct attribute constructors leave us with only the computed form of the attribute constructors. Like a computed element constructors, a computed attribute constructor has two forms: one in which the attribute name is a literal QName, and the other in which the attribute name is a computed expression.
We start with the static rule for an attribute constructor with a computed
name expression, because it is the simplest rule. The computed name
expression must have type xs:QName
. The result type is an
attribute of type xdt:untypedAtomic
.
|
||
|
||
statEnv |- attribute {
Expr1 } {
Expr2 } : attribute * of type xdt:untypedAtomic |
In the first rule for a computed attribute constructor with a literal
QName, we consider the case in which the validation mode is set to "skip",
which means that all type information in the attribute is ignored and the new
attribute is labeled with the type xdt:untypedAtomic
.
|
|||
|
|||
statEnv |- attribute
QName { Expr } : attribute expanded-QName of type xdt:untypedAtomic |
In the remaining rules, the validation mode is either "lax" or "strict", and the static rules for attribute constructors check that the type of the attribute's content expression are valid with respect to the type of the attribute being constructed.
As in element constructors, the static rules are liberal when a single
xdt:untypedAtomic
content
expression is provided as an argument and conservative, otherwise.
If the content expression is a sequence of expressions all of which are untypedAtomic, we apply a liberal static rule (i.e., assume the validation will succeed) and assign the appropriate static type:
|
|||||
|
|||||
statEnv |- attribute QName { Expr } : attribute expanded-QName TypeReference |
In the second case, the static type rule is conservative: the type of the attribute's content expression must be a subtype of its required type.
|
|||||||
|
|||||||
statEnv |- attribute QName { Expr } : attribute expanded-QName of type TypeName |
The following rules take a computed attribute constructor expression and
construct an attribute node. The rules are similar to those rules for element
constructors. First, the attribute's name is expanded into a qualified name.
Second, the function fs:item-sequence-to-untypedAtomic
is applied to the content expression and this function call is evaluated in
the dynamic environment. Recall from [4.7.3.2 Computed Attribute
Constructors] that during normalization, we do not convert the
content of direct attribute constructors that contain one attribute-content
unit. This guarantees that useful type information is preserved for static
analysis. Since the conversion function fs:item-sequence-to-untypedAtomic
was not applied to all attribute constructors during normalization, we have
to apply it at evaluation time. (As before, it is possible to elide the
application of fs:item-sequence-to-untypedAtomic
injected during normalization and the application injected during
evaluation.)
|
|||
|
|||
dynEnv |- attribute
QName { Expr } => attribute expanded-QName of type xdt:untypedAtomic { Value } |
|
|||
|
|||
dynEnv |- attribute {
Expr } { Expr } => attribute { Value0 } of
type xdt:untypedAtomic {
Value } |
The default rules for propogating errors, described in [3.5 Error Handling] apply to attribute
constructors. In addition, an attribute constructor with a computed name
raises a type error if the name value is not a xs:QName
. the
xmlns
namespace.
|
||
|
||
dynEnv |- attribute { Expr1 } { Expr2 } raises typeError |
A dynamic error is raised if the namespace URI of the attribute's QName,
whether known statically or dynamically, is in the xmlns
namespace.
|
|||
|
|||
dynEnv |- attribute { Expr1 } { Expr2 } raises dynError |
Core Grammar
The Core grammar rule for a computed document constructor is:
[68 (Core)] | CompDocConstructor |
::= | "document" "{" Expr "}" |
A document node constructor contains an expression, which must evaluate to a sequence of element, text, comment, or processing-instruction nodes. Section 3.7.3.3 Document Node ConstructorsXQ specifies the rules for converting a sequence of atomic values and nodes into a sequence of nodes before document construction. The built-in function [6.1.6 The fs:item-sequence-to-node-sequence function] implements this conversion.
[document { Expr }]Expr |
== |
document { fs:item-sequence-to-node-sequence (
[Expr]Expr) } |
The static semantics checks that the type of the argument expression is a
sequence of element, text, processing-instruction, and comment nodes. The
type of the entire expression is a document
with the given
content type.
|
||
|
||
statEnv |- document { Expr } : document { Type } |
The dynamic semantics checks that the argument expression evaluates to a value that is a sequence of element, text, processing-instruction, or comment nodes. The entire expression evaluates to a new document node value. Note that the type annotations for all the nodes in content of a document node are eliminated; the erases to judgment performs this erasure.
|
||
|
||
dynEnv |- document { Expr } => document { Value } |
The default rules for propogating errors, described in [3.5 Error Handling] apply to document node constructors. In addition, if the argument expression evaluates to a value that is not a sequence of element, text, processing-instruction, or comment nodes, a type error is raised.
Core Grammar
The Core grammar rule for a computed text constructor is:
[74 (Core)] | CompTextConstructor |
::= | "text" "{" Expr? "}" |
A text node constructor contains an expression, which must evaluate to an
xs:string
value. Section 3.7.3.4 Text Node
ConstructorsXQ specifies the rules for
converting a sequence of atomic values into a string prior to construction of
a text node. Each node is replaced by its string value. For each adjacent
sequence of one or more atomic values returned by an enclosed expression, a
untyped atomic value is constructed, containing the canonical lexical
representation of all the atomic values, with a single blank character
inserted between adjacent values. As formal specification of these conversion
rules is not instructive, [6.1.7
The fs:item-sequence-to-untypedAtomic function] implements this
conversion.
[text { Expr }]Expr |
== |
text { (fs:item-sequence-to-untypedAtomic (
[Expr]Expr)) cast as
xs:string } |
The static semantics checks that the argument expression has type
xs:string
. The type of the entire expression is an zero-or-one
text
node type. The type is zero-or-one, because no text node is
constructed if the argument of the text node constructor is the empty
string.
The dynamic semantics checks that the argument expression evaluates a
value of type xs:string
. The entire expression evaluates to a
new text node.
The default rules for propogating errors, described in [3.5 Error Handling] apply to text node constructors. In addition, if the argument expression evaluates to a value that is not a string, a type error is raised.
Core Grammar
The Core grammar rule for computed processing-instruction constructors is:
[72 (Core)] | CompXmlPI |
::= | (("processing-instruction" NCName
"{") | ("processing-instruction" "{" Expr "}"
"{")) Expr? "}" |
Computed processing-instruction constructors are normalized by mapping their name and content expression in the same way that computed element and attribute constructors are normalized.
[pi QName { Expr }]Expr |
== |
pi [QName]Expr { ( fs:item-sequence-to-untypedAtomic
([Expr]Expr ))
cast as xs:string } |
[pi { Expr1 } { Expr2 }]Expr | ||||||||
== | ||||||||
|
The static typing rules for processing-instruction constructors are straightforward.
statEnv |-
Expr1 : xs:QName
statEnv |-
Expr2 : xs:string |
|
statEnv |- pi { Expr1 } { Expr2 } : processing-instruction |
The dynamic evaluation rules for computed processing instructions are straightforward.
|
||
|
||
dynEnv |- pi NCName { Expr } => processing-instruction NCName { Value } |
|
|||
|
|||
dynEnv |- pi { Expr1 } { Expr2 } => processing-instruction Value1 { Value2 } |
The default rules for propogating errors, described in [3.5 Error Handling] apply to computed processing-instruction constructors. The normalization rules guarantee that a dynamic error is raised if the target expression is not a string or a QName.
Core Grammar
The Core grammar rule for computed comment constructors is:
[73 (Core)] | CompXmlComment |
::= | "comment" "{" Expr "}" |
Computed processing-instruction constructors are normalized by mapping their content expression.
[comment { Expr }]Expr |
== |
comment { (fs:item-sequence-to-untypedAtomic
([Expr]Expr))
cast as xs:string } |
The static typing rule for computed comment constructors is straightforward.
The dynamic evaluation rule for computed comment constructors is straightforward.
|
||
|
||
dynEnv |- comment { Expr } => comment { Value } |
The default rules for propogating errors, described in [3.5 Error Handling] apply to computed comment constructors.
Core Grammar
The Core grammar rule for computed namespace constructors is:
[70 (Core)] | CompNSConstructor |
::= | ("namespace" NCName "{") Expr "}" |
Computed namespace constructors are normalized by mapping their name and content expression in the same way as computed processing-instruction constructors.
[namespace NCName { Expr }]Expr |
== |
namespace [NCName]Expr { ( fs:item-sequence-to-untypedAtomic
([Expr]Expr ))
cast as xs:string } |
A computed namespace constructor may only occur within a computed element constructor. The result of evaluating a computed element constructor is a namespace annotation, which annotates the element thats results from evaluating the containing computed element constructor. Because the computed namespace constructor has no effect on the type of the element that it annotates, it is given the empty-sequence type.
The dynamic evaluation rules for a computed namespace constructor are given in the evaluation rules for computed element constructors in [4.7.3.1 Computed Element Constructors].
The default rules for propogating errors, described in [3.5 Error Handling] apply to computed namespace constructors.
The semantics of namespace nodes on constructed elements in specified in [4.7.1 Direct Element Constructors] and [4.7.3.1 Computed Element Constructors].
Introduction
[XPath/XQuery] provides [For/FLWR] expressions for iteration, for binding variables to intermediate results, and filtering bound variables according to a predicate.
A FLWRExpr in XQuery 1.0 consists of a sequence of ForClauses and LetClauses, followed by an optional WhereClause, followed by the , as described by the following grammar productions. Each variable binding is preceded by an optional type declaration which specify the type expected for the variable.
[42 (XQuery)] | FLWORExpr |
::= | (ForClause | LetClause)+ WhereClause? OrderByClause? "return" ExprSingle |
[43 (XQuery)] | ForClause |
::= | "for" "$" VarName TypeDeclaration? PositionalVar? "in" ExprSingle ("," "$" VarName TypeDeclaration? PositionalVar? "in" ExprSingle)* |
[45 (XQuery)] | LetClause |
::= | "let" "$" VarName TypeDeclaration? ":=" ExprSingle ("," "$" VarName TypeDeclaration? ":=" ExprSingle)* |
[123 (XQuery)] | TypeDeclaration |
::= | "as" SequenceType |
[44 (XQuery)] | PositionalVar |
::= | "at" "$" VarName |
[46 (XQuery)] | WhereClause |
::= | "where" Expr |
[47 (XQuery)] | OrderByClause |
::= | ("order" "by" | "stable" "order" "by") OrderSpecList |
[48 (XQuery)] | OrderSpecList |
::= | OrderSpec ("," OrderSpec)* |
[49 (XQuery)] | OrderSpec |
::= | ExprSingle OrderModifier |
[50 (XQuery)] | OrderModifier |
::= | ("ascending" | "descending")? (("empty" "greatest") | ("empty"
"least"))? ("collation" StringLiteral)? |
[25 (XPath)] | ForExpr |
::= | SimpleForClause "return"
ExprSingle |
[26 (XPath)] | SimpleForClause |
::= | "for" "$" VarName "in" ExprSingle ("," "$" VarName "in"
ExprSingle)* |
Notation
Individual [For/FLWR] clauses are normalized by means of the auxiliary normalization rules:
Where FLWORClause can be any either a ForClause, a LetClause, a WhereClause, or an OrderByClause. The OrderByClause is discussed in [4.8.4 Order By and Return Clauses].
Normalized FLWOR expressions restrict a For and Let clause to bind only one variable. Otherwise, the Core FLWOR expression is the same as the XQuery FLWOR expression.
Notation
The auxiliary rule []FLWOR(Expr) normalizes a For, Let, or Where clause in a expression. Note that the rule takes the remainder of the FLWOR expression (other For, Let, or Where clauses and the Return clause) as a parameter in Expr.
The [For/FLWR] expressions include the FLWRExpr of XQuery and the ForExpr of XPath. The normalization rule for ForExpr is simple: It simply unrolls a ForExpr that binds multiple variables into nested ForExprs, each of which bind one variable.
[for VarRef0 in Expr0, ..., VarRefn in Exprn return Expr ]Expr | ||||
== | ||||
|
Full FLWRExpr expressions are normalized to nested core expressions using two sets of normalization rules. Note that some of the rules also accept ungrammatical FLWRExprs such as "where Expr1 return Expr2". This does not matter, as normalization is always applied on parsed [XPath/XQuery] expressions, and ungrammatical FLWRExprs would be rejected by the parser beforehand.
The first set of rules is applied on a full [For/FLWR] expression, splitting it at the clause level, then applying further normalization on each separate clause.
[ (ForClause | LetClause | WhereClause | OrderByClause) FLWRExpr ]Expr |
== |
[(ForClause | LetClause | WhereClause | OrderByClause)]FLWOR([FLWRExpr]Expr) |
[ (ForClause | LetClause | WhereClause | OrderByClause) return Expr ]Expr |
== |
[(ForClause | LetClause | WhereClause | OrderByClause)]FLWOR([Expr]Expr) |
Then each [For/FLWR] clause is normalized separately. A ForClause may bind more than one variable, whereas a For expression in the [XPath/XQuery] Core binds and iterates over only one variable. Therefore, a ForClause is normalized to nested for expressions:
[
|
|||
== | |||
|
Note that the additional Expr parameter of the auxiliary normalization rule is used as the final return expression.
Likewise, a LetClause clause is normalized to nested let expressions, each of which binds one variable:
[
|
|||
== | |||
|
A WhereClause is normalized to an IfExpr, with the else-branch returning the empty sequence:
Example
The following simple example illustrates, how a FLWRExpr is
normalized. The for
expression in the example below is used to
iterate over two collections, binding variables $i
and
$j
to items in these collections. It uses a let
clause to binds the local variable $k
to the sum of both
numbers, and a where
clause to select only those numbers that
have a sum equal to or greater than the integer 5
.
for $i as xs:integer in (1, 2), $j in (3, 4) let $k := $i + $j where $k >= 5 return <tuple> <i> { $i } </i> <j> { $j } </j> </tuple>
By the first set of rules, this is normalized to (except for the operators and element constructor which are not treated here):
for $i as xs:integer in (1, 2) return for $j in (3, 4) return let $k := $i + $j return if ($k >= 5) then <tuple> <i> { $i } </i> <j> { $j } </j> </tuple> else ()
For each binding of $i
to an item in the sequence (1 ,
2)
the inner for
expression iterates over the sequence
(3 , 4)
to produce tuples ordered by the ordering of the outer
sequence and then by the ordering of the inner sequence. This core expression
eventually results in the following document fragment:
(<tuple> <i>1</i> <j>4</j> </tuple>, <tuple> <i>2</i> <j>3</j> </tuple>, <tuple> <i>2</i> <j>4</j> </tuple>)
Core Grammar
After normalization, single for
expressions are described by
the following Core grammar production.
[33 (Core)] | FLWORExpr |
::= | (ForClause | LetClause) "return" ExprSingle |
[34 (Core)] | ForClause |
::= | "for" "$" VarName TypeDeclaration? PositionalVar? "in" ExprSingle |
[35 (Core)] | PositionalVar |
::= | "at" "$" VarName |
[84 (Core)] | TypeDeclaration |
::= | "as" SequenceType |
A single for
expression is typed as follows: First Type1 of the
iteration expression Expr1 is
inferred. Then the prime type of Type1 - prime(Type1) - is
determined. This is a union over all item types in Type1 (see also
[7.5 Judgments for sequences of item
types]). With the variable component of the static environment
statEnv extended with VarRef1 as
type prime(Type1), the type
Type2 of
Expr2 is inferred. Because the
for
expression iterates over the result of
Expr1, the final type of the
iteration is Type2 multiplied with the possible number of items in Type1 (one,
?
, *
, or +
). This number is determined
by the auxiliary type-function quantifier(Type1).
|
|||
|
|||
statEnv |- for VarRef1 in Expr1 return Expr2 : Type2 · quantifier(Type1) |
When a positional variable Variablepos is present, the static environment is also extended with
the positional variable typed as an xs:integer
.
|
|||
|
|||
statEnv |- for VarRef1 at VarRefpos in Expr1 return Expr2 : Type2 · quantifier(Type1) |
When a type declaration is present, the static semantics also checks that the type of the input expression is a subtype of the declared type and extends the static environment by typing VarRef1 with type Type0. This semantics is specified by the following typing rule.
|
|||||
|
|||||
statEnv |- for VarRef1 as SequenceType in Expr1 return Expr2 : Type2 · quantifier(Type1) |
The last rule contains a For expression that contains a type declaration and a positional variable. When the positional variable is present, the static environment is also extended with the positional variable typed as an integer.
|
|||||
|
|||||
statEnv |- for VarRef1 as SequenceType at VarRefpos in Expr1 return Expr2 : Type2 · quantifier(Type1) |
Example
For example, if $example
is bound to the sequence 10.0,
1.0E1, 10
of type xsd:decimal, xsd:float, xsd:integer
,
then the query
for $s in $example return $s * 2
is typed as follows:
(1) prime(xsd:decimal, xsd:float, xsd:integer) = xsd:decimal | xsd:float | xsd:integer (2) quantifier(xsd:decimal, xsd:float, xsd:integer) = + (3) $s : xsd:decimal | xsd:float | xsd:integer (4) $s * 2 : xsd:decimal | xsd:float | xsd:integer (5) result-type : ( xsd:decimal | xsd:float | xsd:integer ) +
This result-type is not the most specific type possible. It does not take
into account the order of elements in the input type, and it ignores the
individual and overall number of elements in the input type. The most
specific type possible is: element out {element one {}}, element out
{element two {}}, element out {element three {}}
. However, inferring
such a specific type for arbitrary input types and arbitrary return clauses
requires significantly more complex type inference rules. In addition, if put
into the context of an element, the specific type violates the "unique
particle attribution" restriction of XML schema, which requires that an
element must have a unique content model within a particular context.
The evaluation of a for
expression distinguishes two cases:
If the iteration expression Expr1
evaluates to the empty sequence, then the entire expression evaluates to the
empty sequence:
dynEnv |- Expr1 => Value empty(Value) |
|
dynEnv |- for VarRef1 TypeDeclaration? in Expr1 return Expr2 => () |
Otherwise, the iteration expression Expr1, is evaluated to produce the sequence Item1, ...,
Itemn.
For each item Itemi in this sequence, the body of the for
expression Expr2 is evaluated in
the environment dynEnv extended
with VarRef1 bound to Itemi. This
produces values Valuei, ..., Valuen which
are concatenated to produce the result sequence.
|
|||||
|
|||||
dynEnv |- for VarRef in Expr1 return Expr2 => Value1 ,..., Valuen |
The following rule is the same as the rule above, but includes the optional positional variable VarRefpos If present, VarRefpos is bound to the position of the item in the input sequence, i.e., the value i.
|
||||||
|
||||||
dynEnv |- for VarRef at VarRefpos in Expr1 return Expr2 => Value1 ,..., Valuen |
When a type declaration is present, the dynamic semantics also checks that each item in the result of evaluating Expr1 matches the declared type. This semantics is specified by the following dynamic rule.
|
||||||||
|
||||||||
dynEnv |- for VarRef as SequenceType in Expr1 return Expr2 => gr_Value1; ,..., Valuen |
The last rule covers a For expresstion that contains a type declaration and a positional variable.
|
|||||||||
|
|||||||||
dynEnv |- for VarRef as SequenceType at VarRefpos in Expr1 return Expr2 => Value1 ,..., Valuen |
Note that this definition allows non-deterministic evaluation of the resulting sequence, since the judgments above the inference rule can be evaluated in any order.
If evaluation of the first expression raises an error, the entire expression raises an error. This rule applies to all forms of a For expression, i.e., those with or without a type declaration or positional variable.
dynEnv |- Expr1 raises Error |
|
dynEnv |- for Variable1 TypeDeclaration1? PositionalVar1? in Expr1 return Expr2 raises Error |
If any evaluation of the body of the for expression raises an error, then the entire expression raises an error. This rule applies to for expressions with or without the type declaration.
|
||||
|
||||
dynEnv |- for VarRef TypeDeclaration? in Expr1 return Expr2 raises Error |
|
||||
|
||||
dynEnv |- for VarRef TypeDeclaration? at VarRefpos in Expr1 return Expr2 raises Error |
When a type declaration is present, a type error is raised if any item in the result of evaluating Expr1 does not match the declared type.
|
||||
|
||||
dynEnv |- for VarRef as SequenceType PositionalVar1? in Expr1 return Expr2 raises typeError |
Example
Note that if the expression in the return
clause results in a
sequence, sequences are never nested in the [XPath/XQuery] data model. For
instance, in the following for expression:
for $i in (1,2) return (<i> {$i} </i>, <negi> {-$i} </negi>)
each iteration in the for
results in a sequence of two
elements, which are then concatenated and flattened in the resulting
sequence:
(<i>1</i>, <negi>-1</negi>, <i>2</i>, <negi>-2</negi>)
Core Grammar
After normalization, single let
expressions are described by
the following Core grammar production.
[36 (Core)] | LetClause |
::= | "let" "$" VarName TypeDeclaration? ":=" ExprSingle |
A let
expression extends the type environment statEnv with
Variable1 of type Type1 inferred
from Expr1, and infers the type of
Expr2 in the extended environment
to produce the result type Type2.
|
||
|
||
statEnv |- let VarRef := Expr1 return Expr2 : Type2 |
When a type declaration is present, the static semantics also checks that the type of the input expression is a subtype of the declared type and extends the static environment by typing Variable1 with type Type0. This semantics is specified by the following static rule.
|
|||||
|
|||||
statEnv |- let VarRef1 as SequenceType := Expr1 return Expr2 : Type2 |
A let
expression extends the dynamic environment dynEnv with Variable bound to
Value1
returned by Expr1, and evaluates
Expr2 in the extended environment
to produce Value2.
|
|||
|
|||
dynEnv |- let VarRef1 := Expr1 return Expr2 => Value2 |
When a type declaration is present, the dynamic semantics also checks that the result of evaluating Expr1 matches the declared type. This semantics is specified as the following dynamic rule.
|
|||||
|
|||||
dynEnv |- let VarRef1 as SequenceType := Expr1 return Expr2 => Value2 |
The default rules for propogating errors, described in [3.5 Error Handling] apply to let expressions. In addition, in the case that a type declaration is present, a type error is raised if the result of evaluating Expr1 does not match the declared type.
|
||||
|
||||
dynEnv |- let VarRef1 as SequenceType := Expr1 return Expr2 raises typeError |
Example
Note the use of the environment discipline to define the scope of each
variable. For instance, in the following nested let
expression:
let $k := 5 return let $k := $k + 1 return $k+1
the outermost let
expression binds variable $k
to the integer 5
in the environment, then the expression
$k+1
is computed, yielding value 6
, to which the
second variable $k
is bound. The expression then results in the
final integer 7
.
[37 (Core)] | OrderByClause |
::= | ("order" "by" | "stable" "order" "by") OrderSpecList |
[38 (Core)] | OrderSpecList |
::= | OrderSpec ("," OrderSpec)* |
[39 (Core)] | OrderSpec |
::= | ExprSingle OrderModifier |
[40 (Core)] | OrderModifier |
::= | ("ascending" | "descending")? (("empty" "greatest") | ("empty"
"least"))? ("collation" StringLiteral)? |
Introduction
The dynamic semantics of the OrderByClause is not specified formally. Because an OrderByClause does not effect the type of a expression, the static semantics of a expression with an OrderByClause is equivalent to the static semantics of the same expression with the OrderByClause omitted. The dynamic semantics is not specified formally as it would require the introduction of tuples, which are not supported in the [XPath/XQuery] data model. The dynamic semantics of the order-by clause can be found in Section 3.8.3 Order By and Return ClausesXQ.
Notation
To define normalization of OrderBy, the following auxiliary mapping rule is used.
[OrderSpecList]OrderSpecList |
== |
Expr |
which specify that OrderSpecList is mapped to Expr.
An OrderByClause is normalized to a Let clause, nested For expressions, and atomization, which guarantees that the OrderSpecList is well typed. Note that if evaluated dynamically, the normalization of OrderByClause given here does not express the required sorting semantics, but this normalization does provide the correct static type.
[ stable? order by OrderSpecList]FLWOR(Expr) |
== |
let $fs:new0 := [OrderSpecList]OrderSpecList return Expr |
Each OrderSpec is normalized the auxiliary atomization normalization rule.
[ Expr OrderModifier, OrderSpecList]OrderSpecList |
== |
for $fs:new in
fn:data (Expr) [OrderSpecList]OrderSpecList |
[ Expr OrderModifier]OrderSpecList |
== |
for $fs:new in
fn:data (Expr) return () |
Introduction
The fn:unordered
function returns its input sequence in any
order.
Notation
The dynamic semantics for unordered use an auxiliary judgments which disregards order between the items in a sequence.
The following judgment
dynEnv |- Value1 permutes to Value2holds if the sequence of items in Value2 is a permutation of the sequence of items in Value1.
The static semantics for unordered uses the auxiliary type functions prime(Type) and quantifier(Type); which are defined in [7.5 Judgments for sequences of item types]. The type of each argument is determined, and then prime(.) and quantifier(.) are applied to the sequence type (Type1, Type2).
The dynamic semantics for unordered is specified using the auxiliary judgments permutes to as follows.
|
|||
|
|||
|
Note that the permutes to judgments is non deterministic. There are many sequences that are permutations of a given sequence. Any of those permutations satisfies the above semantics.
The default rules for propogating errors, described in [3.5 Error Handling] apply to the unordered expression.
Introduction
A conditional expression supports conditional evaluation of one of two expressions.
[54 (XQuery)] | IfExpr |
::= | "if" "(" Expr ")" "then" ExprSingle "else" ExprSingle |
Core Grammar
The Core grammar rule for the conditional expression is:
[44 (Core)] | IfExpr |
::= | "if" "(" Expr ")" "then" ExprSingle "else" ExprSingle |
[if (Expr1) then Expr2 else Expr3]Expr | |
== | |
|
statEnv |-
Expr1 : xs:boolean
statEnv |-
Expr2 : Type2
statEnv
|- Expr3 : Type3 |
|
statEnv |- if
(Expr1) then
Expr2 else
Expr3 : (Type2 | Type3) |
If the conditional's boolean expression Expr1 evaluates to true, Expr2 is evaluated and its value is produced. If the conditional's boolean expression evaluates to false, Expr3 is evaluated and its value is produced. Note that the existence of two separate evaluation rules ensures that only one branch of the conditional is evaluated.
dynEnv |- Expr1 => true dynEnv |- Expr2 => Value2 |
|
dynEnv |- if Expr1 then Expr2 else Expr3 => Value2 |
dynEnv |- Expr1 => false dynEnv |- Expr3 => Value3 |
|
dynEnv |- if
Expr1 then
Expr2 else
Expr3 => Value3 |
If the conditional's boolean expression raises an error, then the conditional expression raises an error.
If the conditional's boolean expression Expr1 evaluates to true, and Expr2 raises an error, then the conditional expression raises an error. Conversely, if the conditional's boolean expression evaluates to false, and Expr3 raises an error, then the conditional raises an error.
Introduction
[XPath/XQuery] defines two quantification expressions:
[51 (XQuery)] | QuantifiedExpr |
::= | (("some" "$") | ("every" "$")) VarName TypeDeclaration? "in" ExprSingle ("," "$" VarName TypeDeclaration? "in" ExprSingle)* "satisfies" ExprSingle |
[27 (XPath)] | QuantifiedExpr |
::= | (("some" "$") | ("every" "$")) VarName "in" ExprSingle ("," "$"
VarName "in" ExprSingle)* "satisfies" ExprSingle |
Core Grammar
The quantified expressions are in the core.
[41 (Core)] | QuantifiedExpr |
::= | (("some" "$") | ("every" "$")) VarName TypeDeclaration? "in" ExprSingle ("," "$" VarName TypeDeclaration? "in" ExprSingle)* "satisfies" ExprSingle |
The quantified expressions are normalized into nested Core quantified expressions, each of which binds one variable.
[some VarRef1 in Expr1, ..., VarRefn in Exprn satisfies Expr]Expr | |||||
== | |||||
|
[every VarRef1 in Expr1, ..., VarRefn in Exprn satisfies Expr]Expr | |||||
== | |||||
|
The static semantics of the quantified expressions uses the prime operator on types, which is explained in [7.5 Judgments for sequences of item types]. These rules are similar to those for For expressions in [4.8.2 For expression].
|
|||
|
|||
statEnv |- some VarRef1 in
Expr1 satisfies
Expr2 : xs:boolean |
The next rule is for SomeExpr with the optional type declaration.
|
|||||
|
|||||
statEnv |- some VarRef1 as
SequenceType in Expr1
satisfies Expr2 : xs:boolean |
The next rule is for EveryExpr without the optional type declaration.
|
|||
|
|||
statEnv |- every VarRef1 in
Expr1 satisfies
Expr2 : xs:boolean |
The next rule is for EveryExpr with the optional type declaration.
|
|||||
|
|||||
statEnv |- every VarRef1 as
SequenceType in Expr1
satisfies Expr2 : xs:boolean |
The existentially quantified "some" expression yields true if any evaluation of the satisfies expression yields true. The existentially quantified "some" expression yields false if every evaluation of the satisfies expression is false. A quantified expression may raise an error if any evaluation of the satisfies expression raises an error. The dynamic semantics of quantified expressions is non-deterministic. This non-determinism permits implementations to use short-circuit evaluation strategies when evaluating quantified expressions.
|
||||
|
||||
dynEnv |- some VarRef1 in Expr1 satisfies Expr2 => true |
The next rule is for SomeExpr with the optional type declaration, in which some evaluation of the satisfies expression yields true.
|
||||||
|
||||||
dynEnv |- some VarRef1 as SequenceType in Expr1 satisfies Expr2 => true |
The next rule is for SomeExpr without the optional type declaration, in which all evaluations of the satisfies expression yields false.
|
|||||
|
|||||
dynEnv |- some VarRef1 in Expr1 satisfies Expr2 => false |
The next rule is for SomeExpr with the optional type declaration, in which all evaluations of the satisfies expression yields false.
|
||||||||
|
||||||||
dynEnv |- some VarRef1 as SequenceType in Expr1 satisfies Expr2 => false |
The universally quantified "every" expression yields false if any evaluation of the satisfies expression yields false. The universally quantified "every" expression yields true if every evaluation of the satisfies expression is true.
|
||||
|
||||
dynEnv |- every VarRef1 in Expr1 satisfies Expr2 => false |
The next rule is for EveryExpr with the optional type declaration, in which some evaluation of the satisfies expression yields false.
|
||||||
|
||||||
dynEnv |- every VarRef1 as SequenceType in Expr1 satisfies Expr2 => false |
The next rule is for EveryExpr in which all evaluations of the satisfies expression yields true.
|
|||||
|
|||||
dynEnv |- every VarRef1 in Expr1 satisfies Expr2 => true |
The next rule is for EveryExpr with the optional type declaration in which all evaluations of the satisfies expression yields true.
|
||||||||
|
||||||||
dynEnv |- every VarRef1 as SequenceType in Expr1 satisfies Expr2 => true |
An existentially quantified expression raises an error if:
evaluation of the first expression raises an error,
any evaluation of the satisfies expression raises an error.
|
||
|
||
dynEnv |- some VarRef1 TypeDeclaration ? in Expr1 satisfies Expr2 raises Error |
|
||||
|
||||
dynEnv |- some VarRef1 TypeDeclaration? in Expr1 satisfies Expr2 raises Error |
|
||||
|
||||
dynEnv |- every VarRef1 in Expr1 satisfies Expr2 raises Error |
An existentially quantified expression raises a type error if evaluation of the first expression yields a value, and some item in this value does not match the declared type of the variable.
Introduction
Expressions on SequenceTypes are expressions whose semantics depends on the type of some of the sub-expressions to which they are applied. The syntax of SequenceType expressions is described in [3.4.3 SequenceType Syntax].
[57 (XQuery)] | InstanceofExpr |
::= | TreatExpr ( "instance" "of"
SequenceType )? |
Introduction
The SequenceType expression "Expr instance of SequenceType" is true if and only if the result of evaluating expression Expr is an instance of the type referred to by SequenceType.
An InstanceofExpr expression is normalized into a TypeswitchExpr expression. Note that the following normalization rule uses a variable $fs:new, which is a newly created variable which must not conflict with any variables already in scope. This variable is necessary to comply with the syntax of typeswitch expressions in the core [XPath/XQuery], but is never used.
[52 (XQuery)] | TypeswitchExpr |
::= | "typeswitch" "(" Expr ")" CaseClause+ "default" ("$" VarName)? "return" ExprSingle |
[53 (XQuery)] | CaseClause |
::= | "case" ("$" VarName "as")?
SequenceType "return" ExprSingle |
Introduction
The typeswitch expression chooses one of several expressions to evaluate based on the dynamic type of an input value.
Each branch of a typeswitch expression may have an optional VarRef, which is bound the value of the input expression. This variable is optional in [XPath/XQuery] but mandatory in the [XPath/XQuery] core. One of the reasons for having this variable is that it is assigned a specific type for the corresponding branch.
Core Grammar
The Core grammar productions for typeswitch are:
[42 (Core)] | TypeswitchExpr |
::= | "typeswitch" "(" Expr ")" CaseClause+ "default" ("$" VarName)? "return" ExprSingle |
[43 (Core)] | CaseClause |
::= | "case" ("$" VarName "as")? SequenceType "return" ExprSingle |
Notation
To define normalization of case clauses to the [XPath/XQuery] core, the following auxiliary mapping rule is used.
[CaseClause]Case |
== |
CaseClause |
specifies that CaseClause is mapped to CaseClause, in the [XPath/XQuery] type system.
Normalization of a typeswitch expression guarantees that every branch has an associated VarRef. The following normalization rule adds a newly created variable that does not appear in the rest of the query. Note that $fs:new is a newly generated variable that must not conflict with any variables already in scope and that is not used in any of the sub-expressions.
[
|
|||||
== | |||||
|
Notation
The following auxiliary grammar production is used to identify branches of the typeswitch.
[76 (Formal)] | CaseRules |
::= | ("case" "$" VarName "as" SequenceType "return" Expr CaseRules) |
("default" "$" VarName "return" Expr) |
The following judgment
statEnv |- Type1 case CaseClause : Typeis used in the static of typeswitch. It indicates that under the environment statEnv, and with the input type of the typeswitch being Type1, the given case rule yields the type Type.
The following judgment
dynEnv |- Value1 against CaseRules => Value2is used in the dynamic semantics of typeswitch. It indicates that under the environment dynEnv, with the input value of the typeswitch being Value1, the given case rules yields the value Value2.
The static typing rules for the typeswitch expression are simple. Each case clause and the default clause of the typeswitch is typed independently. The type of the entire typeswitch expression is the union of the types of all the clauses.
|
|||||||||
|
|||||||||
|
To type one case clause, the case variable is assigned the type of the case clause CaseType and the body of the clause is typed in the extended environment. Thus, the type of a case clause is independent of the type of the input expression.
|
|||
|
|||
Type0 case case VarRef as SequenceType return Expr : Type |
To type the default clause, the variable is assigned the type of the input expression and the body of the default clause is typed in the extended environment.
The evaluation of a typeswitch proceeds as follows. First, the input
expression is evaluated, yielding an input value. The effective case
is the first case
clause such that the input value matches the
SequenceType in the case
clause. The return
clause
of the effective case is evaluated and the value of the return
expression is the value of the typeswitch expression.
|
|||
|
|||
dynEnv |- typeswitch (Expr) CaseRules => Value1 |
If the value matches the sequence type, the following rule applies: It
extends the environment by binding the variable Variable to Value0 and
evaluates the body of the return
clause.
|
||||
|
||||
dynEnv |- Value0 against case VarRef SequenceType return Expr CaseRules => Value1 |
If the value does not match the sequence type, the current case is not evaluated, and the remaining case rules are evaluated order by applying the inference rule recursively.
|
||
|
||
dynEnv |- Value0 against case SequenceType VarRef return Expr CaseRules => Value1 |
The last rule states that the default
branch of a typeswitch
expression always evaluates to the value of its return
clause.
statEnv |- VarRef of var expands to Variable dynEnv + varValue(Variable => Value0) |- Expr => Value1 |
|
dynEnv |- Value0 against default VarRef return Expr => Value1 |
The default rules for propogating errors, described in [3.5 Error Handling] apply to the typeswitch expression. In particular, if evaluation of the input expression or evaluation of any case rule raises an error, then the typeswitch expression raises an error.
[60 (XQuery)] | CastExpr |
::= | ComparisonExpr ( "cast"
"as" SingleType )? |
[124 (XQuery)] | SingleType |
::= | AtomicType "?"? |
Cast expressions change the atomic type of an atomic value to a given atomic type.
Core Grammar
The Core grammar production for cast as is:
[48 (Core)] | CastExpr |
::= | ValueExpr ( "cast" "as" SingleType )? |
[85 (Core)] | SingleType |
::= | AtomicType "?"? |
Introduction
The expression "( Expr ) cast as AtomicType " is used to change the atomic type of an atomic value from one atomic type to another. It changes both the type and value of the result of an expression, and can only be applied to an atomic value.
The semantics of cast expressions depends on the specification given in Section 17 CastingFO. For any source and target primitive types, the casting table in Section 17 CastingFO indicates whether the cast from the source type to the target type is permitted. When a cast is permitted, the detailed dynamic rules for cast in Section 17 CastingFO are applied. These rules are not specified further here.
The normalization of cast applies atomization to its argument. The type declaration asserts that the result is a single atomic value. The second normalization rule applies when the target type is optional.
[(Expr) cast as AtomicType ]Expr | ||
== | ||
|
[(Expr) cast as AtomicType? ]Expr | ||||
== | ||||
|
Notation
The following judgments access the contents of the casting table in Section 17 CastingFO and are used to specify the static and dynamic semantics of casting.
The judgment:
holds if casting from atomic type AtomicType1 to AtomicType2 is always possible (Y), may be possible (M), or is not allowed (N), as specified in Section 17 CastingFO. The judgment also holds for derived atomic types. The following rule specifies that for any pair of types AtomicType1 and AtomicType2 such that AtomicType1 is derived from some primitive type AtomicType1' and AtomicType2 is derived from some primitive type AtomicType2', AtomicType1 may be cast to AtomicType2 if and only if AtomicType1' may be cast to AtomicType2'.
|
||||
|
||||
statEnv |- AtomicType1 cast allowed AtomicType2 = X |
The following auxilliary rule denotes the result of applying the casting rules in Section 17 CastingFO, such that Value1 cast to Type2 yields the value Value2.
If the cast table indicates the cast is not allowed (N), a static type error is raised. Otherwise, the following static typing rules give the static semantics of "cast as" expression.
|
|||
|
|||
statEnv |- (Expr) cast as AtomicType2 : AtomicType2 |
|
|||
|
|||
statEnv |- (Expr) cast as AtomicType2 : AtomicType2 |
If the cast is allowed (Y or M), the following evaluation rules apply the casting rules on the result of the input expression.
|
||||
|
||||
|
|
||||
|
||||
|
Note that in the case that the casting table indicates "M", the casting
operation is allowed but might fail during evaluation if the input value does
not satsify the lexical and value constraints of the target atomic type
(e.g., attempting to cast the string "VRAI" into xs:boolean
). In
that case, the dynamic evaluation raises a dynamic error.
The default rules for propogating errors, described in [3.5 Error Handling] apply to the cast expression. In particular, if Expr1 raises an error, then the cast expression raises an error. In addition, if the casting table returns "N", the cast is not allowed, and the dynamic semantics raises a type error.
|
|||
|
|||
|
[59 (XQuery)] | CastableExpr |
::= | CastExpr ( "castable" "as"
SingleType )? |
Castable expressions check whether a value can be cast to a given type.
Core Grammar
The Core grammar production for castable is:
[47 (Core)] | CastableExpr |
::= | CastExpr ( "castable" "as"
SingleType )? |
The normalization of castable simply maps its expression argument.
[Expr castable as AtomicType]Expr | ||
== | ||
|
[Expr castable as AtomicType]Expr | ||
== | ||
|
The type of a castable expression is always a boolean.
|
statEnv |- Expr
castable as AtomicType : xs:boolean |
If casting succeeds, then the 'castable' expression evaluates to true.
|
|||
|
|||
|
If casting raises a dynamic error, 'castable as' evaluates to false.
|
||
|
||
|
The default rules for propogating errors, described in [3.5 Error Handling] apply to the castable expression.
Constructor functions provide an alternative syntax for casting.
Constructor functions for atomic types are normalized to explicit
cast as
expressions.
[AtomicType(Expr)]Expr |
== |
[(Expr) cast as AtomicType ]Expr |
[58 (XQuery)] | TreatExpr |
::= | CastableExpr ( "treat" "as"
SequenceType )? |
Introduction
The expression "Expr treat as SequenceType", can be used to change the static type of the result of an expression without changing its value. The treat-as expression raises a dynamic error if the dynamic type of the input value does not match the specified type.
Treat as expressions are normalized to typeswitch expressions. Note that the following normalization rule uses a variable $fs:new, which is a newly created variable that does not conflict with any variables already in scope.
[78 (XQuery)] | ValidateExpr |
::= | "validate" SchemaMode?
SchemaContext? "{" Expr "}" |
[12 (XQuery)] | SchemaMode |
::= | "lax" | "strict" | "skip" |
[79 (XQuery)] | SchemaContext |
::= | ("context" SchemaContextLoc) | "global" |
[143 (XQuery)] | SchemaContextLoc |
::= | (SchemaContextPath?
QName) | SchemaGlobalTypeName |
[142 (XQuery)] | SchemaContextPath |
::= | SchemaGlobalContext
"/" (SchemaContextStep
"/")* |
[14 (XQuery)] | SchemaGlobalContext |
::= | QName | SchemaGlobalTypeName |
[15 (XQuery)] | SchemaContextStep |
::= | QName |
Core Grammar
The Core grammar production for validate is:
[53 (Core)] | ValidateExpr |
::= | "validate" SchemaMode? SchemaContext? "{" Expr "}" |
[8 (Core)] | SchemaMode |
::= | "lax" | "strict" | "skip" |
[54 (Core)] | SchemaContext |
::= | ("context" SchemaContextLoc) | "global" |
[104 (Core)] | SchemaContextLoc |
::= | (SchemaContextPath?
QName) | SchemaGlobalTypeName |
[103 (Core)] | SchemaContextPath |
::= | SchemaGlobalContext "/"
(SchemaContextStep
"/")* |
[10 (Core)] | SchemaGlobalContext |
::= | QName | SchemaGlobalTypeName |
[11 (Core)] | SchemaContextStep |
::= | QName |
A validate
expression validates its argument with respect to
the in-scope schema definitions, using the schema validation process
described in [Schema Part 1]. The argument to a
validate expression must be either element or document node. Validation
replaces element and attribute nodes with new nodes that have their own
identity and contain type annotations and defaults created by the validation
process.
Static typing of the validate operation is defined by the following rule. Note the use of a subtyping check to ensure that the type of the expression to validate is either an element or a well-formed document node (i.e., with only one root element and no text nodes).
|
|||||||
|
|||||||
statEnv |- validate SchemaMode? SchemaContext? { Expr } : Type2 |
The normative dynamic semantics of validation is specified in Section 3.13 Validate ExpressionsXQ. The effect of validation of a data model value is equivalent to:
The above steps are expressed formally by the "erasure" and "annotation"
judgments. Formally, validation removes existing type annotations from nodes
("erasure"), and it re-validates the corresponding data model instance,
possibly adding new type annotations to nodes ("annotation"). Both erasure
and annotation are described formally in [7.7
Judgments for the validate expression]. Indeed, the conjunction of
erasure and annotation provides a formal model for a large part of actual
schema validation. The semantics of the validate
expression is
specified as follows.
When the validation mode is either given explicitly as "skip" or is set to
"skip" in statEnv.validationMode, the dynamic
semantics simply erases all existing type annotations and re-labels all
element nodes with xdt:untypedAny
and all attribute nodes
with xdt:untypedAtomic
.
|
||||
|
||||
dynEnv |- validate SchemaMode? SchemaContext? { Expr } => ElementValue2 |
|
||||
|
||||
dynEnv |- validate SchemaMode? SchemaContext? { Expr } => document { ElementValue2 } |
When the validation mode is either given explicitly as "lax" or "strict" or is set to "lax" or "strict" in statEnv.validationMode, the dynamic semantics erases all existing type annotations and then re-validates against the definition of the given element in the given schema context.
|
|||||||
|
|||||||
dynEnv |- validate SchemaMode? SchemaContext? { Expr } => ElementValue3 |
|
|||||||
|
|||||||
dynEnv |- validate SchemaMode? SchemaContext? { Expr } => document { ElementValue3 } |
Note that validation depends upon the order of types in a union type, but that type matching does not.
The default rules for propogating errors, described in [3.5 Error Handling] apply to the validate expression.
Introduction
XQuery supports modules as defined in Section 4 Modules and PrologsXQ. A main moduleXQ contains a PrologXQ followed by a query bodyXQ. A query has exactly one main module. In a main module, the query bodyXQ can be evaluated, and its value is the result of the query. A library moduleXQ contains a module declaration followed by a PrologXQ.
Editorial note | |
The Formal Semantics currently specifies the semantics of main modules only. The semantics of library modules and module import are not specified. See Issue 555 (Formal Semantics of Module Import). |
The Prolog is a sequence of declarations and definitions that effect query processing. The Prolog can be used, for example, to define namespace prefixes, import type definitions from XML Schemas, and define functions and variables. Namespace declarations and schema imports always precede function definitions, as specified by the following grammar productions.
[30 (XQuery)] | Module |
::= | VersionDecl? (MainModule | LibraryModule) |
[31 (XQuery)] | MainModule |
::= | Prolog QueryBody |
[32 (XQuery)] | LibraryModule |
::= | ModuleDecl Prolog |
[34 (XQuery)] | Prolog |
::= | ((NamespaceDecl |
[39 (XQuery)] | QueryBody |
::= | Expr |
Function declarations are globally scoped, that is, the use of a function name in a function call may precede declaration of the function. Variable declarations are lexically scoped, i.e., variable declarations must precede variable uses.
Core Grammar
Core grammar production for the prolog are as follows.
[21 (Core)] | Module |
::= | VersionDecl? (MainModule | LibraryModule) |
[22 (Core)] | MainModule |
::= | Prolog QueryBody |
[23 (Core)] | LibraryModule |
::= | ModuleDecl Prolog |
[25 (Core)] | Prolog |
::= | ((NamespaceDecl |
[30 (Core)] | QueryBody |
::= | Expr |
Notation
The Prolog contains a variety of declarations that specify the initial static and dynamic context of the query. The following formal grammar productions represent any Prolog declaration.
[77 (Formal)] | PrologDeclList |
::= | PrologDecl* |
[78 (Formal)] | PrologDecl |
::= | NamespaceDecl |
The function []PrologDecl takes a prolog declaration and maps it into its equivalent declaration in the Core grammar.
[PrologDecl]PrologDecl |
== |
PrologDecl |
The following auxiliary judgments are applied when processing the declarations in the prolog. The effect of the judgment is to process each prolog declaration in order, constructing a new static environment from the static environment constructed from previous prolog declarations.
The judgment:
holds if the sequence of prolog declarations PrologDeclList yields the static environment statEnv and the normalized sequence of prolog declarations in the Core grammar.
The judgment:
holds if under the static environment statEnv1, the single prolog declaration PrologDecl yields the new static environment statEnv2.
Similary, the judgment:
holds if under the dynamic environment dynEnv, the single prolog declaration PrologDecl yields the new dynamic environment dynEnv1.
andProlog declarations are processed in the order they are encountered. The normalization of a prolog declaration PrologDecl depends on the static context processing of all previous prolog declarations. In turn, static context processing of PrologDecl depends on the normalization of the PrologDecl. For example, because variables are lexically scoped, the normalization and static context processing of a variable declaration depends on the normalization and static context processing of all previous variable declarations. Therefore, the normalization phase and static context processing are interleaved, with normalization preceding static context processing for each prolog declaration.
The following inference rules express this dependency. The first rule specifies that for an empty sequence of prolog declarations, the initial static environment is the default static context.
|
||
|
The next rule interleaves normalization and static context processing. The result of static context processing and normalization is a static context and the normalized prolog declarations.
|
||||
|
||||
|
Static typing of a main module follows context processing and normalization. Context processing and normalization of a main module applies the rules above to the prolog, then using the resulting static environment statEnv, the query body is normalized into a Core expression, and the static typing rules are applied to this Core expression.
|
|||
|
|||
|
The rules for initializing the dynamic context are simpler. The first rule specifies for an empty sequence of prolog declarations, the initial dynamic environment is the default dynamic context.
The second rule simply computes the dynamic environment by processing the prolog declarations in order.
|
|||
|
|||
|
Dynamic evalution of a main module applies the rules for dynamic-context processing to the prolog declarations, then using the resulting dynamic environment dynEnv, the dynamic evaluation rules are applied to the normalized query body.
[36 (XQuery)] | VersionDecl |
::= | "xquery" "version" StringLiteral Separator |
Core Grammar
The corresponding core grammar production is.
[27 (Core)] | VersionDecl |
::= | "xquery" "version" StringLiteral Separator |
A version declaration specifies the applicable XQuery syntax and semantics for a module. An XQuery implementation must raise a static error when processing a query labeled with a version that the implementation does not support. The Formal Semantics is specified for XQuery 1.0 and does not specify this static error formally.
[33 (XQuery)] | ModuleDecl |
::= | "module" "namespace" NCName "=" StringLiteral Separator |
Core Grammar
The corresponding core grammar production is.
[24 (Core)] | ModuleDecl |
::= | "module" "namespace" NCName "="
StringLiteral Separator |
Editorial note | |
Module declarations only occur in library modules, which are not specified formally in this draft. |
[117 (XQuery)] | BaseURIDecl |
::= | "declare" "base-uri" StringLiteral |
Core Grammar
The corresponding core grammar production is.
[78 (Core)] | BaseURIDecl |
::= | "declare" "base-uri" StringLiteral |
The base URI declaration is already in the Core grammar, so normalization rule is necessary.
A base URI declaration specifies the base URI property of the static context, which is used when resolving relative URIs within a module. A static error is raised if more than one base URI declaration is declared in a query prolog.
The base URI declaration does not effect the dynamic context.
[118 (XQuery)] | NamespaceDecl |
::= | "declare" "namespace" NCName "=" StringLiteral |
Core Grammar
The corresponding core grammar production is.
[79 (Core)] | NamespaceDecl |
::= | "declare" "namespace" NCName "="
StringLiteral |
The namespace declaration is in the Core grammar, so no normalization rules are necessary.
A namespace declaration adds a new (prefix,uri) binding in the namespace component of the static environment. All namespace declarations in the prolog are passive declarations. Namespace declaration attributes of element constructors are active declarations.
statEnv1 = statEnv + namespace(NCName => (passive, String)) | ||
|
||
|
The namespace declaration does not effect the dynamic context.
[119 (XQuery)] | DefaultNamespaceDecl |
::= | (("declare" "default" "element") | ("declare" "default"
"function")) "namespace" StringLiteral |
Core Grammar
The corresponding core grammar production is.
[80 (Core)] | DefaultNamespaceDecl |
::= | (("declare" "default" "element") | ("declare" "default"
"function")) "namespace" StringLiteral |
The default namespace declarations are in the Core grammar, so no normalization rules are necessary.
A default element namespace declaration changes the default element namespace prefix binding in the namespace component of the static environment.
statEnv1 = statEnv + default_elem_namespace( String) | ||
|
||
|
A default function namespace declaration changes the default function namespace prefix binding in the namespace component of the static environment.
statEnv1 = statEnv + default_function_namespace( String) | ||
|
||
|
Note that multiple declarations of the same namespace prefix in the Prolog result in a static error. However, a declaration of a namespace in the Prolog can override a prefix that has been predeclared in the static context.
The default namespace declarations do not effect the dynamic context.
[146 (XQuery)] | SchemaImport |
::= | "import" "schema" SchemaPrefix? StringLiteral ("at" StringLiteral)? |
[147 (XQuery)] | SchemaPrefix |
::= | ("namespace" NCName "=") | ("default" "element"
"namespace") |
The semantics of Schema Import is described in terms of the [XPath/XQuery] type system. The process of converting an XML Schema into a sequence of type declarations is described in Section [8 Importing Schemas]. This section describes how the resulting sequence of type declarations is added into the static context when the Prolog is processed.
Core Grammar
The corresponding core grammar productions are.
[107 (Core)] | SchemaImport |
::= | "import" "schema" SchemaPrefix? StringLiteral ("at" StringLiteral)? |
[108 (Core)] | SchemaPrefix |
::= | ("namespace" NCName "=") |
("default" "element" "namespace") |
Schema import declarations are in the Core grammar, so no normalization rules are necessary.
Notation
The following auxiliary judgments are used when processing schema imports.
The judgment:
holds if under the static environment statEnv1, the sequence of type definitions Definition* yields the new static environment statEnv2.
The judgment:
holds if under the static environment statEnv1, the single definition Definition yields the new static environment statEnv2.
A schema imported into a query is first mapped into the [XPath/XQuery] type system, which yields a sequence of XQuery type definitions. The rules for mapping the imported schema begins in [8.2 Schemas as a whole]. Each type definition in an imported schema is then added to the static environment.
|
|||
|
|||
|
The schema import declaration may also assign an element/type namespace prefix to the URI of the imported schema, or assign the default element namespace to the URI of the imported schema.
|
||||
|
||||
|
|
||||
|
||||
|
An empty sequence of type definitions yields the input environment.
Each type definition is added into the static environment.
|
|||
|
|||
|
Each type, element, or attribute declaration is added respectively to the type, element and attribute declarations components of the static environment.
|
|||
|
|||
|
|
|||
|
|||
|
|
|||
|
|||
|
Note that it is a static error to import two schemas that both define the same name in the same symbol space and in the same scope, that is multiple top-level definitions of the same type, element, or attribute name raises a static error. For instance, a query may not import two schemas that include top-level element declarations for two elements with the same expanded name.
The schema import declarations do not effect the dynamic context.
[37 (XQuery)] | ModuleImport |
::= | "import" "module" ("namespace" NCName "=")? StringLiteral ("at" StringLiteral)? |
Core Grammar
The corresponding core grammar production is.
[28 (Core)] | ModuleImport |
::= | "import" "module" ("namespace" NCName "=")? StringLiteral ("at" StringLiteral)? |
Editorial note | |
Issue: The semantics of Module Import is not formally specified. See Issue 555. |
[38 (XQuery)] | VarDecl |
::= | "declare" "variable" "$" VarName
TypeDeclaration? (("{" Expr "}") | "external") |
Core Grammar
The corresponding core grammar production is.
[29 (Core)] | VarDecl |
::= | "declare" "variable" "$" VarName
TypeDeclaration? (("{" Expr "}") | "external") |
Normalization of a variable declaration normalizes the variable and its corresponding expression, if it is present.
[ declare variable VarRef as SequenceType { Expr } ]PrologDecl |
== |
declare variable VarRef as SequenceType [Expr]Expr |
[ declare variable VarRef as SequenceType external ]PrologDecl |
== |
declare variable VarRef as SequenceType external |
A variable declaration updates the variable component of the static context by associating the given variable with a static type. If the variable declaration has a type declaration, the static type of the variable is simply the specified type.
|
|||
|
|||
|
If a variable declaration does not have a type declaration but has an associated expression, then the static type of the variable is the static type of the expression.
|
|||
|
|||
|
If a variable declaration does not have a type declaration and is
external, then the static type of the variable is
xs:anyType
.
|
|||
|
|||
|
To evaluate a variable declaration, its associated expression is evaluated, and the dynamic context is updated with the variable bound to the resulting value.
|
||||
|
||||
|
Dynamic evaluation does not apply to externally defined variables. The evaluation environment must provide the values of external variables in the initial dynamic context (dynEnvDefault).
[145 (XQuery)] | ValidationDecl |
::= | "declare" "validation" SchemaMode |
Core Grammar
The corresponding core grammar production is.
[106 (Core)] | ValidationDecl |
::= | "declare" "validation" SchemaMode |
The validation declaration is in the Core grammar, so no normalization rules are necessary.
The validation declaration updates the validation context of the static context. The validation context is used in the definition of the element constructor and validation expressions.
statEnv1 = statEnv + validationContext( SchemaContext) | ||
|
||
|
The validation declaration does not effect the dynamic context.
[115 (XQuery)] | XMLSpaceDecl |
::= | "declare" "xmlspace" ("preserve" | "strip") |
The xmlspace declaration is not specified formally as the Formal Semantics is defined on the Core language, which is an abstract, not concrete, syntax and is typically the result of parsing phase described in [2.4.1 Processing model].
[116 (XQuery)] | DefaultCollationDecl |
::= | "declare" "default" "collation" StringLiteral |
Core Grammar
The corresponding core grammar production is.
[77 (Core)] | DefaultCollationDecl |
::= | "declare" "default" "collation" StringLiteral |
The default collation declaration is in the Core grammar, so no normalization rules are necessary.
The default collation declaration updates the collations environment of the static context. The collations environment is used by several functions in [Functions and Operators], but is otherwise not used in the Formal Semantics.
statEnv1 = statEnv + collations( String) | ||
|
||
|
The default collation declaration does not effect the dynamic context.
Introduction
User-defined functions specify the name of the function, the names and types of the parameters, and the type of the result. The function body defines how the result of the function is computed from its parameters.
[120 (XQuery)] | FunctionDecl |
::= | "declare" "function" QName "(" ParamList? (")" | (")" "as" SequenceType)) (EnclosedExpr | "external") |
[121 (XQuery)] | ParamList |
::= | Param ("," Param)* |
[122 (XQuery)] | Param |
::= | "$" VarName TypeDeclaration? |
Core Grammar
The corresponding core grammar productions are.
[81 (Core)] | FunctionDecl |
::= | "declare" "function" QName "("
ParamList? (")" | (")" "as" SequenceType)) (EnclosedExpr | "external") |
[82 (Core)] | ParamList |
::= | Param ("," Param)* |
[83 (Core)] | Param |
::= | "$" VarName TypeDeclaration? |
Notation
The following auxiliary mapping rule is used for the normalization of parameters in function definitions: []Param.
Parameters without a declared typed are given the item* sequence type.
The parameter list and body of a user-defined function are all normalized into Core expressions.
[ define function QName ( ParamList? ) as SequenceType EnclosedExpr ]PrologDecl |
== |
define function QName ( [ParamList?]Param ) as SequenceType [EnclosedExpr]Expr |
If the return type of the function is not provided, it is given the
item*
sequence type.
[define function QName ( ParamList? ) EnclosedExpr ]PrologDecl |
== |
define function QName( [ParamList?]Param ) as item* [EnclosedExpr]Expr |
Externally defined functions are normalized similarly.
[ define function QName ( ParamList? ) as SequenceType external] |
== |
define function QName( [ParamList?]Param ) as SequenceType external |
[define function QName ( ParamList? ) external ]PrologDecl |
== |
define function [QName] ( [ParamList?]Param ) as item* external |
Because functions are mutually referential, all function signatures must be defined in the static environment before static type analysis is applied to the function bodies.
|
|||
|
|||
|
The static typing rules for function bodies follows normalization and processing of the static context. The typing rules below constructs a new environment in which each variable has the given expected type, then the static type of the function's body is computed under the new environment. The function body's type must be a subtype of the expected return type. If type checking fails, a static error is raised. Otherwise, static typing of the function has no other effect, as function signatures are already inside the static environment.
|
|||
|
|||
statEnv |- define function QName (VarRef1 as SequenceType1, ··· VarRefn as SequenceTypen) as SequenceTyper { Expr } |
The bodies of external functions are not available and therefore cannot by type checked. To ensure type soundness, the evaluation environment must guarantee that the value returned by the external function matches the expected return type.
|
statEnv |- define function QName ( VarRef1 as SequenceType1 , ··· VarRefn as SequenceTypen ) as SequenceTyper external |
A function declaration updates the dynamic context. The function name with arity N is associated with the given function body. The number of arguments is required, because XQuery permits overloading of function names as long as each function signature has a different number of arguments.
|
||||
|
||||
dynEnv |- define function QName ( VarRef1 as SequenceType1 , ··· VarRefn as SequenceTypen ) as SequenceTyper { Expr } =>dyn dynEnv' |
An external function declaration does not effect the dynamic environment. The evaluation environment must provide the implementation for externally defined functions.
|
dynEnv |- define function QName ( Variable1 as SequenceType1 , ··· Variablen as SequenceTypen ) as SequenceTyper external =>dyn dynEnv |
The dynamic semantics of a function body are applied when the function is called and is described in [4.1.5 Function Calls].
This section defines a number of auxilliary functions required to define the formal semantics of [XPath/XQuery], and gives special static typing rules for some functions in [Functions and Operators].
Introduction
This section gives the definition and semantics of functions that are used in the formal semantics but are not in [Functions and Operators]. Their dynamic semantics are defined in the same informal style as in the [Functions and Operators] document. The static semantics of some formal-semantics functions require custom typing rules.
minus
,
fs:plus
, fs:times
,
fs:idiv
, fs:div
, and
fs:mod
These functions are overloaded placeholders with signatures and semantics as specified in the table in Appendix [B.2 Mapping of Overloaded Internal Functions].
eq
, fs:ne
,
fs:lt
, fs:le
,
fs:gt
, and fs:ge
These functions are overloaded placeholders with signatures and semantics as specified in the table of Appendix [B.2 Mapping of Overloaded Internal Functions].
convert-operand
functionfs:convert-operand
($actual
as
item *
, $expected
as
xdt:anyAtomicType
) as
xdt:anyAtomicType
?
The formal-semantics function fs:convert-operand
converts
the operands of arithmetic and comparison operators as follows:
If $actual
is the empty sequence, returns the empty
sequence.
If statEnv.xpath1.0_compatibility is false and
$actual
is of type xdt:untypedAtomic
, then
if $expected
is of type xdt:untypedAtomic
, returns
$actual
cast to xs:string;
if $expected
is of numeric type, returns $actual
cast to xs:double
otherwise returns $actual
cast to the type of
$expected
.
If statEnv.xpath1.0_compatibility is true, and
$actual
is not of numeric type, then returns the value of the
expression
fn:number
(fn:subsequence
($actual,1,1
)).
Editorial note | |
New Issue: According to Michael Kay this converts too much (for example dates) to numbers in backwards compatibility mode. One fix would be to make the test inclusive yet sufficiently general to capture all the actual 1.0 cases, i.e., say something like "$actual : (node* | xdt:untypedAtomic* | xs:string*)". [Kris] |
Otherwise, the operand is returned unchanged.
No conversion is needed for numeric (or empty) operands.
|
|||
|
|||
statEnv |- fs:convert-operand (Expr
1, Expr2) : Type1 |
In XPath 1.0 backwards compatibility mode, if the static type of the
operand is not a single numeric, the static type of the expression is the
static type of the expression that extracts the first element of the operand
and then converts that item to a double (by applying
fn:number
).
|
|||||
|
|||||
statEnv |- fs:convert-operand (Expr
1, Expr2) : Type |
Editorial note | |
This rule does not seem quite right...to discuss. [Kris] |
Pairs of untyped atomic operands are converted to strings.
|
||||
|
||||
statEnv |- fs:convert-operand (Expr
1, Expr2) : xs:string · quantifier (Type1) |
When an untyped operand is paired with a numeric operand, it is converted to xs:double.
|
|||||
|
|||||
statEnv |- fs:convert-operand (Expr
1, Expr2) : xs:double · quantifier (Type1) |
Finally, an untyped atomic operand not dealt with by the above rules is converted to the type of the other operand.
|
|||||
|
|||||
statEnv |- fs:convert-operand (Expr
1, Expr2) : Type2 · quantifier(Type1) |
convert-simple-operand
functionfs:convert-simple-operand
($actual
as
item *
, $expected
as
xdt:anyAtomicType
) as
xdt:anyAtomicTypeAtomic
*
The formal-semantics function fs:convert-simple-operand
is used to convert the value of the $actual
argument such that
it matches the type of the $expected
argument (or matches a
sequence of that type).
The dynamic semantics of this function are as follows:
If statEnv.xpath1.0_compatibility is true and
the $expected
argument is of type xs:string
(or a
type derived from xs:string
) but the $actual
argument is not of type xs:string (or a type derived from xs:string), then
returns the value of fn:string
($actual)
.
If statEnv.xpath1.0_compatibility is true and
the $expected
argument is of numeric type (or a type derived
from a numeric type) but the $actual
argument is not of numeric
type, then returns the value of fn:number($actual)
.
If statEnv.xpath1.0_compatibility is false, then
for each item in $actual
argument that is of type
xdt:untypedAtomic, that item is cast to the type of the
$expected
argument, and the resulting sequence is returned.
Otherwise, return the $actual
argument unchanged.
The following static semantics rules correspond to the dynamic semantics rules given above.
|
|||||
|
|||||
statEnv |- fs:convert-simple-operand (
Expr1, Expr2) :
xs:string |
|
|||||
|
|||||
statEnv |- fs:convert-simple-operand (
Expr1, Expr2) :
xs:double |
|
||||||
|
||||||
statEnv |- fs:convert-simple-operand (
Expr1, Expr2) :
Type3 ·
quantifier(Type1) |
distinct-doc-order
functionfs:distinct-doc-order
($nodes
as
node *
) as
node *
The fs:distinct-doc-order
function sorts its input sequence of nodes by document order and removes
duplicates.
The function expects a sequence of nodes as input and returns a sequence of prime types, which are defined in [7.5 Judgments for sequences of item types].
|
||
|
||
|
item-sequence-to-node-sequence
functionfs:item-sequence-to-node-sequence
($items
as
item *
) as
node *
Th fs:item-sequence-to-node-sequence
function converts a sequence of item values to nodes by applying the
normative rules in Section 3.7.3.1 Computed
Element ConstructorsXQ.
Notation
The type function items_to_nodes(Type) converts all atomic types in a type to the text type. The items_to_nodes function is defined by induction as follows.
items_to_nodes(AtomicTypeName ) | = | text |
items_to_nodes(NodeType) | = | NodeType |
items_to_nodes(Type1 , Type2) | = | items_to_nodes(Type1) , items_to_nodes(Type2) |
items_to_nodes(Type1 & Type2) | = | items_to_nodes(Type1) & items_to_nodes(Type2) |
items_to_nodes(Type1 | Type2) | = | items_to_nodes(Type1) | items_to_nodes(Type2) |
items_to_nodes(Type ?) | = | items_to_nodes(Type) ? |
items_to_nodes(Type *) | = | items_to_nodes(Type) * |
items_to_nodes(Type +) | = | items_to_nodes(Type) + |
|
||
|
||
|
item-sequence-to-untypedAtomic
functionfs:item-sequence-to-untypedAtomic
($items
as
item *
) as
xdt:untypedAtomic
The fs:item-sequence-to-untypedAtomic
function converts a sequence of item values to a string of type xdt:untypedAtomic
by applying the
normative rules in Section 3.7.3.2 Computed
Attribute ConstructorsXQ.
In summary, each node is replaced by its string value. For each adjacent sequence of one or more atomic values returned by an enclosed expression, a string is constructed, containing the canonical lexical representation of all the atomic values, with a single blank character inserted between adjacent values.
There are no special static typing rules for this function.
Introduction
This section gives static typing rules for functions in [Functions and Operators] for which the standard typing rule based on the function's signature can be made more precise. All functions that are not mentioned here have static semantics as descrbed by the generic static rule of section [4.1.5 Function Calls]. The rules in this section always give more precise type information than the generic rule.
fn:abs
,
fn:ceiling
, fn:floor
, fn:round
, and
fn:round-half-to-even
functionsThe typing rules for the fn:abs
, fn:ceiling
,
fn:floor
, fn:round
, and
fn:round-half-to-even
functions promote their input type to the
(least) base primitive numeric type from which the input type is derived.
Parameters of type xdt:untypedAtomic
are always
promoted to xs:double
. Instead of writing a separate judgment
for each function, we write one rule with function variable F, which
is one of the fn:abs
, fn:ceiling
,
fn:floor
, fn:round
, or
fn:round-half-to-even
functions.
|
|||||
|
|||||
statEnv |- F (Expr) : Type1 · quantifier(Type) |
fn:collection
and fn:doc
functionsThe fn:collection
function as described in the [Functions and Operators] document, takes a
string-valued expression, which denotes a URI, and returns a value.
If the argument to fn:collection
is a literal String
expression and that string is defined in statEnv.collectionType, then the result type is the type corresponding to
the String in statEnv.collectionType.
Otherwise, if the argument is not a literal string or is a string but not defined in statEnv.collectionType, then we don't know anything about the URI, and the static type is node*:
statEnv |- statEnv.collectionType(String) not defined |
|
statEnv |-
fn:collection (Expr) : node * |
The static type of the fn:doc
function has similar static
rules, but, in addition, requires that the static type of the URI be any
document:
|
||||
|
||||
statEnv |-
fn:doc (String) : Type |
Otherwise, if the argument is not a literal string or is not defined in the domain of statEnv.docType, then we don't know anything about the URI, and the static type is document:
fn:data
functionIntroduction
The fn:data
function converts a sequence of items to a
sequence of atomic values.
Notation
Infering the type for the fn:data
function is
done by applying the data on
auxiliary judgment, using the same approach as for the XPath steps.
The general rule for fn:data
is to apply the filter data on to the prime type of its
argument type, then apply the quantifier to the result:
|
|||
|
|||
statEnv |-
fn:data (Expr) : Type1 · quantifier(Type) |
When applied to none, data on yields none.
When applied to empty, data on yields empty.
When applied to the union of two types, data on is applied to each of the two types. The resulting types are combined into a factored type. This rule is necessary because data on may return a sequence of atomic types.
|
|||
|
|||
statEnv |- data on (Type1|Type2) : prime(Type1'|Type2') · quantifier(Type1'|Type2') |
When applied to an atomic type, data on simply returns the atomic type:
When applied to comment or processing instruction node types, data on returns
xs:string
When applied to text, and document node types, data on returns xdt:untypedAtomic
When applied to element node types with type annotation xdt:untypedAny
, the data on filter returns xdt:untypedAtomic
.
|
||
|
||
statEnv |- data on ElementType : xdt:untypedAtomic |
When applied to an attribute node type, the data on filter returns the attribute's simple type.
|
|||
|
|||
statEnv |- data on AttributeType : Type |
When applied to an element type whose type annotation denotes a simple type or a complex type of simple content, data on returns the element's simple type.
|
||||
|
||||
statEnv |- data on ElementType : Type1 |
When applied to an element type whose type annotation denotes a complex
type of mixed content, the data
on filter returns xdt:untypedAtomic
.
|
|||
|
|||
statEnv |- data on ElementType : xdt:untypedAtomic |
The data on filter is not defined on any element type whose type annotation denotes a complex type of complex content and therefore apply data on to such a node raises a static error.
Example
Consider the following variables and its corresponding static type.
$x : (element price { attribute currency { xs:string }, xs:decimal } | element price_code { xs:integer })
Applying the fn:data
function on that variable results in the
following type.
fn:data($x) : (xs:decimal | xs:integer)
Because the input type is a choice, applying the data on filter results in a choice of simple types for
the output of the fn:data
function.
fn:distinct-values
functionThe fn:distinct-values
function expects a sequence of atomic
values as input and returns a sequence of prime types, which are defined in
[7.5 Judgments for sequences of item
types].
fn:min
, fn:max
, fn:avg
, and
fn:sum
functionsThe dynamic evaluation rules for the fn:min
,
fn:max
, and fn:avg
convert any item of type
xdt:untypedAtomic
in the
input sequence to xs:double
, then attempt to promote all values
in the input sequence to values that are comparable. The static typing rules
reflect the dynamic rules.
Semantics
The type function convert_untypedAtomic
takes a prime type and converts all occurrences of the type xdt:untypedAtomic
to a target type.
It is defined by induction as follows.
convert_untypedAtomic(xdt:untypedAtomic , Type) |
= | Type |
convert_untypedAtomic(ItemType, Type) | = | ItemType (ItemType is not xdt:untypedAtomic ) |
convert_untypedAtomic(empty , Type) |
= | empty |
convert_untypedAtomic(none , Type) |
= | none |
convert_untypedAtomic(Type1 | Type2, Type) | = | convert_untypedAtomic(Type1, Type) | convert_untypedAtomic(Type2, Type) |
The function aggregate_quantifier converts the input type quantifier zero-or-more or zero-or-one to the result type quantifier zero-or-one, and converts the input type quantifier one or one-or-more, to the result type quantifier one.
aggregate_quantifier(? ) |
= | ? |
aggregate_quantifier(* ) |
= | ? |
aggregate_quantifier(1 ) |
= | 1 |
aggregate_quantifier(+ ) |
= | 1 |
Now we can define the static typing rules for the aggregate functions.
First, the input type is converted to a prime type. Second, the type function
convert_untypedAtomic is applied to the prime type,
yielding a new prime type, in which occurrences of xdt:untypedAtomic
are converted to
xs:double
. Third, the judgment can be promoted to is applied to the new prime
type and a target type T. The result type is T combined
with the aggregate quantifier of the input type.
Instead of writing a separate judgment for each function and target type,
we write one rule with function variable F and target variable
T. When the function variable F is fn:min
or
fn:max
, the target type T must be one of
xs:string
, xs:integer
, xs:decimal
,
xs:float
, xs:double
, xdt:yearMonthDuration
, or xdt:dayTimeDuration
. When the function
variable F is fn:avg
, the target type T must
be one of xs:decimal
, xs:float
,
xs:double
, xdt:yearMonthDuration
, or xdt:dayTimeDuration
.
|
||||
|
||||
statEnv |- F (Expr) : Type1 · aggregate_quantifier(quantifier(Type)) |
The static typing rules for fn:sum
are similar to those for
the aggregate functions above, in which T must be one of
xs:integer
, xs:decimal
, xs:float
,
xs:double
, xdt:yearMonthDuration
, or xdt:dayTimeDuration
.
The fn:sum
function has two forms. The first form takes two
arguments: The first argument is the input sequence and the second argument
is the value that should be returned if input sequence is empty. In this
case, the result type is the union of the target type T and the type
of the second argument.
|
|||||
|
|||||
statEnv |- fn:sum
(Expr1,
Expr2) : T | Type2 |
The second form of fn:sum
takes one argument, with an
implicit second argument of the integer value 0. In this case, the result
type is the union of target type T and xs:integer
.
|
||||
|
||||
statEnv |- fn:sum
(Expr1) : T |
xs:integer |
fn:remove
functionThe typing rules for the fn:remove
function returns the
factored type of the function's input sequence.
fn:reverse
functionThe typing rules for the fn:reverse
function returns the
factored type of the function's input sequence.
fn:subsequence
functionThe fn:subsequence
function has special typing rules when its
second argument is the numeric literal value 1 or the built-in variable
$
fs:last
.
These rules provide better typing for path expressions such as
Expr[1] and Expr[fn:last
()].
If the type of the input expression has exactly one or one-or-more items,
then fn:subsequence
, when selecting the first item, has the
prime typenn input type.
|
||
|
||
|
The same rule applies when the last item in the input sequence is selected.
|
||
|
||
|
If the type of the input expression has zero or more items, then
fn:subsequence
, when selecting the first item, has zero-or-one
of the prime type of the input type.
|
||
|
||
|
The same rule applies when the last item in the input sequence is selected.
|
||
|
||
|
The last rule applies to all other applications of the
fn:subsequence
function.
op:union
,
op:intersect
, and op:except
operatorsThe static semantics for op:union
uses the auxiliary type
functions prime(Type) and quantifier(Type); which
are defined in [7.5 Judgments for sequences of item
types]. The type of each argument is determined, and then prime(.) and quantifier(.) are applied to the sequence type (Type1, Type2).
|
||
|
||
|
The static semantics of op:intersect
is analogous to that for
op:union
. Because an intersection may be empty, the result type
is optional.
|
|||
|
|||
|
The static semantics of op:except
follows. The type of the
second argument is ignored as it does not contribute to the result type. As
with op:intersect
, the result of op:except
may be
the empty sequence.
This section defines auxiliary judgments used in the course of defining the formal semantics.
Notation
The judgment
holds when the possibly optional schema mode resolves to the given schema mode.
Semantics
An absent schema mode resolves to the schema mode in statEnv.validationMode.
statEnv.validationMode = SchemaMode |
|
statEnv |- is SchemaMode |
A present schema mode resolves to itself.
Notation
The judgment
holds when the possibly optional schema context path resolves to the given type name.
Semantics
An absent schema context path resolves to the schema context in statEnv.validationContext.
A schema context path consisting only of a type name resolves to the type name.
An element name is resolved to the type name of the global element with the given element name. The rest of the context is resolved recursively in the context of the element's type.
|
|||
|
|||
statEnv |- ElementName/(SchemaContextStep*) is TypeName1 |
A type name followed by an element name is resolved to the type name of the element declaration that is associated with the given element name in the type associated with the first type name. The element may be either a locally or globally declared element. The rest of the schema path context is resolved recursively in the context of the element's type.
|
|||
|
|||
statEnv |- type(TypeName)/ElementName(/SchemaContextStep*) is TypeName2 |
Introduction
This section defined several auxiliary judgments to access components of the [XPath/XQuery] type system. The first two judgements (derives from and substitutes for) are used to access the type and element name hierarchies from the schema. The other judgments (name lookup, type lookup, extended by, adjusts to and expands to) are used to lookup the meaning of element or attribute types from the schema. These judgments are used in many expressions, notably in the specification of type matching (See [7.4 Judgments for type matching]), validation (See [7.7 Judgments for the validate expression]), and the static semantics for step expressions (See [7.3 Judgments for step expressions and filtering]).
Notation
The judgment
holds when the first type name derives from the second type name.
Example
For example, assuming the extended XML Schema given in section [2.3.5 Example of a complete Schema], then the following judgments hold.
USAddress derives from xs:anyType NYCAddress derives from USAddress NYCAddress derives from xs:anyType xsd:positiveInteger derives from xsd:integer xsd:integer derives from xs:anySimpleType [Anon3] derives from xsd:positiveInteger [Anon3] derives from xsd:integer [Anon3] derives from xs:anySimpleType [Anon3] derives from xs:anyType
Note
Derivation is a partial order. It is reflexive and transitive by the definition below. It is asymmetric because no cycles are allowed in derivation by restriction or extension.
Semantics
This judgment is specified by the following rules.
Some rules have hypotheses that simply list a type, element, or attribute declaration.
Every type name derives from itself.
Every type name derives from the type it is declared to derive from by restriction or extension.
|
|||
|
|||
statEnv |- TypeName derives from BaseTypeName |
|
|||
|
|||
statEnv |- TypeName derives from BaseTypeName |
Derivation is transitive.
The substitutes judgment is used to know whether an element name is in the substitution group of another element name.
Notation
The judgment
holds when the first element name substitutes for the second element name.
Example
For example, assuming the extended XML Schema given in section [2.3.5 Example of a complete Schema], then the following judgments hold.
usaddress substitutes for address nyaddress substitutes for usaddress nyaddress substitutes for address
Note
Substitution is a partial order. It is reflexive and transitive by the definition below. It is asymmetric because no cycles are allowed in substitution groups.
Semantics
The substitutes judgment for element names is specified by the following rules.
Every element name substitutes for itself.
Every element name substitutes for the element it is declared to substitute for.
statEnv.elemDecl(ElementName) => define element ElementName substitutes for BaseElementName Nillable? TypeReference |
|
statEnv |- ElementName substitutes for BaseElementName |
Substitution is transitive.
The name lookup judgments are used to obtain the appropriate type annotations from any given attribute or element type, given an element or attribute name.
Notation
The judgment
holds when matching an element with the given element name against the given element type requires that the element be nillable as indicated and matches the type reference.
Example
For example, assuming the extended XML Schema given in section [2.3.5 Example of a complete Schema], then the following judgments hold.
comment name lookup element comment yields of type xsd:string size name lookup element size nillable of type xs:integer yields nillable of type xsd:string apt name lookup element apt yields of type [Anon3] nycaddress name lookup element address yields of type NYCAddress
Note that when the element name is in a substitution group, the name
lookup returns the type name corresponding to the original element name (here
the type NYCAddress
for the element nycaddress
,
instead of Address
for the element address
).
Semantics
This judgment is specified by the following rules.
If the element type is a reference to a global element, then name lookup yields the type reference in the element declaration for the given element name. The given element name must be in the substitution group of the global element.
|
|||
|
|||
statEnv |- ElementName1 name lookup element ElementName2 yields Nillable? TypeReference |
If the given element name matches the element name in the element type, and the element type contains a type reference, then name lookup yields that type reference.
|
statEnv |- ElementName name lookup element ElementName Nillable? TypeReference yields Nillable? TypeReference |
If the element type has no element name but contains a type reference, then name lookup yields the type reference.
|
statEnv |- ElementName name lookup element TypeReference yields TypeReference |
If the element type has no element name and no type reference, then name
lookup yields xs:anyType
.
|
statEnv |- ElementName name lookup element yields of type xs:anyType |
Notation
The judgment
holds when matching an attribute with the given attribute name against the given attribute type matches the type reference.
Example
For example, assuming the extended XML Schema given in section [2.3.5 Example of a complete Schema], then the following judgments hold.
orderDate name lookup attribute orderDate of type xsd:date yields of type xsd:date? orderDate name lookup attribute of type xsd:date yields of type xsd:date?
Semantics
This judgment is specified by the following rules.
If the attribute type is a reference to a global attribute, then name lookup yields the type reference in the attribute declaration for the given attribute name.
|
||
|
||
statEnv |- AttributeName name lookup attribute AttributeName yields TypeReference |
If the given attribute name matches the attribute name in the attribute type, and the attribute type contains a type reference, then name lookup yields that type reference.
|
statEnv |- AttributeName name lookup attribute AttributeName TypeReference yields TypeReference |
If the attribute type has no attribute name but contains a type reference, then name lookup yields the type reference.
|
statEnv |- AttributeName name lookup attribute TypeReference yields TypeReference |
If the attribute type has no attribute name and no type reference, then
name lookup yields xs:anySimpleType
.
|
statEnv |- AttributeName name lookup attribute
yields of type
xs:anySimpleType |
The type lookup judgments are used to obtain the appropriate type reference for an attribute or element within a type context.
Notation
The judgment
holds when the element type accessed in the given type context is optionally nillable and has the given type reference.
Semantics
The element type lookup judgments are specified by the following rules.
If the type context is absent, the element type must be a reference to a global element, in which case type lookup yields the type reference in the global element declaration with the given element name.
|
||
|
||
statEnv |- element ElementName in context type lookup Nillable? TypeReference |
If the type context is present, type lookup yields the type reference of the element declaration with the given element name in the type context. This requires expanding the type context to a type expression and then finding the reference to the given element within the type expression. This rule depends on the auxiliary rule referenced in, defined below.
|
|||
|
|||
statEnv |- element ElementName in context TypeName type lookup TypeReference |
In the case of a local element type, type lookup yields the corresponding type reference.
|
statEnv |- element ElementName Nillable? TypeReference in context TypeName? type lookup Nillable? TypeReference |
If the element type has no element name but contains a type reference, then type lookup yields that type reference.
|
statEnv |- element Nillable? TypeReference in context TypeName? type lookup TypeReference |
If the element type has no element name and no type reference, then lookup
yields xs:anyType
.
|
statEnv |- element in context TypeName? type lookup of type xs:anyType |
Notation
The judgment
holds when the element name referenced in the given type expression has the given type reference.
Semantics
Resolving a reference to a global or local element in a type expression is specified by the following rules. Note that the rules for resolving a reference to an element in a complex type expression are non-deterministic. For example, to resolve an element reference in a sequence type (Type1, Type2), it is sufficient to resolve the element reference in either Type1 or Type2. Similar rules apply to type expressions that include the choice and interleave type operators. These non-deterministic rules are sufficient, because in Section 3.8.6 Constraints on Model Group Schema ComponentsXS1, multiple declarations of the same element in a single model group (i.e., type expression) must all have the same top-level type definition.
The following four rules resolve element references in repetition, sequence, choice, and interleave type expressions.
|
||
|
||
statEnv |- element ElementName referenced in Type Occurrence has type TypeReference |
|
||
|
||
statEnv |- element ElementName referenced in Type1, Type2 has type TypeReference |
|
||
|
||
statEnv |- element ElementName referenced in Type1 & Type2 has type TypeReference |
|
||
|
||
statEnv |- element ElementName referenced in Type1 | Type2 has type TypeReference |
A type expression that refers to a global element resolves to the global element's type.
|
||
|
||
statEnv |- element ElementName referenced in element ElementName has type TypeReference |
A type expression that refers to a local element resolves to the local element's type.
|
statEnv |- element ElementName referenced in element ElementName Nillable? TypeReference has type TypeReference |
Notation
The judgment
holds when the attribute type accessed in the given type context up has the given type reference.
Semantics
This judgment is specified by the following rules.
If the type context is absent, the attribute type must be a reference to a global attribute, in which case type lookup yields the type reference in the global attribute declaration with the given attribute name.
|
||
|
||
statEnv |- attribute AttributeName in context type lookup TypeReference |
If the type context is present, type lookup yields the type reference of the attribute declaration with the given attribute name in the type context. This requires expanding the type context to a type expression and then finding the reference to the given attribute within the type expression.
|
|||
|
|||
statEnv |- attribute AttributeName in context TypeName type lookup TypeReference |
In the case of a local attribute type, type lookup yields the corresponding type reference.
|
statEnv |- attribute AttributeName TypeReference in context TypeName? type lookup TypeReference |
If the attribute type has no attribute name but contains a type reference, then type lookup yields the type reference.
|
statEnv |- attribute TypeReference in context TypeName? type lookup TypeReference |
If the attribute type has no attribute name and no type reference, then
type lookup yields xs:anySimpleType
.
|
statEnv |- attribute in context TypeName? type lookup of type
xs:anySimpleType |
Notation
The judgment
holds when the attribute name referenced in the given type expression has the given type reference.
Semantics
The rules for resolving a reference to a global or local attribute in a type expression are similar to those for resolving a reference to a global or local element in a type expression. However, because attribute references may occur only in sequence and interleave types, but not in repetitions or choice types, there are is one referenced in rule for resolving an attribute reference in a sequence type and one for interleave type.
|
||
|
||
statEnv |- attribute AttributeName referenced in Type1, Type2 has type TypeReference |
|
||
|
||
statEnv |- attribute AttributeName referenced in Type1 & Type2 has type TypeReference |
A type expression that refers to a global attribute resolves to the global attribute's type.
|
||
|
||
statEnv |- attribute AttributeName referenced in attribute AttributeName has type TypeReference |
A type expression that refers to a local attribute resolves to the local attribute's type.
|
statEnv |- attribute AttributeName referenced in attribute AttributeName TypeReference has type TypeReference |
Notation
The judgment
holds when the result of extending Type1 by Type2 is Type.
Semantics
This judgment is specified by the following rules.
Any element may optionally include the four built-in attributes xsi:type, xsi:nil, xsi:schemaLocation, or xsi:noNamespaceSchemaLocation. The adjust judgment is used to build a complete type description from a (possibly mixed) type description.
Notation
The judgment
holds when the second type is the same as the first, with the four built-in attributes added, and with text nodes added if the type is mixed.
Semantics
This judgment is specified by the following rules.
If the type is flagged as mixed, then mix the type and extend it by the built-in attributes.
|
||||
|
||||
statEnv |- mixed Type1 adjusts to Type4 |
Otherwise, just extend the type by the built-in attributes.
|
|||
|
|||
statEnv |- Type1 adjusts to Type3 |
The definition of BuiltInAttributes appears in [7.7.1 Builtin attributes].
The expands to judgment is used to compute the union of all types derived by extension from a given type. In the case the type is nillable it also makes sure the content model allows the empty sequence.
Notation
The judgment
holds when expanding the (nillable) type reference results in the given type.
Semantics
This judgment is specified by the following rules.
If the type is nillable, then its expansion is optional.
statEnv |- TypeReference expands to Type |
|
statEnv |- nillable TypeReference expands to Type? |
The type definition for the type reference is contained in its expansion.
|
||||
|
||||
statEnv |- of type TypeName expands to Type2 |
If there is a type deriving from the given type, then it is contained in its expansion.
|
|||
|
|||
statEnv |- of type TypeName1 expands to (Type1 | Type2) |
Editorial note | |
MF (13 Mar 2003) : I'm not sure about the expansion rule above. Jerome, please check. |
Note
Note that the expand judgment can result in many possible types, depending on how much expansion is performed. We are only interested in the biggest expansion, which is defined by the following auxiliary judgment.
Notation
The judgment
holds when the (nillable) type reference fully expands to the given type.
Semantics
Nillable? TypeReference fully expands to Type1 holds if and only if, Nillable? TypeReference expands to Type1 and there exists no type Type2 such that Nillable? TypeReference expands to Type2 and Type1 <: Type2.
Introduction
Step expressions are one of the elementary operations in [XPath/XQuery]. Steps select nodes reachable from the root of an XML tree. Defining the semantics of step expressions requires a detailed analysis of all the possible cases of axis and node tests.
This section introduces auxiliary judgments used to define the semantics of step expressions. The first judgment captures the notion of principal node kind in XPath. The second and third judgments describe how an axis or a node test filters a type and is used to define the static semantics of Step Expressions. The fourth judgment describes how an axis or a node test filters a sequence of nodes and is used to define the dynamic semantics of Step Expressions. A fifth judgment that accesses the value of an attribute is introduced for convenience.
Notation
The following auxiliary grammar production describe principal node types (See [XML Path Language (XPath) 2.0]).
[70 (Formal)] | PrincipalNodeKind |
::= | "element" | "attribute" | "namespace" |
Notation
The judgment
holds when PrincipalNodeKind is the principal node kind for Axis.
Example
For example, the following judgments hold.
child:: principal element descendant:: principal element preceding:: principal element attribute:: principal attribute namespace:: principal namespace
Semantics
This judgment is specified by the following rules.
The principal node type for the attribute axis is attribute.
|
attribute:: principal attribute |
The principal node type for the namespace axis is namespace.
|
namespace:: principal namespace |
The principal node type for all other axis is element.
Axis != attribute::
Axis != namespace:: |
|
Axis principal element |
Notation
The following judgment
holds when applying the axis Axis on type Type1 yields the type Type2.
Example
For example, assuming the extended XML Schema given in section [2.3.5 Example of a complete Schema], then the following judgments hold.
axis child:: of element of type xsd:string : text axis child:: of element items of type Items : element item of type [Anon1]* axis child:: of element purchaseOrder : element shipTo of type USAddress, element billTo of type USAddress, element ipo:comment?, element items of type Items axis attribute:: of element of type xsd:string : empty
Semantics
This judgment is specified by the following rules.
The first set of rules are used to process the content each individual item type in the input content model. Note that those rules always apply to prime types (see [7.5 Judgments for sequences of item types]).
|
|||
|
|||
statEnv |- axis Axis of Type1|Type2 : Type3|Type4 |
The following rules specifies how the judgment applies to each Axis.
Applying the self axis to a node type results in the same node type.
Editorial note | |
Mary: Michael R points out that if the simple content is empty then no text node will be constructed, therefore the type is text? Also, not clear whether the PIs and comments should be injected into the result type here or not. |
In the case of an element type, the static type of the child axis is
obtained by type lookup and expansion of the resulting type, then removal of
the types which are attributes. If the extracted type is empty
,
the child axis has type empty
.
|
||||
|
||||
statEnv |- axis child::
of ElementType : empty |
If the extracted type is a simple type, the child axis has type text.
|
||||
|
||||
statEnv |- axis child::
of ElementType : text |
In the case of an element type with complex content type, the static type of the child axis is obtained by type lookup and expansion of the resulting type, then removal of the types which are attributes.
|
||||
|
||||
statEnv |- axis child::
of ElementType : prime(ChildType) · quantifier(ChildType) |
In the case of an attribute type, the static type of the child axis is empty.
|
statEnv |- axis child::
of AttributeType : empty |
In the case of a text node type, the static type of the child axis is empty.
In the case of a comment node type, the static type of the child axis is empty.
In the case of a processing-instruction node type, the static type of the child axis is empty.
In case of a document node type, the static type of the child axis is the factored type of the document node type;.