The presentation of this document has been augmented to identify changes from a previous version. Three kinds of changes are highlighted: new, added text, changed text, and deleted text.
This document is also available in these non-normative formats: XML and Revisions to LC Draft.
Copyright © 2005 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This document defines formally the semantics of XQuery 1.0 [XQuery 1.0: A Query Language for XML] XPath 2.0 [XML Path Language (XPath) 2.0].
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a CandidateRecommendationasdescribed in the ProcessDocument.This document is intendedfor review by W3C membersand otherinterested parties. The publication ofthis document constitutes a callfor implementationsof this specification.This specification will remain aCandidate Recommendation until at least 2006-02-28.
Thisdocument has been jointly produced by the XML Query Working Group (WG) and the XSL Working Group, bothof whichare partof the XMLActivity.Publication asaCandidateRecommendationdoesnotimplyendorsement by the W3CMembership. This is adraft document andmay be updated, replaced or obsoleted by other documentsat any time. It is inappropriate to cite thisdocument as other than work in progress.
Thisdraft includescorrections and changes based on publiccommentsrecorded in the W3CpublicBugzilla repository (http://www.w3.org/Bugs/Public/) usedfor Last Callissues tracking. A list of substantivechanges since the Last CallWorking Draft of 04April 2005 can be found in [F Revision Log].
Comments on this document should be made in W3C's publicBugzilla system(instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery mailing list, public-qt-comments@w3.org. Itwill be very helpful if you include the string [FS] in the subject line of your comment, whether made in Bugzilla or in email. Each Bugzilla entry and email message should contain only one comment. Archives of thecomments and responses are available at http://lists.w3.org/Archives/Public/public-qt-comments/ .
The XML Query and XPath Test Suite isunder development. Implementors are encouraged to run thistest suite andreport theirresults.
The patent policy for this document is the 5 February 2004 W3C Patent Policy. Patent disclosures relevant to this specification may be found on the XML Query Working Group's patent disclosure page [and the XSL Working Group's patent disclosure page]. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose this information in accordance with section 6 of the W3C Patent Policy.
1 Introduction
1.1 Normative and Informative
Sections
2 Preliminaries
2.1 Introduction to the Formal Semantics
2.1.1 Notations from grammar productions
2.1.2 Notations for judgments
2.1.3 Notations for environments
2.1.4 Notations for inference rulestogether
2.1.5 Putting it together
2.2 URIs, Namespaces, and Prefixes
2.3 XML Values
2.3.1 Formal values
2.3.2 Examples of values
2.4 The [XPath/XQuery] Type System
2.4.1 XML Schema and the [XPath/XQuery] Type System
2.4.2 Item types
2.4.3 Content models
2.4.4 Top level definitions
2.4.5 Example of a complete Schema
2.5 Functions and operators
3 Basics
3.1 Expression Context
3.1.1 Static Context
3.1.1.1 Resolving QNames to Expanded QNames
3.1.2 Dynamic Context
3.2 Processing Model
3.2.1 Processing model
3.2.2 Normalization judgment
3.2.3 Static typing judgment
3.2.4 Dynamic evaluation judgment
3.3 Error Handling
3.4 Concepts
3.4.1 Document Order
3.4.2 Atomization
3.4.3 Effective Boolean Value
3.4.4 Input Sources
3.4.5 URI Literals
3.5 Types
3.5.1 Predefined Schema Types
3.5.2 Typed Value and String Value
3.5.3 SequenceType Syntax
3.5.4 SequenceType Matching
3.6 Comments
3.7 XML-defined Terminals
4 Expressions
4.1 Primary Expressions
4.1.1 Literals
4.1.2 Variable References
4.1.3 Parenthesized Expressions
4.1.4 Context Item Expression
4.1.5 Function Calls
4.2 Path Expressions
4.2.1 Steps
4.2.1.1 Axes
4.2.1.2 Node Tests
4.2.2 Predicates
4.2.3 Unabbreviated Syntax
4.2.4 Abbreviated Syntax
4.3 Sequence Expressions
4.3.1 Constructing Sequences
4.3.2 Filter Expressions
4.3.3 Combining Node Sequences
4.4 Arithmetic Expressions
4.5 Comparison Expressions
4.5.1 Value Comparisons
4.5.2 General Comparisons
4.5.3 Node Comparisons
4.6 Logical Expressions
4.7 Constructors
4.7.1 Direct Element Constructors
4.7.1.1 Attributes
4.7.1.2 Namespace Declaration Attributes
4.7.1.3 Content
4.7.1.4 Whitespace in Element Content
4.7.2 Other Direct Constructors
4.7.3 Computed Constructors
4.7.3.1 Computed Element Constructors
4.7.3.2 Computed Attribute Constructors
4.7.3.3 Document Node Constructors
4.7.3.4 Text Node Constructors
4.7.3.5 Computed Processing Instruction Constructors
4.7.3.6 Computed Comment Constructors
4.7.4 In-scope Namespaces of a Constructed Element
4.8 [For/FLWOR] Expressions
4.8.1 FLWOR expressions
4.8.2 For expression
4.8.3 Let Expression
4.8.4 Order By and Return Clauses
4.9 Ordered and Unordered Expressions
4.10 Conditional Expressions
4.11 Quantified Expressions
4.12 Expressions on SequenceTypes
4.12.1 Instance Of
4.12.2 Typeswitch
4.12.3 Cast
4.12.4 Castable
4.12.5 Constructor Functions
4.12.6 Treat
4.13 Validate Expressions
4.13.1 Validating an Element Node
4.13.2 Validating a Document Node
4.14 Extension Expressions
5 Modules and Prologs
5.1 Version Declaration
5.2 Module Declaration
5.3 Boundary-space Declaration
5.4 Default Collation Declaration
5.5 Base URI Declaration
5.6 Construction Declaration
5.7 Ordering Mode Declaration
5.8 Empty Order Declaration
5.9 Copy-Namespaces Declaration
5.10 Schema Import
5.11 Module Import
5.12 Namespace Declaration
5.13 Default Namespace Declaration
5.14 Variable Declaration
5.15 Function Declaration
5.16 Option Declaration
6 Conformance
6.1 Static Typing Feature
6.1.1 Static Typing Extensions
7 Additional Semantics of Functions
7.1 Formal Semantics Functions
7.1.1 The fs:convert-operand function
7.1.2 The fs:convert-simple-operand function
7.1.3 The fs:distinct-doc-order function
7.1.4 The fs:distinct-doc-order-or-atomic-sequence function
7.1.5 The fs:item-sequence-to-node-sequence function
7.1.6 The fs:item-sequence-to-untypedAtomic function
7.1.7 The fs:item-sequence-to-untypedAtomic-PI function
7.1.8 The fs:item-sequence-to-untypedAtomic-text function
7.1.9 The fs:item-sequence-to-untypedAtomic-comment function
7.1.10 The fs:apply-ordering-mode
function
7.1.11 The fs:to function
7.2 Standard functions with specific typing rules
7.2.1 The fn:last context function
7.2.2 The fn:position context function
7.2.3 The fn:abs, fn:ceiling, fn:floor,
fn:round, and fn:round-half-to-even functions
7.2.4 The fn:boolean function
7.2.5 The fn:collection and fn:doc functions
7.2.6 The fn:data function
7.2.7 The fn:distinct-values function
7.2.8 The fn:unordered function
7.2.9 The fn:error function
7.2.10 The fn:min, fn:max, fn:avg, and fn:sum
functions
7.2.11 The fn:remove function
7.2.12 The fn:reverse function
7.2.13 The fn:subsequence function
7.2.14 The op:union, op:intersect, and
op:except operators
7.2.15 The fn:insert-before function
7.2.16 The fn:zero-or-one, fn:one-or-more, and
fn:exactly-one functions
8 Auxiliary Judgments
8.1 Judgments for accessing types
8.1.1 Derives from
8.1.2 Substitutes for
8.1.3 Element and attribute name lookup (Dynamic)
8.1.4 Element and attribute type lookup (Static)
8.1.5 Extension
8.1.6 Mixed content
8.1.7 Type adjustment
8.1.8 Builtin attributes
8.1.9 Type expansion
8.1.10 Union interpretation of derived types
8.2 Judgments for step expressions and filtering
8.2.1 Principal Node Kind
8.2.2 Auxiliary judgments for axes
8.2.2.1 Static semantics of axes
8.2.2.1.1 Inference rules for all axis
8.2.2.1.2 Inference rules for the self axis
8.2.2.1.3 Inference rules for the child axis
8.2.2.1.4 Inference rules for the attribute axis
8.2.2.1.5 Inference rules for the parent axis
8.2.2.1.6 Inference rules for the namespace axis
8.2.2.1.7 Inference rules for the descendant axis
8.2.2.1.8 Inference rules for the descendant-or-self axis
8.2.2.1.9 Inference rules for the ancestor axis
8.2.2.1.10 Inference rules for the ancestor-or-self axis
8.2.2.2 Dynamic semantics of axes
8.2.3 Auxiliary judgments for node tests
8.2.3.1 Static semantics of node tests
8.2.3.1.1 Name Tests
8.2.3.1.2 Kind Tests
8.2.3.2 Dynamic semantics of node tests
8.2.3.2.1 Name Tests
8.2.3.2.2 Kind Tests
8.3 Judgments for type matching
8.3.1 Matches
8.3.2 Subtype and Type equality
8.4 Judgments for FLWOR and other expressions on sequences
8.5 Judgments for function calls
8.5.1 Type promotion
8.6 Judgments for validation modes and contexts
8.6.1 Elements in validation mode
A Normalized core grammar
A.1 Core BNF
B Functions and Operators
B.1 Functions and Operators used in the Formal Semantics
B.2 Mapping of Overloaded Internal Functions
C Importing Schemas
C.1 Introduction
C.1.1 Features
C.1.2 Organization
C.1.3 Main mapping rules
C.1.4 Special attributes
C.1.4.1 use, default, and fixed
C.1.4.2 minOccurs, maxOccurs, minLength, maxLength, and length
C.1.4.3 mixed
C.1.4.4 nillable
C.1.4.5 substitutionGroup
C.1.5 Anonymous type names
C.2 Schemas as a whole
C.2.1 Schema
C.2.2 Include
C.2.3 Redefine
C.2.4 Import
C.3 Attribute Declarations
C.3.1 Global attributes declarations
C.3.2 Local attribute declarations
C.4 Element Declarations
C.4.1 Global element declarations
C.4.2 Local element declarations
C.5 Complex Type Definitions
C.5.1 Global complex type
C.5.2 Local complex type
C.5.3 Complex type with simple content
C.5.4 Complex type with complex content
C.6 Attribute Uses
C.7 Attribute Group Definitions
C.7.1 Attribute group definitions
C.7.2 Attribute group reference
C.8 Model Group Definitions
C.9 Model Groups
C.9.1 All groups
C.9.2 Choice groups
C.9.3 Sequence groups
C.10 Particles
C.10.1 Element reference
C.10.2 Group reference
C.11 Wildcards
C.11.1 Attribute wildcards
C.11.2 Element wildcards
C.12 Identity-constraint Definitions
C.13 Notation Declarations
C.14 Annotation
C.15 Simple Type Definitions
C.15.1 Global simple type definition
C.15.2 Local simple type definition
C.15.3 Simple type content
D References
D.1 Normative References
D.2 Non-normative References
D.3 Background References
E Auxiliary Judgments for Validation (Non-Normative)
E.1 Judgments for the validate expression
E.1.1 Type resolution
E.1.2 Interleaving
E.1.3 Attribute filtering
E.1.4 Erasure
E.1.4.1 Simply erases
E.1.4.2 Erases
E.1.5 Annotate
E.1.5.1 Simply annotate
E.1.5.2 Nil-annotate
E.1.5.3 Annotate
F Revision Log (Non-Normative)
F.1 15 September 2005
This document defines the formal semantics of XQuery 1.0 and XPath 2.0. The present document is part of a set of documents that together define the XQuery 1.0 and XPath 2.0 languages:
[XQuery 1.0: A Query Language for XML] introduces the XQuery 1.0 language, defines its capabilities from a user-centric view, and defines the language syntax.
[XML Path Language (XPath) 2.0] introduces the XPath 2.0 language, defines its capabilities from a user-centric view, and defines the language syntax.
[Functions and Operators] lists the functions and operators defined for the [XPath/XQuery] language and specifies the required types of their parameters and return value.
[Data Model] formally specifies the data model used by [XPath/XQuery] to represent the content of XML documents. The [XPath/XQuery] language is formally defined by operations on this data model.
[Data Model Serialization] specifies how [XPath/XQuery] data model values are serialized into XML.
The scope and goals for the [XPath/XQuery] language are discussed in the charter of the W3C [XSL/XML Query] Working Group and in the [XPath/XQuery] requirements [XML Query 1.0 Requirements].
This document defines the semantics of [XPath/XQuery] by giving a precise formal meaning to each of the expressions of the [XPath/XQuery] specification in terms of the [XPath/XQuery] data model. This document assumes that the reader is already familiar with the [XPath/XQuery] language. This document defines the formal semantics for XPath 2.0 only when the XPath 1.0 backward compatibility rules are not in effect.
Two important design aspects of [XPath/XQuery] are that it is functional and that it is typed. These two aspects play an important role in the [XPath/XQuery] Formal Semantics.
[XPath/XQuery] is a functional language. [XPath/XQuery] is built from expressions, rather than statements. Every construct in the language (except for the XQuery query prolog) is an expression and expressions can be composed arbitrarily. The result of one expression can be used as the input to any other expression, as long as the type of the result of the former expression is compatible with the input type of the latter expression with which it is composed. Another characteristic of a functional language is that variables are always passed by value, and a variable's value cannot be modified through side effects.
[XPath/XQuery] is a typed language. Types can be imported from one or more XML Schemas that describe the input documents and the output document, and the [XPath/XQuery] language can then perform operations based on these types. In addition, [XPath/XQuery] supports static type analysis. Static type analysis infers the output type of an expression based on the type of its input expressions. In addition to inferring the type an expression for the user, static typing allows early detection of type errors, and can be used as the basis for certain classes of optimization. The [XPath/XQuery] type system captures most of the features of [Schema Part 1], including global and local element and attribute declarations, complex and simple type definitions, named and anonymous types, derivation by restriction, extension, list and union, substitution groups, and wildcard types. It does not model uniqueness constraints and facet constraints on simple types.
This document is organized as follows. [2 Preliminaries] introduces the notations used to define the [XPath/XQuery] Formal Semantics. These include the formal notations for values in the [XPath/XQuery] data model and for types in XML Schema. The next three sections: [3 Basics], [4 Expressions], and [5 Modules and Prologs] have the same structure as the corresponding sections in the [XQuery 1.0: A Query Language for XML] and [XML Path Language (XPath) 2.0] documents. This allows the reader to quickly find the formal definition of a particular language construct. [3 Basics] defines the semantics for basic [XPath/XQuery] concepts, and [4 Expressions] defines the dynamic and static semantics of each [XPath/XQuery] expression. [5 Modules and Prologs] defines the semantics of the [XPath/XQuery] prolog. [7 Additional Semantics of Functions] defines the static semantics of several functions in [Functions and Operators] and gives the dynamic and static semantics of several supporting functions used in this document. The remaining sections, [8 Auxiliary Judgments] and [C Importing Schemas], contain material that supports the formal semantics of [XPath/XQuery]. [8 Auxiliary Judgments] defines formal judgments that relate data model values to types, that relate types to types, and that support the formal definition of validation. These judgments are used in the definition of expressions in [4 Expressions]. Lastly, [C Importing Schemas], specifies how XML Schema documents are imported into the [XPath/XQuery] type system and relates XML Schema types to the [XPath/XQuery] type system.
Certain aspects of language processing are described in this specification as implementation-defined or implementation-dependent.
[Definition: Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementor for each particular implementation.]
[Definition: Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementor for any particular implementation.]
A language aspect described in this specification as implementation-defined or implementation dependent may be further constrained by the specifications of a host language in which XPath or XQuery is embedded.
This document contains the normative static semantics of [XPath/XQuery]. The static semantics rules in [3 Basics], [4 Expressions], [5 Modules and Prologs], and [7 Additional Semantics of Functions] are normative. [3.1.1 Static Context] is normative, because it defines the static context used in the static typing rules. [8 Auxiliary Judgments] is normative, because it contains all the judgments necessary for defining SequenceType Matching.
The dynamic semantics of [XPath/XQuery] are normatively defined in [XQuery 1.0: A Query Language for XML] and [XML Path Language (XPath) 2.0]. In this document, the dynamic semantic rules in [3 Basics], [4 Expressions], and [5 Modules and Prologs], the examples, and the material labeled as "Note" are provided for explanatory purposes and are not normative.
The mapping rules from XML Schema to the XQuery type system provided in [C Importing Schemas], and the formal semantics of XML Schema validation in [E Auxiliary Judgments for Validation] are informative and do not handle every feature of XML Schema.
This section provides the background necessary to understand the Formal Semantics, introduces the notations that are used, and explains its relationship to other documents.
Why a Formal Semantics? The goal of the formal semantics is to complement the [XPath/XQuery] specification ([XQuery 1.0: A Query Language for XML] and [XML Path Language (XPath) 2.0]), by defining the meaning of [XPath/XQuery] expressions with mathematical rigor.
A rigorous formal semantics clarifies the intended meaning of the English specification, ensures that no corner cases are left out, and provides a reference for implementation.
Why use formal notations? Rigor is achieved by the use of formal notations to represent [XPath/XQuery] objects such as expressions, XML values, and XML Schema types, and by the systematic definition of the relationships between those objects to reflect the meaning of the language. In particular, the dynamic semantics relates [XPath/XQuery] expressions to the XML value to which they evaluate, and the static semantics relates [XPath/XQuery] expressions to the XML Schema type that is inferred for that expression.
The Formal Semantics uses several kinds of formal notations to define the relationships between [XPath/XQuery] expressions, XML values, and XML Schema types. This section introduces the notations for judgments, inference rules, and mapping rules as well as the notation for environments, which implement the dynamic and static contexts. The reader already familiar with these notations can skip this section and continue with [2.3 XML Values].
Grammar productions are used to describe "objects" (values, types, [XPath/XQuery] expressions, etc.) manipulated by the Formal Semantics. The Formal Semantics makes use of several kinds of grammar productions: productions from the [XPath/XQuery] grammar itself, productions for a subset of the [XPath/XQuery] language called the XQuery Core which is used throughout this document, and other productions used for formal specification only such as for the XQuery type system.
XQuery grammar productions describe the XQuery language and expressions. XQuery productions are identified by a number, which corresponds to their number in the [XQuery 1.0: A Query Language for XML] document, and are marked with "(XQuery)". For instance, the following production describes FLWOR expressions in XQuery.
| [33 (XQuery)] | FLWORExprXQ | ::= | (ForClause | LetClause)+ WhereClause? OrderByClause? "return" ExprSingle |
For the purpose of this document, the differences between the XQuery 1.0 and the XPath 2.0 grammars are mostly irrelevant. By default, this document uses XQuery 1.0 grammar productions. Whenever the grammar for XPath 2.0 differs from the one for XQuery 1.0, the corresponding XPath 2.0 productions are also given. XPath productions are identified by a number, which corresponds to their number in [XML Path Language (XPath) 2.0], and are marked with "(XPath)". For instance, the following production describes for expressions in XPath.
| [4 (XPath)] | ForExprXP | ::= | SimpleForClause "return" ExprSingle |
XQuery Core grammar productions describe the XQuery Core. The Core grammar is given in [A Normalized core grammar]. Core productions are identified by a number, which corresponds to their number in [A Normalized core grammar], and are marked with "(Core)". For instance, the following production describes the simpler form of the "FLWOR" expression in the XQuery Core.
| [32 (Core)] | FLWORExpr | ::= | (ForClause | LetClause) "return" ExprSingle |
The Formal Semantics manipulates "objects" (values, types, expressions, etc.) for which there is no existing grammar production in the [XQuery 1.0: A Query Language for XML] document. In these cases, specific grammar productions are introduced. Notably, additional productions are used to describe values in the [Data Model], and to describe the [XPath/XQuery] type system. Formal Semantics productions are identified by a number, and are marked by "(Formal)". For instance, the following production describes global type definitions in the [XPath/XQuery] type system.
| [39 (Formal)] | Definition | ::= | ("define" "element" ElementName OptSubstitution OptNillableTypeReference) |
Note that grammar productions that are specific to the Formal Semantics (i.e., marked with "(Formal)") are not part of [XPath/XQuery]. They are not accessible to the user and are only used in the course of defining the languages' semantics.
Grammar non-terminals are used extensively in this document to represent objects in judgments(see the next section). As a convenience, non-terminals used in judgmentslink to the appropriate grammar production.
The basic building block of the formal specification is called a judgment. A judgment expresses whether a property holds or not.
For example:
Notation
The judgment
holds if the object Object is a positive integer.
A judgment may hold (if it is true) or not hold (if it is false). For instance '1 is a positive integer' holds and '-1 is a positive integer' does not hold.
Notation
Here are two other example judgments.
The judgment
holds if the expression Expr yields (or evaluates to) the value Value.
The judgment
holds if the expression Expr has the type Type.
Most other judgments used in this document are short English sentences intended to reflect their meaning, and written in bold fonts. For instance, the judgment
holds if PrincipalNodeKind is the principal node kind for the axis Axis.
A judgment can contain symbols and patterns.
Symbols are purely syntactic and are used to write the judgment itself. In general, symbols in a judgment are chosen to reflect its meaning. For example, 'is beautiful', '=>' and ':' are symbols, the second and third of which should be read "yields", and "has type" respectively.
Patterns are usedto represent objects that can be constructedfrom a given grammar production. In patterns, italicizedwords correspond to non-terminals in the grammar.The nameof those non-terminals is significant, and may be instantiated only to an "object"(a value, a type, anexpression, etc.) that can be substitutedlegally for that non-terminal. For example, 'Expr' is a pattern that stands forevery [XPath/XQuery] expressions, 'Expr1 + Expr2'is a patternthat standsfor every addition expression, 'element a {Value}'isa pattern that standsfor every value in the [XPath/XQuery]data model that is an 'a' element.
Non-terminals in a patternmay appear with subscripts (e.g. Expr1, Expr2) to distinguish different instances of the same sort of pattern. Insome cases, non-terminals in a pattern may have a name that is not exactly the name of thatnon terminal,but is based on it. For instance, a BaseTypeName is a pattern that stands for a type name, as would TypeName, or TypeName2. This usage is limited, and only occurs to improve the readability of some of the inference rules.
Wheninstantiatingthe judgment, each pattern mustbeinstantiated toan appropriate sort of "object" (value,type, expression, etc). For example,'3 => 3'and'$x+0=> 3' are bothinstances of the judgment 'Expr => Value'. Notethatinthefirst judgment,'3' corresponds toboththe expression '3' (on the left-handside of the => symbol)and to the value '3' (onthe right-handside of the => symbol).
In some cases,inference rules may need to use the factthat acertain judgment does not hold. not( Judgment)holds iffJudgmentdoes not hold.
Insome cases, a pattern may be instantiated to avaluewithin a finite set of pre-determinedvalues. We may writethat setof possiblevalues using the in judgment.For instance, the judgment
holdsif the patternColorhas either the value blue or the value green.
An environment component is a dictionary that maps a symbol (e.g., a function name or a variable name) to an "object" (e.g., a function body, a type, a value). One can access information in an environment component or update it.
If "envComp" is an environment component,then "envComp(symbol)" denotes the "object" to which symbol is mapped. The notation is intentionally similar to function application, because an environment component can be considered a function from the argument symbol to the "object" to which the symbol is mapped.
This document uses environments that group related environment components. If "env" is an environment containing the environment component "envComp", that environment component is denoted "env.envComp". The value that symbol is mapped to in that environment component is denoted "env.envComp(symbol)".
The two main environmentsused in the Formal Semantics are: a dynamic environment (dynEnv), which models the [XPath/XQuery]'s dynamic context, and a static environment (statEnv), which models the [XPath/XQuery]'s static context. Both are defined in [3.1 Expression Context].
For example, dynEnv.varValue denotes the dynamic environment component that maps variables to values and dynEnv.varValue(Variable) denotes the value of the variable Variable in the dynamic context.
Environmentsare used in a judgment to capture some of the context in which the judgment is computed, and most judgments are computed assuming that some environment is given. This assumption is denoted by prefixing the judgment with "env |-". The "|-" symbol is called a "turnstile" and is used in almost all inference rules.
For instance, the judgment
is read as: Assuming the dynamic environment dynEnv, the expression Expr yields the value Value.
Environmentscan be updated, using the following notation:
"env.envComp(symbol => object) " denotes the new environment that is identical to env except that the environment component envComp has been updated to map symbol to object. The notation symbol => object indicates that symbol is mapped to object in the new environment.
In case the environment component contains only a constant value (e.g., the ordering mode which can only be either ordered or unordered), the following notation is used to set its value. "env.envComp( object ) ".
The following shorthand is also allowed: "env.envComp( symbol1 => object1 ; ... ; symboln => objectn ) " in which each symbol is mapped to a corresponding object in the new environment.
This notation is equivalent to nested updates, as in " (env + envComp( symbol1 => object1) + ... ) + env(symboln => objectn)".
Updating an environment creates a copy of the original environment and overrides any previous binding that might exist for the same name and the same component in that environment. Updating the environment is used to capture the scope of a symbol (e.g., for variables, namespace prefixes, etc). For instance, in the following expression
let $x := 1 return let $x := $x + 2 return $x - 3
each let expression changes the dynamic context by binding a
new variable to a new value. Each different context is
represented by a different environment. The original
environment, in which the expression 1 is
evaluated, does not contain any binding for variable
$x. This environment is updated a first time with a
binding of variable $x to the value 1,
and this environment is used for the evaluation of the
expression $x + 2. Then it is updated a second time
with a binding of variable $x to the value
3, and this environment is used for the evaluation
of the expression $x - 3.
Also, note that there are no operations to remove entries from environments. This is never necessary as updating an environment effectively creates a new extended copy of the original environment, leaving the original environment accessible wherever it is in scope along with the updated copy.
Inference rules are used to specify how to infer whether a given judgment holds ornot. Inference rules express the logical relationbetween judgments and describe how complex judgments can be concluded from simpler premise judgments.
A logical inference ruleis written as a collection of premises and a conclusion, written respectively above and below a dividing line, as follows:
| premise1 ... premisen |
| conclusion |
All premises and the conclusion are judgments. From a logical point of view, an inference rule is a deduction that if the premises hold, then the conclusion holds as well. In that sense, the previous inference rule has a similar meaning as the following logical statement.
IF premise1
AND ...
AND premisen
THEN conclusion
Here is a simple example of inference rule, which uses specific instances of the example judgment 'Expr => Value' from above:
| $x => 0 3 => 3 |
| $x + 3 => 3 |
This inference rule expresses the following property: if the variable expression '$x' yields the value '0', and the literal expression '3' yields the value '3', thenthe expression '$x + 3' yields the value '3'.
An inference rule may have no premises above the line, which means that the expression below the line always holds. For instance:
| 3 => 3 |
This inference rule expresses the following property: evaluating the literal expression '3' always yields the value '3'.
The two above rules are expressed in terms of specific expressions and values, but usually rules are more abstract. That is, the judgments are not fully instantiated. Here is a rule that says that for any variable Variablethat yields the integer value Integer, adding '0' yields the same integer value:
Each occurrence of a given pattern in a particular inference rule must be instantiated to the same "object" within the entire rule. This means that one can talk about "the value of Variable" instead of the value bound to the first (second, etc) occurrence of VarRef.
Here is an example of a rule occurring later in this document.
This rule is read as follows: if two expressions Expr1 and Expr2 are known to have the static types Type1 and Type2 (the two premises above the line), then it is the case that the sequenceexpression "Expr1 , Expr2" hasthe static type "Type1, Type2", which is the sequence of types Type1 and Type2. Note that thisinference rule does not modify the static environment.
The following rule defines the static semantics of a "let" expression. The binding of the new variable is captured by an update to the varType component of the original static environment.
| |||
statEnv |-
let
$VarName := Expr1
return
Expr2 : Type2
|
This rule is read as follows: First, because the variable is a QName, it is first expanded into an expanded QName. Second, the type Type1 for the "let" input expression Expr1 is computed. Then the "let" variable with expanded name, expanded-QName with type Type1 is added into the varType component of the static environment statEnv. Finally, the type Type2 of Expr2 is computed in that new environment.
In isolation,each inference rule describes a fragment of the semantics for a given judgment.Put together, inference rules describe possible inferences that can be used to decide whetherthat a particular judgment hold.
For a given judgment, and a set of inference rules, if that judgment can be inferred to betrue, the inference succeeds. In mostcases, the inference will proceed by proving intermediatejudgments, following the consequences from one judgment to thenext by applying successiveinference rules.
Suchinference is a mechanism which can be used to describe both statictype analysis and dynamic evaluation. Morespecifically, performing static typing consists in provingthat thefollowingjudgmentholdsfor a givenexpression Expr .
If the judgment holds for a given type Type,this type is a possible static type for the expression. If there exists no type for which this judgment holds, then static typing fails and a static type error is returned to the user.
Consider the following expression.
fn:count((1,2,3))
Using the static typing rules given for expressions in the
restof this document, onecan deduce that the expression is of
type xs:integer through the following inference.
statEnv |- 1 : xs:integer (from typing of literals)
statEnv |- 2 : xs:integer (from typing of literals)
--------------------------------------------------- (sequence)
statEnv |- 1,2 : xs:integer, xs:integer
statEnv |- 3 : xs:integer
----------------------------------------------------- (sequence)
statEnv |- 1,2,3 : xs:integer, xs:integer, xs:integer
declare function fn:count($x as item()*) as xs:integer
statEnv |- xs:integer,xs:integer,xs:integer <: item*
---------------------------------------------------------- (function call)
statEnv |- fn:count((1,2,3)) : xs:integer
Conversly, consider the following expression.
fn:nilled((1,2,3))
Using the static typing rules given for expressions in the rest of this document, one can apply inference rules up to the following point.
....
----------------------------------------------------- (sequence)
statEnv |- 1,2,3 : xs:integer, xs:integer, xs:integer
However, there is no rule thatcan infer the type of
fn:nilled((1,2,3)), because the static typing rules
for function calls will only hold if the type ofthe function
parameters is a subtype of the expected type. However, here
(xs:integer,xs:integer,xs:integer) is not a node
type, which is the expectedtype for the function
fn:nilled.
Note that in some cases, the inference can only proceedthrough the appropriate changes to the environment. For instance, consider the following expression.
let $x := 1 return ($x,$x)
Using the static typing rules given for expressions in the
rest of this document, one can deduce that the expression is of
type (xs:integer,xs:integer) through the following
inference.
statEnv0.varType = ()
-------------------------- (literal)
statEnv0 |- 1 : xs:integer
statEnv1 = statEnv0 + varType($x => xs:integer)
statEnv1.varType($x) = xs:integer
--------------------------------- (variable reference)
statEnv1 |- $x : xs:integer
statEnv1.varType($x) = xs:integer
--------------------------------- (variable reference)
statEnv1 |- $x : xs:integer
------------------------------------------- (sequence)
statEnv1 |- ($x,$x) : xs:integer,xs:integer
-------------------------------------------------------------- (let)
statEnv0 |- let $x := 1 return ($x,$x) : xs:integer,xs:integer
This example illustrates how each rule is applied to individual sub-expressions, and how the environment is used to maintain the relevant context information.
The Formal Semantics does not formally specify the adjustment of relative URIs according to a base URI. All URIs used in this document are assumed to be absolute URIs.
The Formal Semantics uses the following namespace prefixes.
fn: for functions and operators from the
[Functions and Operators] document.
xs: for XML Schema components and
built-in types.
xdt: for [XPath/XQuery] built-in
types.
All these prefixes are assumed to be bound to the appropriate URIs.
In addition, the Formal Semantics uses the following special prefixes for specification purposes.
dm: for accessors of the [Data Model].
op: for operators in [Functions and Operators].
fs: for functions and types defined in the formal semantics.
These prefixes are always italicized to emphasize that the corresponding functions, variables, and types are abstract: they are not and cannot be made accessible in [XPath/XQuery]. None of these special prefixes are given an explicit URI, but they behave as if they had one for the purposes of namespace resolution.
The [XPath/XQuery] language is defined over values of the [XPath/XQuery] data model. The [XPath/XQuery] data model is defined normatively in [Data Model]. We define the formal notation that is used in this document to describe and manipulate values in inference rules. Formal values are used for specification purposes only and are not exposed to the [XPath/XQuery] user.
This section gives the grammar for formal values, along with a summary of the corresponding data model properties. In the context of this document, all constraints on values that are specified in [Data Model] are assumed to hold.
A value is a sequence of zero or more items. An item is either an atomic value or a node.
An atomic value is a value in the value space of an atomic
type, labeled with the name of that atomic type. An atomic type
is either a primitive or derived atomic type according to XML
Schema [Schema Part 2], xdt:untypedAtomic, or
xdt:anyAtomicType.
A node is either an element, an attribute, a document, a text, a comment, or a processing-instruction node.
Element nodes have a type annotationXQ and contain a complex value or a
simple value. Attribute nodes have a type annotationXQ and contain a simple value. Text
nodes always contain one string value of type
xdt:untypedAtomic, therefore the corresponding type annotation
is omitted in the formal notation of a text node. Document nodes
do not have a type annotation and contain a sequence of element,
text, comment, or processing-instruction nodes.
A simple value is a sequence of atomic values.
A complex value is a sequence of attribute nodes followed by a sequence of element, text, comment, or processing-instruction nodes.
A type annotationXQ can be either the QName of a declared type or an anonymous type. An anonymous type corresponds to an XML Schema type for which the schema writer did not provide a name. Anonymous type names are not visible to the user, but are generated during schema validation and used to annotate nodes in the data model. By convention, anonymous type names are written using the fs: Formal Semantics prefix: fs:anon0, fs:anon1, etc.
Formal values are defined by the following grammar.
| [7 (Formal)] | Value | ::= | Item |
| [21 (Formal)] | Item | ::= | NodeValue |
| [22 (Formal)] | AtomicValue | ::= | AtomicValueContent TypeAnnotation? |
| [1 (Formal)] | AtomicValueContent | ::= | String |
| [2 (Formal)] | TypeAnnotation | ::= | "of" "type" TypeName |
| [9 (Formal)] | ElementValue | ::= | "element" ElementName "nilled"? TypeAnnotation? "{" Value "}" ("{" NamespaceBindings "}")? |
| [10 (Formal)] | AttributeValue | ::= | "attribute" AttributeName TypeAnnotation? "{" SimpleValue "}" |
| [8 (Formal)] | SimpleValue | ::= | AtomicValue |
| [11 (Formal)] | DocumentValue | ::= | "document" "{" Value "}" |
| [13 (Formal)] | CommentValue | ::= | "comment" "{" String "}" |
| [14 (Formal)] | ProcessingInstructionValue | ::= | "processing-instruction" NCName "{" String "}" |
| [12 (Formal)] | TextValue | ::= | "text" "{" String "}" |
| [20 (Formal)] | NodeValue | ::= | ElementValue |
| [3 (Formal)] | ElementName | ::= | QName |
| [6 (Formal)] | AttributeName | ::= | QName |
| [23 (Formal)] | TypeName | ::= | QName |
| [15 (Formal)] | NamespaceBindings | ::= | NamespaceBinding ("," NamespaceBinding)* |
| [17 (Formal)] | NamespaceBinding | ::= | "namespace" NCName "{" AnyURI "}" |
Notation
In that grammar, "String" indicates the value
space of xs:string, "Decimal" indicates the
value space of xs:decimal, etc.
Element (resp. attributes) without type annotations, are
assumed to have the type annotation xs:anyType
(resp. xs:anySimpleType). Atomic values without type
annotations, are assumed to have a type annotation which is the
base type for the corresponding value. For instance,
"Hello, World!" is equivalent to "Hello,
World!" of type xs:string.
Untyped elements (e.g., from well-formed documents) have the
type annotationXQ xdt:untyped,
untyped attributes have the type annotationXQ xdt:untypedAtomic, and untyped
atomic values have the type annotationXQ xdt:untypedAtomic.
An element has an optional "nilled" marker. This
marker ispresent only if the element has been validated
against
an element type in the schema which is "nillable",
and the element has no content and an
attribute
xsi:nil set to "true".
An element also has a sequence of namespace bindings, which are the set of in-scope namespaces for that element. Each namespace binding is a prefix, URI pair. Elements without namespace bindings are assumed to have an empty set of in-scope namespaces.
Note:
In [XPath], the in-scope namespaces of an element node are represented by a collection of namespace nodes arranged on a namespace axis, which is optional and deprecated in [XML Path Language (XPath) 2.0]. XQuery does not support the namespace axis and does not represent namespace bindings in the form of nodes.
A well-formed document
<fact>The cat weighs <weight units="lbs">12</weight> pounds.</fact>
In the absence of a Schema, this document is represented as
element fact of type xdt:untyped {
text { "The cat weighs " },
element weight of type xdt:untyped {
attribute units of type xdt:untypedAtomic {
"lbs" of type xdt:untypedAtomic
}
text { "12" }
},
text { " pounds." }
}
A document before and after validation.
<weight xsi:type="xs:integer">42</weight>
The formal model for values can represent values before and after validation. Before validation, this element is represented as:
element weight of type xdt:untyped {
attribute xsi:type of type xdt:untypedAtomic {
"xs:integer" of type xdt:untypedAtomic
},
text { "42" }
}
After validation, this element is represented as:
element weight of type xs:integer {
attribute xsi:type of type xs:QName {
"xs:integer" of type xs:QName
},
42 of type xs:integer
}
An element with a list type
<sizes>1 2 3</sizes>
Before validation, this element is represented as:
element sizes of type xdt:untyped {
text { "1 2 3" }
}
Assume the following Schema.
<xs:element name="sizes" type="sizesType"/>
<xs:simpleType name="sizesType">
<xs:list itemType="sizeType"/>
</xs:simpleType>
<xs:simpleType name="sizeType">
<xs:restriction base="xs:integer"/>
</xs:simpleType>
After validation against this Schema, the element is represented as:
element sizes of type sizesType {
1 of type sizeType,
2 of type sizeType,
3 of type sizeType
}
An element with an anonymous type
<sizes>1 2 3</sizes>
Before validation, this element is represented as:
element sizes of type xdt:untyped {
text { "1 2 3" }
}
Assume the following Schema.
<xs:element name="sizes">
<xs:simpleType>
<xs:list itemType="xs:integer"/>
</xs:simpleType>
</xs:element>
After validation, this element is represented as:
element sizes of type fs:anon1 {
1 of type xs:integer,
2 of type xs:integer,
3 of type xs:integer
}
where fs:anon1 stands for the internal anonymous name
generated by the system for the sizes element.
A nillable element with xsi:type set to
true:
<sizes xsi:nil="true"/>
Before validation, this element is represented as:
element sizes of type xdt:untyped {
attribute xsi:nil of type xdt:untypedAtomic { "true" of type xdt:untypedAtomic }
}
Assume the following Schema.
<xs:element name="sizes" type="sizesType" nillable="true"/>
After validation against this Schema, the element is represented as:
element sizes nilled of type sizesType {
attribute xsi:nil of type xs:boolean { true of type xs:boolean }
}
An element with a union type
<sizes>1 two 3 four</sizes>
Before validation, this element is represented as:
element sizes of type xdt:untyped {
text { "1 two 3 four" }
}
Assume the following Schema:
<xs:element name="sizes" type="sizesType"/>
<xs:simpleType name="sizesType">
<xs:list itemType="sizeType"/>
</xs:simpleType>
<xs:simpleType name="sizeType">
<xs:union memberType="xs:integer xs:string"/>
</xs:simpleType>
After validation against this Schema, the element is represented as:
element sizes of type sizesType {
1 of type xs:integer,
"two" of type xs:string,
3 of type xs:integer,
"four" of type xs:string
}
The [XPath/XQuery] type system is used in the specification of the dynamic and of the static semantics of [XPath/XQuery]. This section introduces formal notations for describing types.
The [XPath/XQuery] type system is based on [Schema Part 1] and [Schema Part 2]. [Schema Part 1] and [Schema Part 2] specify normatively the type information available in [XPath/XQuery]. We define the formal notation that is used in this document to describe and manipulate types in inference rules. Formal types are used for specification purposes only and are not exposed to the [XPath/XQuery] user.
Representation of content models. For the
purpose of static typing, the [XPath/XQuery] type system only
describes minOccurs, maxOccurs, and minLength, maxLength on list
types for the occurrences that correspond to the DTD operators
+, *, and ?. Choices are
represented using the DTD operator |. All
groups are represented using the interleaving operator
(&).
Representation of anonymous types. To clarify the semantics, the [XPath/XQuery] type system makes all anonymous types explicit.
Representation of XML Schema simple type facets and identity constraints. For simplicity, XML Schema simple type facets and identity constraints are not formally represented in the [XPath/XQuery] type system. However, an [XPath/XQuery] implementation supporting XML Schema import and validation must take simple type facets and identity constraints into account.
This document describe types in the [XPath/XQuery] types system, as well as the operations and properties over those types which are used to define the [XPath/XQuery] static typing feature. The two most important properties are whether a data instances matches a type, and whether a type is a subtype of another. Those properties are described in [8.3 Judgments for type matching]. This document does not describe all other possible properties over those types.
The mapping from XML Schema into the [XPath/XQuery] type system is given in [C Importing Schemas]. The rest of this section is organized as follows. [2.4.2 Item types] describes item types, [2.4.3 Content models] describes content models, and [2.4.4 Top level definitions] describe top-level type declarations.
An item type is either an atomic type, an element type, an attribute type, a document node type, a text node type, a comment node type, or a processing instruction type. We distinguish between document nodes, attribute nodes, and nodes that can occur in element content (elements, comments, processing instructions, and text nodes), as we need to refer to element content types later in the formal semantics.
| [25 (Formal)] | FormalItemType | ::= | AtomicTypeName | NodeType |
| [28 (Formal)] | AtomicTypeName | ::= | QName |
| [26 (Formal)] | NodeType | ::= | DocumentType |
| [27 (Formal)] | ElementContentType | ::= | ElementType |
| [29 (Formal)] | ElementType | ::= | "element" ElementNameOrWildcard OptTypeSpecifier |
| [29 (Formal)a] | ElementNameOrWildcard | ::= | QName | "*" |
| [29 (Formal)b] | AttributeNameOrWildcard | ::= | QName |"*" |
| [29 (Formal)c] | OptTypeSpecifier | ::= | TypeSpecifier? |
| [30 (Formal)] | TypeSpecifier | ::= | OptNillableTypeReference |
| [31 (Formal)] | AttributeType | ::= | "attribute" AttributeNameOrWildcardOptTypeReference |
| [31 (Formal)a] | OptNillable | ::= | Nillable? |
| [32 (Formal)] | Nillable | ::= | "nillable" |
| [32 (Formal)a] | OptTypeReference | ::= | TypeReference? |
| [32 (Formal)b] | TypeReference | ::= | "of" "type" TypeName |
| [48 (Formal)] | DocumentType | ::= | "document" ("{" Type "}")? |
An element or attribute type has an optional name and an optional type reference. A name alone corresponds to a reference to a global element or attribute declaration. A name with a type reference corresponds to a local element or attribute declaration. The word "element" or "attribute" alone refers to the wildcard types for any element or any attribute. In addition, an element type has an optional nillable flag that indicates whether the element can be nilled or not.
A document type has an optional content type. If no content type is given, then the type is treated as being the wildcard type for documents, i.e., a sequence of text and element nodes. For consistency with element nodes, PIs and comments are not indicated in that wildcard type, but may occur in instances.
Note
Generic node types (e.g., node()) such as used
in the SequenceType production, are interpreted in the type
system as a union of thecorresponding node types (e.g.,
element,attribute,text,commentand processing-instruction
nodes)and therefore do not appear in the grammar. The
semantics of sequence types is described in [3.5.4 SequenceType Matching].
Examples
The following is a text node type
text
The following is a type for all elements
element * of type xs:anyType
The following is a type for all elements of type string
element * of type xs:string
The following is a type for a nillable element of type
string and with name size
element size nillable of type xs:string
The following is a reference to a global attribute declaration
attribute sizes
The following is a type for elements with anonymous type fs:anon1:
element sizes of type fs:anon1
Following XML Schema, types in [XPath/XQuery] are composed from
item types by optional, one or more, zero or more, all
group, sequence, choice, empty
sequence (written empty), or empty choice (written
none).
The type empty matches the empty sequence. The type
none matches no values. none is the identity for
choice, that is (Type | none) =
Type. The type none is the static type
for [7.2.9 The fn:error function].
| [24 (Formal)] | Type | ::= | FormalItemType |
The [XPath/XQuery] type system includes three binary operators on types: ",", "|" and "&", corresponding respectively to sequence, choice and all groups in Schema. The [XPath/XQuery] type system includes three unary operators on types: "*", "+", and "?", corresponding respectively to zero or more instances of the type, one or more instances of the type, or an optional instance of the type.
The "&" operator builds the "interleaved product" of two types. The type Type1 & Type2 matches any sequence that is an interleaving of two sequences of items, Value1 and Value2,with Value1matching Type1 and Value2 matchingType2. The interleaving of two sequences of items Value1 and Value2 is any sequence Value0 such that there is an ordered partition of Value0 into the two sub-sequences Value1 and Value2. The interleaved product captures the semantics of all groups in XML Schema, but is more general as it applies to arbitrary types. All groups in XML Schema are restricted to apply only on global or local element declarations with minOccurs 0 or 1, and maxOccurs 1.
For example, consider the types Type1 =
xs:integer,xs:integer,xs:integer and Type2 =
xs:string,xs:string. Value1 = (1,2,3)
matches the type Type1 and Value2 =
("a","b") matches the type Type2. Any of the
following Value0 are interleavings of Value1 and
Value2, and therefore match the type (Type1 &
Type2):
Value0 = (1,2,3,"a","b")
Value0 = (1,2,"a",3,"b")
Value0 = (1,2,"a","b",3)
Value0 = (1,"a",2,3,"b")
Value0 = (1,"a",2,"b",3)
Value0 = (1,"a","b",2,3)
Value0 = ("a",1,2,3,"b")
Value0 = ("a",1,2,"b",3)
Value0 = ("a",1,"b",2,3)
Value0 = ("a","b",1,2,3)
Types precedence order. To improve readability when writing types, we assume the following precedence order between operators on types.
Parenthesis can be used to enforce precedence. For instance
xs:string | xs:integer, xs:float*
is equivalent to
xs:string | (xs:integer, (xs:float*))
and a different precedence can be obtained by writing
((xs:string | xs:integer), xs:float)*
Examples
A sequence of elements
The "," operator builds the "sequence" of two types. For example,
element title of type xs:string, element year of type xs:integer
is a sequence of an element title of type string followed by an element year of type integer.
The union of two element types
The "|" operator builds the "union" of two types. For example,
element editor of type xs:string | element bib:author
means either an element editor of type string, or a
reference to the global element bib:author.
An all group of two elements
The "&" operator builds the "interleaved product" of two types. For example,
(element a & element b) =
element a, element b
| element b, element a
which specifies that the a and b
elements can occur in any order.
An empty type
The following type matches the empty sequence.
empty
A sequence of zero or more elements
The following type matches zero or more elements each of
which can be a surgeon or a
plumber.
(element surgeon | element plumber)*
Top level definitions correspond to global element declarations, global attribute declarations and type definitions in XML Schema.
| [40 (Formal)] | Definitions | ::= | (Definition Definitions)? |
| [39 (Formal)] | Definition | ::= | ("define" "element" ElementName OptSubstitution OptNillableTypeReference) |
| [39 (Formal)a] | OptSubstitution | ::= | Substitution? |
| [41 (Formal)] | Substitution | ::= | "substitutes" "for" ElementName |
| [33 (Formal)] | TypeDerivation | ::= | ComplexTypeDerivation | AtomicTypeDerivation |
| [34 (Formal)] | ComplexTypeDerivation | ::= | Derivation? OptMixed"{" Type? "}" |
| [35 (Formal)] | AtomicTypeDerivation | ::= | "restricts" AtomicTypeName |
| [37 (Formal)] | Derivation | ::= | ("restricts" TypeName) |
| [37 (Formal)a] | OptMixed | ::= | Mixed? |
| [38 (Formal)] | Mixed | ::= | "mixed" |
A type definition has a name (possibly anonymous) and a type derivation. In the case of a complex type, the derivation indicates whether it is derived by extension or restriction, its base type, and its content model, with an optional flag indicating if it has mixed content.
Example
For instance, the following complex type
<complexType name="UKAddress">
<complexContent>
<extension base="ipo:Address">
<sequence>
<element name="postcode" type="ipo:UKPostcode"/>
</sequence>
<attribute name="exportCode" type="positiveInteger" fixed="1"/>
</extension>
</complexContent>
</complexType>
is represented as follows
define type UKAddress extends ipo:Address {
attribute exportCode of type ipo:UKPostcode,
element postcode of type positiveInteger
};
Example
In the case of simple types derived by union or list, the
derivation is always a restriction from the base type
xs:anySimpleType, and has a content which is a union of the
member types, or a repetition of the item type. For instance, the two following simple type declarations
<xsd:simpleType name="listOfMyIntType"> <xsd:list itemType="myInteger"/> </xsd:simpleType> <xsd:simpleType name="zipUnion"> <xsd:union memberTypes="USState FrenchRegion"/> </xsd:simpleType>
are represented as follows
define type listOfMyIntType restricts xs:anySimpleType {
myInteger*
}
define type zipUnion restricts xs:anySimpleType {
USState | FrenchRegion
}
Example
In the case of an atomic type, it just indicates its base type. For instance, the following type definition
<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>
is represented as follow
define type SKU restrict xsd:string;
Example
When the type derivation is omitted, the type derives by
restriction from xs:anyType. For instance:
define type Bib { element book* } =
define type Bib restricts xs:anyType { element book* }
Example
Empty content can be indicated with the explicit empty sequence, or omitted, as in:
define type Bib { } =
define type Bib { empty }
Global element and attribute declarations always have a name and a reference to a (possibly anonymous) type. A global element declaration also may declare a substitution group for the element and whether the element is nillable.
Example
A type declaration with one element name of type
xs:string follows by one or more elements street of type
xs:string.
define type Address {
element name of type xs:string,
element street of type xs:string*
}
Example
A type declaration with complex content derived by extension
define type USAddress extends Address {
element zip name of type xs:integer
}
Example
A type declaration with mixed content
define type Section mixed {
(element h1 of type xs:string |
element p of type xs:string |
element div of type Section)*
}
Example
A type declaration with simple content derived by restriction
define type SKU restricts xs:string
Example
An element declaration
define element address of type Address
Example
An element declaration with a substitution group
define element usaddress substitutes for address of type USAddress
Example
An element declaration which is nillable
define element zip nillable of type xs:integer
Here is a schema describing purchase orders from [XML Schema Part 0].
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:annotation>
<xsd:documentation xml:lang="en">
Purchase order schema for Example.com.
Copyright 2000 Example.com. All rights reserved.
</xsd:documentation>
</xsd:annotation>
<xsd:element name="purchaseOrder" type="PurchaseOrderType"/>
<xsd:element name="comment" type="xsd:string"/>
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
<xsd:complexType name="USAddress">
<xsd:sequence>
<xsd:element name="name" type="xsd:string"/>
<xsd:element name="street" type="xsd:string"/>
<xsd:element name="city" type="xsd:string"/>
<xsd:element name="state" type="xsd:string"/>
<xsd:element name="zip" type="xsd:decimal"/>
</xsd:sequence>
<xsd:attribute name="country" type="xsd:NMTOKEN" fixed="US"/>
</xsd:complexType>
<xsd:complexType name="Items">
<xsd:sequence>
<xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="productName" type="xsd:string"/>
<xsd:element name="quantity">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="100"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:element>
<xsd:element name="USPrice" type="xsd:decimal"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="shipDate" type="xsd:date" minOccurs="0"/>
</xsd:sequence>
<xsd:attribute name="partNum" type="SKU" use="required"/>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:complexType>
<!-- Stock Keeping Unit, a code for identifying products -->
<xsd:simpleType name="SKU">
<xsd:restriction base="xsd:string">
<xsd:pattern value="\d{3}-[A-Z]{2}"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:schema>
Here is the mapping of the above schema into the [XPath/XQuery] type system.
declare namespace xsd = "http://www.w3.org/2001/XMLSchema";
define element purchaseOrder of type PurchaseOrderType;
define element comment of type xsd:string;
define type PurchaseOrderType {
attribute orderDate of type xsd:date?,
element shipTo of type USAddress,
element billTo of type USAddress,
element comment?,
element items of type Items
};
define type USAddress {
attribute country of type xsd:NMTOKEN,
element name of type xsd:string,
element street of type xsd:string,
element city of type xsd:string,
element state of type xsd:string,
element zip of type xsd:decimal
};
define type Items {
attribute partNum of type SKU,
element item of type fs:anon1*
};
define type fs:anon1 {
element productName of type xsd:string,
element quantity of type fs:anon2,
element USPrice of type xsd:decimal,
element comment?,
element shipDate of type xsd:date?
};
define type fs:anon2 restricts xsd:positiveInteger;
define type SKU restrict xsd:string;
Note that the two anonymous types in the item
element declarations are mapping to types with names fs:anon1
and fs:anon2.
The following additional definitions illustrate how more advanced XML Schema features (a complex type derived by extension, an anonymous simple type derived by restriction, and substitution groups) are represented in the [XPath/XQuery] type system.
<complexType name="NYCAddress">
<complexContent>
<extension base="USAddress">
<sequence>
<element ref="apt"/>
</sequence>
</extension>
</complexContent>
</complexType>
<element name="apt">
<xsd:simpleType>
<xsd:restriction base="xsd:positiveInteger">
<xsd:maxExclusive value="10000"/>
</xsd:restriction>
</xsd:simpleType>
</element>
<element name="usaddress" substitutionGroup="address" type="USAddress"/>
<element name="nycaddress" substitutionGroup="usaddress" type="NYCAddress"/>
The above definitions are mapped into the [XPath/XQuery] type system as follows:
define type NYCAddress extends USAddress {
element apt
}
define element apt of type fs:anon3
define type fs:anon3 restricts xsd:positiveInteger
define element usaddress substitutes for address of type USAddress
define element nycaddress substitutes for usaddress of type NYCAddress
The [Functions and Operators] document defines built-in functions available in [XPath/XQuery]. A number of these functions are used to define the [XPath/XQuery] semantics; those functions are listed in [B.1 Functions and Operators used in the Formal Semantics].
Many functions in the [Functions and Operators] document are
generic: they perform operations on arbitrary
components of the data model, e.g., any kind of node, or any
sequence of items. For instance, the fn:unordered
returns its input sequence in an implementation-dependent
order. The signature of the fn:unordered function
takes arbitrary items as input and output:
fn:unordered($sourceSeq as item()*) as item()*
As defined, this signature provides little useful type
information. For such functions, better type information can often
be obtained by having the output type depend on the type of input
parameters. For instance, if the function
fn:unordered is applied on a sequence of
a elements, the result is also a sequence of
a elements.
In order to provide better static typing for those functions, specific typing rules are given in [7 Additional Semantics of Functions].
The organization of this section parallels the organization of Section 2 BasicsXQ.
Introduction
The expression context for a given expression consists of all the information that can affect the result of the expression. This information is organized into the static context and the dynamic context. This section specifies the environments that represent the context information used by [XPath/XQuery] expressions.
Notation
Weintroduce the following auxiliary grammar production to describe functionsignatures.
| [94 (Formal)] | FunctionSig | ::= | "declare" "function" QName "(" TypeList? ")" "as" SequenceType |
| [95 (Formal)] | TypeList | ::= | SequenceType ("," SequenceType)* |
statEnv denotes the environment available during static analysis. Static analysis may extend parts of the static environment. The static environment is also available during dynamic evaluation.
If analysis of an expression relies on some component of the static context that has not been assigned a value, a static error is raised.
The following environment components are part of the static environment:
| statEnv.xpath1.0_compatibility |
| |||||
| statEnv.namespace |
| |||||
| statEnv.default_elem_namespace |
| |||||
| statEnv.default_function_namespace |
| |||||
| statEnv.typeDefn |
| |||||
| statEnv.elemDecl |
| |||||
| statEnv.attrDecl |
| |||||
| statEnv.varType |
| |||||
| statEnv.funcType |
| |||||
| statEnv.collations |
| |||||
| statEnv.defaultCollation |
| |||||
| statEnv.constructionMode |
| |||||
| statEnv.orderingMode |
| |||||
| statEnv.defaultEmptySequenceOrder |
| |||||
| statEnv.boundarySpace |
| |||||
| statEnv.copyNamespacesMode |
| |||||
| statEnv.baseURI |
| |||||
| statEnv.docType |
| |||||
| statEnv.collectionType |
| |||||
| statEnv.defaultCollectionType |
|
Note that the boundary-space behavior is not formally specified in this document.
Environments have an initial state when [expression/query] processing begins, containing, for example, the function signatures of all built-in functions. The initial values for the static context are defined in Section C Context ComponentsXQ and Section C Context ComponentsXP and is denoted by statEnvDefault in the Formal Semantics.
Here is an example that shows how the static environment is modified in response to a namespace definition.
| ||
statEnv |-
declare namespace NCName = URI; Expr
|
This rule reads as follows: "the phrase on the bottom (a namespace declaration in the query prolog followed by a sequence of expressions) is well-typed (accepted by the static type inference rules) within an environment statEnv if the sequence of expressions above the line is well-typed in the environment obtained from statEnv by adding the namespace declaration".
The helper function
fs:active_ns(statEnv) returns all the active in-scope
namespaces in the given static environment.
For each attribute and element node in
Value, such that the node has name expanded-QName in the
namespace URI, the helper function
fs:get_static_ns_from_items(statEnv, Value) returns the
in-scope namespace that corresponds to URI. This is a
reverse-lookup on statEnv.namespace by URI.
A common use of the static environment is to expand a
QName by looking up the URI that corresponds to the QName's
namespace prefix in the statEnv.namespace environment component and by
constructing an expanded-QNameDM,
which contains the URI and the QName's local part. Element and
type names may be in the null namespace, that is, there is no
URI associated with their namespace prefix. The null namespace
is denoted by the special value
#NULL-NAMESPACE.
The auxiliary judgments below expand an element, type, attribute, variable, or function QName by looking up the namespace prefix in statEnv.namespace or, if the QName is unqualified, by using the appropriate default namespace.
Notation
The judgment
holds when the element or type QName expands to the given expanded QName.
The judgment
holds when the attribute QName expands to the given expanded QName.
The judgment
holds when the variable QName expands to the given expanded QName.
The judgment
holds when the function QName expands to the given expanded QName.
Semantics
Note that none of the inference rules can infer a resolved name in the case a given namespace prefix is bound to the (#UNDECLARED) value. As a result, namespace resolution will fail if the implementation supports [XML Names 1.1] and a given namespace prefixed as been undeclared.
An element or type QName consisting of a prefix NCName and a local part NCName expands to the URI (or the null namespace) corresponding to that prefix and the local part.
| statEnv.namespace(NCName1) = URI-or-#NULL-NAMESPACE |
| statEnv |- NCName1:NCName2 of elem/type expands to (URI-or-#NULL-NAMESPACE, NCName2) |
An element or type QName consisting only of a local part NCName expands to the default element/type namespace and the local part.
| statEnv.default_elem_namespace = URI-or-#NULL-NAMESPACE |
| statEnv |- NCName of elem/type expands to (URI-or-#NULL-NAMESPACE, NCName) |
An attribute QName consisting of a prefix NCName and a local part NCName expands to the URI (or the null namespace) corresponding to the prefix and the local part.
| statEnv.namespace(NCName1) = URI-or-#NULL-NAMESPACE |
| statEnv |- NCName1:NCName2 of attr expands to (URI-or-#NULL-NAMESPACE, NCName2) |
An attribute QName consisting only of a local part NCName expands to the null namespace and the local part.
| statEnv |- NCName of attr expands to (#NULL-NAMESPACE, NCName) |
A variable QName consisting of a prefix NCName and a local part NCName expands to the URI that corresponds to the prefix and the local part.
| statEnv.namespace(NCName1) = URI |
| statEnv |- NCName1:NCName2 of var expands to (URI, NCName2) |
A variable QName consisting only of a local part NCName expands to the null namespace and the local part.
| statEnv |- NCName of var expands to (#NULL-NAMESPACE, NCName) |
A function QName consisting of a prefix NCName and a local part NCName expands to the URI that corresponds to the prefix and the local part.
| statEnv.namespace(NCName1) = URI |
| statEnv |- NCName1:NCName2 of func expands to (URI, NCName2) |
A function QName consisting only of a local part NCName expands to the default function namespace URI and the local part.
| statEnv.default_function_namespace = URI |
| statEnv |- NCName of func expands to (URI, NCName) |
dynEnv denotes the environment availableduring dynamic evaluation. Dynamic evaluation mayextend parts of the dynamic environment.
If evaluation of an expression relies on some component of the dynamic context that has not been assigned a value, a dynamic error is raised.
The following environment componentsare part of the dynamic environment:
| dynEnv.varValue |
| ||||
| dynEnv.funcDefn |
| ||||
| dynEnv.dateTime |
| ||||
| dynEnv.timezone |
| ||||
| dynEnv.docValue |
| ||||
| dynEnv.collectionValue |
| ||||
| dynEnv.defaultCollectionValue |
|
The initial values for the dynamic context are defined in Section C Context ComponentsXQ and Section C Context ComponentsXP. The corresponding initial dynamicenvironment is denoted by dynEnvDefault in the Formal Semantics.
The following Formal Semantics variables represent the context item, context position, and context size properties of the dynamic context:
| Built-in Variable | Represents: |
$fs:dot | context item |
$fs:position | context position |
$fs:last | context size |
Within this document, variableswith the "fs" prefix are
reserved for use in the
formalspecification. Values of
$fs:position and $fs:last can be obtained by invoking the
fn:position and fn:last functions, respectively.
This section reviews the processing model for [XPath/XQuery].The [XPath/XQuery] processing model is defined normatively in Section 2.2 Processing ModelXQ. This section also explains how the main notations (normalization rules, static type inference, and dynamic evaluation) relate to the phases in that processing model.
The following figure depicts the [XPath/XQuery] processing model

Figure 1: Processing Model Overview
This processing model is not intended to describe an actual implementation, although a naive implementation might be based upon it. It does not prescribe an implementation technique, but any implementation should produce the same results as obtained by following this processing model and applying the rest of the Formal Semantics specification.
Query processing consists of two phases: a static analysis phase and a dynamic evaluation phase. Static analysis is further divided into four sub-phases. Each phase consumes the result of the previous phase and generates output for the next phase. For each processing phase, we point to the relevant notations introduced later in the document.
[Definition: The static analysis phase depends on the expression itself and on the static context. The static analysis phase does not depend on input data (other than schemas).]
The purpose of the static analysis phase is to detect errors, e.g., syntax errors or type errors, at compile time rather than at run-time. If no error occurs, the result of static analysis could be some compiled form of [expression/query], suitable for execution by a compiled-[expression/query] processor. Static analysis consists of the following sub-phases:
Parsing. (Step SQ1 in Figure 1). The grammar for the [XPath/XQuery] syntax is defined in [XQuery 1.0: A Query Language for XML]. Parsing may generate syntax errors. If no error occurs, an internal operation tree of the parsed query is created.
Static Context Processing. (Steps SQ2, SQ3, and SQ4 in Figure 1). The static semantics of [expression/query] depends on the input static context. The input static context needs to be generated before the [expression/query] can be analysed. In XQuery, the input static context may be defined by the processing environment and by declarations in the Query Prolog (See [5 Modules and Prologs]). In XPath, the input static context is defined by the processing environment. The static context is denoted by statEnv.
Normalization.
(Step SQ5 in Figure 1). To simplify the semantics
specification, some normalization is performed on the
[expression/query]. The [XPath/XQuery] language provides many powerful
features that make [expression/query]s simpler to write and use,
but are also redundant. For instance, a complex
for expression might be rewritten as a
composition of several simple for
expressions. The language composed of these simpler
[expression/query] is called the [XPath/XQuery] Core
language and is described by a grammar which is a
subset of the XQuery grammar. The grammar of the [XPath/XQuery]
Core language is given in [A Normalized core grammar].
During the normalization phase, each [XPath/XQuery] [expression/query] is mapped into its equivalent [expression/query] in the Core. (Note that this has nothing to do with Unicode Normalization, which works on character strings.) Normalization works by recursive application of the normalization rules over agiven expression.
Specifically the normalization phase is defined in terms of the static part of the context (statEnv) and a [expression/query] (Expr) abstract syntax tree. Formal notations for the normalization phase are introduced in [3.2.2 Normalization judgment].
After normalization, the full semantics is obtained by giving a semantics to the normalized Core [expression/query]. This is done during the last two phases.
Static type analysis. (Step SQ6 in Figure 1). Static type analysis is optional. If this phase is not supported, then normalization is followed directly by dynamic evaluation.
Static type analysis checks whether each [expression/query] is type safe, and if so, determines its static type. Static type analysis is defined only for Core [expression/query]. Static type analysis works by recursive application of the type inference rules overagiven expression.
If the [expression/query] is not type-safe, static type analysis yields a type error. For instance, a comparison between an integer value and a string value might be detected as an type error during the static type analysis. If static type analysis succeeds, it yields an abstract syntax tree where each sub-expression is associated with its static type.
More precisely, the static analysis phase is defined in terms of the static context (statEnv) and a Core [expression/query] (CoreExpr). Formal notations for the static analysis phase are introduced in [3.2.3 Static typing judgment].
Static typing does not imply that the content of XML documents must be rigidly fixed or even known in advance. The [XPath/XQuery] type system accommodates "flexible" types, such as elements that can contain any content. Schema-less documents are handled in [XPath/XQuery] by associating a standard type with the document, such that it may include any legal XML content.
If the static analysis phase succeeds, the dynamic evaluation phase (sometimes also called "execution") evaluates a query on input document(s).
Dynamic Context Processing. (Steps DQ2 and DQ3 in Figure 1).The dynamic semantics of [expression/query] depends on the dynamic input context. The dynamic input context needs to be generated before the [expression/query] can be evaluated. The dynamic input context may be defined by the processing environment and by statements in the Query Prolog (See [5 Modules and Prologs]). In XPath, the dynamic input context is defined by the processing environment. The static input context is denoted by dynEnv.
Dynamic Evaluation. (Steps DQ4 and DQ5 in Figure 1). This phase computes the value of an [expression/query]. The semantics of evaluation is defined only for Core [expression/query] terms. The formal description of evaluation works by recursive application of the evaluation rules over agiven expression. (Note that in practice some implementations may prefer top-down evaluation strategies.) Evaluation may result in a value OR a dynamic error, which may be a non-type error or a type error. If static typing of an expression does not raise a type error, then dynamic evaluation of the same expression will not raise a type error (and thus dynamic type checking can be avoided when static typing is enabled). Dynamic evaluation may still raise a non-type error.
The dynamic evaluation phase is defined in terms of the static context (statEnv) and evaluation context (dynEnv), and a Core [expression/query] (CoreExpr). Formal notations for the dynamic evaluation phase are introduced in [3.2.4 Dynamic evaluation judgment].
Static type analysis catches only certain classes of
errors. For instance, it can detect a comparison operation
applied between incompatible types (e.g., xs:int
and xs:date). Some other classes of errors cannot
be detected by the static analysis and are only detected at
evaluation time. For instance, whether an arithmetic expression
on 32 bit integers (xs:int) yields an out-of-bound
value can only be detected at run-time by looking at the
data.
While implementations are free to implement different processing models, the [XPath/XQuery] static semantics relies on the existence of a static type analysis phase that precedes any access to the input data.
The above processing phases are all internal to the [XPath/XQuery] processor. They do not deal with how the [XPath/XQuery] processor interacts with the outside world, notably how it accesses actual documents and types. A typical [expression/query] engine would support at least three other important processing phases:
Schema Import Processing. The [XPath/XQuery] type system is based on XML Schema. In order to perform dynamic or static typing, the [XPath/XQuery] processor needs to build type descriptions that correspond to the schema(s) of the input documents. This phase is achieved by mapping all schemas required by the [expression/query] into the [XPath/XQuery] type system. The XML Schema import phase is described in [C Importing Schemas].
Data Model Generation. Expressions are evaluated on values in the [Data Model]. XML documents must be loaded into the [Data Model] before the evaluation phase. This is described in the [Data Model] and is not discussed further here.
Serialization. Once the [expression/query] is evaluated, processors might want to serialize the result of the [expression/query] as actual XML documents. Serialization of data model instances is described in [Data Model Serialization] and is not discussed further here.
The parsing phase is not specified formally; the formal semantics does not define a formal model for the syntax trees, but uses the [XPath/XQuery] concrete syntax directly. More details about parsing for XQuery 1.0 can be found in the [XQuery 1.0: A Query Language for XML] document and more details about parsing for XPath 2.0 can be found in the [XML Path Language (XPath) 2.0] document. No further discussion of parsing is included here.
Normalization is specified using mapping rules, which describe how a [XPath/XQuery] expression is rewritten into an expression in the [XPath/XQuery] Core. Mapping rules are also used in [C Importing Schemas] to specify how XML Schemas are imported into the [XPath/XQuery] type system.
Notation
Mapping rules are written using a square bracket notation, as follows:
| [Object]Subscript |
| == |
| Mapped Object |
The original "object" is written above the == sign. The rewritten "object" is written beneath the == sign. The subscript is used to indicate what kind of "object" is mapped, and sometimes to pass some information between mapping rules.
Since normalization is always applied in the presence of a static context, the above rule is a shorthand for:
The static environment is used in certain normalization rules (e.g. for normalization of function calls). To keep the notation simpler, the static environment is not written in the normalization rules, but it is assumed to be available.
The normalization rule that is used to map "top-level" expressions in the [XPath/XQuery] syntax into expressions in the [XPath/XQuery] Core is:
| [Expr]Expr |
| == |
| CoreExpr |
which indicates that the expression Expr is normalized to the expression CoreExpr in the [XPath/XQuery] Core (with the implied statEnv).
Example
For instance, the following [expression/query]
for $i in (1, 2),
$j in (3, 4)
return
element pair { ($i,$j) }
is normalized to the Core expression
for $i in (1, 2) return
for $j in (3, 4) return
element pair { ($i,$j) }
in which the "FWLR" expression is mapped into a composition of two simpler "for" expressions.
The static semantics is specified using type inference rules, which relate [XPath/XQuery] expressions to types and specify under what conditions an expression is well typed.
Notation
The judgment
holds when, in the static environment statEnv, the expression Expr has type Type.
Example
The result of static type inference is to associate a static type with every [expression/query], such that any evaluation of that [expression/query] is guaranteed to yield a value that belongs to that type.
For instance, the following expression.
let $v := 3 return $v+5
has type xs:integer. This can be inferred as follows: the
input literals '3' and '5' have type integer, so the variable
$v also has type integer. Since the sum of two integers is an
integer, the complete expression has type integer.
Note
The type of an expression is computed by inference. Static type inference rules define for each kind of expression how to compute the type of the expression given the types of its sub-expressions. Here is a simple example:
statEnv |-
Expr1 : xs:boolean
statEnv |-
Expr2 : Type2
statEnv |-
Expr3 : Type3
|
statEnv |-
if Expr1
then Expr2
else Expr3 : ( Type2 | Type3 )
|
This rule states that if the conditional expression of an "if" expression has type boolean, then the type of the entire expression is one of the two types of its "then" and "else" clauses. Note that the resulting type is represented as a union: '(Type2|Type3)'.
The "left half" (the part before the :) of the expression below the line corresponds to some [expression/query], for which a type is computed. If the [expression/query] has been parsed into an internal abstract syntax tree, this usually corresponds to some node in that tree. The expression usually has patterns in it (here Expr1, Expr2, and Expr3) that need to be matched against the children of the node in the abstract syntax tree. The expressions above the line indicate things that need to be computed to use this rule; in this case, the types of the condition expression and the two branches of the if-then-else expression. Once those types are computed (by further applying static inference rules recursively to the expressions on each side), then the type of the expression below the line can be computed. This example illustrates a general feature of the [XPath/XQuery] type system: the type of an expression depends only on the type of its sub-expressions. The overall static type inference algorithm is recursive, following the abstract syntax of the [expression/query]. At each point in the recursion, an appropriate matching inference rule is sought; if at any point there is no applicable rule, then static type inference has failed and the [expression/query] is not type correct.
The dynamic, or operational, semantics is specified using value inference rules, which relate [XPath/XQuery] expressions to values, and in some cases specify the order in which an [XPath/XQuery] expression is evaluated.
Notation
The judgment
holds when, in the static environment statEnv and dynamic environment dynEnv, the expression Expr yields the value Value.
The static environment is used in certain cases (e.g. for type matching) during evaluation. To keep the notation simpler, the static environment is not written in the dynamic inference rules, but it is assumed to be available.
Example
For instance, the following expression.
let $v := 3 return $v+5
yields the integer value 8. This can be inferred as follows: the input literals '3' and '5' denote the values 3 and 5, respectively, so the variable $v has the value 3. Since the sum of 3 and 5 is 8, the complete expression has the value 8.
Note
As with static type inference, logical inference rules are used to determine the value of each expression, given the dynamic environment and the values of its sub-expressions.
The inference rules used for dynamic evaluation, like those for static type inference, follow a bottom-up recursive structure, computing the value of expressions from the values of their sub-expressions.
Expressions can raise errors during static analysis or dynamic
evaluation. The [Functions and Operators] [XQuery 1.0: A Query Language for XML], and [XML Path Language (XPath) 2.0]
specify the conditions under which an expression or operator
raises an error. The user may raise an error explicitly by
calling the fn:error function, which takes an optional item as
an argument.
This document does not describe formally the conditions under which dynamic errors are raised. Notably, it does not specify the error codes or the rules about errors and optimization, as described in [XQuery 1.0: A Query Language for XML]. Instead, this document describe the rules necessary to statically detect the subset of the [XPath/XQuery] dynamic errors known as type errorXQ.
[XPath/XQuery] is most generally used to process documents. The representation of a document is normatively defined in [Data Model]. The functions used to access documents and collections are normatively defined in [Functions and Operators].
Document order is defined in [Data Model].
Atomization converts an item sequence into a sequence of
atomic values and is implemented by the fn:data function.
Atomization is applied to a value when the value is used in a
context in which a sequence of atomic values is required.
If a sequence of items is encountered where a boolean value
is expected, the item sequence's effective boolean value is
used. The fn:boolean function returns the effective boolean
value of an item sequence.
[XPath/XQuery] has a set of functions that provide access to input data. These functions are of particular importance because they provide a way in which an expression can reference a document or a collection of documents. The dynamic semantics of these three input functions are described in more detail in [Functions and Operators].
In certain places in the XQuery grammar, a statically known valid absolute URI is required. These places are denoted by the grammatical symbol URILiteral, and are treated as described in [XQuery 1.0: A Query Language for XML].
All the built-in types of XML Schema are recognized by
[XPath/XQuery]. In addition, [XPath/XQuery] recognizes the predefined
types xdt:anyAtomicType, xdt:untypedAtomic and xdt:untyped
and the duration subtypes xdt:yearMonthDuration and
xdt:dayTimeDuration . The definition of those types in the
[XPath/XQuery] type system is given below.
[Definition: The following
type definition of xs:anyType
reflects the semantics of the Ur type from Schema in the
[XPath/XQuery] type system.]
define type xs:anyType restricts xs:anyType {
attribute * of type xs:anySimpleType*,
( xdt:anyAtomicType* | ( element * of type xs:anyType | text | comment | processing-instruction )* )
}
[Definition:
The following type definition of
xs:anySimpleType reflects the
semantics of the Ur simple type from Schema in the [XPath/XQuery]
type system.]
define type xs:anySimpleType restricts xs:anyType {
xdt:anyAtomicType*
}
The name of the Ur simple type is xs:anySimpleType. It is
derived by restriction from xs:anyType, its content is a
sequence any atomic types.
[Definition: The following type definition
of xdt:anyAtomicType reflects
the semantics of xdt:anyAtomicType in the [XPath/XQuery] type
system.]
define type xdt:anyAtomicType restricts xs:anySimpleType {
( xs:string
| xs:boolean
| xs:decimal
| xs:float
| xs:double
| xs:duration
| xs:dateTime
| xs:time
| xs:date
| xs:gYearMonth
| xs:gYear
| xs:gMonthDay
| xs:gDay
| xs:gMonth
| xs:hexBinary
| xs:base64Binary
| xs:anyURI
| xs:QName
| xs:NOTATION
| xdt:untypedAtomic )
}
[Definition: The following type definitions of the XML Schema primitive types reflect the semantics of the primitive types from Schema in the [XPath/XQuery] type system.]
define type xs:string restricts xdt:anyAtomicType define type xs:boolean restricts xdt:anyAtomicType define type xs:decimal restricts xdt:anyAtomicType define type xs:float restricts xdt:anyAtomicType define type xs:double restricts xdt:anyAtomicType define type xs:duration restricts xdt:anyAtomicType define type xs:dateTime restricts xdt:anyAtomicType define type xs:time restricts xdt:anyAtomicType define type xs:date restricts xdt:anyAtomicType define type xs:gYearMonth restricts xdt:anyAtomicType define type xs:gYear restricts xdt:anyAtomicType define type xs:gMonthDay restricts xdt:anyAtomicType define type xs:gDay restricts xdt:anyAtomicType define type xs:gMonth restricts xdt:anyAtomicType define type xs:hexBinary restricts xdt:anyAtomicType define type xs:base64Binary restricts xdt:anyAtomicType define type xs:anyURI restricts xdt:anyAtomicType define type xs:QName restricts xdt:anyAtomicType define type xs:NOTATION restricts xdt:anyAtomicType
All of those primitive types derive from
xdt:anyAtomicType. Note that the value space of each atomic
type (such as xs:string) does not appear. The value space for
each type is built-in and is as defined in [Schema Part 2].
[Definition: The type
xdt:untypedAtomic is defined as
follows.]
define type xdt:untypedAtomic restricts xdt:anyAtomicType
Note that this rule does not indicate the value space of
xdt:untypedAtomic. By definition, xdt:untypedAtomic has the
same value space as xs:string.
The following example shows two atomic values. The first one is a value of type string containing "Database". The second one is an untyped atomic value containing "Database".
"Databases" of type xs:string "Databases" of type xdt:untypedAtomic
[Definition: The type
xdt:untyped is defined as
follows.]
define type xdt:untyped restricts xs:anyType {
attribute * of type xdt:untypedAtomic*,
( element * of type xdt:untyped | text | comment | processing-instruction )*
}
[Definition: The following type definitions of the XML Schema derived types reflect the semantics of the XML Schema types derived by restriction from another atomic type.]
define type xs:normalizedString restricts xs:string define type xs:token restricts xs:normalizedString define type xs:language restricts xs:token define type xs:NMTOKEN restricts xs:token define type xs:Name restricts xs:token define type xs:NCName restricts xs:Name define type xs:ID restricts xs:Name define type xs:IDREF restricts xs:Name define type xs:ENTITY restricts xs:Name define type xs:integer restricts xs:decimal define type xs:nonPositiveInteger restricts xs:integer define type xs:negativeInteger restricts xs:nonPositiveInteger define type xs:long restricts xs:integer define type xs:int restricts xs:long define type xs:short restricts xs:int define type xs:byte restricts xs:short define type xs:nonNegativeInteger restricts xs:integer define type xs:unsignedLong restricts xs:nonNegativeInteger define type xs:unsignedInt restricts xs:unsignedLong define type xs:unsignedShort restricts xs:unsignedInt define type xs:unsignedByte restricts xs:unsignedShort define type xs:positiveInteger restricts xs:nonNegativeInteger
Three XML Schema built-in derived types are derived by list,
as follows. Note that those derive directly from
xs:anySimpleType, since they are derived by list, and that
their value space is defined using a "one or more"
occurrence indicator.
define type xs:NMTOKENS restricts xs:anySimpleType { xs:NMTOKEN+ }
define type xs:IDREFS restricts xs:anySimpleType { xs:IDREF+ }
define type xs:ENTITIES restricts xs:anySimpleType { xs:ENTITY+ }
For example, here is an element whose content is of type
xs:IDREFS.
element a of type xs:IDREFS {
"id1" of type xs:IDREF,
"id2" of type xs:IDREF,
"id3" of type xs:IDREF
}
Note that the type name xs:IDREFS derives from
xs:anySimpleType, but not from xs:IDREF. As a consequence,
calling the following three XQuery functions with the element
a as a parameter succeeds for f1 and
f2, but raises a type error for
f3.
declare function f1($x as element(*,xs:anySimpleType)) { $x }
declare function f2($x as element(*,xs:IDREFS)) { $x }
declare function f3($x as element(*,xs:IDREF)) { $x }
[Definition: The totally ordered duration types,
xdt:yearMonthDuration and xdt:dayTimeDuration , are derived
by restriction from xs:duration.]
define type xdt:yearMonthDuration restricts xs:duration define type xdt:dayTimeDuration restricts xs:duration
[Definition: In addition,
the Formal Semantics uses the additional type fs:numeric. This
type is necessary for the specification of some of XPath type
conversion rules. It is defined as follows.]
define type fs:numeric restricts xdt:anyAtomicType { xs:decimal | xs:float | xs:double }
The typed value of a node is computed by the fn:data
function, and the string value of a node is computed by the
fn:string function, defined in [Functions and Operators].
The normative definitions of typed value and string value are defined in [Data Model].
Introduction
Sequence types can be used in [XPath/XQuery] to refer to an XML Schema type. Sequence types are used to declare the types of function parameters and in several [XPath/XQuery] expressions.
The syntax of sequence types is described by the following grammar productions.
| [119 (XQuery)] | SequenceTypeXQ | ::= | ("empty-sequence" "(" ")") |
| [121 (XQuery)] | ItemTypeXQ | ::= | KindTest | ("item" "(" ")") | AtomicType |
| [120 (XQuery)] | OccurrenceIndicatorXQ | ::= | "?" | "*" | "+" |
| [122 (XQuery)] | AtomicTypeXQ | ::= | QName |
| [123 (XQuery)] | KindTestXQ | ::= | DocumentTest |
| [125 (XQuery)] | DocumentTestXQ | ::= | "document-node" "(" (ElementTest | SchemaElementTest)? ")" |
| [133 (XQuery)] | ElementTestXQ | ::= | "element" "(" (ElementNameOrWildcard ("," TypeName "?"?)?)? ")" |
| [135 (XQuery)] | SchemaElementTestXQ | ::= | "schema-element" "(" ElementDeclaration ")" |
| [136 (XQuery)] | ElementDeclarationXQ | ::= | ElementName |
| [129 (XQuery)] | AttributeTestXQ | ::= | "attribute" "(" (AttribNameOrWildcard ("," TypeName)?)? ")" |
| [131 (XQuery)] | SchemaAttributeTestXQ | ::= | "schema-attribute" "(" AttributeDeclaration ")" |
| [132 (XQuery)] | AttributeDeclarationXQ | ::= | AttributeName |
| [134 (XQuery)] | ElementNameOrWildcardXQ | ::= | ElementName | "*" |
| [138 (XQuery)] | ElementNameXQ | ::= | QName |
| [130 (XQuery)] | AttribNameOrWildcardXQ | ::= | AttributeName | "*" |
| [137 (XQuery)] | AttributeNameXQ | ::= | QName |
| [139 (XQuery)] | TypeNameXQ | ::= | QName |
| [128 (XQuery)] | PITestXQ | ::= | "processing-instruction" "(" (NCName | StringLiteral)? ")" |
| [127 (XQuery)] | CommentTestXQ | ::= | "comment" "(" ")" |
| [126 (XQuery)] | TextTestXQ | ::= | "text" "(" ")" |
| [124 (XQuery)] | AnyKindTestXQ | ::= | "node" "(" ")" |
Core Grammar
The Core grammar productions for sequence types are:
| [83 (Core)] | SequenceType | ::= | ("empty-sequence" "(" ")") |
| [85 (Core)] | ItemType | ::= | KindTest | ("item" "(" ")") | AtomicType |
| [84 (Core)] | OccurrenceIndicator | ::= | "?" | "*" | "+" |
| [86 (Core)] | AtomicType | ::= | QName |
| [87 (Core)] | KindTest | ::= | DocumentTest |
| [89 (Core)] | DocumentTest | ::= | "document-node" "(" (ElementTest | SchemaElementTest)? ")" |
| [97 (Core)] | ElementTest | ::= | "element" "(" (ElementNameOrWildcard ("," TypeName "?"?)?)? ")" |
| [99 (Core)] | SchemaElementTest | ::= | "schema-element" "(" ElementDeclaration ")" |
| [100 (Core)] | ElementDeclaration | ::= | ElementName |
| [93 (Core)] | AttributeTest | ::= | "attribute" "(" (AttribNameOrWildcard ("," TypeName)?)? ")" |
| [95 (Core)] | SchemaAttributeTest | ::= | "schema-attribute" "(" AttributeDeclaration ")" |
| [96 (Core)] | AttributeDeclaration | ::= | AttributeName |
| [98 (Core)] | ElementNameOrWildcard | ::= | ElementName | "*" |
| [102 (Core)] | ElementName | ::= | QName |
| [94 (Core)] | AttribNameOrWildcard | ::= | AttributeName | "*" |
| [101 (Core)] | AttributeName | ::= | QName |
| [103 (Core)] | TypeName | ::= | QName |
| [92 (Core)] | PITest | ::= | "processing-instruction" "(" (NCName | StringLiteral)? ")" |
| [91 (Core)] | CommentTest | ::= | "comment" "(" ")" |
| [90 (Core)] | TextTest | ::= | "text" "(" ")" |
| [88 (Core)] | AnyKindTest | ::= | "node" "(" ")" |
The semantics of SequenceTypes is defined by means of normalization rules from SequenceTypes into types in the [XPath/XQuery] type system (See [2.4 The [XPath/XQuery] Type System]).
However, the [XPath/XQuery] type system not being part of the [XPath/XQuery] syntax, the SequenceType syntax is still part of the [XPath/XQuery] Core. Normalization from SequenceTypes to types is not applied during the normalization phase but whenever a dynamic or static rule requires it. Normalization of SequenceTypes is the only example of normalization that does not yield an expression in the [XPath/XQuery] Core and that occurs on-demand in dynamic or static rules.
Introduction
During processing of a query, it is sometimes necessary to determine whether a given value matches a type that was declared using the SequenceType syntax. This process is known as SequenceType matching, and is formally specified in [8.3 Judgments for type matching].
Notation
To define normalization of SequenceTypes to the [XPath/XQuery] type system, the following auxiliary mapping rule is used.
| [SequenceType]sequencetype |
| == |
| Type |
specifies that SequenceType is mapped to a Type, in the [XPath/XQuery] type system.
OccurenceIndicators are left unchanged when normalizing SequenceTypes into [XPath/XQuery] types. Each kind of SequenceType component is normalized separately into the [XPath/XQuery] type system.
| [ItemType OccurrenceIndicator]sequencetype |
| == |
| [ItemType]sequencetype OccurrenceIndicator |
The "empty-sequence()" sequence type is mapped to the empty type.
| [empty-sequence()]sequencetype |
| == |
| empty |
An atomic type is normalized to itself in the [XPath/XQuery] type system.
An "element" SequenceType without content or with a wildcard and no type name is normalized into a wildcard element type.
| [element()]sequencetype |
| == |
element * of type xs:anyType |
| [element(*)]sequencetype |
| == |
element * of type xs:anyType |
An "element" SequenceType with a wildcard and a type name is normalized into a wildcard element type with a corresponding type name. The presence of a "?" after the type name indicates a nillable element.
| [element(*,TypeName)]sequencetype |
| == |
| element * of type TypeName |
| [element(*,TypeName?)]sequencetype |
| == |
| element * nillable of type TypeName |
An "element" SequenceType with a name and a type name is normalized into an element type with a corresponding type name. The presence of a "?" after the type name indicates a nillable element.
| [element(ElementName,TypeName)]sequencetype |
| == |
| element ElementName of type TypeName |
| [element(ElementName,TypeName?)]sequencetype |
| == |
| element ElementName nillable of type TypeName |
An "element" SequenceType with only a name is normalized into a nillable element type with a corresponding name. The reason for the normalization to allow nillable elements is because the semantics of SequenceTypes in that case allows it to match every possible element with that names, regardless of its type or nilled property.
| [element(ElementName)]sequencetype |
| == |
element ElementName nillable of type xs:anyType
|
A "schema-element" SequenceType with an element declaration is normalized into a reference to the corresponding global element declaration.
| [schema-element(ElementName)]sequencetype |
| == |
| element ElementName |
An "attribute" SequenceType without content or with a wildcard and no type name is normalized into a wildcard attribute type.
| [attribute()]sequencetype |
| == |
attribute * of type xs:anySimpleType |
| [attribute(*)]sequencetype |
| == |
attribute * of type xs:anySimpleType |
An "attribute" SequenceType with a wildcard and a type name is normalized into a wildcard attribute type with a corresponding type name.
| [attribute(*,TypeName)]sequencetype |
| == |
| attribute * of type TypeName |
An "attribute" SequenceType with a name and a type name is normalized into an attribute type with a corresponding type name.
| [attribute(AttributeName,TypeName)]sequencetype |
| == |
| attribute AttributeName of type TypeName |
A "schema-attribute" SequenceType with an attribute declaration is normalized into a reference to the corresponding global attribute declaration.
| [schema-attribute(AttributeName)]sequencetype |
| == |
| attribute AttributeName |
A "document-node()" sequence types is normalized into the corresponding document type.
| [document-node()]sequencetype |
| == |
document { (element * of type xs:anyType |
A "document-node" sequence type with an element test (resp. a schema element test) is normalized into the corresponding document type, whose content is the normalization of the element test (resp. schema element test), interleaved with an arbitrary sequence of processing instruction, comment, and text nodes.
| [document-node(ElementTest)]sequencetype |
| == |
| document { [ElementTest]sequencetype & ( processing-instruction | comment ) *} |
| [document-node(SchemaElementTest)]sequencetype |
| == |
| document { [SchemaElementTest]sequencetype & ( processing-instruction | comment ) *} |
A "processing-instruction()" SequenceType is normalized into the corresponding processing-instruction type.
| [processing-instruction()]sequencetype |
| == |
| processing-instruction |
The [XPath/XQuery] type system does not model the target of a processing-instruction, which is treated as a dynamic property. Therefore a "processing-instruction" SequenceType with a string or NCName parameter is normalized into an optional processing-instruction type.
| [processing-instruction(StringLiteral)]sequencetype |
| == |
| processing-instruction? |
| [processing-instruction(NCName)]sequencetype |
| == |
| processing-instruction? |
A "comment()" SequenceType is normalized into the corresponding comment type.
| [comment()]sequencetype |
| == |
| comment |
A "text()" SequenceType is normalized into the corresponding text type.
| [text()]sequencetype |
| == |
| text |
The "node()" SequenceType denotes any node. It is normalized into a choice between the corresponding wildcard types for each kind of node.
| [node()]sequencetype |
| == |
(element * of type xs:anyTypexs:anySimpleTypexs:anyType |
The "item()" SequenceType denotes any node or atomic value. It is normalized into a choice between the corresponding wildcard types for each kind of nodes or atomic values.
| [item()]sequencetype |
| == |
(element * of type xs:anyTypexs:anySimpleTypexs:anyTypexdt:anyAtomicType )
|
| [151 (XQuery)] | CommentXQ | ::= | "(:" (CommentContents | Comment)* ":)" |
| [159 (XQuery)] | CommentContentsXQ | ::= | (Char+ - (Char* ('(:' | ':)') Char*)) |
Comments are lexical constructs only, and have no effect on the meaning of the query, and therefore do not have any formal semantics.
The following terminals are defined by XML.
| [152 (XQuery)] | PITarget | ::= | [http://www.w3.org/TR/REC-xml#NT-PITarget]XML |
| [153 (XQuery)] | CharRef | ::= | [http://www.w3.org/TR/REC-xml#NT-CharRef]XML |
| [154 (XQuery)] | QName | ::= | [http://www.w3.org/TR/REC-xml-names/#NT-QName]Names |
| [155 (XQuery)] | NCName | ::= | [http://www.w3.org/TR/REC-xml-names/#NT-NCName]Names |
| [156 (XQuery)] | S | ::= | [http://www.w3.org/TR/REC-xml#NT-S]XML |
| [157 (XQuery)] | Char | ::= | [http://www.w3.org/TR/REC-xml#NT-Char]XML |
This section gives the semantics of all the [XPath/XQuery] expressions. The organization of this section parallels the organization of Section 3 ExpressionsXQ.
| [31 (XQuery)] | ExprXQ | ::= | ExprSingle ("," ExprSingle)* |
| [32 (XQuery)] | ExprSingleXQ | ::= | FLWORExpr |
| [1 (XPath)] | XPathXP | ::= | Expr |
For each expression, a short description and the relevant grammar productions are given. The semantics of an expression includes the normalization, static analysis, and dynamic evaluation phases. Recall that normalization rules translate [XPath/XQuery] syntax into Core syntax. In the sections that contain normalization rules, the Core grammar productions into which the expression is normalized are also provided. After normalization, sections on static type inference and dynamic evaluation define the static type and dynamic value for the Core expression.
Core Grammar
The Core grammar productions for expressions are:
| [30 (Core)] | Expr | ::= | ExprSingle ("," ExprSingle)* |
| [31 (Core)] | ExprSingle | ::= | FLWORExpr |
It is a static type error for any expression to have the empty type, except for the following expressions and functions:
Empty parenthesis (), which denote the
empty sequence.
The fn:data function and all functions in the
fs namespace applied to empty
parenthesis().
Any function which returns the empty type.
The rule below enforces the above constraints. It is a static type error, if the following conditions hold for a given expression Expr.
| ||||
|
In general, static type errors are raised whenever there is no static type inference rules which can compute the type of a given expression. This is the reason for the absence of a formal post-condition in this rules. There is indeed a rule that infers the type for expression Expr, however the inferred type is empty and still a static type error must be raised.
Example
The above rule is useful in catching common mistakes, such as the misspelling of an element or attribute name or referencing of an element or attribute that does not exist. For instance, the following path expression
$x/title
raises a static type error if the type of variable
$x does not include any
title children elements.
Primary expressions are the basic primitives of the language. Theyinclude literals, variables, function calls, and the parenthesized expressions.
| [84 (XQuery)] | PrimaryExprXQ | ::= | Literal | VarRef | ParenthesizedExpr | ContextItemExpr | FunctionCall | OrderedExpr | UnorderedExpr | Constructor |
Core Grammar
The Core grammar production for primary expressions is:
| [63 (Core)] | PrimaryExpr | ::= | Literal | VarRef | ParenthesizedExpr | FunctionCall | Constructor |
Introduction
A literal is a direct syntactic representation of an atomic value. [XPath/XQuery] supports two kinds of literals: string literals and numeric literals.
| [85 (XQuery)] | LiteralXQ | ::= | NumericLiteral | StringLiteral |
| [86 (XQuery)] | NumericLiteralXQ | ::= | IntegerLiteral | DecimalLiteral | DoubleLiteral |
| [141 (XQuery)] | IntegerLiteralXQ | ::= | Digits |
| [142 (XQuery)] | DecimalLiteralXQ | ::= | ("." Digits) | (Digits "." [0-9]*) |
| [143 (XQuery)] | DoubleLiteralXQ | ::= | (("." Digits) | (Digits ("." [0-9]*)?)) [eE] [+-]? Digits |
| [144 (XQuery)] | StringLiteralXQ | ::= | ('"' (PredefinedEntityRef | CharRef | EscapeQuot | [^"&])* '"') | ("'" (PredefinedEntityRef | CharRef | EscapeApos | [^'&])* "'") |
| [144 (XQuery)a] | URILiteral | ::= | StringLiteral |
| [145 (XQuery)] | PredefinedEntityRefXQ | ::= | "&" ("lt" | "gt" | "amp" | "quot" | "apos") ";" |
| [158 (XQuery)] | DigitsXQ | ::= | [0-9]+ |
Core Grammar
The Core grammar productions for literals are:
| [64 (Core)] | Literal | ::= | NumericLiteral | StringLiteral |
| [65 (Core)] | NumericLiteral | ::= | IntegerLiteral | DecimalLiteral | DoubleLiteral |
| [105 (Core)] | IntegerLiteral | ::= | Digits |
| [106 (Core)] | DecimalLiteral | ::= | ("." Digits) | (Digits "." [0-9]*) |
| [107 (Core)] | DoubleLiteral | ::= | (("." Digits) | (Digits ("." [0-9]*)?)) [eE] [+-]? Digits |
| [108 (Core)] | StringLiteral | ::= | ('"' (EscapeQuot | [^"])* '"') | ("'" (EscapeApos | [^'])* "'") |
| [119 (Core)] | Digits | ::= | [0-9]+ |
Notation
To define the dynamic semantics of literals, we introduce the following auxiliary judgments.
The judgment
Holds if the literal expression LiteralExpr corresponds to the value AtomicValue. This judgment yields an atomic value, according to the rules described in [XQuery 1.0: A Query Language for XML]. Notably, this judgment deals with handling of literal overflows for numeric literals, and handling of character references, and predefined entity references for string literals.
Literalsare left unchanged through normalization.
| [IntegerLiteral]Expr |
| == |
| IntegerLiteral |
| [DecimalLiteral]PrologDecl |
| == |
| DecimalLiteral |
| [DoubleLiteral]Expr |
| == |
| DoubleLiteral |
| [StringLiteral]Expr |
| == |
| StringLiteral |
In the static semantics, the type of an integer literal is simply xs:integer:
In the dynamic semantics, an integer literal is evaluated by constructing an atomic value in the data model, which consists of the literal value and its type:
| ||
|
The formal definitions of decimal, double, and string literals are analogous to those for integer.
| ||
|
| ||
|
| ||
|
Introduction
A variable evaluates to the value to which the variable's QName is bound in the dynamic context.
| [87 (XQuery)] | VarRefXQ | ::= | "$" VarName |
| [88 (XQuery)] | VarNameXQ | ::= | QName |
Core Grammar
The Core grammar productions for variable references are:
| [66 (Core)] | VarRef | ::= | "$" VarName |
| [67 (Core)] | VarName | ::= | QName |
Variablereferences are left unchanged through normalization.
In the static semantics, the type of a variable is simply its type in the static environment statEnv.varType:
| |||
|
If the variable is not bound in the static environment, a static type error is raised.
In the dynamic semantics, a locally declared variable is evaluated by "looking up" its value in dynEnv.varValue:
| |||
|
In the dynamic semantics, a reference to a variable imported from a module is evaluated by accessing the dynamic context of the module in which the variable is declared.
| |||||
|
| [89 (XQuery)] | ParenthesizedExprXQ | ::= | "(" Expr? ")" |
Core Grammar
The Core grammar production for parenthesized expressions is:
| [68 (Core)] | ParenthesizedExpr | ::= | "(" Expr? ")" |
Empty parenthesis () always have the empty
type. Remember that it is a static type error for most expressions
other than () to have the empty type (see [4 Expressions] for the complete rule.)
Empty parenthesis () evaluate to the empty
sequence.
| [90 (XQuery)] | ContextItemExprXQ | ::= | "." |
Introduction
A context item expression evaluates to the context item, which may be either a node or an atomic value.
A context item expression is normalized to the
built-in
variable $fs:dot. Because it can only be bound through the
external context or a path expression, there is no need for
a specific typing rule to enforce that its value is a
singleton item.
Introduction
A function call consists of a QName followed by a parenthesized list of zero or more expressions. In [XPath/XQuery], the actual argument to a function is called an argument and the formal argument of a function is called a parameter. We use the same terminology here.
| [93 (XQuery)] | FunctionCallXQ | ::= | QName "(" (ExprSingle ("," ExprSingle)*)? ")" |
Because [XPath/XQuery] implicitly converts the values of function arguments, a normalization step is required.
Core Grammar
The Core grammar production for function calls is:
| [71 (Core)] | FunctionCall | ::= | QName "(" (ExprSingle ("," ExprSingle)*)? ")" |
Notation
Normalization of function calls uses an auxiliary mapping []FunctionArgument(SequenceType) used to insert conversions of function arguments that depend only on the expected SequenceType of the corresponding parameters. It is defined as follows:
| [Expr]FunctionArgument(SequenceType) |
| == |
| [[[Expr]Expr]AtomizeAtomic(SequenceType)]Convert(SequenceType) |
where
[Expr]AtomizeAtomic(SequenceType) denotes
fn:data(Expr) | If
[SequenceType]sequencetype
<: xdt:anyAtomicType* | |
| Expr | Otherwise |
which specifies that if the function expects atomic
parameters, then fn:data is called to obtain them.
[Expr]Convert(SequenceType) denotes
fs:convert-simple-operand(Expr,PrototypicalValue) | If
[SequenceType]sequencetype
<: xdt:anyAtomicType* |
| Expr | Otherwise |
where PrototypicalValue is a built-in
atomic
value used to encode the expected atomic type (for instance
the value 1.0 if the expected type
is
xs:decimal). A value is used here since [XPath/XQuery]
expressions cannot operate directly on types. Which value is
chosen does not have any impact on the actual semantics,
only its actual atomic type matters.
Note
The fs:convert-simple-operand function takes a
PrototypicalValue, which is a value of the target
type, to ensure that conversion to base types is possible
even though types are not first class objects in
[XPath/XQuery].
Each argument expression in a function call is normalized to its corresponding Core expression by applying []FunctionArgument(SequenceType) for each argument with the expected SequenceType for the argument inserted.
| [ QName (Expr1, ..., Exprn) ]Expr |
| == |
| QName ( [Expr1]FunctionArgument(SequenceType1), ..., [Exprn]FunctionArgument(SequenceTypen) ) |
Note that this normalization rule depends on the function signatures, which is used to get the types of the function parameters (SequenceType1,...,SequenceTypen). For user-defined functions, the function signature can be obtained from the XQuery prolog where the function is declared. For built-in functions, the signature is given in the [Functions and Operators] document. For overloaded built-in functions, several signatures may exists, however, because they all correspond to sequences of atomic values, they all result in the same normalization.
Different sets of static typing rules are used to type check function calls depending on which of the following categories the belong to: overloaded internal functions, built-in functions with a specific typing rule, and other built-in and user-defined functions.
The following rule is common to all those categories, and is used to bootstrap type inference, by first looking-up the expanded QName for the function, then applying the appropriate set of inference rule depending on the category in which the function is.
| ||||||
|
The following depends on the kind of function call.
If the expanded QName for the function corresponds to one of the overloaded internal fs: functions listed in [B.2 Mapping of Overloaded Internal Functions], the rules in [B.2 Mapping of Overloaded Internal Functions] are applied.
If the expanded QName for the function corresponds to one of the built-in functions with a specialized typing rule, listed in [7 Additional Semantics of Functions], the rules in [7 Additional Semantics of Functions] are applied.
Otherwise, the following general rule is applied.
The rule looks up the function in the static environment and checks that some signature for the function satisfies the following constraint: the type of each actual argument is a subtype of some type that can be promoted to the type of the corresponding function parameter. In this case, the function call is well typed and the result type is the return type specified in the function's signature.
| |||||
|
The function body itself is not analyzed for each invocation: static typing of the function definition itself guarantees that the function body always returns a value of the declared return type.
Notice that the static context contains at most one function declaration for each function. This is possible since the treatment of overloaded operators is done through a set of specific rules which do not require access to the environment. See [B.2 Mapping of Overloaded Internal Functions].
Notation
The following auxiliary judgment
holdswhen applying the function with expandedQName expanded-QName,and parameters of type (Type1,...,Typen)onthe values(Value1,...,Valuen)yields the value Value.
Thatjudgment is definedbelow for each kind of function (user-defined,built-in, external, and importedfunctions).
The followingrule appliesto allthe differentkinds of functionsusing the previouslydefined judgment.
| ||||||||||||
|
Firstthe function name is expanded, and the expandedname is usedto retrieve the function signature from the staticenvironment. Then, the rule evaluates each function argument expression,and the resulting values are promoted accordingto the expectedtype for the function. The result of evaluatingthe functionis obtained through the auxiliary judgment previously defined,and the resulting value is promoted according to theexpected return type.
In case the function is a user definedfunction in a main module, the expression body is retrieved from the dynamicenvironment and used to compute the value of the function. The rule extends dynEnv.varValueby binding each formal variable to its corresponding value, and evaluates thebody of the function in the new environment. The resulting valueis the value of the function call.
| ||||
|
Note that the function body is evaluated in the dynamic environment containing the main module declarations.
The rule for evaluating an function imported from a module is similar to that for evaluating a user-defined function in a main module, except that the function call is evaluated in the dynamic context of the module in which it is declared, and that the appropriate additional type matching must be performed.
| |||||||||||
|
If the function is a built-in function (resp. special formal semantics function),the value returned by thefunction is the one specified in [Functions and Operators] (resp.[7 Additional Semantics of Functions]).
| |||
|
If the function is an external function, the value returned by the function is implementation-defined.
| |||
|
Introduction
Path expressions are used to locate nodes within a tree. There are two kinds of path expressions, absolute path expressions and relative path expressions. An absolute path expression is a rooted relative path expression. A relative path expression is composed of a sequence of steps.
| [68 (XQuery)] | PathExprXQ | ::= | ("/" RelativePathExpr?) |
| [69 (XQuery)] | RelativePathExprXQ | ::= | StepExpr (("/" | "//") StepExpr)* |
Core Grammar
PathExpr and RelativePathExpr are fully normalized, therefore they have no corresponding productions in the Core. The grammar for path expressions in the Core starts with the StepExpr production.
Absolute path expressions are path expressions starting with
the / or // symbols, indicating that
the expression must be applied on the root node in the current
context. The root node in the current context is the greatest
ancestor of the context node. The following two rules normalize
absolute path expressions to relative ones. They use the
fn:root function, which returns the greatest ancestor of its
argument node. The treat expressions guarantee that the value
bound to the context variable $fs:dot is a document node.
| [/]Expr |
| == |
(fn:root(self::node()) treat as document-node())
|
| [/ RelativePathExpr]Expr |
| == |
[((fn:root(self::node())) treat as document-node()) /
RelativePathExpr]Expr
|
| [// RelativePathExpr]Expr |
| == |
[((fn:root(self::node())) treat as document-node) /
descendant-or-self::node() /
RelativePathExpr]Expr
|
| [ StepExpr // RelativePathExpr ]Expr |
| == |
| [StepExpr / descendant-or-self::node() / RelativePathExpr]Expr |
A composite relative path expression (using /)
is normalized into a for expression by
concatenating the sequences obtained by mapping each node of the
left-hand side in document order to the sequence it generates on
the right-hand side. The call to the fs:distinct-doc-order
function ensures that the result is in document order without
duplicates. The dynamic context is defined by binding the
$fs:dot,
$fs:sequence,
$fs:position and
$fs:last variables.
Note that sorting by document order enforces the restriction that input and output sequences contains only nodes, and that the last step in a path expression may actually return atomic values.
| [StepExpr / RelativePathExpr]Expr | |||||||
| == | |||||||
|
Note that this section uses some auxiliary judgments which are defined in [8.2 Judgments for step expressions and filtering].
Introduction
| [70 (XQuery)] | StepExprXQ | ::= | FilterExpr | AxisStep |
| [71 (XQuery)] | AxisStepXQ | ::= | (ReverseStep | ForwardStep) PredicateList |
| [72 (XQuery)] | ForwardStepXQ | ::= | (ForwardAxis NodeTest) | AbbrevForwardStep |
| [75 (XQuery)] | ReverseStepXQ | ::= | (ReverseAxis NodeTest) | AbbrevReverseStep |
| [82 (XQuery)] | PredicateListXQ | ::= | Predicate* |
Core Grammar
The Core grammar productions for XPath steps are:
| [54 (Core)] | StepExpr | ::= | PrimaryExpr | AxisStep |
| [55 (Core)] | AxisStep | ::= | ReverseStep | ForwardStep |
| [56 (Core)] | ForwardStep | ::= | ForwardAxis NodeTest |
| [58 (Core)] | ReverseStep | ::= | ReverseAxis NodeTest |
Note
Step expressions can be followed by predicates. Normalization of predicates uses the following auxiliary mapping rule: []Predicates, which is specified in [4.2.2 Predicates]. Normalization for step expressions also uses the following auxiliary mapping rule: []Axis, which is specified in [4.2.1.1 Axes].
Normalization of predicates need to distinguish between forward steps, reverse steps, and primary expressions.
As explained in the [XPath/XQuery] document, applying a step in
XPath changes the focus (or context). The change of focus is
made explicit by the normalization rule below, which binds the
variable $fs:dot to the node currently being processed, and
the variable $fs:position to the position (i.e., the position
within the input sequence) of that node.
There are two sets of normalization rules for Predicates.
The first set of rules apply when the predicate is a numeric
literal or the expression last(). The second set
of rules apply to all predicate expressions other than numeric
literals and the expression last(). In the first
case, the normalization rules provides a more precise static
type than if the general rules were applied.
When the predicate expression is a numeric literal or the
fn:last function, the following normalization rules
apply.
| [ForwardStep PredicateList [ NumericLiteral ]]Expr | ||
| == | ||
|
[ForwardStep PredicateList [
fn:last() ]]Expr
| |||
| == | |||
|
When predicates are applied on a reverse step, the position variable is bound in reverse document order.
| [ReverseStep PredicateList [ NumericLiteral ]]Expr | ||||
| == | ||||
|
When the step is a reverse axis, then the last item in the context sequence is the first in document order.
[ReverseStep PredicateList [
fn:last() ]]Expr
| ||
| == | ||
|
The normalization rules above all use the function
fn:subsequence to select a particular item. The static
typing rules for this function are defined in [7.2.13 The fn:subsequence function].
When predicates are applied on a forward step, the input
sequence is first sorted in document order and duplicates are
removed. The context is changed by binding the $fs:dot
variable to each node in document order.
| [ForwardStep PredicateList [ Expr ]]Expr | ||||
| == | ||||
|
When predicates are applied on a reverse step, the input
sequence is first sorted in document order and duplicates are
removed. The context is changed by binding the $fs:dot
variable to each node in document order.
| [ReverseStep PredicateList [ Expr ]]Expr | |||||
| == | |||||
|
Finally, a stand-alone forward or reverse step is normalized by the auxiliary normalization rule for Axis.
| [ForwardStep]Expr |
| == |
fs:apply-ordering-mode([ForwardStep]Axis)
|
| [ReverseStep]Expr |
| == |
fs:apply-ordering-mode([ReverseStep]Axis)
|
The static semantics of an Axis NodeTest pair is obtained by retrieving the type of the context node, and applying the two filters (the Axis, and then the NodeTest with a PrincipalNodeKind) on the result.
| ||||||
| statEnv |- Axis NodeTest : Type3 |
Note
Note that the second judgment in the inference rule requires that the context item be a node, guaranteeing that a type error is raised when the context item is an atomic value.
The dynamic semantics of an Axis NodeTest pair is obtained by retrieving the context node, and applying the two filters (Axis, then NodeTest) on the result. The application of each filter is expressed through the filter judgment as follows.
| ||||||
dynEnv |-
Axis NodeTest => fs:distinct-doc-order(Value3)
|
Note
Note that the second judgment in the inference rule guarantees that the context item is bound to a node.
Introduction
The XQuery grammar for forward and reverse axis is as follows.
| [73 (XQuery)] | ForwardAxisXQ | ::= | ("child" "::") |
| [76 (XQuery)] | ReverseAxisXQ | ::= | ("parent" "::") |
In the case of XPath, forward axis also contain the
namespace:: axis.
| [30 (XPath)] | ForwardAxisXP | ::= | ("child" "::") |
Core Grammar
The Core grammar productions for XPath axis are:
| [57 (Core)] | ForwardAxis | ::= | ("child" "::") |
| [59 (Core)] | ReverseAxis | ::= | ("parent" "::") |
Notation
The normalization of axes uses the following auxiliary mapping rule: []Axis.
The normalization for all axes is specified as follows.
The semantics of the following(-sibling) and preceding(-sibling) axes are expressed by mapping them to Core expressions. All other axes are part of the Core and therefore are left unchanged through normalization.
[following-sibling::
NodeTest]Axis
|
| == |
| [let $e := . return $e/parent::node()/child:: NodeTest [.>>$e]]Expr |
[following::
NodeTest]Axis
|
| == |
[ancestor-or-self::node()/following-sibling::node()/descendant-or-self::NodeTest]Expr
|
All other forward axes are part of the Core [XPath/XQuery] and handled by the normalization rules below:
[child:: NodeTest]Axis
|
| == |
child:: NodeTest
|
[attribute:: NodeTest]Axis
|
| == |
attribute:: NodeTest
|
[self:: NodeTest]Axis
|
| == |
self:: NodeTest
|
[descendant:: NodeTest]Axis
|
| == |
descendant:: NodeTest
|
[descendant-or-self:: NodeTest]Axis
|
| == |
descendant-or-self:: NodeTest
|
[namespace:: NodeTest]Axis
|
| == |
namespace:: NodeTest
|
Reverse axes:
[preceding-sibling:: NodeTest]Axis
|
| == |
| [let $e := . return $e/parent::node()/child:: NodeTest [.<<$e]]Expr |
[preceding:: NodeTest]Axis
|
| == |
[ancestor-or-self::node()/preceding-sibling::node()/descendant-or-self::NodeTest]Expr
|
All other reverse axes are part of the Core [XPath/XQuery] and handled by the normalization rules below:
[parent:: NodeTest]Axis
|
| == |
parent:: NodeTest
|
[ancestor:: NodeTest]Axis
|
| == |
ancestor:: NodeTest
|
[ancestor-or-self:: NodeTest]Axis
|
| == |
ancestor-or-self:: NodeTest
|
Introduction
A node test is a condition applied on the nodes selected by an axis step. Node tests are described by the following grammar productions.
| [78 (XQuery)] | NodeTestXQ | ::= | KindTest | NameTest |
| [79 (XQuery)] | NameTestXQ | ::= | QName | Wildcard |
| [80 (XQuery)] | WildcardXQ | ::= | "*" |
Core Grammar
The Core grammar productions for node tests are:
| [60 (Core)] | NodeTest | ::= | KindTest | NameTest |
| [61 (Core)] | NameTest | ::= | QName | Wildcard |
| [62 (Core)] | Wildcard | ::= | "*" |
Notation
For convenience, we will use the grammar non-terminals Prefix, and LocalPart, both of which are NCNames, in some of the inference rules. They are defined by the following grammar productions.
| [18 (Formal)] | Prefix | ::= | NCName |
| [19 (Formal)] | LocalPart | ::= | NCName |
Introduction
A predicate consists of an expression, called a predicate expression, enclosed in square brackets.
| [83 (XQuery)] | PredicateXQ | ::= | "[" Expr "]" |
Notation
Normalization of predicates uses the following auxiliary mapping rule: []Predicates.
Predicates in path expressions are normalized with a special mapping rule:
| [Expr]Predicates | |||
| == | |||
|
Note that the semantics of predicates whose input
expression returns a numeric value also work if that value is
not an integer. In those cases the
op:numeric-equal returns false when compared to a
position. For example the expression //a[3.4]
returns the empty sequence)
The corresponding Section in the [XPath/XQuery] document just contains examples.
| [74 (XQuery)] | AbbrevForwardStepXQ | ::= | "@"? NodeTest |
| [77 (XQuery)] | AbbrevReverseStepXQ | ::= | ".." |
Here are normalization rules for the abbreviated syntax.
| [ @ NameTest ]Expr |
| == |
| attribute :: NameTest |
Introduction
[XPath/XQuery] supports operators to construct and combine sequences. A sequence is an ordered collection of zero or more items. An item is either an atomic value or a node.
| [31 (XQuery)] | ExprXQ | ::= | ExprSingle ("," ExprSingle)* |
| [49 (XQuery)] | RangeExprXQ | ::= | AdditiveExpr ( "to" AdditiveExpr )? |
Core Grammar
The Core grammar production for sequence expressions is:
| [30 (Core)] | Expr | ::= | ExprSingle ("," ExprSingle)* |
A sequence expression is normalized into a sequence of normalized single expressions:
The type of the sequence expression is the sequence over the types of the individual expressions.
Each expression in the sequence is evaluated and the resulting values are concatenated into one sequence.
The range operator is normalized to the fs:to
function.
The static semantics of the fs:to function is defined in
[Functions and Operators].
The dynamic semantics of the op:to operator is defined in [Functions and Operators].
Introduction
| [81 (XQuery)] | FilterExprXQ | ::= | PrimaryExpr PredicateList |
Core Grammar
There are no Core grammar productions for filter expressions as they are normalized to other Core expressions.
When a predicate with a numeric literal or the
last() expression is applied on a primary
expression, it is normalized using the fn:subsequence
function. This results in a more precise static type for those
cases.
| [PrimaryExpr PredicateList [ NumericLiteral ]]Expr | ||
| == | ||
|
| [PrimaryExpr PredicateList [ last() ]]Expr | ||
| == | ||
|
In the general case, when a predicate is applied on a primary expression, it is normalized to a FLWOR expression as follows. The input sequence is processed in sequence order and the context item is bound to each item in the input sequence.
| [PrimaryExpr PredicateList [ Expr ]]Expr | ||||
| == | ||||
|
There are no additional static type rules for filter expressions.
There are no additional dynamic evaluation rules for filter expressions.
[XPath/XQuery] provides several operators for combining sequences of nodes.
| [52 (XQuery)] | UnionExprXQ | ::= | IntersectExceptExpr ( ("union" | "|") IntersectExceptExpr )* |
| [53 (XQuery)] | IntersectExceptExprXQ | ::= | InstanceofExpr ( ("intersect" | "except") InstanceofExpr )* |
Notation
The union, intersect, and except expressions are normalized into function calls to the appropriate functions. The mapping function []SequenceOp is defined by the following table:
| SequenceOp | [SequenceOp]SequenceOp |
| "union" | op:union |
| "|" | op:union |
| "intersect" | op:intersect |
| "except" | op:except |
Operators for combining node sequences are normalized as follows.
| [Expr1 SequenceOp Expr2]Expr |
| == |
fs:apply-ordering-mode ([SequenceOp]SequenceOp (
[Expr1]Expr, [Expr2]Expr ))
|
The static semantics of the operators that combine sequences are defined in [7.2.14 The op:union, op:intersect, and op:except operators].
The dynamic semantics for function calls is given in [4.1.5 Function Calls].
[XPath/XQuery] provides arithmetic operators for addition, subtraction, multiplication, division, and modulus, in their usual binary and unary forms.
| [50 (XQuery)] | AdditiveExprXQ | ::= | MultiplicativeExpr ( ("+" | "-") MultiplicativeExpr )* |
| [51 (XQuery)] | MultiplicativeExprXQ | ::= | UnionExpr ( ("*" | "div" | "idiv" | "mod") UnionExpr )* |
| [58 (XQuery)] | UnaryExprXQ | ::= | ("-" | "+")* ValueExpr |
| [59 (XQuery)] | ValueExprXQ | ::= | ValidateExpr | PathExpr | ExtensionExpr |
Core Grammar
The Core grammar production for arithmetic expressions is:
| [48 (Core)] | ValueExpr | ::= | ValidateExpr | StepExpr | ExtensionExpr |
Notation
The mapping functions []ArithOp and UnaryArithOp are defined by the following tables:
| ArithOp | [ArithOp]ArithOp |
| "+" | fs:plus |
| "-" | fs:minus |
| "*" | fs:times |
| "div" | fs:div |
| "mod" | fs:mod |
| UnaryArithOp | [UnaryArithOp]UnaryArithOp |
| "+" | fs:unary-plus |
| "-" | fs:unary-minus |
Core Grammar
There are no Core grammar productions for arithmetic expressions as they are normalized to other Core expressions.
The normalization rules for all the arithmetic operators
except idiv first atomize each argument by applying
fn:data and then apply the internal function
fs:convert-operand to each argument. If the first argument to
this function has type xdt:untypedAtomic, then the first
argument is cast to a double, otherwise it is returned unchanged.
The overloaded internal function corresponding to the arithmetic
operator is then applied to the two converted arguments. The
table above maps the operators to the corresponding internal
function. The mapping from the overloaded internal functions to
the corresponding non-overloaded function is given in [B.2 Mapping of Overloaded Internal Functions].
| [Expr1 ArithOp Expr2]Expr | ||||
| == | ||||
|
The normalization rules for the idiv operator are similar,
but instead of casting arguments with type xdt:untypedAtomic to
xs:double, they are cast to xs:integer.
[Expr1 idiv Expr2]Expr
| ||||
| == | ||||
|
The unary operators are mapped similarly.
[+ Expr]Expr
|
| == |
fs:unary-plus(fs:convert-operand(fn:data(([Expr]Expr)), 1.0E0))
|
[- Expr]Expr
|
| == |
fs:unary-minus(0, fs:convert-operand(fn:data(([Expr]Expr)), 1.0E0))
|
The static semantics for function calls is given in [4.1.5 Function Calls].
The dynamic semantics for function calls is given in [4.1.5 Function Calls].