XQuery 1.0 and XPath 2.0 Data Model

1 Introduction

This document defines the XQuery 1.0 and XPath 2.0 Data Model, which is the data model of [XPath 2.0], [XSLT 2.0] and [XQuery 1.0: A Query Language for XML]

The XQuery 1.0 and XPath 2.0 Data Model (henceforth "data model") serves two purposes. First, it defines precisely the information contained in the input to an XSLT or XQuery processor. Second, it defines all permissible values of expressions in the XSLT, XQuery, and XPath languages. A language is closed with respect to a data model if the value of every expression in a language is guaranteed to be in the data model. XSLT 2.0, XQuery 1.0, and XPath 2.0 are all closed with respect to the data model.

The data model is based on the [XML Information Set] (henceforth "Infoset"), but it requires the following new features to meet the [XPath Requirements Version 2.0] and [XML Query Requirements]:

Support for XML Schema types. The XML Schema recommendations define features, such as structures ([XMLSchema Part 1]) and simple data types ([XMLSchema Part 2]), that extend the XML Information Set with precise type information.
Representation of collections of documents and of complex values. ([XML Query Requirements])

As with the Infoset, the XQuery 1.0 and XPath 2.0 Data Model specifies what information in the documents is accessible, but it does not specify the programming-language interfaces or bindings used to represent or access the data.

Every value handled by the data model is a sequence of zero or more items. An item is either a node or an atomic value. A node is defined in 4 Nodes and is one of seven node kinds. An atomic value encapsulates an XML Schema atomic type and a corresponding value of that type. They are defined in 5 Atomic Values. A sequence is an ordered collection of nodes, atomic values, or any mixture of nodes and atomic values. A sequence cannot be a member of a sequence. A single item appearing on its own is modeled as a sequence containing one item. Sequences are defined in 6 Sequences.

Note:

In XPath 1.0, the data model only defines nodes. The primitive data types (number, boolean, string, node-set) are part of the expression language, not the data model.

The data model can represent various values including not only the input and the output of a stylesheet or query, but all values of expressions used during the intermediate calculations. Examples include the input document or document repository (represented as a document node or a sequence of document nodes), the result of a path expression (represented as a sequence of nodes), the result of an arithmetic or a logical expression (represented as an atomic value), a sequence expression resulting in a sequence of items, etc. Examples of values that cannot be expressed directly by the data model include schema components and atomic values whose type is not an XML Schema atomic type.

In this document, we provide a precise definition of how values in the XQuery 1.0 and XPath 2.0 Data Model are constructed and accessed, and how they relate to values in the Infoset. We note wherever the XQuery 1.0 and XPath 2.0 Data Model differs from that of XPath 1.0.

2 Notation and Pseudo-code Syntax

In addition to prose, we define two sets of functions to explain the data model: accessors and constructors. The accessors and constructors defined by the data model are shown with the prefix dm. The prefix is always shown in italics to emphasize that these functions are abstract; they exist to explain the interface between the data model and specfications that rely on the data model: they are not and cannot be made accessible directly from the host language.

See [Issue-0033: Unclear relationship between values passed to the constructor, and those returned by the accessor].

The signature of accessors and constructors is shown using the same style as [XQuery 1.0 and XPath 2.0 Functions and Operators]. For example:

dm:typed-value($n as Node) as AtomicValue*

In the psuedo-code syntax, the term Node denotes the category of node values, AtomicValue denotes the category of atomic values, and Item refers to the category of either node values or atomic values.

Some accessors and constructors can accept or return sequences. The following notation is used to denote sequence values:

V* denotes a sequence of zero or more items of type V.
V? denotes a sequence of exactly zero or one items of type V.
V+ denotes a sequence of one or more items of type V.

In a sequence, V may be a Node or AtomicValue, or the union (choice) of several categories of Items.

There are some functions in the data model that are partial functions. We use the occurrence indicators ? or * when specifying the return type of such functions. For example, a node may have one parent node or no parent. If the node argument has a parent, the dm:parent accessor returns a singleton sequence. If the node argument does not have a parent, it returns the empty sequence. The signature of dm:parent specifies that it returns an empty sequence or a sequence containing one node:

dm:parent($n as Node) as Node?

Note:

The XPath 1.0 data model defines accessors, but does not define constructors.

This document relies on the [XML Information Set]. Information items and properties are indicated by the styles information item and [property], respectively.

This document frequently uses the term expanded-QName. [Definition: An expanded-QName is a pair of values consisting of a namespace URI and a local name. They belong to the value space of the XML Schema type xs:QName. When this document refers to xs:QName we always mean the value space, i.e. a namespace URI, local name pair (and not the lexical space referring to constructs of the form prefix:local-name).]

3 Concepts

3.1 Node Identity

Because XML documents are tree-structured, we define the data model using conventional terminology for trees. The data model is a node-labeled, tree-shaped graph, but also includes a concept of node identity. The identity of a node is established when a node-constructor is applied to create the node: each application of a node constructor creates a new node that is identical to itself, and not identical to any other node (see 4 Nodes).

This concept should not be confused with the concept of a unique ID, which is a unique name assigned to an element by the author to represent references using ID/IDREF correlation.

3.2 Document Order

[Definition: A document order is defined on all the nodes in a document. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order returned by an in-order, depth-first, left-to-right traversal of the data model.] There is precisely one document order and it satisfies the following constraints.

The document node is the first node.
The relative order of siblings is determined by their order in the XML representation. A node N1 occurs before a node N2 in document order if and only if the start of N1 occurs before the start of N2 in the XML document.
Element nodes occur before their children; children occur before following-siblings.
Namespace nodes immediately follow the element node with which they are associated. The relative order of namespace nodes is stable but implementation-dependent.
Attribute nodes immediately follow the namespace nodes of the element with which they are associated. The relative order of attribute nodes is stable but implementation-dependent.

Reverse document order is the reverse of document order.

The relative order of nodes in distinct documents is implementation-dependent but stable. In other words, given two distinct documents A and B, if a node in document A is before a node in document B, then every node in document A is before every node in document B.

Note:

The relative order of free-floating nodes (those not in a document) is not defined. See [Issue-0050: Relative order of free-floating nodes].

3.3 XML Schemas and the XML Information Set

The data model is defined in terms of the [XML Information Set] after XML Schema validity assessment. XML Schema validity assessment is the process of assessing an XML element information item with respect to an XML Schema and augmenting it and some or all of its descendants with properties that provide information about validity and type assignment. [Definition: The result of schema validity assessment is an augmented Infoset, known as the Post Schema-Validation Infoset, or PSVI.]

The data model supports well-formed XML documents conforming to [Namespaces in XML]. XML documents that are not well-formed are not XML, by definition. XML documents that do not conform to [Namespaces in XML] are not supported (they are not supported by [XML Information Set]).

In other words, the data model supports the following classes of XML documents:

Well-formed documents conforming to [Namespaces in XML],
DTD-valid documents conforming to [Namespaces in XML], and
W3C XML Schema-validated documents.

The data model supports some kinds of values that are not supported by [XML Information Set]. Examples of these are well-formed document fragments, sequences of fragments or sequences of documents. The data model also supports values that are not nodes. Examples of these are atomic values, sequences of atomic values, or sequences mixing nodes and atomic values. These are necessary to be able to represent the results of intermediate expressions in the data model during expression processing.

Schema-validated documents include documents in which some elements or attributes have been validated by "lax" or "skip" validation ([XMLSchema Part 2]).

An "incompletely validated document" is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.

The data model supports incompletely validated documents.

Note:

This implies accommodation for the case where both a DTD and a schema are applied. This will probably require some reconciliation of the [attribute type] property with type information from the PSVI. See issues [Issue-0004: Schema/DTD], [Issue-0081: Schema-less documents with a DTD].

In addition to specifying the transformation from the Post Schema Validation Infoset (PSVI) to the data model, this document also specifies the transformation from the data model back to the XML Information Set. This is a useful notion that can be used for defining serialization and validation. Serialization can be viewed as a two step process, first transforming to the XML Infoset and then to an XML document. Validation is described conceptually as a process of mapping the data model to the XML Infoset followed by XML Schema validation producing a PSVI which is then loaded into the data model.

See [Issue-0085: Globally declared namespaces in the infoset].

3.4 Types

The data model supports a representation of named types as expanded-QNames. Named types include both the built-in types defined by [XMLSchema Part 2] and user-types declared in a schema and imported by a stylesheet or query. Since named types in XML Schema are global, an expanded-QName uniquely identifies such a type. The namespace name of the expanded-QName is the target namespace of the schema and its local name is the name of the type. The data model does not uniquely identify anonymous types and represents them by xs:anyType or xs:anySimpleType.

The data model associates type information with element nodes, attribute nodes and atomic values. If this type information is something other than xs:anyType or xs:anySimpleType, the item is guaranteed to be a valid instance of that type as defined by XML Schema.

The data model defines an accessor dm:type that returns an expanded-QName corresponding to the type of the element node, attribute node or atomic value. It returns xs:anyType or xs:anySimpleType if it is locally declared, if no type information exists, or if it failed W3C XML Schema validity assessment.

When no type information exists for an element or an attribute node we frequently use the terminology "element with unknown type" or "attribute with unknown simple type".

The data model does not represent element or attribute declaration schema components, but it supports various type-related operations. The semantics of such operations, e.g. checking if a particular instance of an element node has a given type is defined in [XQuery 1.0 Formal Semantics].

3.5 Typed Value and String Value

The content of a text, attribute, or element node can be interpreted in two ways: as a string value or as a typed value. For these types of nodes, the typed value can be extracted by the dm:typed-value accessor, and the string value can be extracted by the dm:string-value accessor.

The string value of a node is a single xs:string derived from the content of the node as described in the definitions of the accessor functions for each kind of node.

The typed value of a node is a sequence of atomic values derived from its string value and its type in a way that is consistent with schema validation, as described in the definitions of the accessor functions for each kind of node.

[Issue-0089: How is typed-value calculated?]

3.6 Mapping PSV Infoset additions to Types

This section specifies how the type of an element or attribute node is computed from the PSVI properties that specify validity and type assessment for the node's corresponding information item.

See [Issue-0077: PSVI to Type mapping dependence on conformance levels].

A PSVI element or attribute information item has a [validity] property. The [validity] property may be "valid", "invalid", or "notKnown" and reflects the outcome of schema-validity assessment. The only information that can be inferred from an invalid or not known validity value is that the information item is well-formed, therefore, we must associate some general type information with the element or attribute node if it is not known to be valid.

The precise definition of the type of an element or attribute information item depends on the properties of the PSVI. XML Schema only guarantees the existence of either the [type definition] property, or the [type definition namespace], [type definition name] and [type definition anonymous] properties. If the type definition refers to a union type, there are further properties defined, that refer to the type definition which actually validated the item's normalized value. These properties are either the [member type definition], or the [member type definition namespace], [member type definition name] and [member type definition anonymous] properties. If these are available, the type of an element or attribute will refer to the member type that actually validated the schema normalized value.

The type of an element information item is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:

If the [validity] property exists and is ""valid":
- If [member type definition] exists and its {name} property is present:
  - The {target namespace} and {name} properties of the [member type definition] property.
- If the [type definition] property exists and its {name} property is present:
  - The {target namespace} and {name} properties of the [type definition] property.
- If [member type definition anonymous] exists and is false: the [member type definition namespace] and the [member type definition name].
- If [type definition anonymous] exists and is false: the [type definition namespace] and the [type definition name]
Otherwise, xs:anyType for elements or xs:anySimpleType for attributes.

Editorial note
The above definition is currently under discussion. It is very likely that a change will be made in a future draft to reflect a more precise definition covering derived types and the possible usage of generated type identifiers when no type names available. See [Issue-0076: PSVI to Type mapping supporting derived types].

3.7 Comments, Processing Instructions, and Whitespace

Although the data model is able to represent comments, processing instructions, and insignificant whitespace, preservation of this information may be unnecessary and onerous for some applications.

An instance of the data model can be constructed from an Infoset, a PSVI, or from some other data source entirely. Different applications may or may not choose to construct nodes in the data model to represent comments, processing instructions, and insignificant white space. These decisions are considered outside the scope of the data model. Consequently the data model makes no attempt to control or identify the sort of processing in this regard that an application uses to construct a data model instance.

4 Nodes

The category of Node values contains seven distinct kinds of nodes: document, element, attribute, text, namespace, processing instruction, and comment. The seven kinds of nodes are defined in the following subsections.

Each kind of node has its own constructor. The effect of a node constructor is to create a new node with a unique identity, distinct from all other nodes.

A tree contains a root plus all nodes that are reachable directly or indirectly from the root via the dm:children, dm:attributes, and dm:namespace accessors. Every node belongs to exactly one tree, and every tree has exactly one root node. A tree whose root node is a document node is referred to as a document. A tree whose root node is some other kind of node is referred to as a fragment.

4.1 Accessors

A set of accessors is defined on all seven kinds of Nodes. Some accessors return a constant empty sequence on certain node kinds.

See [Issue-0091: Support for substitution groups].

In order for applications to be able to operate on instances of the data model, the model must expose properties of the items it contains. The data model does this by defining a family of accessor functions. These are not functions in the literal sense, they are not available for users or applications to call directly, rather they are descriptions of the interface that an implementation of the data model must expose to applications. Functions and operators available to end-users are described in [XQuery 1.0 and XPath 2.0 Functions and Operators].

The following table summarizes the accessor functions and the types of values that they return if called on a node of each type.

	Document Node	Element Node	Attribute Node	Namespace Node	P.I. Node	Comment Node	Text Node
dm:base-uri	xs:anyURI?
dm:node-kind	xs:string
dm:node-name	()	xs:QName	xs:QName	xs:QName?	()	()	()
dm:parent	()	Node?	Node?	Node?	Node?	Node?	Node?
dm:string-value	xs:string
dm:typed-value	()	Value?	Value?	()	()	()	Value?
dm:type	()	xs:QName?	xs:QName?	()	()	()	()
dm:children	Node+	Node*	()	()	()	()	()
dm:attributes	()	Node*	()	()	()	()	()
dm:namespaces	()	Node*	()	()	()	()	()

4.1.1 dm:base-uri Accessor

dm:base-uri($n as Node) as xs:anyURI?

The dm:base-uri accessor returns a sequence containing zero or one uri references.

Document, element, and processing-instruction nodes have a base-uri property. If that property is non-empty, its value is returned.

If the accessor is called on a node that does not have a base-uri property, or whose base-uri property is empty, the base-uri of that node's parent is returned. If the node has no parent, an error is raised. See [Issue-0087: base-uri should return ()?]

4.1.2 `node-kind` Accessor

dm:node-kind($n as Node) as xs:string

The dm:node-kind accessor returns a string value identifying the kind of node on which the accessor was called. One of the following values is returned:

"document" for document nodes.
"element" for element nodes.
"attribute" for attribute nodes.
"text" for text nodes.
"namespace" for namespace nodes.
"processing-instruction" for processing instruction nodes.
"comment" for comment nodes.

4.1.3 `node-name` Accessor

dm:node-name($n as Node) as xs:QName?

The dm:node-name accessor returns a sequence of zero or one xs:QNames.

For element and attribute nodes, dm:node-name returns the qualified name of the element or attribute.
For processing-instructions nodes, dm:node-name returns an xs:QName with a the processing instruction target name in the local-name and no namespace URI.
For namespace nodes, dm:node-name returns an xs:QName with the prefix of the namespace declaration in the local-name and no namespace URI. If the namespace declaration declares the default namespace, which has no prefix, an empty sequence is returned.
Some implementations may not preserve information about the prefixes declared. In these cases, the dm:node-name accessor returns the empty sequence when applied to processing-instruction nodes.

4.1.4 `parent` Accessor

dm:parent($n as Node) as Node?

The dm:parent accessor returns a sequence containing zero or one nodes.

For nodes that have a parent, dm:parent returns the parent node. For all other nodes, it returns the empty sequence.

If the return value is not the empty sequence, it will always be either an element node or a document node.

4.1.5 `string-value` Accessor

dm:string-value($n as Node) as xs:string

The dm:string-value accessor returns a string representation of the node.

For some kinds of nodes, this is part of the node; for other kinds of nodes, it is computed from the dm:string-value of its descendant nodes.

The dm:string-value accessor can be used to recover the lexical representation of an atomic value. The details of converting an atomic value to its string representation are described in the "Casting Functions" section of [XQuery 1.0 and XPath 2.0 Functions and Operators]. In particular if the atomic value's type is primitive, dm:string-value returns the atomic value's canonical lexical representation for that primitive type as specified in [XMLSchema Part 2]. If the atomic value's type is derived, the lexical representation depends on whether a value is supplied for the type's pattern facet: If no such value is supplied dm:string-value returns the atomic value's canonical lexical representation for the primitive base type. Otherwise dm:string-value returns a lexical representation that matches the value specified for the pattern facet. (This case includes xs:integers.) See [Issue-0072: Lexical representation of Schema primitive types].

Note:

Using the canonical lexical representation for atomic values as described above may not always be compatible with XPath 1.0.

4.1.6 `typed-value` Accessor

dm:typed-value($n as Node) as AtomicValue*

The dm:typed-value accessor returns the typed-value of the node, which is a sequence of zero or more atomic values. The typed-value is closely related to the node's string-value and its type. For instance when the node's string-value is "3.14" and its type is xs:decimal, the typed-value is a sequence containing the atomic value 3.14 of type decimal. In fact, when the type is an atomic type, typed-value is always the atomic-value constructed from the string-value and the type.

In the general case, dm:typed-value constructs a sequence of atomic values. These values are derived from the string-value of the element and its type, in such a way as to be consistent with validation.

See [Issue-0080: Typed value of Document, PI and Comment nodes].

If the node (or node kind) has no typed value, the empty sequence is returned.

4.1.7 `type` Accessor

dm:type($n as Node) as xs:QName?

The dm:type accessor returns a sequence containing zero or one xs:QName.

For element nodes, dm:type returns the globally declared QName of the type of the node or xs:anyType if it is locally declared or no type information exists.

For attribute nodes, dm:type returns the globally declared QName of the type of the node or xs:anySimpleType if it is locally declared or no type information exists.

For other node kinds, it always returns the empty sequence.

4.1.8 `children` Accessor

dm:children($n as Node) as Node*

The dm:children accessor returns a sequence containing zero or more nodes.

For document and element nodes, it returns the nodes that are the children of that node. It returns the empty sequence for document and element nodes that have no children.

For all other nodes, it always returns the empty sequence.

A document node or an element node is the parent of each of its child nodes. Nodes never share children: if two nodes have distinct identities, then no child of one node will be a child of the other node.

4.1.9 `attributes` Accessor

dm:attributes($n as Node) as AttributeNode*

The dm:attributes accessor returns a sequence containing zero or more attribute nodes.

For element nodes, these are the attributes of the node. For all other nodes, it always returns the empty sequence.

4.1.10 `namespaces` Accessor

dm:namespaces($n as Node) as NamespaceNode*

The dm:namespaces accessor returns a sequence containing zero or more namespace nodes.

For element nodes, these are the namespaces of the node. For all other nodes, it always returns the empty sequence.

4.2 Documents

4.2.1 Overview

Document nodes encapsulate XML documents. Documents have the following properties:

base-uri, possibly empty.
children

Document nodes must satisfy the following constraints.

An document node's children may not contain two consecutive text nodes. Consecutive text nodes are collapsed by the document constructor into one text node.
If a node N is a child of a document D, then the parent of N must be D.
If a node N has a parent document D, then N must be among the children of D.
Every child of a document must be distinct.

In a well-formed document, the children of the document node consist exclusively of element nodes, processing-instruction nodes, and comment nodes, and exactly one of these children is an element node. A document node in the data model is more permissive: it allows more than one element node as a child and also permits text nodes as children. See [Issue-0074: Do we need Document fragments].

Note:

Document nodes and XPath 1.0 root nodes are essentially identical.

4.2.2 Constructor

A document node can be constructed using dm:document-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes an optional base URI value and a non-empty sequence of children nodes as arguments.

dm:document-node($children as Node+) as DocumentNode

dm:document-node($children as Node+, $base-uri as xs:anyURI) as DocumentNode

The sequence of nodes passed as $children must not be empty and must consist only of element, processing instruction, comment, and text nodes. See [Issue-0090: Documents can be empty].

If consecutive text nodes are specified as the children of a document node they are collapsed into one text node whose string value is the concatenation of the string values of the consecutive text nodes.

4.2.3 Accessors

Accessor	Returns:
dm:base-uri	The value of the base-uri property
dm:node-kind	"`document`"
dm:node-name	()
dm:parent	()
dm:string-value	The concatenation of the string-values of all the text node descendants of the document in document order
dm:typed-value	()
dm:type	()
dm:children	The children of the document node
dm:attributes	()
dm:namespaces	()

Two additional accessors are defined on document nodes:

`dm:unparsed-entity-system-id`(	`$node`	`as` `DocumentNode`,
`dm:unparsed-entity-system-id`(	`$entityname`	`as` `xs:string`) `as` `xs:string?`

The dm:unparsed-entity-system-id accessor returns the system identifier of an unparsed external entity declared in the specified document. If no entity with the name specified in $entityname exists, or if the entity is not an external unparsed entity, the empty sequence is returned.

`dm:unparsed-entity-public-id`(	`$node`	`as` `DocumentNode`,
`dm:unparsed-entity-public-id`(	`$entityname`	`as` `xs:string`) `as` `xs:string?`

The dm:unparsed-entity-public-id accessor returns the public identifier of an unparsed external entity declared in the specified document. If no entity with the name specified in $entityname exists, or if the entity is not an external unparsed entity, or if the entity has no public identifier, the empty sequence is returned.

4.2.4 PSVI to Datamodel Mapping

When a data model fragment is created from the PSVI, a document information item is mapped to a Document Node. The precise transformation is described by specifying the PSVI property corresponding to each argument of the document node constructor:

Argument	Value:
`$base-uri`	The value of the [base URI] property
`$children`	The sequence of nodes constructed from the information items found in the [children] property.

To construct the value of the $children argument, for each element, processing instruction, comment, and maximal sequence of adjacent characater information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, and Text node is constructed and that sequence of nodes is used as the value. If present among the [children], the [document type declaration] information item is ignored.

Note:

There is no way to determine what DTD might apply to the data model. See [Issue-0042: System Id and Public Id are not exposed].

4.2.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps a Document Node to a document information item. The properties of the document information item are constructed as follows:

Property	Value:
[base URI]	The value returned by the dm:base-uri accessor
[children]	The sequence of information items constructed from the nodes returned by the dm:children accessor. In other words, for each node returned by the dm:children accessor, a corresponding information item is constructed and that sequence of information items is used as the value for the [children] property.
[notations]	The values of these properties are implementation-defined but must be consistent with the rest of InfoSet constructed.
[unparsed entities]
[character encoding scheme]
[standalone]
[version]
[all declarations processed]

Note:

Since Document Nodes are more permissive than document information items, the resulting InfoSet may be invalid.

4.3 Elements

4.3.1 Overview

Element nodes encapsulate XML elements. Elements have the following properties:

base-uri, possibly empty.
node-name
parent
type, possibly empty
children, possibly empty
attributes, possibly empty
namespaces, possibly empty

Element nodes must satisfy the following constraints.

An element node's children may not contain two consecutive text nodes. Consecutive text nodes are collapsed by the element constructor into one text node.
If a node N is a child of an element E, then the parent of N must be E.
If a node N has a parent element E, then N must be among the children of E.
Every child of an element must be distinct.
The attributes of an element must have distinct names.
The namespace nodes of an element must have distinct names. At most one of the namespace nodes of an element has no name (this is the default namespace). A namespace node whose namespace URI is the zero-length string must have no name. No namespace node may have the name "xmlns".

The element node constructor assures that the first three constraints are satisfied.

Note:

The data model does not enforce a constraint that the namespaces of an element must be a superset of the namespaces of its parent, nor does it enforce a constraint that the namespaces of an element must include namespace nodes for each of the namespace URIs used in the element name and the names of its attributes, or of namespace URIs used in the content of elements and attributes of type xs:QName. Applications of the data model (such as XSLT and XQuery) may enforce such constraints in particular circumstances, but these constraints are not part of the data model.

4.3.2 Constructor

An element node can be constructed using dm:element-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes an expanded-QName, a sequence of namespace nodes, a sequence of attribute nodes, a sequence of child nodes, the node's type, and an optional base URI as arguments.

`dm:element-node`(	`$qname`	`as` `xs:QName`,
	`$nsnodes`	`as` `NamespaceNode*`,
	`$attrnodes`	`as` `AttributeNode*`,
	`$children`	`as` `Node*`,
	`$type`	`as` `xs:QName`) `as` `ElementNode`

`dm:element-node`(	`$qname`	`as` `xs:QName`,
	`$nsnodes`	`as` `NamespaceNode*`,
	`$attrnodes`	`as` `AttributeNode*`,
	`$children`	`as` `Node*`,
	`$type`	`as` `xs:QName`,
	`$base-uri`	`as` `xs:anyURI`) `as` `ElementNode`

The sequence of nodes passed as $children must consist only of element, processing instruction, comment, and text nodes.

If consecutive text nodes are specified as the children of an element node they are collapsed into one text node whose string value is the concatenation of the string values of the consecutive text nodes.

To guarantee that the parent-child relationship is invertible, (i.e. that the parent of any child of a node is itself and that any node that has a parent is among its parent's children), the element constructors logically create a copy of all of their namespace, attribute, and children arguments and set the parent property of these nodes to the newly created element node. As long as the parent-child constraint is satisfied, an implementation of the data model may choose to use specialized techniques to avoid creating physical copies of the arguments to an element constructor. See [Issue-0052: Element constructor copies nodes?].

The data model permits element nodes without parents. In fact element nodes created by the element node constructor never have parents unless they are enclosed in other node constructors. Such nodes may represent partial results during expression processing.

4.3.3 Accessors

Accessor	Returns:
dm:base-uri	The base URI of the element or its parent
dm:node-kind	"`element`"
dm:node-name	The QName of the element
dm:parent	The parent element or document node
dm:string-value	The concatenation of the string-values of all the text node descendants of the element in document order
dm:typed-value	The typed value of the node
dm:type	Returns the QName of the element's globally declared type or the empty sequence if the element's type is locally declared or is unavailable
dm:children	The children of the element node
dm:attributes	The attributes of the element node
dm:namespaces	The namespaces of the element node

The dm:base-uri accessor returns the base-uri property of the element node, if it exists. If it does not exist, the base URI of the element's parent is returned. In other words, the value returned must be the same as the value of dm:base-uri(dm:parent()) although implemetations are not required to implement it in that way.

The accessors dm:namespaces and dm:attributes return the same set of namespace and attribute nodes (respectively) that were supplied to the constructor, but they are not constrained to return them in the same order. See [Issue-0086: Nodes returned by namespaces and attributes]

The dm:parent accessor returns the empty sequence if the element has no parent.

If the element node's type is xs:anyType, the dm:typed-value accessor returns the node's string value as xs:anySimpleType. If the type is a complex type with complex content, invoking dm:typed-value raises an error.

Editorial note
The dm:string-value of a node and the result of casting the dm:typed-value of a node to a string may give different results under this definition. The issue of whether to allow or possibly mandate that dm:string-value return the same result as the string-value of the typed value is still under discussion. See [Issue-0079: String-value vs. string-value of the typed-value].

4.3.4 PSVI to Data Model Mapping

When a data model fragment is created from the PSVI, an element information item is mapped to an Element Node. The precise transformation is described by specifying the PSVI property corresponding to each argument of the element node constructor.

$qname

An xs:QName constructed from the [local name] property and the [namespace name] property

$nsnodes

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property.

Implementations may provide mechanisms to allow some or all of the namespaces in the [in-scope namespaces] property to be discarded from the data model.

$attrnodes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

The special nature of xsi:nil is still being discussed, see [Issue-0071: Magic Attributes].

$children

If the [schema normalized value] PSVI property exists, a single text node whose string value is the value of that property.
Otherwise, the sequence of nodes constructed in the following way from the information items found in the [children] property: for each element, processing instruction, comment, and maximal sequence of adjacent characater information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, and Text node is constructed.
Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

$type

The xs:QName computed as described in 3.6 Mapping PSV Infoset additions to Types. Note that if the type referenced would be a union type then type refers to the member type that actually validated the schema normalized value.

The unique ID of the element node is an identifier optionally assigned by the user. It corresponds to the [normalized value] property of the attribute information item in the [attributes] property that has a type ID, if one exists.

Note:

Using this definition, only IDs declared in a DTD are effective. See [Issue-0004: Schema/DTD]. Even so, this definition is not backward compatible with XPath 1.0. See [Issue-0038: XPath 1.0 treatment of non-unique IDs]. Furthermore, it doesn't even work as spec'd, see [Issue-0044: Unable to construct an element with unique ID].

4.3.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps an Element Node to an element information item. The properties of the element information item are constructed as follows:

Property	Value:
[namespace name]	The namespace name of the QName returned by the dm:node-name accessor
[local name]	The local name of the QName returned by the dm:node-name accessor
[prefix]	An appropriate namespace prefix, as described below
[children]	The sequence of information items constructed from the nodes returned by the dm:children accessor. In other words, for each node returned by the dm:children accessor, a corresponding information item is constructed and that sequence of information items is used as the value for the [namespace name] property.
[attributes]	The sequence of attribute information items constructed from the nodes returned by the dm:attributes accessor.
[in-scope namespaces]	The sequence of namespace information items constructed from the nodes returned by the dm:namespaces accessor.
[base URI]	The value returned by the dm:base-uri accessor
[parent]	The information item constructed from the node returned by the dm:parent accessor. If the node has no parent, the property must be left absent and the resulting InfoSet will not be valid.
[namespace attributes]	The sequence of namespace information items constructed from the nodes that are present in the difference between the sequence of nodes returned by the dm:namespaces accessor on this element and the sequence of nodes returned by the dm:namespaces accessor of this element's dm:parent.

An implementation must construct the value of the [prefix] property as if the following algorithm was applied: if the element has at least one namespace node whose namespace URI is the same as the namespace name of the QName returned by the dm:node-name accessor, it returns the local part of the name of that namespace node or the empty string if the namespace node has no name. If there are several such namespace nodes, it chooses one of them arbitrarily. If there is no such namespace node, it generates an arbitrary prefix that is distinct from the dm:node-name of any of the element's namespaces.

If a new prefix is generated, a corresponding namespace information item must be added to the [in-scope namespaces] property of the element information item. The namespace information item must associate the generated prefix with the namespace name of the QName returned by the element's dm:node-name accessor.

Note:

If the implementation has allowed in-scope namespaces to be discarded from the data model, then these namespaces may need to be reintroduced when creating an InfoSet in order to ensure that the InfoSet corresponds to a document that is namespace well-formed as defined in [XML Namespaces].

Note:

The algorithm used to calculate namespace attributes will need to be adjusted to cater for XML Namespaces 1.1, which allows the "undeclaration" of all namespaces, whether they have a prefix or not.

4.4 Attributes

4.4.1 Overview

Attribute nodes encapsulate XML attributes. Attributes have the following properties:

node-name
parent
type, possibly empty

The above information associated with an attribute node is set during construction and can be accessed later via the accessor functions.

For convenience, the element node that owns this attribute is called its "parent" even though an attribute node is not a "child" of its parent element.

4.4.2 Constructor

An attribute node can be constructed using dm:attribute-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes the attribute's expanded-QName, string value, and type.

`dm:attribute-node`(	`$qname`	`as` `xs:QName`,
	`$value`	`as` `xs:string`,
	`$type`	`as` `xs:QName`) `as` `AttributeNode`

Like all other node constructors, the attribute node constructor has the effect of creating a new node with a unique identity, distinct from all other nodes.

4.4.3 Accessors

Accessor	Returns:
dm:base-uri	()
dm:node-kind	"`attribute`"
dm:node-name	The QName of the attribute
dm:parent	The parent element node
dm:string-value	The value of the attribute
dm:typed-value	The typed value of the node
dm:type	Returns the QName of the attributes's globally declared type or the empty sequence if the attribute's type is locally declared or is unavailable
dm:children	()
dm:attributes	()
dm:namespaces	()

Editorial note
The dm:string-value of a node and the result of casting the dm:typed-value of a node to a string may give different results under this definition. The issue of whether to allow or possibly mandate that dm:string-value return the same result as the string-value of the typed value is still under discussion. See [Issue-0079: String-value vs. string-value of the typed-value].

4.4.4 PSVI to Data Model Mapping

When a data model fragment is created from the PSVI an attribute information item is mapped to an Attribute Node. The precise transformation is described by specifying the PSVI property corresponding to each argument of the attribute node constructor.

$qname

An xs:QName constructed from the [local name] property and the [namespace name] property

$value

The [schema normalized value] PSVI property if that exists, or
the [normalized value] property.

$type

4.4.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps an Attribute Node to an attribute information item. The properties of the corresponding attribute information item are constructed as follows:

Property	Value:
[namespace name]	The namespace name of the QName returned by the dm:node-name accessor
[local name]	The local name of the QName returned by the dm:node-name accessor
[prefix]	An appropriate namespace prefix, as described below
[normalized value]	The value returned by the dm:string-value accessor
[owner element]	The information item constructed from the node returned by the dm:parent accessor. If the node has no parent, the property must be left absent and the resulting InfoSet will not be valid.
[specified]	The values of these properties are implementation-defined but must be consistent with the rest of InfoSet constructed.
[attribute type]
[references]

If the attribute node has a parent, an implementation must construct the value of the [prefix] property in the following way: if the attribute has a parent, in the same way that a prefix would be constructed for that element, otherwise a prefix is chosen arbitrarily, and no attempt is made to associate the prefix with the namespace URI.

4.5 Namespaces

4.5.1 Overview

Namespace nodes encapsulate XML namespaces. Namespaces have the following properties:

prefix, possibly empty.
uri

In XPath 1.0, namespace nodes were directly accessible by applications, by means of the namespace axis. In XPath 2.0 the namespace axis is deprecated, and it is not available at all in XQuery 1.0. XPath 2.0 implementations are not required to expose the namespace axis, though they may do so if they wish to offer backwards compatibility. The information held in namespace nodes is instead made available to applications using two functions defined in [Functions and Operators], namely xf:get-in-scope-namespaces and xf:get-namespace-uri-for-prefix. Certain properties of namespace nodes are not exposed by these functions: in particular, properties related to the identity of namespace nodes, their parentage, and their position in document order. Implementations that do not expose the namespace axis can therefore avoid the overhead of maintaining this information.

The above information associated with a namespace node is set during construction and can be accessed later via the accessor functions.

Each element node has an associated set of namespace nodes corresponding to its in-scope namespaces, and that element node acts as the parent of those namespace nodes.

4.5.2 Constructor

A namespace node can be constructed using dm:namespace-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes a namespace prefix and the URI of the namespace being declared.

dm:namespace-node($prefix as xs:string?, $uri as xs:string) as NamespaceNode

The namespace prefix may be the empty sequence. If the URI is the zero-length string, the prefix must be the empty sequence.

4.5.3 Accessors

Accessor	Returns:
dm:base-uri	fn:error, see [Issue-0087: base-uri should return ()?].
dm:node-kind	"`namespace`"
dm:node-name	A QName with the namespace prefix in the local-name and an empty URI
dm:parent	()
dm:string-value	The namespace name (URI) of the node
dm:typed-value	()
dm:type	()
dm:children	()
dm:attributes	()
dm:namespaces	()

4.5.4 PSVI to Data Model Mapping

When a data model fragment is created from the PSVI a namespace information item is mapped to a Namespace Node. The precise transformation is described by specifying the PSVI property corresponding to each argument of the attribute node constructor.

$prefix: The [prefix] property.
$uri: The [namespace name] property.

4.5.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps a Namespace Node to a namespace information item. The properties of the namespace information item are constructed as follows:

Property	Value:
[prefix]	An appropriate namespace prefix, as described below
[namespace name]	The value returned by the dm:string-value accessor

4.6 Processing Instructions

4.6.1 Overview

Processing instruction nodes encapsulate XML processing instructions. Processing instructions have the following properties:

target
content
parent, possibly empty

4.6.2 Constructor

A processing instructoin node can be constructed using dm:processing-instruction-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes an NCName, a string, and an optional base URI.

`dm:processing-instruction-node`(	`$target`	`as` `xs:NCName`,
`dm:processing-instruction-node`(	`$content`	`as` `xs:string`) `as` `ProcessingInstructionNode`

`dm:processing-instruction-node`(	`$target`	`as` `xs:NCName`,
	`$content`	`as` `xs:string`,
	`$base-uri`	`as` `xs:anyURI`) `as` `ProcessingInstructionNode`

The string '?>' may not occur within a processing instruction's target or content value ([XML Recommendation]).

4.6.3 Accessors

Accessor	Returns:
dm:base-uri	()
dm:node-kind	"`processing-instruction`"
dm:node-name	A QName with the processing-instruction target in the local-name and an empty URI
dm:parent	The parent element or document node
dm:string-value	The content of the processing-instruction
dm:typed-value	()
dm:type	()
dm:children	()
dm:attributes	()
dm:namespaces	()

4.6.4 PSVI to Data Model Mapping

When a data model fragment is created from the PSVI, a processing instruction information item is mapped to a Processing Instruction Node. The precise transformation is described by specifying the PSVI property corresponding to each argument of the processing instruction node constructor.

$target: The value of the [target] property.
$content: The value of the [content] property.
$base-uri: The value of the [base URI] property.

There are no processing instruction nodes for processing instructions that are children of a document type declaration information item.

4.6.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps a Processing Instruction Node to a processing instruction information item. The properties of the processing instruction information item are constructed as follows:

Property	Value:
[target]	The local name of the QName returned by the dm:node-name accessor
[content]	The value of the dm:string-value accessor
[parent]	The value of the dm:parent accessor.
[notation]	??? UNKNOWN ???
[base URI]	The value of the dm:base-uri accessor

4.7 Comments

4.7.1 Overview

Comment nodes encapsulate XML comments. Comments have the following properties:

the content
parent

4.7.2 Constructor

A comment node can be constructed using dm:comment-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes a string value.

dm:comment-node($content as xs:string) as CommentNode

The string "--" (two consecutive hyphens) must not occur within a comment's string value ([XML Recommendation]).

4.7.3 Accessors

Accessor	Returns:
dm:base-uri	()
dm:node-kind	"`comment`"
dm:node-name	()
dm:parent	The parent element or document node
dm:string-value	The content of the comment
dm:typed-value	()
dm:type	()
dm:children	()
dm:attributes	()
dm:namespaces	()

4.7.4 PSVI to Data Model Mapping

When a data model fragment is created from the PSVI a comment information item is mapped to a Comment Node. The precise transformation is described by specifying the PSVI property corresponding to each argument of the comment node constructor.

$content: The value of the [content] property.

There are no comment nodes for comments that are children of a document type declaration information item.

4.7.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps a Comment Node to a comment information item. The properties of the corresponding comment information item are constructed as follows:

Property	Value:
[content]	The value of the dm:string-value accessor
[parent]	??? WRONG: DM to Infoset Conversion!!! ??? The value of the dm:parent accessor

4.8 Text

4.8.1 Overview

Text nodes encapsulate XML character content. Text has the following properties:

content
parent

Text nodes must satisfy the following constraint:

A text node cannot contain the empty string as its content.

In addition, document and element nodes impose the constraint that two consecutive text nodes can never occur as adjacent siblings.

4.8.2 Constructor

A text node can be constructed using dm:text-node which returns a new node with unique identity, distinct from all other nodes.

The constructor takes a string value.

dm:text-node($content as xs:string) as TextNode

4.8.3 Accessors

Accessor	Returns:
dm:base-uri	()
dm:node-kind	"`comment`"
dm:node-name	()
dm:parent	The parent element or document node
dm:string-value	The text content
dm:typed-value	The string value of the node as `xs:anySimpleType`
dm:type	()
dm:children	()
dm:attributes	()
dm:namespaces	()

4.8.4 PSVI to Data Model Mapping

When a data model fragment is created from the PSVI a maximal sequence of consecutive character information items are mapped to a Text Node. The precise transformation is described by specifying the PSVI property corresponding to the argument of the comment node constructor.

$content: A string comprised of characters that correspond to the [character code] properties of each of the character information items.

Note:

The string-value is not W3C normalized as described in the Character Model for the World Wide Web version 1.0 draft. See [Issue-0045: Text nodes are not W3C-normalized text].

4.8.5 Data Model to Infoset Mapping

The mapping of the data model to the XML Information Set maps a Text Node to a sequence of character information items. The properties of the corresponding character information items are constructed as follows:

Property	Value:
[character code]	The ISO 10646 character code of the character in question
[element content whitespace]	A boolean that is `true` if and only if the entire Text Node consists of white space and the parent of the Text Node exists and is an element and the element content type of the element is not "mixed". [Issue-0088: Content type is not preserved]
[parent]

5 Atomic Values

[Definition: An atomic value is a value in the value space of an atomic type labeled with that atomic type.] The typed value of nodes whose type is unknown (for instance because they have not been validated) are labeled with the type xs:anySimpleType. [Definition: An atomic type is a primitive simple type or a type derived by restriction from a primitive simple type. Types derived by list or union are not atomic.]

The primitive simple types are those defined by XML Schema [XMLSchema Part 2]: xs:string, xs:boolean, xs:decimal, xs:float, xs:double, xs:duration, xs:dateTime, xs:time, xs:date, xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:gDay, xs:gMonth, xs:hexBinary, xs:base64Binary, xs:anyURI, xs:QName, and xs:NOTATION. A derived atomic type is derived by restriction and has a primitive base type and a set of constraining facets.

Editorial note
Is "primitive" base type really what was intended?

The value space of the atomic values is the union of the value spaces of the nineteen primitive XML Schema types. This value space clearly includes those atomic values whose type is primitive, but it also includes those whose type is derived, as derivation by restriction always limits the value space.

An XML Schema simple type [XMLSchema Part 2] may be primitive or derived by restriction, list, or union.

The values of nodes whose type is an XML Schema primitive simple type or is derived by restriction from an XML Schema primitive simple type are represented as atomic values of that type.
The values of nodes whose type is derived by list from an XML Schema primitive type are represented by a sequence of atomic values whose type is the item type.
The values of nodes whose type is derived by union from an XML Schema primitive type are represented by a sequence of atomic values whose type is one of the individual types from the union. The union type information is lost and only the specific types of each individual item is retained.

An atomic value can be constructed from the value's lexical representation. Given a string and an atomic type, the atomic value is constructed in such a way as to be consistent with validation. In particular the construction takes into consideration the facets of the type. If the string does not represent a valid value of the type, an error is raised. When xs:anySimpleType is specified as the type, no validation takes place. The details of the construction are described in the "Constructor Functions" and the related "Casting Functions" section of [XQuery 1.0 and XPath 2.0 Functions and Operators].

6 Sequences

A sequence is an ordered collection of zero or more items. An item may be a node or an atomic value, i.e. a sequence may contain nodes, atomic values, or any mixture of nodes and atomic values. When a node is added to a sequence its identity remains the same. Consequently a node may occur in more than one sequence and a sequence may contain duplicate items. Sequences are "flat", they may not contain other sequences.

An important characteristic of the data model is that there is no distinction between an item (a node or an atomic value) and a singleton sequence containing that item. An item is equivalent to a singleton sequence containing that item and vice versa.

Note:

Sequences replace node-sets from XPath 1.0. In XPath 1.0, node-sets do not contain duplicates. In generalizing node-sets to sequences in XPath 2.0, duplicate removal is provided by functions on node sequences.

See [Issue-0025: Types of Sequences].

A collection of documents is represented in the data model as a sequence of document nodes (see [Issue-0023: Support for document repositories]).

A sequence has no identity. Equality comparison of sequences is performed only by comparing the items of the sequences.

A XML Information Set Conformance

This specification conforms to the XML Information Set [XML Information Set]. The following information items must be exposed by the infoset producer to construct a data model fragment:

The Document Information Item with [base URI] and [children] properties.
Element Information Items with [children], [attributes], [in-scope namespaces], [local name], [namespace name], [parent] properties.
Attribute Information Items with [namespace name], [local name], [normalized value], [owner element] properties.
Character Information Items with [character code] and [parent] properties.
Processing Instruction Information Items with [target], [content] and [parent] properties.
Comment Information Items with [content] and [parent] properties.
Namespace Information Items with [prefix] and [namespace name] properties.

Other information items and properties made available by the Infoset processor are ignored. In addition to the properties above, the following properties from the PSV Infoset are required:

[validity], [type definition], [type definition namespace], [type definition name], [type definition anonymous], [member type definition], [member type definition namespace], [member type definition name], [member type definition anonymous] and [schema normalized value] properties on Element Information Items.
[validity], [type definition], [type definition namespace], [type definition name], [type definition anonymous], [member type definition], [member type definition namespace], [member type definition name], [member type definition anonymous] and [schema normalized value] properties on Attribute Information Items.

B References

XML Information Set: World Wide Web Consortium, XML Information Set (Infoset). See http://www.w3.org/TR/xml-infoset/.
XML Recommendation: World Wide Web Consortium, Extensible Markup Language (XML) 1.0 (Second Edition) See http://www.w3.org/TR/REC-xml.
Namespaces in XML: World Wide Web Consortium, Namespaces in XML See http://www.w3.org/TR/REC-xml-names.
XQuery 1.0 and XPath 2.0 Functions and Operators: World Wide Web Consortium, XQuery 1.0 and XPath 2.0 Functions and Operators. See http://www.w3.org/TR/xquery-operators/.
XML Schema: Formal Description: World-Wide Web Consortium XML Schema: Formal Description, Working Draft, March 2001. See http://www.w3.org/TR/xmlschema-formal/.
XMLSchema Part 1: World Wide Web Consortium, XML Schema Part 1: Structures. See http://www.w3.org/TR/xmlschema-1.
XMLSchema Part 2: World Wide Web Consortium, XML Schema Part 2: Datatypes. See http://www.w3.org/TR/xmlschema-2.

C References (Non-Normative)

W3C Style Activity: World Wide Web Consortium, Style Activity. See http://www.w3.org/Style/Activity.
W3C XML Activity: World Wide Web Consortium, XML Activity. See http://www.w3.org/XML/Activity.
XML Query Data Model: World-Wide Web Consortium XML Query Data Model, Working Draft, Feb 2001. See http://www.w3.org/TR/2001/WD-query-datamodel-20010215/.
XPath: World-Wide Web Consortium XML Path Language (XPath): Version 1.0. November, 1999. See http://www.w3.org/TR/xpath.html.
XPath Requirements Version 2.0: World Wide Web Consortium, XPath Requirements Version 2.0. See http://www.w3.org/TR/xpath20req.
XPath 2.0: World-Wide Web Consortium XML Path Language (XPath): Version 2.0. See http://www.w3.org/TR/xpath20/.
XML Pointer Language (XPointer): World Wide Web Consortium, XML Pointer Language (XPointer). See http://www.w3.org/TR/xptr/.
XSLT 2.0: World Wide Web Consortium, XSL Transformations Language (XSLT): Version 2.0. See http://www.w3.org/TR/xslt20/.
XSLT 1.0: World Wide Web Consortium, XSL Transformations Language (XSLT): Version 1.0. See http://www.w3.org/TR/xslt.
XQuery 1.0 Formal Semantics: World Wide Web Consortium, XQuery 1.0 Formal Semantics. See http://www.w3.org/TR/query-semantics/
XML Query Working Group: World Wide Web Consortium, XML Query Working Group. Home page: http://www.w3.org/XML/Activity#query-wg.
XSL Working Group: World Wide Web Consortium, XSL Working Group. Home page: http://www.w3.org/Style/XSL/.
XQuery 1.0: A Query Language for XML: World Wide Web Consortium, XQuery 1.0: A Query Language for XML. See http://www.w3.org/TR/xquery/.
XML Query Requirements: World Wide Web Consortium, XML Query Requirements. See http://www.w3.org/TR/2001/WD-xmlquery-req-20010215.

D Example (Non-Normative)

We use the following XML document to illustrate the information contained in a data model fragment:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="dm-example.xsl"?>
<!DOCTYPE catalog [
<!ELEMENT tshirt ANY>
<!ATTLIST tshirt code ID #REQUIRED>
<!ELEMENT album ANY>
<!ATTLIST album code ID #REQUIRED>
<!ELEMENT price ANY>
<!ATTLIST price currency CDATA 'USD'>
<!ENTITY copy '&#169;'>
]>
<catalog xmlns="http://www.example.com/catalog"
         xmlns:html="http://www.w3.org/1999/xhtml"
         xmlns:xlink="http://www.w3.org/1999/xlink"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.example.com/catalog
                             dm-example.xsd"
         version="0.1">

<tshirt code="T1534017" label=" Staind : Been Awhile "
        xlink:href="http://store.artistdirect.com/store/prod/detail/0,,1655091,00.html"
        sizes="M L XL">
  <title> Staind: Been Awhile Tee Black (1-sided) </title>
  <description>
    <html:p>
      Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath
      the large 'Flock &amp; Weld' Staind logo. A very unique logo that looks as
      cool as it feels!
    </html:p>
  </description>
  <price> 25.00 </price>
</tshirt>

<album code="A1481344" label=" Staind : Its Been A While "
       formats="CD">
  <title> It's Been A While </title>
  <description xsi:nil="true" />
  <price currency="USD"> 10.99 </price>
  <artist> Staind </artist>
</album>

</catalog>

The document is associated with the URI "http://www.example.com/catalog.xml", and is valid with respect to the following XML schema:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:cat="http://www.example.com/catalog"
           xmlns:xlink="http://www.w3.org/1999/xlink"
           targetNamespace="http://www.example.com/catalog"
           elementFormDefault="qualified">

<xs:import namespace="http://www.w3.org/XML/1998/namespace"
           schemaLocation="http://www.w3.org/2001/xml.xsd" />

<xs:import namespace="http://www.w3.org/1999/xlink"
           schemaLocation="http://www.cs.rpi.edu/~puninj/XGMML/xlinks-2001.xsd" />

<xs:element name="catalog">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="cat:_item" maxOccurs="unbounded" />
    </xs:sequence>
    <xs:attribute name="version" type="xs:string" fixed="0.1" use="required" />
    <xs:attribute ref="xml:base" />
  </xs:complexType>
</xs:element>

<xs:element name="_item" type="cat:itemType" abstract="true" />

<xs:complexType name="itemType">
  <xs:sequence>
    <xs:element name="title" type="xs:token" />
    <xs:element name="description" type="cat:description" nillable="true" />
    <xs:element name="price" type="cat:price" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute name="label" type="xs:token" />
  <xs:attribute name="code" type="xs:ID" use="required" />
  <xs:attributeGroup ref="xlink:simpleLink" />
</xs:complexType>

<xs:element name="tshirt" type="cat:tshirtType" substitutionGroup="cat:_item" />

<xs:complexType name="tshirtType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:attribute name="sizes" type="cat:clothesSizes" use="required" />
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:simpleType name="clothesSizes">
  <xs:union memberTypes="cat:sizeList">
    <xs:simpleType>
      <xs:restriction base="xs:token">
        <xs:enumeration value="oneSize" />
      </xs:restriction>
    </xs:simpleType>
  </xs:union>
</xs:simpleType>

<xs:simpleType name="sizeList">
  <xs:restriction>
    <xs:simpleType>
      <xs:list itemType="cat:clothesSize" />
    </xs:simpleType>
    <xs:minLength value="1" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="clothesSize">
  <xs:union memberTypes="cat:numberedSize cat:categorySize" />
</xs:simpleType>

<xs:simpleType name="numberedSize">
  <xs:restriction base="xs:integer">
    <xs:enumeration value="4" />
    <xs:enumeration value="6" />
    <xs:enumeration value="8" />
    <xs:enumeration value="10" />
    <xs:enumeration value="12" />
    <xs:enumeration value="14" />
    <xs:enumeration value="16" />
    <xs:enumeration value="18" />
    <xs:enumeration value="20" />
    <xs:enumeration value="22" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="categorySize">
  <xs:restriction base="xs:token">
    <xs:enumeration value="XS" />
    <xs:enumeration value="S" />
    <xs:enumeration value="M" />
    <xs:enumeration value="L" />
    <xs:enumeration value="XL" />
    <xs:enumeration value="XXL" />
  </xs:restriction>
</xs:simpleType>

<xs:element name="album" type="cat:albumType" substitutionGroup="cat:_item" />

<xs:complexType name="albumType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:sequence>
        <xs:element name="artist" type="xs:string" />
      </xs:sequence>
      <xs:attribute name="formats" type="cat:formatsType" use="required" />
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:simpleType name="formatsType">
  <xs:list itemType="cat:formatType" />
</xs:simpleType>

<xs:simpleType name="formatType">
  <xs:restriction base="xs:token">
    <xs:enumeration value="CD" />
    <xs:enumeration value="MiniDisc" />
    <xs:enumeration value="tape" />
    <xs:enumeration value="vinyl" />
  </xs:restriction>
</xs:simpleType>

<xs:complexType name="description" mixed="true">
  <xs:sequence>
    <xs:any namespace="http://www.w3.org/1999/xhtml" processContents="lax"
            minOccurs="0" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:complexType name="price">
  <xs:simpleContent>
    <xs:extension base="cat:monetaryAmount">
      <xs:attribute name="currency" type="cat:currencyType" default="USD" />
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<xs:simpleType name="currencyType">
  <xs:restriction base="xs:token">
    <xs:pattern value="[A-Z]{3}" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="monetaryAmount">
  <xs:restriction base="xs:decimal">
    <xs:fractionDigits value="2" />
    <xs:pattern value="\d+\.\d{2}" />
  </xs:restriction>
</xs:simpleType>

</xs:schema>

This example exposes the data model for a document that has an associated schema and has been validated successfully against it. In general, an XML Schema is not required, that is, the data model can represent a schemaless, well-formed XML document with the rules described in 3.4 Types.

The XML document is represented by the data-model constructors below. The value D1 represents a document node; the values E1, E2, etc. represent element nodes; the values A1, A2, etc. represent attribute nodes; the values N1, N2, etc. represent namespace nodes; the values P1, P2, etc. represent processing-instruction nodes; the values T1, T2, etc. represent text nodes.

For brevity:

The data model doesn't include whitespace-only text nodes.
Literal strings are shown without the xs:string() constructor
Literal decimals are shown without the xs:decimal() constructor
Nodes are referred to using the syntax [nodeID]
QNames are used with the following prefixes:
xs http://www.w3.org/2001/XMLSchema
xsi http://www.w3.org/2001/XMLSchema-instance
cat http://www.example.com/catalog
xlink http://www.w3.org/1999/xlink
html http://www.w3.org/1999/xhtml
The abbreviation "\n" is used in string literals to represent a newline character; this isn't supported in XPath, but it makes this presentation clearer.
Accessors that return the empty sequence have been omitted.

// Document node D1
dm:base-uri(D1)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(D1)	=	"document"
dm:string-value(D1)	=	" Staind: Been Awhile Tee Black (1-sided) \n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n 25.00 It's Been A While 10.99 Staind "
dm:children(D1)	=	([E1])

// Namespace node N1
dm:node-kind(N1)	=	"namespace"
dm:node-name(N1)	=	xs:QName("", "xml")
dm:string-value(N1)	=	"http://www.w3.org/XML/1998/namespace"

// Namespace node N2
dm:node-kind(N2)	=	"namespace"
dm:node-name(N2)	=	()
dm:string-value(N2)	=	"http://www.example.com/catalog"

// Namespace node N3
dm:node-kind(N3)	=	"namespace"
dm:node-name(N3)	=	xs:QName("", "html")
dm:string-value(N3)	=	"http://www.w3.org/1999/xhtml"

// Namespace node N4
dm:node-kind(N4)	=	"namespace"
dm:node-name(N4)	=	xs:QName("", "xlink")
dm:string-value(N4)	=	"http://www.w3.org/1999/xlink"

// Namespace node N5
dm:node-kind(N5)	=	"namespace"
dm:node-name(N5)	=	xs:QName("", "xsi")
dm:string-value(N5)	=	"http://www.w3.org/2001/XMLSchema-instance"

// Processing Instruction node P1
dm:base-uri(P1)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(P1)	=	"processing-instruction"
dm:node-name(P1)	=	xs:QName("", "xml-stylesheet")
dm:string-value(P1)	=	"type="text/xsl" href="dm-example.xsl""
dm:parent(P1)	=	([D1])

// Element node E1
dm:base-uri(E1)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E1)	=	"element"
dm:node-name(E1)	=	xs:QName("http://www.example.com/catalog", "catalog")
dm:string-value(E1)	=	" Staind: Been Awhile Tee Black (1-sided) \n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n 25.00 It's Been A While 10.99 Staind "
dm:typed-value(E1)	=	fn:error()
// xs:anyType because of the anonymous type definition
dm:type(E1)	=	xs:anyType
dm:parent(E1)	=	([D1])
dm:children(E1)	=	([E2], [E7])
dm:attributes(E1)	=	([A1], [A2])
dm:namespaces(E1)	=	([N1], [N2], [N3], [N4], [N5])

// Attribute node A1
dm:node-kind(A1)	=	"attribute"
dm:node-name(A1)	=	xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:schemaLocation")
dm:string-value(A1)	=	"http://www.example.com/catalog dm-example.xsd"
dm:typed-value(A1)	=	(xs:anyURI("http://www.example.com/catalog"), xs:anyURI("catalog.xsd"))
dm:type(A1)	=	xs:anySimpleType
dm:parent(A1)	=	([E1])

// Attribute node A2
dm:node-kind(A2)	=	"attribute"
dm:node-name(A2)	=	xs:QName("", "version")
dm:string-value(A2)	=	"0.1"
dm:typed-value(A2)	=	"0.1"
dm:type(A2)	=	xs:string
dm:parent(A2)	=	([E1])

// Element node E2
dm:base-uri(E2)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E2)	=	"element"
dm:node-name(E2)	=	xs:QName("http://www.example.com/catalog", "tshirt")
dm:string-value(E2)	=	" Staind: Been Awhile Tee Black (1-sided) \n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n 25.00 "
dm:typed-value(E2)	=	fn:error()
dm:type(E2)	=	cat:tshirtType
dm:parent(E2)	=	([E1])
dm:children(E2)	=	([E3], [E4], [E6])
dm:attributes(E2)	=	([A3], [A4], [A5], [A6])
dm:namespaces(E2)	=	([N1], [N2], [N3], [N4], [N5])

// Attribute node A3
dm:node-kind(A3)	=	"attribute"
dm:node-name(A3)	=	xs:QName("", "code")
dm:string-value(A3)	=	"T1534017"
dm:typed-value(A3)	=	xs:ID("T1534017")
dm:type(A3)	=	xs:ID
dm:parent(A3)	=	([E2])

// Attribute node A4
dm:node-kind(A4)	=	"attribute"
dm:node-name(A4)	=	xs:QName("", "label")
dm:string-value(A4)	=	"Staind : Been Awhile"
dm:typed-value(A4)	=	xs:token("Staind : Been Awhile")
dm:type(A4)	=	xs:token
dm:parent(A4)	=	([E2])

// Attribute node A5
dm:node-kind(A5)	=	"attribute"
dm:node-name(A5)	=	xs:QName("http://www.w3.org/1999/xlink", "xlink:href")
dm:string-value(A5)	=	"http://store.artistdirect.com/store/prod/detail/0,,1655091,00.html"
dm:typed-value(A5)	=	xs:anyURI("http://store.artistdirect.com/store/prod/detail/0,,1655091,00.html")
dm:type(A5)	=	xs:anyURI
dm:parent(A5)	=	([E2])

// Attribute node A6
dm:node-kind(A6)	=	"attribute"
dm:node-name(A6)	=	xs:QName("", "sizes")
dm:string-value(A6)	=	"M L XL"
dm:typed-value(A6)	=	(xs:anySimpleType("M"), xs:anySimpleType("L"), xs:anySimpleType("XL"))
dm:type(A6)	=	cat:sizeList
dm:parent(A6)	=	([E2])

// Element node E3
dm:base-uri(E3)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E3)	=	"element"
dm:node-name(E3)	=	xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E3)	=	"Staind: Been Awhile Tee Black (1-sided)"
dm:typed-value(E3)	=	xs:token("Staind: Been Awhile Tee Black (1-sided)")
dm:type(E3)	=	xs:token
dm:parent(E3)	=	([E2])
dm:children(E3)	=	()
dm:attributes(E3)	=	()
dm:namespaces(E3)	=	([N1], [N2], [N3], [N4], [N5])

// Text node T1
dm:base-uri(T1)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T1)	=	"text"
dm:string-value(T1)	=	"Staind: Been Awhile Tee Black (1-sided)"
dm:typed-value(T1)	=	xs:anySimpleType("Staind: Been Awhile Tee Black (1-sided)")
dm:type(T1)	=	xs:anySimpleType
dm:parent(T1)	=	([E3])

// Element node E4
dm:base-uri(E4)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E4)	=	"element"
dm:node-name(E4)	=	xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E4)	=	"\n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n "
dm:typed-value(E4)	=	fn:error()
dm:type(E4)	=	cat:description
dm:parent(E4)	=	([E2])
dm:children(E4)	=	([E5])
dm:attributes(E4)	=	()
dm:namespaces(E4)	=	([N1], [N2], [N3], [N4], [N5])

// Element node E5
dm:base-uri(E5)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E5)	=	"element"
dm:node-name(E5)	=	xs:QName("http://www.w3.org/1999/xhtml", "html:p")
dm:string-value(E5)	=	"\n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n "
dm:typed-value(E5)	=	fn:error() or same as string-value???
dm:type(E5)	=	xs:anyType
dm:parent(E5)	=	([E4])
dm:children(E5)	=	()
dm:attributes(E5)	=	()
dm:namespaces(E5)	=	([N1], [N2], [N3], [N4], [N5])

// Text node T2
dm:base-uri(T2)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T2)	=	"text"
dm:string-value(T2)	=	"\n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n "
dm:typed-value(T2)	=	xs:anySimpleType("\n Lyrics from the hit song 'It's Been Awhile' are shown in white, beneath\n the large 'Flock & Weld' Staind logo. A very unique logo that looks as\n cool as it feels!\n ")
dm:type(T2)	=	xs:anySimpleType
dm:parent(T2)	=	([E5])

// Element node E6
dm:base-uri(E6)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E6)	=	"element"
dm:node-name(E6)	=	xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E6)	=	"25.00"
// The typed-value is based on the content type of the complex type for the element
dm:typed-value(E6)	=	cat:monetaryAmount(25.0)
dm:type(E6)	=	cat:price
dm:parent(E6)	=	([E2])
dm:children(E6)	=	()
dm:attributes(E6)	=	([A7])
dm:namespaces(E6)	=	([N1], [N2], [N3], [N4], [N5])

// Attribute node A7
dm:node-kind(A7)	=	"attribute"
dm:node-name(A7)	=	xs:QName("", "currency")
dm:string-value(A7)	=	"USD"
dm:typed-value(A7)	=	cat:currencyType("USD")
dm:type(A7)	=	cat:currencyType
dm:parent(A7)	=	([E6])

// Text node T3
dm:base-uri(T3)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T3)	=	"text"
dm:string-value(T3)	=	"25.00"
dm:typed-value(T3)	=	xs:anySimpleType("25.00")
dm:type(T3)	=	xs:anySimpleType
dm:parent(T3)	=	([E6])

// Element node E7
dm:base-uri(E7)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E7)	=	"element"
dm:node-name(E7)	=	xs:QName("http://www.example.com/catalog", "album")
dm:string-value(E7)	=	" It's Been A While 10.99 Staind "
dm:typed-value(E7)	=	fn:error()
dm:type(E7)	=	cat:albumType
dm:parent(E7)	=	([E1])
dm:children(E7)	=	([E8], [E9], [E10], [E11])
dm:attributes(E7)	=	([A8], [A9], [A10])
dm:namespaces(E7)	=	([N1], [N2], [N3], [N4], [N5])

// Attribute node A8
dm:node-kind(A8)	=	"attribute"
dm:node-name(A8)	=	xs:QName("", "code")
dm:string-value(A8)	=	"A1481344"
dm:typed-value(A8)	=	xs:ID("A1481344")
dm:type(A8)	=	xs:ID
dm:parent(A8)	=	([E7])

// Attribute node A9
dm:node-kind(A9)	=	"attribute"
dm:node-name(A9)	=	xs:QName("", "label")
dm:string-value(A9)	=	"Staind : Its Been A While"
dm:typed-value(A9)	=	xs:token("Staind : Its Been A While")
dm:type(A9)	=	xs:token
dm:parent(A9)	=	([E7])

// Attribute node A10
dm:node-kind(A10)	=	"attribute"
dm:node-name(A10)	=	xs:QName("", "formats")
dm:string-value(A10)	=	"CD"
dm:typed-value(A10)	=	cat:formatType("CD")
dm:type(A10)	=	cat:formatType
dm:parent(A10)	=	([E7])

// Element node E8
dm:base-uri(E8)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E8)	=	"element"
dm:node-name(E8)	=	xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E8)	=	"It's Been A While"
dm:typed-value(E8)	=	xs:token("It's Been A While")
dm:type(E8)	=	xs:token
dm:parent(E8)	=	([E7])
dm:children(E8)	=	()
dm:attributes(E8)	=	()
dm:namespaces(E8)	=	([N1], [N2], [N3], [N4], [N5])

// Text node T4
dm:base-uri(T4)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T4)	=	"text"
dm:string-value(T4)	=	"It's Been A While"
dm:typed-value(T4)	=	xs:anySimpleType("It's Been A While")
dm:type(T4)	=	xs:anySimpleType
dm:parent(T4)	=	([E8])

// Element node E9
dm:base-uri(E9)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E9)	=	"element"
dm:node-name(E9)	=	xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E9)	=	""
// xsi:nil is true so the typed value is the emtpy sequence
dm:typed-value(E9)	=	()
dm:type(E9)	=	cat:description
dm:parent(E9)	=	([E7])
dm:children(E9)	=	()
dm:attributes(E9)	=	([A11])
dm:namespaces(E9)	=	([N1], [N2], [N3], [N4], [N5])

// Attribute node A11
dm:node-kind(A11)	=	"attribute"
dm:node-name(A11)	=	xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:nil")
dm:string-value(A11)	=	"true"
dm:typed-value(A11)	=	xs:boolean("true")
dm:type(A11)	=	xs:boolean
dm:parent(A11)	=	([E9])

// Element node E10
dm:base-uri(E10)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E10)	=	"element"
dm:node-name(E10)	=	xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E10)	=	"10.99"
dm:typed-value(E10)	=	cat:monetaryAmount(10.99)
dm:type(E10)	=	cat:price
dm:parent(E10)	=	([E7])
dm:children(E10)	=	()
dm:attributes(E10)	=	([A12])
dm:namespaces(E10)	=	([N1], [N2], [N3], [N4], [N5])

// Attribute node A12
dm:node-kind(A12)	=	"attribute"
dm:node-name(A12)	=	xs:QName("", "currency")
dm:string-value(A12)	=	"USD"
dm:typed-value(A12)	=	cat:currencyType("USD")
dm:type(A12)	=	cat:currencyType
dm:parent(A12)	=	([E10])

// Text node T5
dm:base-uri(T5)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T5)	=	"text"
dm:string-value(T5)	=	"10.99"
dm:typed-value(T5)	=	xs:anySimpleType("10.99")
dm:type(T5)	=	xs:anySimpleType
dm:parent(T5)	=	([E10])

// Element node E11
dm:base-uri(E11)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E11)	=	"element"
dm:node-name(E11)	=	xs:QName("http://www.example.com/catalog", "artist")
dm:string-value(E11)	=	" Staind "
dm:typed-value(E11)	=	" Staind "
dm:type(E11)	=	xs:string
dm:parent(E11)	=	([E7])
dm:children(E11)	=	()
dm:attributes(E11)	=	()
dm:namespaces(E11)	=	([N1], [N2], [N3], [N4], [N5])

// Text node T6
dm:base-uri(T6)	=	xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T6)	=	"text"
dm:string-value(T6)	=	" Staind "
dm:typed-value(T6)	=	xs:anySimpleType(" Staind ")
dm:type(T6)	=	xs:anySimpleType
dm:parent(T6)	=	([E11])

A graphical representation of the data model for the preceding example is shown below. Document order in this representation can be found by following the traditional in-order, left-to-right, depth-first traversal; however, because the image has been rotated for easier presentation, this appears to be in-order, bottom-to-top, depth-first order.

Graphical depiction of the example data model.

E Open Issues (Non-Normative)

The issues in this section serve as a design history for this document. The ordering of issues is irrelevant. Each issue has a unique id of the form Issue-<dddd> (where d is a digit). This can be used for referring to the issue by <url-of-this-document>#Issue-<dddd>. Furthermore, each issue has a mnemonic header, a date, an optional description, and an optional resolution.

Some of the issues contain references to W3C internal archives. These are marked with "members only". Some of the descriptions of the resolved issues are obsolete w.r.t. to the current version of the document.

Starting with the November 2002 publication, only issues that are still open are displayed. All of the issues are still available in the XML sources for this document.

Issue-0004: Schema/DTD

Date: Oct-2000

Raised by: Datamodel Editors

Effects: (Issue-0004, 2, This implies accommodation for the case where both a DTD and a schema are applied. This will probably require some reconciliation of the [attribute type] property with type information from the PSVI. See issues , . )

3.3 XML Schemas and the XML Information Set
4.3.4 PSVI to Data Model Mapping

Description: A document may refer to a DTD and have an associated schema. Currently, content model from the DTD is ignored, as are unique IDs from the schema. A coherent priority or merging strategy is needed.

Any strategy developed must also address the issue of types derived from xs:ID.

Issue-0023: Support for document repositories

Date: 27-Mar-2001

Raised by: XPath 2.0 Task Force

Effects: (Issue-0023, 1, A collection of documents is represented in the data model as a sequence of document nodes (see ).)

6 Sequences

Description: Many people would like to see support for document repositories in XPath 2.0 with a corresponding notion in the data model. A document repository is easy to model as a sequence or bag of document nodes. It may have some additional properties, like for an ordered repository, order among all the nodes in the repository.

Issue-0025: Types of Sequences

Date: 27-Apr-2001

Raised by: Mike Kay

Effects: (Issue-0025, 1, See .)

6 Sequences

Description: Should sequence values carry their type as do simple typed values and element and attribute nodes?

Issue-0033: Unclear relationship between values passed to the constructor, and those returned by the accessor

Date: 28-April-2001

Raised by: James Clark

Effects: (Issue-0033, 1, See .)

2 Notation and Pseudo-code Syntax

Description: http://lists.w3.org/Archives/Member/w3c-xsl-query/2001Apr/0312.html (members only). Asks for inference rules, especially for the constuctor, describing when values returned by an accessor are the same as those set by the corresponding constructor. Especially unclear are when adjacent text nodes are collapsed, base URI and namespace declarations.

Issue-0037: Axis functions

Date: 19-July-2001

Raised by: XSL WG

Description: Define (somewhere other than the data model document?) axis functions for non-primitive axes like descendants-or-self.

Issue-0038: XPath 1.0 treatment of non-unique IDs

Date: 13-August-2001

Raised by: Datamodel Editors

Effects: (Issue-0038, 1, Using this definition, only IDs declared in a DTD are effective. See . Even so, this definition is not backward compatible with XPath 1.0. See . Furthermore, it doesn't even work as spec'd, see . )

4.3.4 PSVI to Data Model Mapping

Description: From XPath 1.0: "If an XML processor reports two elements in a document as having the same unique ID (which is possible only if the document is invalid) then the second element in document order must be treated as not having a unique ID." This has not been incorporated into this document.

Issue-0040: Setting and examining construction flags

Date: 15-August-2001

Raised by: Jim Melton

Description: [Jim] found [him]self wondering how those flags (parameters) get set/passed. More importantly, can a process ask whether "this instance" of the data model has those flags set or not? If so, how? If not, why not?

Issue-0042: System Id and Public Id are not exposed

Date: 15-August-2001

Raised by: Jim Melton

Effects: (Issue-0042, 1, There is no way to determine what DTD might apply to the data model. See .)

4.2.4 PSVI to Datamodel Mapping

Description: In our model, there is no way for a query to determine what DTD is relevant for the data model instance. That seems like a piece of information that might be wanted occasionally (though probably not often).

Issue-0044: Unable to construct an element with unique ID

Date: 15-August-2001

Raised by: Jonathan Marsh

Effects: (Issue-0044, 1, Using this definition, only IDs declared in a DTD are effective. See . Even so, this definition is not backward compatible with XPath 1.0. See . Furthermore, it doesn't even work as spec'd, see . )

4.3.4 PSVI to Data Model Mapping

Description: The unique ID property is defined on an element node, but is a function of an attribute information item. When an element node is constructed it is given an attribute node - not an info item. An attribute node is insufficient to remember the appropriate properties from the infoset in order for the element constructor to detect when an attribute is an ID declared in the DTD.

Issue-0045: Text nodes are not W3C-normalized text

Date: 2001-08-17

Raised by: Jim Melton

Effects: (Issue-0045, 1, The string-value is not W3C normalized as described in the Character Model for the World Wide Web version 1.0 draft. See .)

4.8.4 PSVI to Data Model Mapping

Description: The Character Model for the World Wide Web version 1.0 working draft defines W3C-normalized text. The algorithm for constructing text nodes from character information items does not perform normalization to this form. Should it?

Issue-0050: Relative order of free-floating nodes

Date: 2001-08-17

Raised by: Jim Melton

Effects: (Issue-0050, 2, The relative order of free-floating nodes (those not in a document) is not defined. See .)

3.2 Document Order
Issue-0058: Node constructors formalism of questionable value

Description: Are newly-constructed nodes in any particular order, such as some kind of document order? Does the order of these nodes have any relationship to the document order of the "input" data model instance? In fact, is the process being described properly characterized as "create a new data model instance from information derived from an existing data model instance", or something similar? Can more than one "new" document instance be created by a single query? In the fourth paragraph [Section 4], we see the phrase "the document node" --- is this the "existing document"'s document node, the "new document"'s document node, both, neither? Can more than one "existing" document instance be the source of information for a query?

Issue-0052: Element constructor copies nodes?

Date: 28-August-2001

Raised by: Michael Kay

Effects: (Issue-0052, 1, To guarantee that the parent-child relationship is invertible, (i.e. that the parent of any child of a node is itself and that any node that has a parent is among its parent's children), the element constructors logically create a copy of all of their namespace, attribute, and children arguments and set the parent property of these nodes to the newly created element node. As long as the parent-child constraint is satisfied, an implementation of the data model may choose to use specialized techniques to avoid creating physical copies of the arguments to an element constructor. See .)

4.3.2 Constructor

Description: In section 4.2 Elements, the notion that the constructor makes a copy of the supplied child nodes seems strange. It's hard to square this with the definition of node identity. Also, I don't see why the provision is needed here, but not for the document node constructor. Wouldn't it be cleaner to define a precondition that all the child nodes supplied to the constructor must be parentless?

Issue-0058: Node constructors formalism of questionable value

Date: 25-September-2001

Raised by: Michael Kay

Description: Node constructors. I'm a bit concerned that this is lacking in rigor. Some of this is exposed in [Issue-0050: Relative order of free-floating nodes]. The idea that the constructor for a parent node (Element or Document) takes a copy of the supplied children node doesn't seem to be fully worked out. What does "taking a copy" mean, where is it defined? It has to be a deep copy to make sense; it has to preserve its name, its type, and its children, but not its base URI or its node identity, and it acquires a new position in document order. What happens about the namespace nodes when an element or attribute is copied? Altogether, I'm worried that this idea of node constructors looks formal, but is actually just as informal as the 1.0 specification. It's actually a very procedural description, and I can't really see why it's needed: if it's intended as a target vocabulary for the formal semantics of the language, then it's a pretty shaky foundation. I'd be much happier with a model that only defines the valid states in the system; if we are going to define the permitted state transitions, we need to be much more rigorous about it.

Issue-0059: Pseudo-formalism provides no value

Date: 25-September-2001

Raised by: Michael Kay

Description: "The sequence-map function applies its first function argument to each member of its second sequence argument and returns a new sequence containing the result of applying the function to each member of the sequence." I really wonder whether it is a good idea to define the data model using a pseudo-formal language that we make up and half-explain as we go along? If we can't use an existing formal notation like Z or VDM that has a good specification we can reference, we should do the whole thing in English.

Issue-0063: Is prefix preserved?

Date: 16-October-2001

Raised by: Michael Kay

Description: Although XSLT and the XPath 2.0 data model agree that element and attribute nodes do not hold a namespace prefix, XSLT has always hinted that prefixes might be preserved through a transformation where possible. http://lists.w3.org/Archives/Member/w3c-xsl-query/2001Oct/0036.html (members only).

[JM: the ability to recover the lexical value of an xs:QName simple type seems useful, perhaps even necessary. It is at least needed to support the name() function. A better description of the xs:QName accessors is required, perhaps unifying xs:QName and expanded-QName.]

Issue-0069: Canonical form for derived types.

Date: 4-December-2001

Raised by: XPath Task Force

Description: Should derived types have a canonical form? Should we ask XML Schema to fix this?

Issue-0071: Magic Attributes

Date: 7-December-2001

Raised by: Michael Rys

Description: Should the following attributes be represented in the data model? xmlns attributes, xml:lang, xml:space, xsi:nil (etc). If so, how?

Resolution: Namespace declarations do not appear in the attributes property. All of the other attributes except xsi:nil do.

This issue remains open while the issue of xsi:nil is resolved.

Issue-0072: Lexical representation of Schema primitive types

Date: 13-January-2002

Raised by: Jeni Tennison

Effects: (Issue-0072, 1, The string-value accessor can be used to recover the lexical representation of an atomic value. The details of converting an atomic value to its string representation are described in the Casting Functions section of . In particular if the atomic value's type is primitive, string-value returns the atomic value's canonical lexical representation for that primitive type as specified in . If the atomic value's type is derived, the lexical representation depends on whether a value is supplied for the type's pattern facet: If no such value is supplied string-value returns the atomic value's canonical lexical representation for the primitive base type. Otherwise string-value returns a lexical representation that matches the value specified for the pattern facet. (This case includes xs:integers.) See .)

4.1.5 string-value Accessor

Description: Unfortunately the XML Schema Datatypes Rec doesn't detail the canonical lexical representation of all of the primitive types. In particular, no canonical lexical representation is specified for:

xs:string, xs:base64Binary, xs:anyURI (but that's OK, I think we can guess)
xs:duration - presumably the lexical representation contains all components of the duration (years, months, days, hours, minutes and seconds, even those that occur 0 times? Or are these omitted? In the latter case, what's the canonical lexical representation of PT0S? Since the number of seconds can be a decimal, is this decimal represented with a decimal point (i.e. using the canonical lexical representation for xs:decimal)?
xs:date - what happens to the timezone component? Presumably, unlike xs:dateTime and xs:time, this isn't normalized to Z? (And similarly for xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:Month, and xs:Day)
xs:QName and xs:NOTATION - these are the trickiest (their value spaces are the same). The XML Schema Rec states that the lexical representation of a QName depends on the in-scope namespaces. Does this mean the ones in the query/stylesheet or the ones from the source document? What if there's more than one namespace declaration for the namespace URI? What if there aren't any?

http://lists.w3.org/Archives/Public/www-xml-query-comments/2002Jan/0268.html

Issue-0074: Do we need Document fragments

Date: 12-February-2002

Raised by: Michael Rys

Effects: (Issue-0074, 1, In a well-formed document, the children of the document node consist exclusively of element nodes, processing-instruction nodes, and comment nodes, and exactly one of these children is an element node. A document node in the data model is more permissive: it allows more than one element node as a child and also permits text nodes as children. See . )

4.2.1 Overview

Description: Currently the Document node in the data model is permissive in the number of element nodes it can directly contain whereas the infoset only allows a single element node. Since preserving the single element node constraint is important to enforce queries to generate only well-formed documents when generating documents, the question is whether the data model needs to introduce a docfragment node that is permissive and keep the document node to be non-permissive. See http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Feb/0111.html (members only).

Issue-0076: PSVI to Type mapping supporting derived types

Date: 29-July-2002

Raised by: Marton Nagy

Effects: (Issue-0076, 1, The above definition is currently under discussion. It is very likely that a change will be made in a future draft to reflect a more precise definition covering derived types and the possible usage of generated type identifiers when no type names available. See . )

3.6 Mapping PSV Infoset additions to Types

Description: The last bullet in the definition of the PSVI to Type mapping does not handle derived types properly. In those cases we should not bottom out at xs:anyType, but attempt to use the rules all over again with the type from which the current type is derived. Also note that these rules need to be slightly different if we decide to use generated type identifiers when no type names are available.

Issue-0077: PSVI to Type mapping dependence on conformance levels

Date: 29-July-2002

Raised by: XPath Task Force

Effects: (Issue-0077, 1, See .)

3.6 Mapping PSV Infoset additions to Types

Description: The rules specifying the computation of the types from the PSVI do not yet reflect that this process depends on the conformance level of the processor. For Basic, are any type annotations preserved? The Query/XPath book says the data types are mapped to the nearest supertype, the Data Model needs to agree with this.

Issue-0079: String-value vs. string-value of the typed-value

Date: 31-July-2002

Raised by: Marton Nagy

Effects: (Issue-0079, 2, The string-value of a node and the result of casting the typed-value of a node to a string may give different results under this definition. The issue of whether to allow or possibly mandate that string-value return the same result as the string-value of the typed value is still under discussion. See . )

4.3.3 Accessors
4.4.3 Accessors

Description: Note that the current definition of dm:string-value($n) is such that it may differ from dm:string-value(dm:typed-value($n)) for some element or attribute nodes $n. For instance given a node dm:element-node(my:a,(),(),dm:text-node("01"),xs:integer), its string-value gives "01", but the string-value of the typed-value is "1".

The possible resolutions are that we (a) prohibit, (b) allow or (c) mandate dm:string-value($n) to be dm:string-value(dm:typed-value($n)). The current text reflects (a). Option (b) would result in non-interoperable implementations. Option (c) is promising. In that case the element constructor would need to be a little smart and change the textnode to contain the string value of dm:atomic-value-sequence applied to the string-value of the original text node and the type passed to the constructor. Similar change would need to be done to the attribute constructor.

Issue-0080: Typed value of Document, PI and Comment nodes

Date: 1-August-2002

Raised by: Mary Fernandez

Effects: (Issue-0080, 1, See .)

4.1.6 typed-value Accessor

Description: There is currently an asymmetry between the data model and the F&O documents. In the data model, the typed-value accessor on document, comment, PI nodes returns the empty sequence. In the F&O document, the data() function applied to a document, namespace, comment or processing instruction node, raises an error.

My intuition is that data() should be defined on all nodes, just as string-value() is defined. Raising an error on document, comment, PI, seems draconian and has tripped me up in writing queries that iterate over a variety of nodes. If data() returns error on such nodes, one ends up with a lot of code checking what kind of node is bound to a variable, etc.

Issue-0081: Schema-less documents with a DTD

Date: 10-Sep-2002

Raised by: Editors

Effects: (Issue-0081, 1, This implies accommodation for the case where both a DTD and a schema are applied. This will probably require some reconciliation of the [attribute type] property with type information from the PSVI. See issues , . )

3.3 XML Schemas and the XML Information Set

Description: If a document has a DTD, do we really have to lose the IDness of all ID attributes and other information that's available from DTD validation?

Issue-0082: Identifying element/attribute type

Date: 17-Aug-2002

Raised by: Jeni Tennison

Description: Note that currently the last bullet point cannot be reached from a legal PSVI because every element in a PSVI must have one of the combinations of properties listed. Elements whose type definition is anonymous still have a [type definition] property, it's just that the type definition's [name] is absent (the property exists, I think, but it has no value). Under the scheme above, such elements would have type whose namespace was the target namespace of the schema and whose name was nothing. http://lists.w3.org/Archives/Public/public-qt-comments/2002Aug/0018.html

Issue-0083: Distinction between {name} and [name]

Date: 17-Aug-2002

Raised by: Jeni Tennison

Description: I'm not sure what distinction you're making between [name] and {name}. As I understand it in the XML Schema spec, {name} is a property on a schema component while [name] is a property on an information item in the PSVI. Since you're talking about properties in the PSVI, I believe you should be using the notation [name] rather than {name}. http://lists.w3.org/Archives/Public/public-qt-comments/2002Aug/0018.html

Issue-0085: Globally declared namespaces in the infoset

Date: 2002-10-23

Raised by: Norman Walsh

Effects: (Issue-0085, 1, See .)

3.3 XML Schemas and the XML Information Set

Description: There is an open issue about how to map namespaces declared globally in the query prolog during the data model to infoset transformation.

This issue replaces an editorial note in the previous draft that said there was an issue.

Issue-0086: Nodes returned by dm:namespaces and dm:attributes

Date: 2002-10-17

Raised by: Jonathan Marsh

Effects: (Issue-0086, 1, The accessors namespaces and attributes return the same set of namespace and attribute nodes (respectively) that were supplied to the constructor, but they are not constrained to return them in the same order. See )

4.3.3 Accessors

Description: Does the constraint that the same set of nodes is returned restrict the possibilities for implementation? Should this constraint be relaxed?

For instance, an XML 1.0 store such as a DOM stores namespace information as attributes, and has no mechanism for undeclaring arbitrary namespaces. If a series of constructor functions are called to construct a data model instance that has fewer namespaces on children then on a parent element, the store will be unable to represent this, and might return extra namespace nodes. I claim (without proof :-) that these extra namespace nodes are harmless, and constraining implementations in this way is a burden.

Issue-0087: dm:base-uri should return ()?

Date: 2002-10-29

Raised by: Norman Walsh

Effects: (Issue-0087, 2, If the accessor is called on a node that does not have a base-uri property, or whose base-uri property is empty, the base-uri of that node's parent is returned. If the node has no parent, an error is raised. See )

4.1.1 base-uri Accessor
4.5.3 Accessors

Description: Calling dm:base-uri on a node that has no (transitive) base URI property raises an error. Would it be better to return ()?

See http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Oct/att-0275/01-2002-10-16-pres.html.

Issue-0088: Content type is not preserved

Date: 2002-10-30

Raised by: Norman Walsh

Effects: (Issue-0088, 1, A boolean that is true if and only if the entire Text Node consists of white space and the parent of the Text Node exists and is an element and the element content type of the element is not mixed. )

4.8.5 Data Model to Infoset Mapping

Description: Calculating the value of [element content whitespace] requires knowing the element's content type which doesn't seem to be available.

Issue-0089: How is typed-value calculated?

Date: 2002-11-01

Raised by: Norman Walsh

Effects: (Issue-0089, 1, )

3.5 Typed Value and String Value

Description: I don't think the spec is clear enough on how typed-value is calculated. I think the following proposal would work, but perhaps I'm wrong or perhaps we've already got another proposal somewhere else.

If the element has a [schema normalized value] property:
- () if the property is "absent"
- Otherwise the result of casting the [schema normalized value] to the appropriate type (the element type or its content type).
Otherwise ().

Issue-0090: Documents can be empty

Date: 2002-11-04

Raised by: Michael Kay

Effects: (Issue-0090, 1, The sequence of nodes passed as $children must not be empty and must consist only of element, processing instruction, comment, and text nodes. See .)

4.2.2 Constructor

Description: I notice that there's a requirement that a document node has at least one child. This seems to have been in previous drafts, but I overlooked it. In XSLT it's legal to create a document node with no children.

http://lists.w3.org/Archives/Member/w3c-xsl-wg/2002Nov/0005.html

Issue-0091: Support for substitution groups

Date: 2002-11-01

Raised by: Jeni Tennison

Effects: (Issue-0091, 1, See .)

4.1 Accessors

Description: I suggest that we add a dm:substitution-groups() accessor to the data model that returns a sequence of xs:QNames derived from the {substitution group affiliations} of the [element declaration] reported for the element and use this list to work out whether an element "has been validated and found to be a member of a substitution group whose head element has the required name" rather than guessing on the basis of the element's name (and type).

http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Nov/0007.html.

F Recently Closed Issues (Non-Normative)

There are 16 recently closed issues. These issues have been resolved since the last publication.

Issue-0028: Whitespace handling

Date: 04-May-2001

Raised by: Jonathan Marsh

Description: Whitespace handling needs to be more explicit. In the presence of a schema we have full knowledge of which whitespace is significant and which isn't, and can either mark whitespace as insignificant (and thus exclude it from text() and string-range() for instance), or automatically suppress whitespace in the data model. The former is appropriate given the dual representation of text nodes and values, the latter is appropriate if we only expose values.

Resolution: Closed. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Oct/att-0275/01-2002-10-16-pres.html.

Issue-0030: Base URI is a property of element nodes

Date: 04-May-2001

Raised by: Jonathan Marsh

Description: With external entities, and now with XML Base, the base URI can be scoped to various parts of the document. A base URI property should be added to Element Nodes, and the constructor and infoset mapping updated. Otherwise relative URIs in content cannot be correctly resolved.

Processing instructions also require an infoset-derived base URI. The base URI of attributes, for instance, should probably not be the empty sequence, if that does not adequately imply that the base URI of the element should be used instead.

Resolution: Closed. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Oct/att-0275/01-2002-10-16-pres.html.

Issue-0032: Keys and key references not represented

Date: 17-May-2001

Raised by: Query

Description: Note that the data model does not currently represent key values and key reference values as described in XML Schema Part 1 : Structures [XMLSchema Part 1]. In a future draft of this document, keys and key references will be represented in the data model.

Resolution: Not in V1. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Oct/att-0275/01-2002-10-16-pres.html.

Issue-0034: Interaction of insignificant whitespace with comments

Date: 8-May-2001

Raised by: Michael Kay

Description: http://lists.w3.org/Archives/Member/w3c-xsl-query/2001May/0053.html (members only). Clarify whether whitespace is classified as insignificant before or after PI and comment removal.

Resolution: NW: insignificant whitespace is orthogonal to PIs and comments. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Sep/0043.html

Issue-0035: Eliminate heterogeneous sequences

Date: 8-May-2001

Raised by: XSL WG (Michael Kay)

Description: http://lists.w3.org/Archives/Member/w3c-xsl-query/2001May/0054.html (members only), http://lists.w3.org/Archives/Member/w3c-xsl-query/2001May/0048.html (members only). Simplify operations such as distinct() by disallowing sequences mixing nodes and simple typed values. Suggests converting nodes in such a heterogeneous sequence to their typed values.

Resolution: Heterogeneous sequences are here to stay.

Issue-0039: Parent of namespace nodes

Date: 13-August-2001

Raised by: Datamodel Editors

Description: In XPath 1.0 namespace nodes have a parent. Should we adopt the XPath 1.0 behavior, the current behavior (no parent), or some other parent (e.g. the document)?

Resolution: Closed by Mike Kay's namespace proposal

Issue-0051: Document order of shared namespace nodes

Date: 28-August-2001

Raised by: Michael Kay

Description: Section 3.2 states that in the document ordering, the namespace nodes of an element follow the element but precede its attributes. This is inconsistent with the idea, suggested but not spelled out in 4.4, that a namespace node can be shared by several elements. In fact, the question of namespace node identity is not really tackled. My view is that namespace node identity should be determined by the combination of (document identity, namespace prefix, namespace URI), that the parent of a namespace node should be the document node, and that namespace nodes should be ordered after every other node in the document. (This is easier for implementations than placing them at the start of the document, because the number of namespace nodes is not known until parsing is complete).

Resolution: Closed by Mike Kay's namespace proposal

Issue-0055: Effect of xsi:nil

Date: 6-September-2001

Raised by: XSL WG

Description: The XSL WG wishes xsi:nil="true" to result in a typed-value of the empty sequence. This allows the differentiation of a null string value and an empty string.

Resolution: This is a duplicate of the resolved XPath Issue 0021: Handling of xsi:nil on Input. See the review of the XPath issues (members only). Accordingly closed this issue, updated the document to reflect the decision and added the related data model Issue-0071.

Issue-0057: Support for XSLT whitespace stripping

Date: 25-September-2001

Raised by: Michael Kay

Description: XSLT allows a stylesheet to designate elements whose whitespace is to be stripped. We need to support this in the data model, or possibly elsewhere.

Similarly, a data model instance for a stylesheet has provisions for stripping comments and processing instructions.

Resolution: NW: this functionality does not need to be supported in the data model. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Sep/0071.html

Issue-0060: Sharing namespace nodes

Date: 25-September-2001

Raised by: Michael Kay

Description: We need to say something about the identity of namespace nodes, and about the fact that two namespace nodes for the same prefix and uri may need to be combined when an element is added to a document. Also "the element constructor logically creates a copy of all of its namespace..." - namespace nodes do not need to be copied(?)

Resolution: Closed by Mike Kay's namespace proposal

Issue-0061: No access to prefix on free-floating attributes

Date: 25-September-2001

Raised by: Michael Kay

Description: An observation: if an attribute node has no parent element (a floating attribute), then there is also no access to any namespace nodes. This makes it impossible to support the XPath 1.0 name() function, which returns a lexical QName for the node by finding a prefix that maps to the node's namespace URI.

[JM: Sounds like we need a QName object that is a triple, local-name, namesapce-uri, and prefix, to support XSLT.]

Resolution: Closed by Mike Kay's namespace proposal

Issue-0062: Namespace fixups required

Date: 16-October-2001

Raised by: Michael Kay

Description: The current XSLT draft includes a substantial piece of text on namespace fixup. This was introduced in the XSLT 1.1 working draft, and is basically designed to ensure that when an element or attribute is added to a result tree, namespace nodes are also added to declare any namespace URI used by the element or attribute. (At XSLT 1.0, this was handled at serialization time, but this had to change when temporary trees became accessible to the stylesheet). The description of namespace fixup logically belongs with the description of the node construction process in the data model document. http://lists.w3.org/Archives/Member/w3c-xsl-query/2001Oct/0036.html (members only).

Resolution: Closed by Mike Kay's namespace proposal

Issue-0070: Should the name accessor return "" or ()?

Date: 5-December-2001

Raised by: XPath Task Force

Description: Should the name accessor return "" or () for nodes that have no name?

Resolution: Returns ().

Issue-0075: Support for unparsed entities

Date: 26-June-2002

Raised by: Michael Kay

Description: XSL WG has a requirement to access unparsed entities in a document. This is needed to support the XSLT 1.0 unparsed-entity-uri() function, and the unparsed-entity-public-id() function which we are adding for XSLT 2.0. In XSLT 1.0, unparsed entities were not described in the XPath data model, but in an XSLT addition to the data model. This solution is unsatisfactory: the data model should describe all the information that is available. The information is available in the InfoSet so there is no architectural problem in providing it. There is room for debate about how it is best provided (we don't really want another node type if we can help it), but the information should be there. We are not proposing, at this stage, that the functions unparsed-entity-uri() and unparsed-entity-public-id() be moved from XSLT into XPath, though that can easily be done if people want them. The information about unparsed entities in the data model would therefore not be available to XQuery users or to XPath users outside an XSLT environment.

Resolution: Closed: added data model accessors. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Oct/att-0275/01-2002-10-16-pres.html

Issue-0078: Data model does not represent CDATA sections

Date: 31-July-2002

Raised by: Michael Kay

Description: It is not clear how the current data model can represent or accomodate CDATA sections. (This is also captured as Issue 0293 on the XPath issue list.)

Resolution: Subsumed by query issue #293. http://lists.w3.org/Archives/Member/w3c-xsl-query/2002Oct/att-0275/01-2002-10-16-pres.html.

Issue-0084: How are unparsed entities represented?

Date: 26-Sep-2002

Raised by: Norman Walsh

Description: Suppose I have an attribute that has the type xs:ENTITY and the value "foo". How can I find out the public and system identifiers of the external unparsed entity "foo"?

Resolution: Duplicate of [Issue-0075: Support for unparsed entities] (closed).

xs	http://www.w3.org/2001/XMLSchema
xsi	http://www.w3.org/2001/XMLSchema-instance
cat	http://www.example.com/catalog
xlink	http://www.w3.org/1999/xlink
html	http://www.w3.org/1999/xhtml

XQuery 1.0 and XPath 2.0 Data Model

W3C Working Draft 15 November 2002

Abstract

Status of this Document

Table of Contents

Appendices

1 Introduction

2 Notation and Pseudo-code Syntax

3 Concepts

3.1 Node Identity

3.2 Document Order

3.3 XML Schemas and the XML Information Set

3.4 Types

3.5 Typed Value and String Value

3.6 Mapping PSV Infoset additions to Types

3.7 Comments, Processing Instructions, and Whitespace

4 Nodes

4.1 Accessors

4.1.1 dm:base-uri Accessor

4.1.2 node-kind Accessor

4.1.3 node-name Accessor

4.1.4 parent Accessor

4.1.5 string-value Accessor

4.1.6 typed-value Accessor

4.1.7 type Accessor

4.1.8 children Accessor

4.1.9 attributes Accessor

4.1.10 namespaces Accessor

4.2 Documents

4.2.1 Overview

4.2.2 Constructor

4.2.3 Accessors

4.2.4 PSVI to Datamodel Mapping

4.2.5 Data Model to Infoset Mapping

4.3 Elements

4.3.1 Overview

4.3.2 Constructor

4.3.3 Accessors

4.3.4 PSVI to Data Model Mapping

4.3.5 Data Model to Infoset Mapping

4.4 Attributes

4.4.1 Overview

4.4.2 Constructor

4.4.3 Accessors

4.4.4 PSVI to Data Model Mapping

4.4.5 Data Model to Infoset Mapping

4.5 Namespaces

4.5.1 Overview

4.5.2 Constructor

4.5.3 Accessors

4.5.4 PSVI to Data Model Mapping

4.5.5 Data Model to Infoset Mapping

4.6 Processing Instructions

4.6.1 Overview

4.6.2 Constructor

4.6.3 Accessors

4.6.4 PSVI to Data Model Mapping

4.6.5 Data Model to Infoset Mapping

4.7 Comments

4.7.1 Overview

4.7.2 Constructor

4.7.3 Accessors

4.7.4 PSVI to Data Model Mapping

4.7.5 Data Model to Infoset Mapping

4.8 Text

4.8.1 Overview

4.8.2 Constructor

4.8.3 Accessors

4.8.4 PSVI to Data Model Mapping

4.8.5 Data Model to Infoset Mapping

5 Atomic Values

6 Sequences

A XML Information Set Conformance

B References

C References (Non-Normative)

D Example (Non-Normative)

E Open Issues (Non-Normative)

F Recently Closed Issues (Non-Normative)

4.1.2 `node-kind` Accessor

4.1.3 `node-name` Accessor

4.1.4 `parent` Accessor

4.1.5 `string-value` Accessor

4.1.6 `typed-value` Accessor

4.1.7 `type` Accessor

4.1.8 `children` Accessor

4.1.9 `attributes` Accessor

4.1.10 `namespaces` Accessor