The presentation of this document has been augmented to identify changes from a previous version. Three kinds of changes are highlighted: new, added text, changed text, and deleted text.


W3C

XQuery 1.0 and XPath 2.0 Data Model (XDM)

W3C Candidate Recommendation 8 June 2006

This version:
Latest version:
http://www.w3.org/TR/xpath-datamodel/
Previous versions:
Editors:
Mary Fernández (XML Query WG), AT&T Labs <mff@research.att.com>
Ashok Malhotra (XML Query and XSL WGs), Oracle Corporation <ashok.malhotra@alum.mit.edu>
Jonathan Marsh (XSL WG), Microsoft <jmarsh@microsoft.com>
Marton Nagy (XML Query WG), Science Applications International Corporation (SAIC) <marton.nagy@saic.com>
Norman Walsh (XSL WG), Sun Microsystems <Norman.Walsh@Sun.COM>

This document is also available in these non-normative formats:

XML
and 
Recent revisions
.


Abstract

This document defines the W3C XQuery 1.0 and XPath 2.0 Data Model (XDM), which is the data model of [XPath 2.0], [XSLT 2.0], and [XQuery], and any other specifications that reference it. This data model is based on the [XPath 1.0] data model and earlier work on an [XML Query Data Model]. This document is the result of joint work by the [XSL Working Group] and the [XML Query Working Group].

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

On 3 November 2005,this specification was published as a Candidate Recommendation, and a Call for Implementations was announced.This revision is published in order to givevisibility to the technical decisions that have been made so far during this phase of the process and to allow review by W3C Members and other interested parties. The maturity level of the specification remains unchanged, and the work is on track to move forward to the Proposed Recommendation stage when the exit criteria for thecurrent phase have been met. Publication as a Candidate Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoletedby other documents at any time. It is inappropriate to cite this document as otherthan work in progress. This specification will remain a Candidate Recommendation until at least 28 February 2006.

This document has been jointly produced by the XML Query Working Group and the XSL Working Group, both of which are part of the XML Activity. Publication as a Candidate Recommendation does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This draft includes corrections and changes based on public comments recorded in the W3C public Bugzilla repository (http://www.w3.org/Bugs/Public/) used for Last Call issues tracking. A list of substantive changes since the Last Call Working Draft of 04 April 2005 can be found in G Change Log.

Comments on this document should be made in W3C's public Bugzilla system (instructions can be found at http://www.w3.org/XML/2005/04/qt-bugzilla). If access to that system is not feasible, you may send your comments to the W3C XSLT/XPath/XQuery mailing list, public-qt-comments@w3.org. It will be very helpful if you include the string [DM] in the subject line of your comment, whether made in Bugzilla or in email. Each Bugzilla entry and email message should contain only one comment. Archives of the comments and responses are available at http://lists.w3.org/Archives/Public/public-qt-comments/.

The Working Groups solicit user and implementor feedback especially on the whitespace collapsing semantics imposed by 6.7.3 Construction from an Infoset and 6.7.4 Construction from a PSVI which are incompatible with current common practice.

The XML Query and XPath Test Suite is under development. Implementors are encouraged to run this test suite and report their results. A preliminary XQuery Test Suite Result Summary has been prepared that contains information submitted for several implementations.

The XML Query and XSL WGs plan to submit this specification for consideration as a W3C Proposed Recommendation as soon as both the XQuery 1.0 specification and the XSLT 2.0 specification have been submitted for consideration as a W3C Proposed Recommendation.

The XML Query and XPath Test Suite is under development. Implementors are encouraged to run this test suite and report their results.

Thisdocument was produced by groups operating underthe 5February 2004 W3C PatentPolicy.W3C maintains a publiclistofany patent disclosuresmade in connection with the deliverablesofthe XML QueryWorking Groupand also maintains a publiclist of anypatentdisclosuresmade in connectionwith the deliverables of the XSL Working Group;those pages also include instructions for disclosinga patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) with respect to this specification should disclose the information in accordance with section 6 of the W3C Patent Policy.

Table of Contents

1 Introduction
2 Concepts
    2.1 Terminology
    2.2 Notation
    2.3 Node Identity
    2.4 Document Order
    2.5 Sequences
    2.6 Types
        2.6.1 Representation of Types
        2.6.2 Predefined Types
        2.6.3 Type Hierarchy
        2.6.4 Atomic Values
        2.6.5 String Values
3 Data Model Construction
    3.1 Direct Construction
    3.2 Construction from an Infoset
    3.3 Construction from a PSVI
        3.3.1 Mapping PSVI Additions to Node Properties
            3.3.1.1 Element and Attribute Node Type Names
            3.3.1.2 Typed Value Determination
            3.3.1.3 Relationship Between Typed-Value and String-Value
            3.3.1.4 Pattern Facets
        3.3.2 Dates and Times
        3.3.3 QNames and NOTATIONS
4 Infoset Mapping
5 Accessors
    5.1 attributes Accessor
    5.2 base-uri Accessor
    5.3 children Accessor
    5.4 document-uri Accessor
    5.5 is-id Accessor
    5.6 is-idrefs Accessor
    5.7 namespace-bindings Accessor
    5.8 namespace-nodes Accessor
    5.9 nilled Accessor
    5.10 node-kind Accessor
    5.11 node-name Accessor
    5.12 parent Accessor
    5.13 string-value Accessor
    5.14 type-name Accessor
    5.15 typed-value Accessor
    5.16 unparsed-entity-public-id Accessor
    5.17 unparsed-entity-system-id Accessor
6 Nodes
    6.1 Document Nodes
        6.1.1 Overview
        6.1.2 Accessors
        6.1.3 Construction from an Infoset
        6.1.4 Construction from a PSVI
        6.1.5 Infoset Mapping
    6.2 Element Nodes
        6.2.1 Overview
        6.2.2 Accessors
        6.2.3 Construction from an Infoset
        6.2.4 Construction from a PSVI
        6.2.5 Infoset Mapping
    6.3 Attribute Nodes
        6.3.1 Overview
        6.3.2 Accessors
        6.3.3 Construction from an Infoset
        6.3.4 Construction from a PSVI
        6.3.5 Infoset Mapping
    6.4 Namespace Nodes
        6.4.1 Overview
        6.4.2 Accessors
        6.4.3 Construction from an Infoset
        6.4.4 Construction from a PSVI
        6.4.5 Infoset Mapping
    6.5 Processing Instruction Nodes
        6.5.1 Overview
        6.5.2 Accessors
        6.5.3 Construction from an Infoset
        6.5.4 Construction from a PSVI
        6.5.5 Infoset Mapping
    6.6 Comment Nodes
        6.6.1 Overview
        6.6.2 Accessors
        6.6.3 Construction from an Infoset
        6.6.4 Construction from a PSVI
        6.6.5 Infoset Mapping
    6.7 Text Nodes
        6.7.1 Overview
        6.7.2 Accessors
        6.7.3 Construction from an Infoset
        6.7.4 Construction from a PSVI
        6.7.5 Infoset Mapping
7 Conformance

Appendices

A XML Information Set Conformance
B References
    B.1 Normative References
    B.2 Other References
C Schema for the Extended XSXDT Namespace
D Glossary (Non-Normative)
E Example (Non-Normative)
F Implementation-Defined and Implementation-Dependent Items (Non-Normative)
    F.1 Implementation-Defined Items
    F.2 Implementation-Dependent Items
G Change Log (Non-Normative)
    G.1 Changes Since 3 November 2005
    G.2 Changes Since 15 September 2005
    G.3 Changes Since 4 April 2005
H Accessor Summary (Non-normative)
I Infoset Construction Summary (Non-normative)
J PSVI Construction Summary (Non-normative)
K Infoset Mapping Summary (Non-normative)


1 Introduction

This document defines the XQuery 1.0 and XPath 2.0 Data Model, which is the data model of [XPath 2.0], [XSLT 2.0] and [XQuery]

The XQuery 1.0 and XPath 2.0 Data Model (henceforth "data model") serves two purposes. First, it defines the information contained in the input to an XSLT or XQuery processor. Second, it defines all permissible values of expressions in the XSLT, XQuery, and XPath languages. A language is closed with respect to a data model if the value of every expression in the language is guaranteed to be in the data model. XSLT 2.0, XQuery 1.0, and XPath 2.0 are all closed with respect to the data model.

The data model is based on the [Infoset] (henceforth "Infoset"), but it requires the following new features to meet the [XPath 2.0 Requirements] and [XML Query Requirements]:

As with the Infoset, the XQuery 1.0 and XPath 2.0 Data Model specifies what information in the documents is accessible, but it does not specify the programming-language interfaces or bindings used to represent or access the data.

The data model can represent various values including not only the input and the output of a stylesheet or query, but all values of expressions used during the intermediate calculations. Examples include the input document or document repository (represented as a Document Node or a sequence of Document Nodes), the result of a path expression (represented as a sequence of nodes), the result of an arithmetic or a logical expression (represented as an atomic value), a sequence expression resulting in a sequence of items, etc.

This document provides a precise definition of the properties of nodes in the XQuery 1.0 and XPath 2.0 Data Model, how they are accessed, and how they relate to values in the Infoset and PSVI.

2 Concepts

This section outlines a number of general concepts that apply throughout this specification.

2.1 Terminology

For a full glossary of terms, see D Glossary.

In this specification the words must, must not, should, should not, may and recommended are to be interpreted as described in [RFC 2119].

This specification distinguishes between the data model as a general concept and specific items (documents, elements, atomic values, etc.) that are concrete examples of the data model by identifying all concrete examples as instances of the data model.

[Definition: Every instance of the data model is a sequence.].

[Definition: A sequence is an ordered collection of zero or more items.] A sequence cannot be a member of a sequence. A single item appearing on its own is modeled as a sequence containing one item. Sequences are defined in 2.5 Sequences.

[Definition: An item is either a node or an atomic value],

Every node is one of the seven kinds of nodes defined in 6 Nodes. Nodes form a tree that consists of a root node plus all the nodes that are reachable directly or indirectly from the root node via the dm:children, dm:attributes, and dm:namespace-nodes accessors. Every node belongs to exactly one tree, and every tree has exactly one root node.

[Definition: A tree whose root node is a Document Node is referred to as a document.]

[Definition: A tree whose root node is not a Document Node is referred to as a fragment.]

[Definition: An atomic value is a value in the value space of an atomic type and is labeled with the name of that atomic type.]

[Definition: An atomic type is a primitive simple type or a type derived by restriction from another atomic type.] (Types derived by list or union are not atomic.)

[Definition: There are 23 primitive simple types: the 19 defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2] and xs:untyped, xs:untypedAtomic, xs:anyAtomicType, xs:dayTimeDuration, andxs:yearMonthDuration], defined in 2.6 Types.

A type is represented in the data model by an expanded-QName.

[Definition: An expanded-QName is a set of three values consisting of a possibly empty prefix, a possibly empty namespace URI, and a local name.] See 3.3.3 QNames and NOTATIONS.

[Definition: Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementor for each particular implementation.]

[Definition: Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementor for any particular implementation.]

Within this specification, the term URI refers to a Universal Resource Identifier as defined in [RFC 3986] and extended in [RFC 2987] with the new name IRI. The term URI has been retained in preference to IRI to avoid introducing new names for concepts such as “Base URI” that are defined or referenced across the whole family of XML specifications.

In all cases where this specification leaves the behavior implementation-defined or implementation-dependent, the implementation has the option of providing mechanisms that allow the user to influence the behavior.

2.3 Node Identity

Each node has a unique identity. Every node in an instance of the data model is unique: identical to itself, and not identical to any other node. (Atomic values do not have identity; every instance of the value “5” as an integer is identical to every other instance of the value “5” as an integer.)

Note:

The concept of node identity should not be confused with the concept of a unique ID, which is a unique name assigned to an element by the author to represent references using ID/IDREF correlation.

2.4 Document Order

[Definition: A document order is defined among all the nodes accessible during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order in which nodes appear in the XML serialization of a document.] [Definition: Document order is stable, which means that the relative order of two nodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.]

Within a tree, document order satisfies the following constraints:

  1. The root node is the first node.

  2. Every node occurs before all of its children and descendants.

  3. Namespace Nodes immediately follow the Element Node with which they are associated. The relative order of Namespace Nodes is stable but implementation-dependent.

  4. Attribute Nodes immediately follow the Namespace Nodes of the element with which they are associated. If there are no Namespace Nodes associated with a given element, then the Attribute Nodes associated with that element immediately follow the element. The relative order of Attribute Nodes is stable but implementation-dependent.

  5. The relative order of siblings is the order in which they occur in the children property of their parent node.

  6. Children and descendants occur before following siblings.

The relative order of nodes in distinct trees is stable but implementation-dependent, subject to the following constraint: If any node in a given tree, T1, occurs before any node in a different tree, T2, then all nodes in T1 are before all nodes in T2.

2.6 Types

The data model supports strongly typed languages such as [XPath 2.0] and [XQuery] that have a type system based on [Schema Part 1]. The type system is formally defined in [Formal Semantics].

Every item in the data model has both a value and a type. In addition to nodes, the data model can represent atomic values like the number 5 or the string “Hello World.” For each of these atomic values, the data model contains both the value of the item (such as 5 or “Hello World”) and its type name (such as xs:integer or xs:string).

2.6.1 Representation of Types

The data model uses expanded-QNames to represent the names of schema types, which include the built-in types defined by [Schema Part 2], five additional types defined by this specification, and may include other user- or implementation-defined types.

For XML Schema types, the namespace name of the expanded-QName is the {targetnamespace}property of the type definition, and its local name is the {name}property of the type definition.

The data model relies on the fact that an expanded-QName uniquely identifies every named type. Although it is possible for different schemas to define different types with the same expanded-QName, at most one of them can be used in any given validation episode. The data model cannot support environments where different types with the same expanded-QName are available.

For anonymous types, the processor must construct an anonymous type name that is distinct from the name of every named type and the name of every other anonymous type. [Definition: An anonymous type name is an implementation dependent, unique type name provided by the processor for every anonymous type declared in the schemas available.] Anonymous type names must be globally unique across all anonymous types that are accessible to the processor. In the formalism of this specification, the anonymous type names are assumed to be xs:QNames, but in practice implementations are not required to use xs:QNames to represent the implementation-dependent names of anonymous types.

The scope over which the names of anonymous types must be meaningful and distinct is depends on the processing context. It is the responsibility of the host language to define the size and scope of the processing context.

The data model does not represent element or attribute declaration schema components, but it supports various type-related operations. The semantics of other operations, for example, checking if a particular instance of an Element Node has a given schema type is defined in [Formal Semantics].

2.6.2 Predefined Types

In addition to the 19 types defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2], the data model defines five additional types: xs:anyAtomicType, xs:untyped, xs:untypedAtomic, xs:dayTimeDuration, and xs:yearMonthDuration. These types are defined in the XML Schema namespace with permission of the XML Schema Working Group, which is expected to add them to some future version of XML Schema.

xs:untyped

The datatype xs:untyped denotes the dynamic type of an element node that has not been validated, or has been validated in skip mode. No predefined types are derived from xs:untyped.

xs:untypedAtomic

The datatype xs:untypedAtomic denotes untyped atomic data, such as text that has not been assigned a more specific type. An attribute that has been validated in skip mode is represented in the Data Model by an attribute node with the type xs:untypedAtomic. No predefined types are derived from xs:untypedAtomic.

xs:anyAtomicType

The datatype xs:anyAtomicType is an atomic type that includes all atomic values (and no values that are not atomic). Its base type is xs:anySimpleType from which all simple types, including atomic, list, and union types are derived. All primitive atomic types, such as xs:integer and xs:string, have xs:anyAtomicType as their base type.

xs:dayTimeDuration

The type xs:dayTimeDuration is derived from xs:duration by restricting its lexical representation to contain only the days, hours, minutes and seconds components. The value space of xs:dayTimeDuration is the set of fractional second values. The components of xs:dayTimeDuration correspond to the day, hour, minute and second components defined in Section 5.5.3.2 of [ISO 8601]. Anxs:dayTimeDuration is derived from xs:duration as follows:

<xs:simpleType name='dayTimeDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[^YM]*[DT].*"/>
  </xs:restriction>
</xs:simpleType>
xs:yearMonthDuration

The type xs:yearMonthDuration is derived from xs:duration by restricting its lexical representation to contain only the year and month components. The value space of xs:yearMonthDuration is the set of xs:integer month values. The year and month components of xs:yearMonthDuration correspond to the Gregorian year and month components defined in section 5.5.3.2 of [ISO 8601], respectively.

The type xs:yearMonthDuration is derived from xs:duration as follows:

<xs:simpleType name='yearMonthDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[^DT]*"/>
  </xs:restriction>
</xs:simpleType>

A schema for these types is provided in C Schema for the Extended XSXDT Namespace.

2.6.3 Type Hierarchy

The diagram below shows how the nodes, primitive simple types, and user defined types fit together into a hierarchy.

The xs:IDREFS, xs:NMTOKENS, xs:ENTITIES and user-defined list and union types are special types in that these types are lists or unions rather than true subtypes.

Type hierarchy graphic

2.6.4 Atomic Values

An atomic value can be constructed from a lexical representation. Given a string and an atomic type, the atomic value is constructed in such a way as to be consistent with schema validation. If the string does not represent a valid value of the type, an error is raised. When xs:untypedAtomic is specified as the type, no validation takes place. The details of the construction are described in Section 5 Constructor FunctionsFO and the related Section 17 CastingFO section of [Functions and Operators].

3 Data Model Construction

This section describes the constraints on instances of the data model.

The data model supports well-formed XML documents conforming to [Namespaces in XML] or [Namespaces in XML 1.1]. Documents that are not well-formed are, by definition, not XML. XML documents that do not conform to [Namespaces in XML] or [Namespaces in XML 1.1] are not supported (nor are they supported by [Infoset]).

In other words, the data model supports the following classes of XML documents:

This document describes how to construct an instance of the data model from an infoset ([Infoset]) or a Post Schema Validation Infoset (PSVI), the augmented infoset produced by an XML Schema validation episode.

An instance of the data model can also be constructed directly through application APIs, or from non-XML sources such as relational tables in a database. Regardless of how an instance of the data model is constructed, every node and atomic value in the data model must have a typed-value that is consistent with its type.

The data model supports some kinds of values that are not supported by [Infoset]. Examples of these are document fragments and sequences of Document Nodes. The data model also supports values that are not nodes. Examples of these are sequences of atomic values, or sequences mixing nodes and atomic values. These are necessary to be able to represent the results of intermediate expressions in the data model during expression processing.

3.3 Construction from a PSVI

An instance of the data model can be constructed from a PSVI, whose element and attribute information items have been strictly assessed, laxly assessed, or have not been assessed. Constructing an instance of the data model from a PSVI must be consistent with the description provided in this section and with the description provided for each node kind.

Data model construction requires that the PSVI provide unique names for all anonymous schema types.

Note:

[Schema Part 1] does not require all schema processors to provide unique names for anonymous schema types. In order to build an instance of the data model from a PSVI produced by a processor that does not provide the names, some post-processing will be required in order to assure that they are all uniquely identified before construction begins.

[Definition: An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.]

The data model supports incompletely validated documents. Elements and attributes that are not valid are treated as having unknown types.

The most significant difference between Infoset construction and PSVI construction occurs in the area of schema type assignment. Other differences can also arise from schema processing: default attribute and element values may be provided, white space normalization of element content may occur, and the user-supplied lexical form of elements and attributes with atomic schema types may be lost.

3.3.1 Mapping PSVI Additions to Node Properties

A PSVI element or attribute information item may have a [validity] property. The [validity] property may be "valid", "invalid", or "notKnown" and reflects the outcome of schema-validity assessment. In the data model, precise schema type information is exposed for Element and Attribute Nodes that are "valid". Nodes that are not "valid" are treated as if they were simply well-formed XML and only very general schema type information is associated with them.

3.3.1.1 Element and Attribute Node Type Names

The precise definition of the schema type of an element or attribute information item depends on the properties of the PSVI. In the PSVI, [Schema Part 1] only guarantees the existence of either the [type definition] property, or the [type definition namespace], [type definition name] and [type definition anonymous] properties. If the type definition refers to a union type, there are further properties defined, that refer to the type definition which actually validated the item's normalized value. These properties are not used to determine the schema type of the node but they may be used to determine the typed value of the node, as described in 3.3.1.2 Typed Value Determination.

The type depends on the [validity] and [validation attempted] properties in the PSVI. If:

  • The [validity] and [validation attempted] properties exist and have the values "valid" and "full", respectively, the schema type of an element or attribute information item is represented by an expanded-QName whose namespace and local name correspond to the first applicable items in the following list:

    • If the [type definition] property exists:

      • If the {name} property is not absent, the {target namespace} and {name} properties of the [type definition] property;

      • Otherwise, the namespace and local name of the appropriate anonymous type name.

    • If [type definition anonymous] exists:

      • If it is false: the [type definition namespace] and the [type definition name] properties;

      • Otherwise, the namespace and local name of the appropriate anonymous type name.

  • The [validity] property exists and is "invalid", or the [validation attempted] property exists and is "partial", the schema type of an element is xs:anyType and the type of an attribute is xs:anySimpleType.

  • The [validity] property exists and is "notKnown", and the [validation attempted] property exists and is "none", the schema type of an element is xs:untyped and the type of an attribute is xs:untypedAtomic.

  • The [validity] or [validation attempted] properties do not exist, the schema type of an element is xs:untyped and the type of an attribute is xs:untypedAtomic.

The prefix associated with the type names is implementation-dependent.

3.3.1.2 Typed Value Determination

This section describes how the typed value of an Element or Attribute Node is computed from an element or attribute PSVI information item, where the information item has either a simple type or a complex type with simple content. For other kinds of Element Nodes, see 6.2.4 Construction from a PSVI; for other kinds of Attribute Nodes, see 6.3.4 Construction from a PSVI.

The typed value of Attribute Nodes and some Element Nodes is a sequence of atomic values. The types of the items in the typed value of a node may differfrom the type of the node itself. This section describes how the typed value of a node is derived from the properties of an information item in a PSVI.

The types of the items in the typed value of a node are determined by a recursive process called typed value determination. This process begins with T, the schema type of the node itself, as represented in the PSVI. The type T has a variety, which is either atomic, union, or list. The typed value determination process is defined as follows:

  • If the nilled property of the node in question is true, then the typed value is the empty sequence.

  • If T is xs:anySimpleType, the typed value is the [schema normalized value] as an instance of xs:untypedAtomic.

  • If the {variety} of T is atomic, the typed value is an instance of T derived from the [schema normalized value] in a way consistent with XML Schema validation.

  • If the {variety} of T is union, then the type of the typed value is the determined by the type definition that actually validated the content of the node, as follows:

    • If [member type definition] exists: If the {name} property exists, the {target namespace} and {name} properties of the [member type definition]; otherwise, the appropriate anonymous type name.

    • If [member type definition anonymous] exists: If it is false, the [member type definition namespace] and [member type definition name] properties; otherwise, the appropriate anonymous type name.

    The resulting type is substituted for T, and the typed value determination process is invoked recursively.

  • If the {variety} of T is list, the [schema normalized value] of the node is considered to be a space-separated list of lexical forms, each of which has its own type. For each of these lexical forms, the type of the corresponding item is found in {item type definition}. This type is then substituted for T, and the typed value determination process is invoked recursively for each member of the list.

The typed value determination process is guaranteed to result in a sequence of atomic values, each having a well-defined atomic type. This sequence of atomic values, in turn, determines the typed-value property of the node in the data model.

3.3.1.3 Relationship Between Typed-Value and String-Value

Element and attribute nodes have both typed-value and string-value properties. However, implementations are allowed some flexibility in how these properties are stored. An implementation may choose to store the string-value only and derive the typed-value from it, or to store the typed-value only and derive the string-value from it, or to store both the string-value and the typed-value.

In order to permit these various implementation strategies, some variations in the string value of a node are defined as insignificant. Implementations that store only the typed value of a node are permitted to return a string value that is different from the original lexical form of the node content. For example, consider the following element:

Assuming that the node is valid, it has a typed value of 30 as an xs:integer. An implementation may return either "30" or "0030" as the string value of the node. Any string that is a valid lexical representation of the typed value is acceptable. In this specification, we express this rule by saying that the relationship between the string value of a node and its typed value must be "consistent with schema validation."

If an implementation stores only the string-value of a node, the following considerations apply:

3.3.2 Dates and Times

The date and time types require special attention. This section applies to implementations that store the typed value of xs:dateTime, xs:date, xs:time, xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:gMonth, xs:gDay, and types that are derived from them. These are known collectively as the date/time types in this specification.

The values of the date/time types are represented in the data model using seven components:

year

An xs:integer.

month

An xs:integer between 1 and 12, inclusive.

day

An xs:integer between 1 and 31, inclusive, possibly restricted further depending on the values of month and year.

hour

An xs:integer between 0 and 23, inclusive.

minute

An xs:integer between 0 and 59, inclusive.

second

An xs:decimal greater than or equal to zero and less than 60. Leap seconds are not supported.

timezone

An xs:dayTimeDuration between -PT14H00M and PT14H00M, inclusive. All timezone values must be an integral number of minutes.

Components that are intrinsic to the datatype (for example, day, month, and year in a xs:date) are required; components that can never be part of a datatype (for example, years in a xs:time) must be missing. Missing components are represented by the empty sequence. When a component is present, it contains the “local value” that has not been normalized in any way. The timezone component is optional for all the date/time datatypes.

Thus, the lexical xs:dateTime representation “2003-01-02T11:30:00-05:00” is stored as “{2003, 1, 2, 11, 30, 0.0, -PT05H00M}”. The value of the lexical representation “2003-01-16T16:30:00” is stored as “{2003, 1, 16, 16, 30, 0, ()}” because it has no timezone. The value of the lexical xs:gDay representation “---30+10:30” is stored as “{(), (), 30, (), (), (), PT10H30M}”.

The lexical form “24:00:00” is normalized in the component model. As a xs:time, it is stored as “{(), (), (), 0, 0, 0.0, ()}” and the xs:dateTime representation “1999-12-31T24:00:00” is stored as “{2000, 1, 1, 0, 0, 0.0, ()}”.

Note: Implementations are permitted to store date/time values in any representation that's convenient for them, provided that the individual properties can be accessed and modified.

3.3.3 QNames and NOTATIONS

The QName and NOTATION data types require special attention. The following sections apply to xs:QName, xs:NOTATION, and types derived from them. These types are referred to collectively as “qualified names”.

As defined in XML Schema, the lexical space for qualified names includes a local name and an optional namespace prefix. The value space for qualified names contains a local name and an optional namespace URI. Therefore, it is not possible to derive a lexical value from the typed value, or vice versa, without access to some context that defines the namespace bindings.

When qualified names exist as values of nodes in a well-formed document, it is always possible to determine such a namespace context. However, the data model also allows qualified names to exist as freestanding atomic values, or as the name or value of a parentless attribute node, and in these cases no namespace context is available.

In this Data Model, therefore, the value space for qualified names contains a local-name, an optional namespace URI, and an optional prefix. The prefix is used only when producing a lexical representation of the value, that is, when casting the value to a string. The prefix plays no part in other operations involving qualified names: in particular, two qualified names are equal if their local names and namespace URIs match, regardless whether they have the same prefix.

The following consistency constraints apply:

4 Infoset Mapping

This specification describes how to map each kind of node to the corresponding information item. This mapping produces an Infoset; it does not and cannot produce a PSVI. Validation must be used to obtain a PSVI for a (portion of a) data model instance.

An Infoset can also be constructed by serializing an instance of the data model and parsing it. Serialization is governed by [Serialization].

5 Accessors

A set of accessors is defined on nodes in the data model. For consistency, all the accessors are defined on every kind of node, although several accessors return a constant empty sequence on some kinds of nodes.

In order for processors to be able to operate on instances of the data model, the model must expose the properties of the items it contains. The data model does this by defining a family of accessor functions. These are not functions in the literal sense; they are not available for users or applications to call directly. Rather they are descriptions of the information that an implementation of the data model must expose to applications. Functions and operators available to end-users are described in [Functions and Operators].

Some typed values in the data model are undefined. Attempting to access an undefined property is always an error. Behavior in these cases is implementation-defined and the host language is responsible for determining the result.

5.8 namespace-nodes Accessor

The dm:namespace-nodes accessor returns the dynamic, in-scope namespaces associated with a node as a sequence containing zero or more Namespace Nodes. The order of Namespace Nodes is stable but implementation dependent.

It is defined on all seven node kinds.

Note: this accessor and the namespace-bindings accessor provide two views of the same information. Implementations that do not need to expose Namespace Nodes might choose not to implement this accessor.

5.9 nilled Accessor

The dm:nilled accessor returns true if the node is "nilled". [Schema Part 1] introduced the nilled mechanism to signal that an element should be accepted as valid when it has no content even when it has a content type which does not require or even necessarily allow empty content.

It is defined on all seven node kinds.

6 Nodes

[Definition: There are seven kinds of Nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.] Each kind of node is described in the following sections.

Note:

A host language based on the Data Model may specify that its usage of the Data Model does not include namespace nodes. Namespace nodes are used only in the "namespaces" property of an element node, which records the bindings of namespace prefixes to namespace URIs. These bindings may be represented either by means of namespace nodes or by using an alternative, implementation-dependent representation.

All nodes must satisfy the following general constraints:

  1. Every node must have a unique identity, distinct from all other nodes.

  2. The children property of a node must not contain two consecutive Text Nodes.

  3. The children property of a node must not contain any empty Text Nodes.

  4. The children and attributes properties of a node must not contain two nodes with the same identity.

6.1 Document Nodes

6.1.1 Overview

Document Nodes encapsulate XML documents. Documents have the following properties:

Document Nodes must satisfy the following constraints.

  1. The children must consist exclusively of Element, Processing Instruction, Comment, and Text Nodes if it is not empty. Attribute, Namespace, and Document Nodes can never appear as children

  2. If a node N is among the children of a Document Node D, then the parent of N must be D.

  3. If a node N has a parent Document Node D, then N must be among the children of D.

In the [Infoset], a document information item must have at least one child, its children must consist exclusively of element information items, processing instruction information items and comment information items, and exactly one of the children must be an element information item. This data model is more permissive: a Document Node may be empty, it may have more than one Element Node as a child, and it also permits Text Nodes as children.

Implementations that support DTD processing and access to the unparsed entity accessors use the unparsed-entities property to associate information about an unordered collection of unparsed entities with a Document Node. This property is accessed indirectly through the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id functions.

6.1.3 Construction from an Infoset

The document information item is required. A Document Node is constructed for each document information item.

The following infoset properties are required: [children] and [base URI].

The following infoset properties are optional: [unparsed entities].

Document Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, and comment found in the [children] property, a corresponding Element, Processing Instruction, or Comment Node is constructed and that sequence of nodes is used as the value of the children property.

If present among the [children], the document type declaration information item is ignored.

unparsed-entities

If the [unparsed entities] property is present and is not the empty set, the values of the unparsed entity information items must be used to support the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id accessors.

The internal structure of the values of the unparsed-entities property is implementation defined.

string-value

The concatenation of the string-values of all its Text Node descendants in document order. If the document has no such descendants, the zero-length string.

typed-value

The dm:string-value of the node as an xs:untypedAtomic value.

document-uri

The document-uri property holds the absolute URI for the resource from which the document node was constructed, if one is available and can be made absolute. For example, if a collection of documents is returned by the fn:collection function, the document-uri property may serve to distinguish between them even though each has the same base-uri property.

If the document-uri is not the empty sequence, then the following constraint must hold: the node returned by evaluating fn:doc() with the document-uri as its argument must return the document node that provided the value of the document-uri property.

In other words, for any Document Node $arg, either fn:document-uri($arg) must return the empty sequence or fn:doc(fn:document-uri($arg)) must return $arg.

6.2 Element Nodes

6.2.1 Overview

Element Nodes encapsulate XML elements. Elements have the following properties:

Element Nodes must satisfy the following constraints.

  1. The children must consist exclusively of Element, Processing Instruction, Comment, and Text Nodes if it is not empty. Attribute, Namespace, and Document Nodes can never appear as children

  2. The Attribute Nodes of an element must have distinct xs:QNames.

  3. If a node N is among the children of an element E, then the parent of N must be E.

  4. Exclusive of Attribute and Namespace Nodes, if a node N has a parent element E, then N must be among the children of E. (Attribute and Namespace Nodes have a parent, but they do not appear among the children of their parent.)

    The data model permits Element Nodes without parents (to represent partial results during expression processing, for example). Such Element Nodes must not appear among the children of any other node.

  5. If an Attribute Node A is among the attributes of an element E, then the parent of A must be E.

  6. If an Attribute Node A has a parent element E, then A must be among the attributes of E.

    The data model permits Attribute Nodes without parents. Such Attribute Nodes must not appear among the attributes of any Element Node.

  7. If a Namespace Node N is among the namespaces of an element E, then the parent of N must be E.

  8. If a Namespace Node N has a parent element E, then N must be among the namespaces of E.

    The data model permits Namespace Nodes without parents. Such Namespace Nodes must not appear among the namespaces of any Element Node. This constraint is irrelevant for implementations that do not support Namespace Nodes.

  9. If the dm:type-name of an Element Node is xs:untyped, then the dm:type-name of all its descendant elements must also be xs:untyped and the dm:type-name of all its Attribute Nodes must be xs:untypedAtomic.

  10. If the dm:type-name of an Element Node is xs:untyped, then the nilled property must be false.

  11. If the nilled property is true, then the children property must not contain Element Nodes or Text Nodes.

  12. For every expanded QName that appears in the dm:node-name of the element, the dm:node-name of any Attribute Node among the attributes of the element, or in any value of type xs:QName or xs:NOTATION (or any type derived from those types) that appears in the typed-value of the element or the typed-value of any of its attributes, if the expanded QName has a non-empty URI, then there must be a prefix binding for this URI among the namespaces of this Element Node.

    If any of the expanded QNames has an empty URI, then there must not be any binding among the namespaces of this Element Node which binds the empty prefix to a URI.

  13. Every element must include a Namespace Node and/or namespace binding for the prefix xml bound to the URI http://www.w3.org/XML/1998/namespace and there must be no other prefix bound to that URI.

  14. The string-value property of an Element Node must be the concatenation of the string-values of all its Text Node descendants in document order or, if the element has no such descendants, the zero-length string.

6.2.3 Construction from an Infoset

The element information items are required. An Element Node is constructed for each element information item.

The following infoset properties are required: [namespace name], [local name], [children], [attributes], [in-scope namespaces], [base URI], and [parent].

Element Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

node-name

An xs:QName constructed from the [prefix], [local name], and [namespace name] properties.

parent

The node that corresponds to the value of the [parent] property or the empty sequence if there is no parent.

type-name

All Element Nodes constructed from an infoset have the type xs:untyped.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

Default and fixed attributes provided by the DTD are added to the [attributes] and are therefore included in the data model attributes of an element.

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property. Implementations that do not support Namespace Nodes may simply preserve the relevant bindings in this property.

Implementations may ignore namespace information items for namespaces which are not known to be used. A namespace is known to be used if:

Note: applications may rely on namespaces that are not known to be used, for example when QNames are used in content and that content does not have a type of xs:QName Such applications may have difficulty processing data models where some namespaces have been ignored.

nilled

All Element Nodes constructed from an infoset have a nilled property of "false".

string-value

The string-value is constructed from the character information item [children] of the element and all its descendants. The precise rules for selecting significant character information itemsand constructing characters from them is described in 6.7.3 Construction from an Infoset of 6.7 Text Nodes.

This process is equivalent to concatenating the dm:string-values of all of the Text Node descendants of the resulting Element Node.

If the element has no such descendants, the string-value is the empty string.

typed-value

The string-value as an xs:untypedAtomic.

is-id

All Element Nodes constructed from an infoset have a is-id property of "false".

is-idrefs

All Element Nodes constructed from an infoset have a is-idrefs property of "false".

6.2.4 Construction from a PSVI

The following Element Node properties are affected by PSVI properties.

type-name

The type-name is determined as described in 3.3.1.1 Element and Attribute Node Type Names.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

For elements with schema simple types, or complex types with simple content, if the [schema normalized value] PSVI property exists, the processor may use a sequence of nodes containing the Processing Instruction and Comment Nodes corresponding to the processing instruction and comment information items found in the [children] property, plus an optional single Text Node whose string value is the [schema normalized value] for the children property. If the [schema normalized value] is the empty string, the Text Node must not be present, otherwise it must be present.

The relative order of Processing Instruction and Comment Nodes must be preserved, but the position of the Text Node, if it is present, among them is implementation defined.

The effect of the above rules is that where a fixed or default value for an element is defined in the schema, and the element takes this default value, a text node will be created to contain the value, even though there are no character information items representing the value in the PSVI. The position of this text node relative to any comment or processing instruction children is implementation-dependent.

[Schema Part 1] also permits an element with mixed content to take a default or fixed value (which will always be a simple value), but at the time of this writing it is unclear how such a defaulted value is represented in the PSVI. Implementations therefore may represent such a default value by creating a text node, but are not required to do so.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

Default and fixed attributes provided by XML Schema processing are added to the [attributes] and are therefore included in the data model attributes of an element.

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property. Implementations that do not support Namespace Nodes may simply preserve the relevant bindings in this property.

Implementations may ignore namespace information items for namespaces which are not known to be used. A namespace is known to be used if:

  • It appears in the expanded QName of the node-name of the element.

  • It appears in the expanded QName of the node-name of any of the element's attributes.

  • It appears in the expanded QName of any values of type xs:QName that appear among the element's children or the typed values of its attributes.

Note: applications may rely on namespaces that are not known to be used, for example when QNames are used in content and that content does not have a type of xs:QName Such applications may have difficulty processing data models where some namespaces have been ignored.

nilled

If the [validity] property exists on an information item and is "valid" then if the [nil] property exists and is true, then the nilled property is "true". In all other cases, including all cases where schema validity assessment was not attempted or did not succeed, the nilled property is "false".

string-value

The string-value is calculated as follows:

  • If the element is empty: its string value is the zero length string.

  • If the element has a type of xs:untyped, a complex type with element-only content, or a complex type with mixed content: its string-value is the concatenation of the string-values of all its Text Node descendants in document order.

  • If the element has a simple type or a complex type with simple content: its string-value is the [schema normalized value] of the node.

If an implementation stores only the typed value of an element, it may use any valid lexical representation of the typed value for the string-value property.

typed-value

The typed-value is calculated as follows:

  • If the element is of type xs:untyped, its typed-value is its dm:string-value as an xs:untypedAtomic.

  • If the element has a complex type with empty content, its typed-value is the empty sequence.

  • If the element has a simple type or a complex type with simple content: its typed value is computed as described in 3.3.1.2 Typed Value Determination. The result is a sequence of zero or more atomic values. The relationship between the type-name, typed-value, and string-value of an element node is consistent with XML Schema validation.

    Note that in the case of dates and times, the timezone is preserved as described in 3.3.2 Dates and Times, and in the case of xs:QNames and xs:NOTATIONs, the prefix is preserved as described in 3.3.3 QNames and NOTATIONS.

  • If the element has a complex type with mixed content (including xs:anyType), its typed-value is its dm:string-value as an xs:untypedAtomic.

  • Otherwise, the element must be a complex type with element-only content. The typed-value of such an element is undefined. Attempting to access this property with the dm:typed-value accessor always raises an error.

is-id

If the element has a complex type with element-only content, the is-id property is false. Otherwise, if the typed-value of the element consists of exactly one atomic value that value is of type xs:ID, or a type derived from xs:ID, the is-id property is true, otherwise it is false.

is-idrefs

If the element has a complex type with element-only content, the is-idrefs property is false. Otherwise, if any of the atomic values in the typed-value of the element is of type xs:IDREF or xs:IDREFS, or a type derived from one of those types, the is-idrefs property is true, otherwise it is false.

All other properties have values that are consistent with construction from an infoset.

6.3 Attribute Nodes

6.3.4 Construction from a PSVI

The following Attribute Node properties are affected by PSVI properties.

string-value

If an implementation stores only the typed value of an attribute, it may use any valid lexical representation of the typed value for the string-value property.

type-name

The type-name is determined as described in 3.3.1.1 Element and Attribute Node Type Names.

typed-value

The typed-value is calculated as follows:

  • If the attribute is of type xs:untypedAtomic: its typed-value is its dm:string-value as an xs:untypedAtomic.

  • Otherwise, a sequence of zero or more atomic values as described in 3.3.1.2 Typed Value Determination. The relationship between the type-name, typed-value, and string-value of an attribute node is consistent with XML Schema validation.

is-id

If the attribute is named xml:id and its [attribute type] property does not have the value ID, then [xml:id] processing is performed. This will assure that the value does have the type ID and that it is properly normalized. The is-id is always true for attributes named xml:id.

If the type-name is xs:ID or a type derived from xs:ID, true, otherwise false.

is-idrefs

If any of the atomic values in the typed-value of the attributeis of type xs:IDREF or xs:IDREFS, or a type derived from one of those types, the is-idrefs property is true, otherwise it is false.

All other properties have values that are consistent with construction from an infoset.

Note: attributes from the XML Schema instance namespace, "http://www.w3.org/2001/XMLSchema-instance", (xsi:schemaLocation, xsi:type, etc.) appear as ordinary attributes in the data model.

6.4 Namespace Nodes

6.4.1 Overview

Each Namespace Node represents the binding of a namespace URI to a namespace prefix or to the default namespace. Implementations that do not use Namespace Nodes may represent the same information using the namespaces property of an element node. Namespaces have the following properties:

Namespace Nodes must satisfy the following constraints.

  1. If a Namespace Node N is among the namespaces of an element E, then the parent of N must be E.

  2. If a Namespace Node N has a parent element E, then N must be among the namespaces of E.

The data model permits Namespace Nodes without parents, see below.

In XPath 1.0, Namespace Nodes were directly accessible by applications, by means of the namespace axis. In XPath 2.0 the namespace axis is deprecated, and it is not available at all in XQuery 1.0. XPath 2.0 implementations are not required to expose the namespace axis, though they may do so if they wish to offer backwards compatibility.

The information held in namespace nodes is instead made available to applications using functions defined in [Functions and Operators]. Some properties of Namespace Nodes are not exposed by these functions: in particular, properties related to the identity of Namespace Nodes, their parentage, and their position in document order. Implementations that do not expose the namespace axis can therefore avoid the overhead of maintaining this information.

Implementations that expose the namespace axis must provide unique Namespace Nodes for each element. Each element has an associated set of Namespace Nodes, one for each distinct namespace prefix that is in scope for the element (including the xml prefix, which is implicitly declared by [Namespaces in XML] and one for the default namespace if one is in scope for the element. The element is the parent of each of these Namespace Nodes; however, a Namespace Node is not a child of its parent element. In implementations that expose the namespace axis, elements never share namespace nodes.

Note:

In implementations that do not expose the namespace axis, there is no means by which the host language can tell if namespace nodes are shared or not and, in such circumstances, sharing namespace nodes may be a very reasonable implementation strategy.

6.5 Processing Instruction Nodes

6.6 Comment Nodes

6.7 Text Nodes

6.7.3 Construction from an Infoset

The character information items are required. A Text Node is constructed for each maximal sequence of character information items in document order.

The following infoset properties are required: [character code] and [parent].

The following infoset properties are optional: [element content whitespace].

A sequence of character information items is maximal if it satisfies the following constraints:

  1. All of the information items in the sequence have the same parent.

  2. The sequence consists of adjacent character information items uninterrupted by other types of information item.

  3. No other such sequence exists that contains any of the same character information items and is longer.

Text Node properties are derived from the infoset as follows:

content

A string comprised of characters that correspond to the [character code] properties of each of the character information items.

If the resulting Text Node consists entirely of whitespace and the [element content whitespace] property of the character information items used to construct this node are true, the content of the Text Node is the zero-length string. Text Nodes are only allowed to be empty if they have no parents; an empty Text Node will be discarded when its parent is constructed, if it has a parent.

The content of the Text Node is not necessarily normalized as described in the [Character Model]. It is the responsibility of data producers to provide normalized text, and the responsibility of applications to make sure that operations do not de-normalize text.

parent

The node corresponding to the value of the [parent] property.

Text Nodes are only allowed to be empty if they have no parents; an empty Text Node will be discarded when its parent is constructed, if it has a parent.

7 Conformance

The data model is intended primarily as a component that can be used by other specifications. Therefore, the data model relies on specifications that use it (such as [XPath 2.0], [XSLT 2.0], and [XQuery]) to specify conformance criteria for the data model in their respective environments. Specifications that set conformance criteria for their use of the data model must not relax the constraints expressed in this specification.

Authors of conformance criteria for the use of the data model should pay particular attention to the following features of the data model:

  1. Support for the normative construction from an infoset described in 3.2 Construction from an Infoset.

  2. Support for the normative construction from a PSVI described in 3.3 Construction from a PSVI.

  3. Support for XML 1.0 and XML 1.1.

  4. How namespaces are supported, through nodes or through the alternative, implementation-dependent representation.

A XML Information Set Conformance

This specification conforms to the XML Information Set [Infoset]. The following information items must be exposed by the infoset producer to construct a data model unless they are explicitly identified as optional:

Other information items and properties made available by the Infoset processor are ignored. In addition to the properties above, the following PSVI properties are required on Element Information Items and Attribute Information Items if the data model is constructed from a PSVI:

B References

B.1 Normative References

Infoset
XML Information Set (Second Edition), John Cowan and Richard Tobin, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-infoset-20040204. The latest version is available at http://www.w3.org/TR/xml-infoset.
Namespaces in XML
Namespaces in XML, Tim Bray, Dave Hollander, and Andrew Layman, Editors. World Wide Web Consortium, 14 Jan 1999. This version is http://www.w3.org/TR/1999/REC-xml-names-19990114. The latest version is available at http://www.w3.org/TR/REC-xml-names.
Namespaces in XML 1.1
Namespaces in XML 1.1, Andrew Layman, Dave Hollander, Richard Tobin, and Tim Bray, Editors. World Wide Web Consortium, 04 Feb 2004. This version is http://www.w3.org/TR/2004/REC-xml-names11-20040204. The latest version is available at http://www.w3.org/TR/xml-names11/.
xml:id
xml:id Version 1.0, Jonathan Marsh, Daniel Veillard, and Norman Walsh, Editors. World Wide Web Consortium, 09 Nov 2004. This version is http://www.w3.org/TR/2004/WD-xml-id-20041109/. The latest version is available at http://www.w3.org/TR/xml-id/.
XPath 2.0
XML Path Language (XPath) 2.0, Don Chamberlin , Anders Berglund, Scott Boag, et. al., Editors. World Wide Web Consortium, 3 Nov 2005. This version is http://www.w3.org/TR/2005/CR-xpath20-20051103/. The latest version is available at http://www.w3.org/TR/xpath20/.
Functions and Operators
XQuery 1.0 and XPath 2.0 Functions and Operators, Ashok Malhotra, Jim Melton, and Norman Walsh, Editors. World Wide Web Consortium, 3 Nov 2005. This version is http://www.w3.org/TR/2005/CR-xpath-functions-20051103/. The latest version is available at http://www.w3.org/TR/xpath-functions/.
Schema Part 1
XML Schema Part 1: Structures Second Edition, David Beech, Noah Mendelsohn, Murray Maloney, and Henry S. Thompson, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-1-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-1/.
Schema Part 2
XML Schema Part 2: Datatypes Second Edition, Paul V. Biron and Ashok Malhotra, Editors. World Wide Web Consortium, 28 Oct 2004. This version is http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/. The latest version is available at http://www.w3.org/TR/xmlschema-2/.
Serialization
XSLT 2.0 and XQuery 1.0 Serialization, Joanne Tong, Michael Kay, Norman Walsh, et. al., Editors. World Wide Web Consortium, 3 Nov 2005. This version is http://www.w3.org/TR/2005/CR-xslt-xquery-serialization-20051103/. The latest version is available at http://www.w3.org/TR/xslt-xquery-serialization/.
Formal Semantics
XQuery 1.0 and XPath 2.0 Formal Semantics, Jérôme Siméon, Denise Draper, Peter Frankhauser, et. al., Editors. World Wide Web Consortium, 3 Nov 2005. This version is http://www.w3.org/TR/2005/CR-xquery-semantics-20051103/. The latest version is available at http://www.w3.org/TR/xquery-semantics/.
RFC 2119
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner. Network Working Group, IETF, Mar 1997.
RFC 3986
Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, and L. Masinter. Network Working Group, IETF, Jan 2005.
RFC 2987
Internationalized Resource Identifiers (IRIs), M. Duerst and M. Suignard. Network Working Group, IETF, Jan 2005.
Character Model
Character Model for the World Wide Web 1.0: Fundamentals, Misha Wolf, Tex Texin, Richard Ishida, et. al., Editors. World Wide Web Consortium, 22 Nov 2004. This version is http://www.w3.org/TR/2004/PR-charmod-20041122/. The latest version is available at http://www.w3.org/TR/charmod/.

B.2 Other References

XML Query Data Model
XML Query Data Model, Mary Fernández and Jonathan Robie, Editors. World Wide Web Consortium, 15 Feb 2001.
XML Base
XML Base, Jonathan Marsh, Editor. World Wide Web Consortium, 27 Jun 2001. This version is http://www.w3.org/TR/2001/REC-xmlbase-20010627/. The latest version is available at http://www.w3.org/TR/xmlbase/.
XPath 1.0
XML Path Language (XPath) Version 1.0, James Clark and Steven DeRose, Editors. World Wide Web Consortium, 16 Nov 1999. This version is http://www.w3.org/TR/1999/REC-xpath-19991116. The latest version is available at http://www.w3.org/TR/xpath.
XPath 2.0 Requirements
XPath Requirements Version 2.0, Mary Fernández, K Karun, and Mark Scardina, Editors. World Wide Web Consortium, 3 Jun 2005. This version is http://www.w3.org/TR/2005/WD-xpath20req-20050603/. The latest version is available at http://www.w3.org/TR/xpath20req.
XSLT 2.0
XSL Transformations (XSLT) Version 2.0, Michael Kay, Editor. World Wide Web Consortium, 3 Nov 2005. This version is http://www.w3.org/TR/2005/CR-xslt20-20051103/. The latest version is available at http://www.w3.org/TR/xslt20/.
XML Query Working Group
XML Query Working Group, World Wide Web Consortium. Home page: http://www.w3.org/XML/Query
XSL Working Group
XSL Working Group, World Wide Web Consortium. Home page: http://www.w3.org/Style/XSL/
XQuery
XQuery 1.0: An XML Query Language, Don Chamberlin , Anders Berglund, Scott Boag, et. al., Editors. World Wide Web Consortium, 3 Nov 2005. This version is http://www.w3.org/TR/2005/CR-xquery-20051103/. The latest version is available at http://www.w3.org/TR/xquery/.
XML Query Requirements
XML Query (XQuery) Requirements, Don Chamberlin, Peter Fankhauser, Massimo Marchiori, and Jonathan Robie, Editors. World Wide Web Consortium, 3 Jun 2005. This version is http://www.w3.org/TR/2005/WD-xquery-requirements-20050603/. The latest version is available at http://www.w3.org/TR/xquery-requirements/.
ISO 8601

C Schema for the Extended XSNamespace

The following schema defines the additional types in the xs: namespace identified by thisdocument.

You can retrieve the normative schema for this namespace from http://www.w3.org/2006/xpath-datatypes.

<?xml version='1.0'?>

<!--
This is an XML Schema document for the XML Schema namespace,
http://www.w3.org/2001/XMLSchema, that has been extended to 
include definitions for the xs:dayTimeDuration and
xs:yearMonthDuration types.

The other xs: types defined in XDM are not described here because
xs:untyped and xs:anyAtomicType are special types that cannot be
properly defined using XML Schema itself and because xs:untypedAtomic
should not be used for validation, but only used for unvalidated
elements and attributes.
-->

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
	   targetNamespace="http://www.w3.org/2001/XMLSchema"
	   elementFormDefault="qualified"
	   xml:lang="en">

<xs:include schemaLocation="http://www.w3.org/2001/XMLSchema.xsd"/>

<xs:simpleType name='dayTimeDuration'>
  <xs:annotation>
    <xs:documentation
	source="http://www.w3.org/TR/xpath-datamodel#dayTimeDuration"/>
  </xs:annotation>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[^YM]*[DT].*"/>
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name='yearMonthDuration'>
  <xs:annotation>
    <xs:documentation
	source="http://www.w3.org/TR/xpath-datamodel#yearMonthDuration"/>
  </xs:annotation>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[^DT]*"/>
  </xs:restriction>
</xs:simpleType>

</xs:schema>

D Glossary (Non-Normative)

anonymous type name

An anonymous type name is an implementation dependent, unique type name provided by the processor for every anonymous type declared in the schemas available.

atomic type

An atomic type is a primitive simple type or a type derived by restriction from another atomic type.

atomic value

An atomic value is a value in the value space of an atomic type and is labeled with the name of that atomic type.

document

A tree whose root node is a Document Node is referred to as a document.

document order

A document order is defined among all the nodes accessible during a given query or transformation. Document order is a total ordering, although the relative order of some nodes is implementation-dependent. Informally, document order is the order in which nodes appear in the XML serialization of a document.

expanded-QName

An expanded-QName is a set of three values consisting of a possibly empty prefix, a possibly empty namespace URI, and a local name.

fragment

A tree whose root node is not a Document Node is referred to as a fragment.

implementation defined

Implementation-defined indicates an aspect that may differ between implementations, but must be specified by the implementor for each particular implementation.

implementation dependent

Implementation-dependent indicates an aspect that may differ between implementations, is not specified by this or any W3C specification, and is not required to be specified by the implementor for any particular implementation.

incompletely validated

An incompletely validated document is an XML document that has a corresponding schema but whose schema-validity assessment has resulted in one or more element or attribute information items being assigned values other than 'valid' for the [validity] property in the PSVI.

instance of the data model

Every instance of the data model is a sequence.

item

An item is either a node or an atomic value

Node

There are seven kinds of Nodes in the data model: document, element, attribute, text, namespace, processing instruction, and comment.

primitive simple type

There are 23 primitive simple types: the 19 defined in Section 3.2 Primitive datatypesXS2 of [Schema Part 2] and xs:untyped, xs:untypedAtomic, xs:anyAtomicType, xs:dayTimeDuration, andxs:yearMonthDuration

sequence

A sequence is an ordered collection of zero or more items.

stable

Document order is stable, which means that the relative order of two nodes will not change during the processing of a given query or transformation, even if this order is implementation-dependent.

E Example (Non-Normative)

The following XML document is used to illustrate the information contained in a data model:

The document is associated with the URI "http://www.example.com/catalog.xml", and is valid with respect to the following XML schema:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           xmlns:cat="http://www.example.com/catalog"
           xmlns:xlink="http://www.w3.org/1999/xlink"
           targetNamespace="http://www.example.com/catalog"
           elementFormDefault="qualified">

<xs:import namespace="http://www.w3.org/XML/1998/namespace"
           schemaLocation="http://www.w3.org/2001/xml.xsd" />

<xs:import namespace="http://www.w3.org/1999/xlink"
           schemaLocation="http://www.cs.rpi.edu/~puninj/XGMML/xlinks-2001.xsd" />

<xs:element name="catalog">
  <xs:complexType>
    <xs:sequence>
      <xs:element ref="cat:_item" maxOccurs="unbounded" />
    </xs:sequence>
    <xs:attribute name="version" type="xs:string" fixed="0.1" use="required" />
    <xs:attribute ref="xml:base" />
    <xs:attribute ref="xml:lang" />
  </xs:complexType>
</xs:element>

<xs:element name="_item" type="cat:itemType" abstract="true" />

<xs:complexType name="itemType">
  <xs:sequence>
    <xs:element name="title" type="xs:token" />
    <xs:element name="description" type="cat:description" nillable="true" />
    <xs:element name="price" type="cat:price" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute name="label" type="xs:token" />
  <xs:attribute name="code" type="xs:ID" use="required" />
  <xs:attributeGroup ref="xlink:simpleLink" />
</xs:complexType>

<xs:element name="tshirt" type="cat:tshirtType" substitutionGroup="cat:_item" />

<xs:complexType name="tshirtType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:attribute name="sizes" type="cat:clothesSizes" use="required" />
    </xs:extension>
  </xs:complexContent>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:simpleType name="clothesSizes">
  <xs:union memberTypes="cat:sizeList">
    <xs:simpleType>
      <xs:restriction base="xs:token">
        <xs:enumeration value="oneSize" />
      </xs:restriction>
    </xs:simpleType>
  </xs:union>
</xs:simpleType>

<xs:simpleType name="sizeList">
  <xs:restriction>
    <xs:simpleType>
      <xs:list itemType="cat:clothesSize" />
    </xs:simpleType>
    <xs:minLength value="1" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="clothesSize">
  <xs:union memberTypes="cat:numberedSize cat:categorySize" />
</xs:simpleType>

<xs:simpleType name="numberedSize">
  <xs:restriction base="xs:integer">
    <xs:enumeration value="4" />
    <xs:enumeration value="6" />
    <xs:enumeration value="8" />
    <xs:enumeration value="10" />
    <xs:enumeration value="12" />
    <xs:enumeration value="14" />
    <xs:enumeration value="16" />
    <xs:enumeration value="18" />
    <xs:enumeration value="20" />
    <xs:enumeration value="22" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="categorySize">
  <xs:restriction base="xs:token">
    <xs:enumeration value="XS" />
    <xs:enumeration value="S" />
    <xs:enumeration value="M" />
    <xs:enumeration value="L" />
    <xs:enumeration value="XL" />
    <xs:enumeration value="XXL" />
  </xs:restriction>
</xs:simpleType>

<xs:element name="album" type="cat:albumType" substitutionGroup="cat:_item" />

<xs:complexType name="albumType">
  <xs:complexContent>
    <xs:extension base="cat:itemType">
      <xs:sequence>
        <xs:element name="artist" type="xs:string" />
      </xs:sequence>
      <xs:attribute name="formats" type="cat:formatsType" use="required" />
    </xs:extension>
  </xs:complexContent>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:simpleType name="formatsType">
  <xs:list itemType="cat:formatType" />
</xs:simpleType>

<xs:simpleType name="formatType">
  <xs:restriction base="xs:token">
    <xs:enumeration value="CD" />
    <xs:enumeration value="MiniDisc" />
    <xs:enumeration value="tape" />
    <xs:enumeration value="vinyl" />
  </xs:restriction>
</xs:simpleType>

<xs:complexType name="description" mixed="true">
  <xs:sequence>
    <xs:any namespace="http://www.w3.org/1999/xhtml" processContents="lax"
            minOccurs="0" maxOccurs="unbounded" />
  </xs:sequence>
  <xs:attribute ref="xml:lang" />
</xs:complexType>

<xs:complexType name="price">
  <xs:simpleContent>
    <xs:extension base="cat:monetaryAmount">
      <xs:attribute name="currency" type="cat:currencyType" default="USD" />
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

<xs:simpleType name="currencyType">
  <xs:restriction base="xs:token">
    <xs:pattern value="[A-Z]{3}" />
  </xs:restriction>
</xs:simpleType>

<xs:simpleType name="monetaryAmount">
  <xs:restriction base="xs:decimal">
    <xs:fractionDigits value="3" />
    <xs:pattern value="\d+\.\d{2-3}?" />
  </xs:restriction>
</xs:simpleType>

</xs:schema>

The schema is associated with the URI "http://www.example.com/dm-example.xsd".

This example exposes the data model for a document that has an associated schema and has been validated successfully against it. In general, an XML Schema is not required, that is, the data model can represent a schemaless, well-formed XML document with the rules described in 2.6 Types.

The XML document is represented by the nodes described below. The value D1 represents a Document Node; the values E1, E2, etc. represent Element Nodes; the values A1, A2, etc. represent Attribute Nodes; the values N1, N2, etc. represent Namespace Nodes; the values P1, P2, etc. represent Processing Instruction Nodes; the values T1, T2, etc. represent Text Nodes.

For brevity:

// Document node D1
dm:base-uri(D1)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(D1)"document"
dm:string-value(D1)="  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  "
dm:typed-value(D1)=xdt:untypedAtomic("  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  ")
dm:children(D1)([E1])
 
// Namespace node N1
dm:node-kind(N1)"namespace"
dm:node-name(N1)xs:QName("", "xml")
dm:string-value(N1)="http://www.w3.org/XML/1998/namespace"
dm:typed-value(N1)="http://www.w3.org/XML/1998/namespace"
 
// Namespace node N2
dm:node-kind(N2)"namespace"
dm:node-name(N2)()
dm:string-value(N2)="http://www.example.com/catalog"
dm:typed-value(N2)="http://www.example.com/catalog"
 
// Namespace node N3
dm:node-kind(N3)"namespace"
dm:node-name(N3)xs:QName("", "html")
dm:string-value(N3)="http://www.w3.org/1999/xhtml"
dm:typed-value(N3)="http://www.w3.org/1999/xhtml"
 
// Namespace node N4
dm:node-kind(N4)"namespace"
dm:node-name(N4)xs:QName("", "xlink")
dm:string-value(N4)="http://www.w3.org/1999/xlink"
dm:typed-value(N4)="http://www.w3.org/1999/xlink"
 
// Namespace node N5
dm:node-kind(N5)"namespace"
dm:node-name(N5)xs:QName("", "xsi")
dm:string-value(N5)="http://www.w3.org/2001/XMLSchema-instance"
dm:typed-value(N5)="http://www.w3.org/2001/XMLSchema-instance"
 
// Processing Instruction node P1
dm:base-uri(P1)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(P1)"processing-instruction"
dm:node-name(P1)xs:QName("", "xml-stylesheet")
dm:string-value(P1)="type="text/xsl"  href="dm-example.xsl""
dm:typed-value(P1)="type="text/xsl"  href="dm-example.xsl""
dm:parent(P1)([D1])
 
// Element node E1
dm:base-uri(E1)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E1)"element"
dm:node-name(E1)xs:QName("http://www.example.com/catalog", "catalog")
dm:string-value(E1)="  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00    It's  Been  A  While    10.99    Staind  "
dm:typed-value(E1)fn:error()
dm:type-name(E1)anon:TYP000001
dm:is-id(E1)false
dm:is-idrefs(E1)false
dm:parent(E1)([D1])
dm:children(E1)([E2], [E7])
dm:attributes(E1)([A1], [A2], [A3])
dm:namespace-nodes(E1)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E1)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Attribute node A1
dm:node-kind(A1)"attribute"
dm:node-name(A1)xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:schemaLocation")
dm:string-value(A1)="http://www.example.com/catalog                                                            dm-example.xsd"
dm:typed-value(A1)(xs:anyURI("http://www.example.com/catalog"), xs:anyURI("catalog.xsd"))
dm:type-name(A1)anon:TYP000002
dm:is-id(A1)false
dm:is-idrefs(A1)false
dm:parent(A1)([E1])
 
// Attribute node A2
dm:node-kind(A2)"attribute"
dm:node-name(A2)xs:QName("http://www.w3.org/XML/1998/namespace", "xml:lang")
dm:string-value(A2)="en"
dm:typed-value(A2)"en"
dm:type-name(A2)xs:NMTOKEN
dm:is-id(A2)false
dm:is-idrefs(A2)false
dm:parent(A2)([E1])
 
// Attribute node A3
dm:node-kind(A3)"attribute"
dm:node-name(A3)xs:QName("", "version")
dm:string-value(A3)="0.1"
dm:typed-value(A3)"0.1"
dm:type-name(A3)xs:string
dm:is-id(A3)false
dm:is-idrefs(A3)false
dm:parent(A3)([E1])
 
// Comment node C1
dm:base-uri(C1)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(C1)"comment"
dm:string-value(C1)="  This  example  is  for  data  model  illustration  only.\n          It  does  not  demonstrate  good  schema  design.  "
dm:typed-value(C1)"  This  example  is  for  data  model  illustration  only.\n          It  does  not  demonstrate  good  schema  design.  "
dm:parent(C1)([E1])
 
// Element node E2
dm:base-uri(E2)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E2)"element"
dm:node-name(E2)xs:QName("http://www.example.com/catalog", "tshirt")
dm:string-value(E2)="  Staind:  Been  Awhile  Tee  Black  (1-sided)  \n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n          25.00  "
dm:typed-value(E2)fn:error()
dm:type-name(E2)cat:tshirtType
dm:is-id(E2)false
dm:is-idrefs(E2)false
dm:parent(E2)([E1])
dm:children(E2)([E3], [E4], [E6])
dm:attributes(E2)([A4], [A5], [A6], [A7])
dm:namespace-nodes(E2)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E2)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Attribute node A4
dm:node-kind(A4)"attribute"
dm:node-name(A4)xs:QName("", "code")
dm:string-value(A4)="T1534017"
dm:typed-value(A4)xs:ID("T1534017")
dm:type-name(A4)xs:ID
dm:is-id(A4)true
dm:is-idrefs(A4)false
dm:parent(A4)([E2])
 
// Attribute node A5
dm:node-kind(A5)"attribute"
dm:node-name(A5)xs:QName("", "label")
dm:string-value(A5)="Staind  :  Been  Awhile"
dm:typed-value(A5)xs:token("Staind : Been Awhile")
dm:type-name(A5)xs:token
dm:is-id(A5)false
dm:is-idrefs(A5)false
dm:parent(A5)([E2])
 
// Attribute node A6
dm:node-kind(A6)"attribute"
dm:node-name(A6)xs:QName("http://www.w3.org/1999/xlink", "xlink:href")
dm:string-value(A6)="http://example.com/0,,1655091,00.html"
dm:typed-value(A6)xs:anyURI("http://example.com/0,,1655091,00.html")
dm:type-name(A6)xs:anyURI
dm:is-id(A6)false
dm:is-idrefs(A6)false
dm:parent(A6)([E2])
 
// Attribute node A7
dm:node-kind(A7)"attribute"
dm:node-name(A7)xs:QName("", "sizes")
dm:string-value(A7)="M  L  XL"
dm:typed-value(A7)(xs:token("M"), xs:token("L"), xs:token("XL"))
dm:type-name(A7)cat:sizeList
dm:is-id(A7)false
dm:is-idrefs(A7)false
dm:parent(A7)([E2])
 
// Element node E3
dm:base-uri(E3)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E3)"element"
dm:node-name(E3)xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E3)="Staind:  Been  Awhile  Tee  Black  (1-sided)"
dm:typed-value(E3)xs:token("Staind: Been Awhile Tee Black (1-sided)")
dm:type-name(E3)xs:token
dm:is-id(E3)false
dm:is-idrefs(E3)false
dm:parent(E3)([E2])
dm:children(E3)()
dm:attributes(E3)()
dm:namespace-nodes(E3)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E3)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Text node T1
dm:base-uri(T1)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T1)"text"
dm:string-value(T1)="Staind:  Been  Awhile  Tee  Black  (1-sided)"
dm:typed-value(T1)xdt:untypedAtomic("Staind:  Been  Awhile  Tee  Black  (1-sided)")
dm:type-name(T1)xdt:untypedAtomic
dm:parent(T1)([E3])
 
// Element node E4
dm:base-uri(E4)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E4)"element"
dm:node-name(E4)xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E4)="\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(E4)xdt:untypedAtomic("\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        ")
dm:type-name(E4)cat:description
dm:is-id(E4)false
dm:is-idrefs(E4)false
dm:parent(E4)([E2])
dm:children(E4)([E5])
dm:attributes(E4)()
dm:namespace-nodes(E4)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E4)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Element node E5
dm:base-uri(E5)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E5)"element"
dm:node-name(E5)xs:QName("http://www.w3.org/1999/xhtml", "html:p")
dm:string-value(E5)="\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(E5)xdt:untypedAtomic("\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        ")
dm:type-name(E5)xs:anyType
dm:is-id(E5)false
dm:is-idrefs(E5)false
dm:parent(E5)([E4])
dm:children(E5)()
dm:attributes(E5)()
dm:namespace-nodes(E5)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E5)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Text node T2
dm:base-uri(T2)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T2)"text"
dm:string-value(T2)="\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        "
dm:typed-value(T2)xdt:untypedAtomic("\n            Lyrics  from  the  hit  song  'It's  Been  Awhile'\n            are  shown  in  white,  beneath  the  large\n            'Flock  &  Weld'  Staind  logo.\n        ")
dm:type-name(T2)xdt:untypedAtomic
dm:parent(T2)([E5])
 
// Element node E6
dm:base-uri(E6)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E6)"element"
dm:node-name(E6)xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E6)="25.00"
// The typed-value is based on the content type of the complex type for the element
dm:typed-value(E6)cat:monetaryAmount(25.0)
dm:type-name(E6)cat:price
dm:is-id(E6)false
dm:is-idrefs(E6)false
dm:parent(E6)([E2])
dm:children(E6)()
dm:attributes(E6)()
dm:namespace-nodes(E6)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E6)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Text node T3
dm:base-uri(T3)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T3)"text"
dm:string-value(T3)="25.00"
dm:typed-value(T3)xdt:untypedAtomic("25.00")
dm:type-name(T3)xdt:untypedAtomic
dm:parent(T3)([E6])
 
// Element node E7
dm:base-uri(E7)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E7)"element"
dm:node-name(E7)xs:QName("http://www.example.com/catalog", "album")
dm:string-value(E7)="  It's  Been  A  While    10.99    Staind  "
dm:typed-value(E7)fn:error()
dm:type-name(E7)cat:albumType
dm:is-id(E7)false
dm:is-idrefs(E7)false
dm:parent(E7)([E1])
dm:children(E7)([E8], [E9], [E10], [E11])
dm:attributes(E7)([A8], [A9], [A10])
dm:namespace-nodes(E7)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E7)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Attribute node A8
dm:node-kind(A8)"attribute"
dm:node-name(A8)xs:QName("", "code")
dm:string-value(A8)="A1481344"
dm:typed-value(A8)xs:ID("A1481344")
dm:type-name(A8)xs:ID
dm:is-id(A8)true
dm:is-idrefs(A8)false
dm:parent(A8)([E7])
 
// Attribute node A9
dm:node-kind(A9)"attribute"
dm:node-name(A9)xs:QName("", "label")
dm:string-value(A9)="Staind  :  Its  Been  A  While"
dm:typed-value(A9)xs:token("Staind : Its Been A While")
dm:type-name(A9)xs:token
dm:is-id(A9)false
dm:is-idrefs(A9)false
dm:parent(A9)([E7])
 
// Attribute node A10
dm:node-kind(A10)"attribute"
dm:node-name(A10)xs:QName("", "formats")
dm:string-value(A10)="CD"
dm:typed-value(A10)cat:formatType("CD")
dm:type-name(A10)cat:formatType
dm:is-id(A10)false
dm:is-idrefs(A10)false
dm:parent(A10)([E7])
 
// Element node E8
dm:base-uri(E8)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E8)"element"
dm:node-name(E8)xs:QName("http://www.example.com/catalog", "title")
dm:string-value(E8)="It's  Been  A  While"
dm:typed-value(E8)xs:token("It's Been A While")
dm:type-name(E8)xs:token
dm:is-id(E8)false
dm:is-idrefs(E8)false
dm:parent(E8)([E7])
dm:children(E8)()
dm:attributes(E8)()
dm:namespace-nodes(E8)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E8)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Text node T4
dm:base-uri(T4)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T4)"text"
dm:string-value(T4)="It's  Been  A  While"
dm:typed-value(T4)xdt:untypedAtomic("It's  Been  A  While")
dm:type-name(T4)xdt:untypedAtomic
dm:parent(T4)([E8])
 
// Element node E9
dm:base-uri(E9)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E9)"element"
dm:node-name(E9)xs:QName("http://www.example.com/catalog", "description")
dm:string-value(E9)=""
// xsi:nil is true so the typed value is the empty sequence
dm:typed-value(E9)()
dm:type-name(E9)cat:description
dm:is-id(E9)false
dm:is-idrefs(E9)false
dm:parent(E9)([E7])
dm:children(E9)()
dm:attributes(E9)([A11])
dm:namespace-nodes(E9)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E9)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Attribute node A11
dm:node-kind(A11)"attribute"
dm:node-name(A11)xs:QName("http://www.w3.org/2001/XMLSchema-instance", "xsi:nil")
dm:string-value(A11)="true"
dm:typed-value(A11)xs:boolean("true")
dm:type-name(A11)xs:boolean
dm:is-id(A11)false
dm:is-idrefs(A11)false
dm:parent(A11)([E9])
 
// Element node E10
dm:base-uri(E10)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E10)"element"
dm:node-name(E10)xs:QName("http://www.example.com/catalog", "price")
dm:string-value(E10)="10.99"
dm:typed-value(E10)cat:monetaryAmount(10.99)
dm:type-name(E10)cat:price
dm:is-id(E10)false
dm:is-idrefs(E10)false
dm:parent(E10)([E7])
dm:children(E10)()
dm:attributes(E10)([A12])
dm:namespace-nodes(E10)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E10)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Attribute node A12
dm:node-kind(A12)"attribute"
dm:node-name(A12)xs:QName("", "currency")
dm:string-value(A12)="USD"
dm:typed-value(A12)cat:currencyType("USD")
dm:type-name(A12)cat:currencyType
dm:is-id(A12)false
dm:is-idrefs(A12)false
dm:parent(A12)([E10])
 
// Text node T5
dm:base-uri(T5)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T5)"text"
dm:string-value(T5)="10.99"
dm:typed-value(T5)xdt:untypedAtomic("10.99")
dm:type-name(T5)xdt:untypedAtomic
dm:parent(T5)([E10])
 
// Element node E11
dm:base-uri(E11)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(E11)"element"
dm:node-name(E11)xs:QName("http://www.example.com/catalog", "artist")
dm:string-value(E11)="  Staind  "
dm:typed-value(E11)" Staind "
dm:type-name(E11)xs:string
dm:is-id(E11)false
dm:is-idrefs(E11)false
dm:parent(E11)([E7])
dm:children(E11)()
dm:attributes(E11)()
dm:namespace-nodes(E11)([N1], [N2], [N3], [N4], [N5])
dm:namespace-bindings(E11)("xml", "http://www.w3.org/XML/1998/namespace", "", "http://www.example.com/catalog", "html", "http://www.w3.org/1999/xhtml", "xlink", "http://www.w3.org/1999/xlink", "xsi", "http://www.w3.org/2001/XMLSchema-instance")
 
// Text node T6
dm:base-uri(T6)xs:anyURI("http://www.example.com/catalog.xml")
dm:node-kind(T6)"text"
dm:string-value(T6)="  Staind  "
dm:typed-value(T6)xdt:untypedAtomic("  Staind  ")
dm:type-name(T6)xdt:untypedAtomic
dm:parent(T6)([E11])
 

A graphical representation of the data model for the preceding example is shown below. Document order in this representation can be found by following the traditional left-to-right, depth-first traversal; however, because the image has been rotated for easier presentation, this appears to be bottom-to-top, depth-first order.

Graphical depiction of the example data model.
Graphic representation of the data model. [large view, SVG]

F Implementation-Defined and Implementation-Dependent Items (Non-Normative)

F.1 Implementation-Defined Items

The following items are implementation-defined.

  1. Support for additional user-defined or implementation-defined types is implementation-defined. (See 2.6.1 Representation of Types)
  2. Some typed values in the data model are undefined. Attempting to access an undefined property is always an error. Behavior in these cases is implementation-defined and the host language is responsible for determining the result. (See 5 Accessors)

F.2 Implementation-Dependent Items

The following items are implementation-dependent.

  1. The relative order of Namespace Nodes nodes is stable but implementation-dependent. (See 2.4 Document Order)
  2. The relative order of Attribute Nodes nodes is stable but implementation-dependent. (See 2.4 Document Order)
  3. The relative order of distinct trees is stable but implementation-dependent. (See 2.4 Document Order)
  4. The names of anonymous types are implementation-dependent. (See 2.6.1 Representation of Types)
  5. The prefix associated with type names is implementation-dependent. (See 3.3.1.1 Element and Attribute Node Type Names)
  6. The representation of the set of prefix/URI pairs returned by the dm:namespace-bindings accessor is implementation-dependent. (See 5.7 namespace-bindings Accessor)
  7. The representation of namespaces, i.e. whether or not they are represented as nodes, is implementation-dependent. (See 6 Nodes)

G Change Log (Non-Normative)

This appendix details the changes made since the 4 April 2005 Last Call Working Draft.

G.1 Changes Since 3 November 2005

This section details the changes made since the 3 November 2005 Candidate Recommendation.

  • Fixed bug 3032: Setting the is-id and is-idref property on element nodes.

  • Fixed bug 2548: Change namespace bindings for xdt:* to xs:*

  • Fixed bug 2629: editorial rewording to avoid potentially ambiguous phrase “may not”.

  • Fixed bug 2706: Added the [validation attempted] and [nil] properties to Appendix A. Double checked other uses of info-item and inforset-property.

  • Fixed bug 2463: Added constraints on the string-value of Element Node and Document Nodes.

  • Fixed bug 2772: Removed spurious comma and unnecessary “respectively”.

  • Fixed bug 2630: Added the proposed text about fixed and default values.

G.2 Changes Since 15 September 2005

This section details the changes made since the 15 September 2005 Working Draft.

  • Added a suggested abbreviation for the Data Model: “XDM”.

  • Clarified how text nodes are constructed from an Infoset or PSVI with respect to white space.

  • Noted the white space collapsing required by data model construction as an issue of particular interest in the Status.

  • Added expansion of PSVI and pointer to definition in Schema Part 1.

G.3 Changes Since 4 April 2005

This section details the changes made since the 4 April 2005 Last Call Working Draft.

  • Removed references to leap seconds per WG decision to stop supporting them.

  • Fixed bug 1295 by changing the definition of xdt:anyAtomicType in the following way: [Definition: xdt:anyAtomicType is an atomic type that includes all atomic values (and no values that are not atomic).] Its base type is xs:anySimpleType from which all simple types, including atomic, list, and union types are derived. All primitive atomic types, such as xs:integer and xs:string, have xdt:anyAtomicType as their base type.

  • Fixed bug 1296 by changing the description of [declaration base URI] in the following way: Implementation defined. In the many cases, the document-uri is the correct answer and implementations MUST use this value if they have no better information. Implementations that keep track of the original declaration base URI for entities should use that value

  • Made it clear that types derived from xsd:ID are IDs for the purpose is-id and that types derived from xsd:IDREF are IDREFs for the purpose of is-idref.

  • Fixed bug 1341 by adding the following text to the terminology section: Within this specification, the term URI refers to a Universal Resource Identifier as defined in RFC 3986 and extended in RFC 3987 with the new name IRI. The term URI has been retained in preference to IRI to avoid introducing new names for concepts such as 'Base URI' that are defined or referenced across the whole family of XML specifications.

  • Fixed bug 1297 by clarifying that the result should be the empty sequence.

  • Fixed bug 1294 by rewording the paragraph in question as follows: The data model relies on the fact that an expanded-QName uniquely identifies every named type. Although it is possible for different schemas to define different types with the same expanded-QName, at most one of them can be used in any given validation episode. The data model cannot support environments where different types with the same expanded-QName are available.

  • Fixed bug 1293 by making the relevant editorial corrections.

  • Declined to fix bug 1342.

  • Fixed bug 1299 by making the obvious editorial change.

  • Fixed bug 1301 by making the obvious editorial change.

  • Changes to the XSLT enforce the constraint identified in bug 1234.

  • Fixed bug 1482 by making the editorial change suggested.

  • Fixed bug 1259 by making the necessary changes to the text. See also bug 1232.

  • Fixed bug 1409 by making the changes resolved at the Raleigh f2f meeting.

  • Fixed bug 1315 by replacing bullet #2 in the referenced section with the following: "If the element has a complex type with empty content, its typed-value is the empty sequence.

  • Fixed bug 1300 by removing the offending paragraph.

  • Fixed bug 1302 by changing the phrase to make it clear that it is a MUST.

  • Fixed bug 1231 by accepting the proposed changes to the pattern facets.

  • Added a note about how 24:00:00 is handled.

H Accessor Summary (Non-Normative)

This section summarizes the return values of each accessor by node type.

H.1 dm:attributes Accessor

Document Nodes

Returns the empty sequence

Element Nodes

Returns the value of the attributes property. The order of Attribute Nodes is stable but implementation dependent.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.2 dm:base-uri Accessor

Document Nodes

Returns the value of the base-uri property if it exists and is not empty, otherwise returns the empty sequence.

Element Nodes

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the element has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns the empty sequence.

Attribute Nodes

If the attribute has a parent, returns the value of the dm:base-uri of its parent; otherwise it returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the value of the base-uri property if it exists and is not empty. Otherwise, if the processing instruction has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns the empty sequence.

Comment Nodes

If the comment has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns the empty sequence.

Text Nodes

If the Text Node has a parent, returns the value of the dm:base-uri of its parent; otherwise, returns the empty sequence.

H.3 dm:children Accessor

Document Nodes

Returns the value of the children property.

Element Nodes

Returns the value of the children property.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.4 dm:document-uri Accessor

Document Nodes

Returns the absolute URI of the resource from which the Document Node was constructed, or the empty sequence if no such absolute URI is available.

Element Nodes

Returns the empty sequence.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.5 dm:is-id Accessor

Document Nodes

Returns the empty sequence.

Element Nodes

Returns the value of the is-id property.

Attribute Nodes

Returns the value of the is-id property.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.6 dm:is-idrefs Accessor

Document Nodes

Returns the empty sequence.

Element Nodes

Returns the value of the is-idrefs property.

Attribute Nodes

Returns the value of the is-idrefs property.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.7 dm:namespace-bindings Accessor

Document Nodes

Returns the empty sequence

Element Nodes

Returns the value of the namespaces property as a set of prefix/URI pairs.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.8 dm:namespace-nodes Accessor

Document Nodes

Returns the empty sequence

Element Nodes

Returns the value of the namespaces property as a sequence of Namespace Nodes. The order of Namespace Nodes is stable but implementation dependent.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.9 dm:nilled Accessor

Document Nodes

Returns the empty sequence

Element Nodes

Returns the value of the nilled property.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.10 dm:node-kind Accessor

Document Nodes

Returns “document”.

Element Nodes

Returns “element”.

Attribute Nodes

Returns “attribute”.

Namespace Nodes

Returns “namespace”.

Processing Instruction Nodes

Returns “processing-instruction”.

Comment Nodes

Returns “comment”.

Text Nodes

Returns “text”.

H.11 dm:node-name Accessor

Document Nodes

Returns the empty sequence.

Element Nodes

Returns the value of the node-name property.

Attribute Nodes

Returns the value of the node-name property.

Namespace Nodes

If the prefix is not empty, returns an xs:QName with the value of the prefix property in the local-name and an empty namespace name, otherwise returns the empty sequence.

Processing Instruction Nodes

Returns an xs:QName with the value of the target property in the local-name and an empty namespace URI and empty prefix.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.12 dm:parent Accessor

Document Nodes

Returns the empty sequence

Element Nodes

Returns the value of the parent property.

Attribute Nodes

Returns the value of the parent property.

Namespace Nodes

Returns the value of the parent property.

Processing Instruction Nodes

Returns the value of the parent property.

Comment Nodes

Returns the value of the parent property.

Text Nodes

Returns the value of the parent property.

H.13 dm:string-value Accessor

Document Nodes

Returns the value of the string-value property.

Element Nodes

Returns the value of the string-value property.

Attribute Nodes

Returns the value of the string-value property.

Namespace Nodes

Returns the value of the uri property.

Processing Instruction Nodes

Returns the value of the content property.

Comment Nodes

Returns the value of the content property.

Text Nodes

Returns the value of the content property.

H.14 dm:type-name Accessor

Document Nodes

Returns the empty sequence.

Element Nodes

Returns the value of the type-name property.

Attribute Nodes

Returns the value of the type-name property.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns xs:untypedAtomic.

H.15 dm:typed-value Accessor

Document Nodes

Returns the value of the typed-value property.

Element Nodes

Returns the value of the typed-value property.

Attribute Nodes

Returns the value of the typed-value property.

Namespace Nodes

Returns the value of the uri property as an xs:string.

Processing Instruction Nodes

Returns the value of the content property as a xs:string.

Comment Nodes

Returns the value of the content property as a xs:string.

Text Nodes

Returns the value of the content property as an xs:untypedAtomic.

H.16 dm:unparsed-entity-public-id Accessor

Document Nodes

Returns the public identifier of the specified unparsed entity or the empty sequence if no such entity exists.

Element Nodes

Returns the empty sequence.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

H.17 dm:unparsed-entity-system-id Accessor

Document Nodes

Returns the system identifier of the specified unparsed entity or the empty sequence if no such entity exists.

Element Nodes

Returns the empty sequence.

Attribute Nodes

Returns the empty sequence.

Namespace Nodes

Returns the empty sequence.

Processing Instruction Nodes

Returns the empty sequence.

Comment Nodes

Returns the empty sequence.

Text Nodes

Returns the empty sequence.

I Infoset Construction Summary (Non-Normative)

This section summarizes data model construction from an Infoset for each kind of information item. General notes occur elsewhere.

I.1 Document Nodes Information Items

The document information item is required. A Document Node is constructed for each document information item.

The following infoset properties are required: [children] and [base URI].

The following infoset properties are optional: [unparsed entities].

Document Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, and comment found in the [children] property, a corresponding Element, Processing Instruction, or Comment Node is constructed and that sequence of nodes is used as the value of the children property.

If present among the [children], the document type declaration information item is ignored.

unparsed-entities

If the [unparsed entities] property is present and is not the empty set, the values of the unparsed entity information items must be used to support the dm:unparsed-entity-system-id and dm:unparsed-entity-public-id accessors.

The internal structure of the values of the unparsed-entities property is implementation defined.

string-value

The concatenation of the string-values of all its Text Node descendants in document order. If the document has no such descendants, the zero-length string.

typed-value

The dm:string-value of the node as an xs:untypedAtomic value.

document-uri

The document-uri property holds the absolute URI for the resource from which the document node was constructed, if one is available and can be made absolute. For example, if a collection of documents is returned by the fn:collection function, the document-uri property may serve to distinguish between them even though each has the same base-uri property.

If the document-uri is not the empty sequence, then the following constraint must hold: the node returned by evaluating fn:doc() with the document-uri as its argument must return the document node that provided the value of the document-uri property.

In other words, for any Document Node $arg, either fn:document-uri($arg) must return the empty sequence or fn:doc(fn:document-uri($arg)) must return $arg.

I.2 Element Nodes Information Items

The element information items are required. An Element Node is constructed for each element information item.

The following infoset properties are required: [namespace name], [local name], [children], [attributes], [in-scope namespaces], [base URI], and [parent].

Element Node properties are derived from the infoset as follows:

base-uri

The value of the [base URI] property.

node-name

An xs:QName constructed from the [prefix], [local name], and [namespace name] properties.

parent

The node that corresponds to the value of the [parent] property or the empty sequence if there is no parent.

type-name

All Element Nodes constructed from an infoset have the type xs:untyped.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

Default and fixed attributes provided by the DTD are added to the [attributes] and are therefore included in the data model attributes of an element.

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property. Implementations that do not support Namespace Nodes may simply preserve the relevant bindings in this property.

Implementations may ignore namespace information items for namespaces which are not known to be used. A namespace is known to be used if:

Note: applications may rely on namespaces that are not known to be used, for example when QNames are used in content and that content does not have a type of xs:QName Such applications may have difficulty processing data models where some namespaces have been ignored.

nilled

All Element Nodes constructed from an infoset have a nilled property of "false".

string-value

The string-value is constructed from the character information item [children] of the element and all its descendants. The precise rules for selecting significant character information itemsand constructing characters from them is described in 6.7.3 Construction from an Infoset of 6.7 Text Nodes.

This process is equivalent to concatenating the dm:string-values of all of the Text Node descendants of the resulting Element Node.

If the element has no such descendants, the string-value is the empty string.

typed-value

The string-value as an xs:untypedAtomic.

is-id

All Element Nodes constructed from an infoset have a is-id property of "false".

is-idrefs

All Element Nodes constructed from an infoset have a is-idrefs property of "false".

I.3 Attribute Nodes Information Items

The attribute information items are required. An Attribute Node is constructed for each attribute information item.

The following infoset properties are required: [namespace name], [local name], [normalized value], [attribute type], and [owner element].

Attribute Node properties are derived from the infoset as follows:

node-name

An xs:QName constructed from the [prefix], [local name], and [namespace name] properties.

parent

The Element Node that corresponds to the value of the [owner element] property or the empty sequence if there is no owner.

type-name

The value xs:untypedAtomic.

string-value

The [normalized value] of the attribute.

typed-value

The attribute’s typed-value is its dm:string-value as an xs:untypedAtomic.

is-id

If the attribute is named xml:id and its [attribute type] property does not have the value ID, then [xml:id] processing is performed. This will assure that the value does have the type ID and that it is properly normalized. The is-id is always true for attributes named xml:id.

If the [attribute type] property has the value ID, true, otherwise false.

is-idrefs

If the [attribute type] property has the value IDREF or IDREFS, true, otherwise false.

I.4 Namespace Nodes Information Items

The namespace information items are required.

The following infoset properties are required: [prefix], [namespace name].

Namespace Node properties are derived from the infoset as follows:

prefix

The [prefix] property.

uri

The [namespace name] property.

parent

The element in whose [in-scope namespaces] property the namespace information item appears, if the implementation exposes any mechanism for accessing the dm:parent accessor of Namespace Nodes.

I.5 Processing Instruction Nodes Information Items

A Processing Instruction Node is constructed for each processing instruction information item that is not ignored.

The following infoset properties are required: [target], [content], [base URI], and [parent].

Processing Instruction Node properties are derived from the infoset as follows:

target

The value of the [target] property.

content

The value of the [content] property.

base-uri

The value of the [base URI] property.

parent

The node corresponding to the value of the [parent] property.

There are no Processing Instruction Nodes for processing instructions that are children of a document type declaration information item.

I.6 Comment Nodes Information Items

The comment information items are optional.

A Comment Node is constructed for each comment information item.

The following infoset properties are required: [content] and [parent].

Comment Node properties are derived from the infoset as follows:

content

The value of the [content] property.

parent

The node corresponding to the value of the [parent] property.

There are no Comment Nodes for comments that are children of a document type declaration information item.

I.7 Text Nodes Information Items

The character information items are required. A Text Node is constructed for each maximal sequence of character information items in document order.

The following infoset properties are required: [character code] and [parent].

The following infoset properties are optional: [element content whitespace].

A sequence of character information items is maximal if it satisfies the following constraints:

  1. All of the information items in the sequence have the same parent.

  2. The sequence consists of adjacent character information items uninterrupted by other types of information item.

  3. No other such sequence exists that contains any of the same character information items and is longer.

Text Node properties are derived from the infoset as follows:

content

A string comprised of characters that correspond to the [character code] properties of each of the character information items.

If the resulting Text Node consists entirely of whitespace and the [element content whitespace] property of the character information items used to construct this node are true, the content of the Text Node is the zero-length string. Text Nodes are only allowed to be empty if they have no parents; an empty Text Node will be discarded when its parent is constructed, if it has a parent.

The content of the Text Node is not necessarily normalized as described in the [Character Model]. It is the responsibility of data producers to provide normalized text, and the responsibility of applications to make sure that operations do not de-normalize text.

parent

The node corresponding to the value of the [parent] property.

Text Nodes are only allowed to be empty if they have no parents; an empty Text Node will be discarded when its parent is constructed, if it has a parent.

J PSVI Construction Summary (Non-Normative)

This section summarizes data model construction from a PSVI for each kind of information item. General notes occur elsewhere.

J.1 Document Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

J.2 Element Nodes Information Items

The following Element Node properties are affected by PSVI properties.

type-name

The type-name is determined as described in 3.3.1.1 Element and Attribute Node Type Names.

children

The sequence of nodes constructed from the information items found in the [children] property.

For each element, processing instruction, comment, and maximal sequence of adjacent character information items found in the [children] property, a corresponding Element, Processing Instruction, Comment, or Text Node is constructed and that sequence of nodes is used as the value of the children property.

For elements with schema simple types, or complex types with simple content, if the [schema normalized value] PSVI property exists, the processor may use a sequence of nodes containing the Processing Instruction and Comment Nodes corresponding to the processing instruction and comment information items found in the [children] property, plus an optional single Text Node whose string value is the [schema normalized value] for the children property. If the [schema normalized value] is the empty string, the Text Node must not be present, otherwise it must be present.

The relative order of Processing Instruction and Comment Nodes must be preserved, but the position of the Text Node, if it is present, among them is implementation defined.

The effect of the above rules is that where a fixed or default value for an element is defined in the schema, and the element takes this default value, a text node will be created to contain the value, even though there are no character information items representing the value in the PSVI. The position of this text node relative to any comment or processing instruction children is implementation-dependent.

[Schema Part 1] also permits an element with mixed content to take a default or fixed value (which will always be a simple value), but at the time of this writing it is unclear how such a defaulted value is represented in the PSVI. Implementations therefore may represent such a default value by creating a text node, but are not required to do so.

Because the data model requires that all general entities be expanded, there will never be unexpanded entity reference information item children.

attributes

A set of Attribute Nodes constructed from the attribute information items appearing in the [attributes] property. This includes all of the "special" attributes (xml:lang, xml:space, xsi:type, etc.) but does not include namespace declarations (because they are not attributes).

Default and fixed attributes provided by XML Schema processing are added to the [attributes] and are therefore included in the data model attributes of an element.

namespaces

A set of Namespace Nodes constructed from the namespace information items appearing in the [in-scope namespaces] property. Implementations that do not support Namespace Nodes may simply preserve the relevant bindings in this property.

Implementations may ignore namespace information items for namespaces which are not known to be used. A namespace is known to be used if:

  • It appears in the expanded QName of the node-name of the element.

  • It appears in the expanded QName of the node-name of any of the element's attributes.

  • It appears in the expanded QName of any values of type xs:QName that appear among the element's children or the typed values of its attributes.

Note: applications may rely on namespaces that are not known to be used, for example when QNames are used in content and that content does not have a type of xs:QName Such applications may have difficulty processing data models where some namespaces have been ignored.

nilled

If the [validity] property exists on an information item and is "valid" then if the [nil] property exists and is true, then the nilled property is "true". In all other cases, including all cases where schema validity assessment was not attempted or did not succeed, the nilled property is "false".

string-value

The string-value is calculated as follows:

  • If the element is empty: its string value is the zero length string.

  • If the element has a type of xs:untyped, a complex type with element-only content, or a complex type with mixed content: its string-value is the concatenation of the string-values of all its Text Node descendants in document order.

  • If the element has a simple type or a complex type with simple content: its string-value is the [schema normalized value] of the node.

If an implementation stores only the typed value of an element, it may use any valid lexical representation of the typed value for the string-value property.

typed-value

The typed-value is calculated as follows:

  • If the element is of type xs:untyped, its typed-value is its dm:string-value as an xs:untypedAtomic.

  • If the element has a complex type with empty content, its typed-value is the empty sequence.

  • If the element has a simple type or a complex type with simple content: its typed value is computed as described in 3.3.1.2 Typed Value Determination. The result is a sequence of zero or more atomic values. The relationship between the type-name, typed-value, and string-value of an element node is consistent with XML Schema validation.

    Note that in the case of dates and times, the timezone is preserved as described in 3.3.2 Dates and Times, and in the case of xs:QNames and xs:NOTATIONs, the prefix is preserved as described in 3.3.3 QNames and NOTATIONS.

  • If the element has a complex type with mixed content (including xs:anyType), its typed-value is its dm:string-value as an xs:untypedAtomic.

  • Otherwise, the element must be a complex type with element-only content. The typed-value of such an element is undefined. Attempting to access this property with the dm:typed-value accessor always raises an error.

is-id

If the element has a complex type with element-only content, the is-id property is false. Otherwise, if the typed-value of the element consists of exactly one atomic value that value is of type xs:ID, or a type derived from xs:ID, the is-id property is true, otherwise it is false.

is-idrefs

If the element has a complex type with element-only content, the is-idrefs property is false. Otherwise, if any of the atomic values in the typed-value of the element is of type xs:IDREF or xs:IDREFS, or a type derived from one of those types, the is-idrefs property is true, otherwise it is false.

All other properties have values that are consistent with construction from an infoset.

J.3 Attribute Nodes Information Items

The following Attribute Node properties are affected by PSVI properties.

string-value

If an implementation stores only the typed value of an attribute, it may use any valid lexical representation of the typed value for the string-value property.

type-name

The type-name is determined as described in 3.3.1.1 Element and Attribute Node Type Names.

typed-value

The typed-value is calculated as follows:

  • If the attribute is of type xs:untypedAtomic: its typed-value is its dm:string-value as an xs:untypedAtomic.

  • Otherwise, a sequence of zero or more atomic values as described in 3.3.1.2 Typed Value Determination. The relationship between the type-name, typed-value, and string-value of an attribute node is consistent with XML Schema validation.

is-id

If the attribute is named xml:id and its [attribute type] property does not have the value ID, then [xml:id] processing is performed. This will assure that the value does have the type ID and that it is properly normalized. The is-id is always true for attributes named xml:id.

If the type-name is xs:ID or a type derived from xs:ID, true, otherwise false.

is-idrefs

If any of the atomic values in the typed-value of the attributeis of type xs:IDREF or xs:IDREFS, or a type derived from one of those types, the is-idrefs property is true, otherwise it is false.

All other properties have values that are consistent with construction from an infoset.

Note: attributes from the XML Schema instance namespace, "http://www.w3.org/2001/XMLSchema-instance", (xsi:schemaLocation, xsi:type, etc.) appear as ordinary attributes in the data model.

J.4 Namespace Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

J.5 Processing Instruction Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

J.6 Comment Nodes Information Items

Construction from a PSVI is identical to construction from the Infoset.

J.7 Text Nodes Information Items

For Text Nodes constructed from the [schema normalized value] of elements, content contains the value of the [schema normalized value].

Otherwise, construction from a PSVI is the same as construction from the Infoset except for the content property. When constructing the content property, [element content whitespace] is not used to test if whitespace is collapsed. Instead, if the resulting Text Node consists entirely of whitespace and the character information items used to construct this node have a parent and that parent is an element and its {content type} is not “mixed”, then the content of the Text Node is the zero-length string.

K Infoset Mapping Summary (Non-Normative)

This section summarizes the infoset mapping for each kind of node. General notes occur elsewhere.

K.1 Document Nodes Information Items

A Document Node maps to a document information item. The mapping fails and produces no value if the Document Node contains Text Node children that do not consist entirely of white space or if the Document Node contains more than one Element Node child.

The following properties are specified by this mapping:

[children]

A list of information items obtained by processing each of the dm:children in order and mapping each to the appropriate information item(s).

[document element]

The element information item that is among the [children].

[unparsed entities]

An unordered set of unparsed entity information items constructed from the unparsed-entities.

Each unparsed entity maps to an unparsed entity information item. The following properties are specified by this mapping:

[name]

The name of the entity.

[system identifier]

The system identifier of the entity.

[public identifier]

The public identifier of the entity.

[declaration base URI]

Implementation defined. In the many cases, the document-uri is the correct answer and implementations must use this value if they have no better information. Implementations that keep track of the original [declaration base URI] for entities should use that value.

The following properties of the unparsed entity information item have no value: [notation name], [notation].

The following properties of the document information item have no value: [notations] [character encoding scheme] [standalone] [version] [all declarations processed].

K.2 Element Nodes Information Items

An Element Node maps to an element information item.

The following properties are specified by this mapping:

[namespace name]

The namespace name of the value of dm:node-name.

[local name]

The local part of the value of dm:node-name.

[prefix]

The prefix associated with the value of dm:node-name.

[children]

A list of information items obtained by processing each of the dm:children in order and mapping each to the appropriate information item(s).

[attributes]

An unordered set of information items obtained by processing each of the dm:attributes and mapping each to the appropriate information item(s).

[in-scope namespaces]

An unordered set of namespace information items constructed from the namespaces.

Each in-scope namespace maps to a namespace information item. The following properties are specified by this mapping:

[prefix]

The prefix associated with the namespace.

[namespace name]

The URI associated with the namespace.

[base URI]

The value of dm:base-uri.

[parent]

The following property has no value: [namespace attributes].

K.3 Attribute Nodes Information Items

An Attribute Node maps to an attribute information item.

The following properties are specified by this mapping:

[namespace name]

The namespace name of the value of dm:node-name.

[local name]

The local part of the value of dm:node-name.

[prefix]

The prefix associated with the value of dm:node-name.

[normalized value]

The value of dm:string-value.

[owner element]

The following properties have no value: [specified] [attribute type] [references].

K.4 Namespace Nodes Information Items

A Namespace Node maps to a namespace information item.

The following properties are specified by this mapping:

[prefix]

The prefix associated with the namespace.

[namespace name]

The value of dm:string-value.

K.5 Processing Instruction Nodes Information Items

An Processing Instruction Node maps to a processing instruction information item.

The following properties are specified by this mapping:

[target]

The local part of the value of dm:node-name.

[content]

The value of dm:string-value.

[base URI]

The value of dm:base-uri.

[parent]
[notation]

no value.

K.6 Comment Nodes Information Items

A Comment Node maps to a comment information item.

The following properties are specified by this mapping:

[content]

The value of the dm:string-value.

[parent]

K.7 Text Nodes Information Items

A Text Node maps to a sequence of character information items.

Each character of the dm:string-value of the node is converted into a character information item as specified by this mapping:

[character code]

The Unicode code point value of the character.

[parent]
[element content whitespace]

Unknown.

This sequence of characters constitutes the infoset mapping.