<?xml version="1.0"?>
<!DOCTYPE spec SYSTEM "../schema/xsl-query.dtd" [

<!ENTITY Document SYSTEM "document.xml">
<!ENTITY Element  SYSTEM "element.xml">
<!ENTITY Attribute  SYSTEM "attribute.xml">
<!ENTITY Namespace  SYSTEM "namespace.xml">
<!ENTITY ProcessingInstruction  SYSTEM "processing-instruction.xml">
<!ENTITY Comment  SYSTEM "comment.xml">
<!ENTITY Text  SYSTEM "text.xml">

<!ENTITY documentNode "Document Node">
<!ENTITY elementNode "Element Node">
<!ENTITY attributeNode "Attribute Node">
<!ENTITY namespaceNode "Namespace Node">
<!ENTITY processingInstructionNode "Processing Instruction Node">
<!ENTITY commentNode "Comment Node">
<!ENTITY textNode "Text Node">

<!ENTITY dm-example.xml SYSTEM "build/dm-example.xml.cdata">
<!ENTITY dm-example.xsd SYSTEM "build/dm-example.xsd.cdata">
<!ENTITY dm-example.tbl SYSTEM "build/dm-example.tbl.xml">

<!ENTITY date.year "2004">
<!ENTITY date.month "October">
<!ENTITY date.MM "10">
<!ENTITY date.day "29">
<!ENTITY date.DD "&date.day;">
<!ENTITY doc.date "&date.year;&date.MM;&date.DD;">
<!ENTITY doc.prefix "WD-xpath-datamodel">
<!ENTITY url.group "http://www.w3.org/XML/Group/">
<!ENTITY url.group.ql "&url.group;xmlquery/">
<!ENTITY url.publoc "&url.group;&date.year;/&date.MM;/&doc.prefix;-&doc.date;.html">
<!ENTITY url.internal "http://www.w3.org/Style/XSL/Group/xpath2-tf/&doc.prefix;-&doc.date;.html">
<!ENTITY url.external "http://www.w3.org/TR/&date.year;/&doc.prefix;-&doc.date;/">
<!ENTITY url.this "&url.external;">
<!ENTITY aacute "&#225;">

<!ENTITY dm.prop.attributes   "<emph role='dm-node-property'>attributes</emph>">
<!ENTITY dm.prop.base-uri     "<emph role='dm-node-property'>base-uri</emph>">
<!ENTITY dm.prop.node-kind    "<emph role='dm-node-property'>node-kind</emph>">
<!ENTITY dm.prop.children     "<emph role='dm-node-property'>children</emph>">
<!ENTITY dm.prop.content      "<emph role='dm-node-property'>content</emph>">
<!ENTITY dm.prop.namespaces   "<emph role='dm-node-property'>namespaces</emph>">
<!ENTITY dm.prop.nilled	      "<emph role='dm-node-property'>nilled</emph>">
<!ENTITY dm.prop.node-name    "<emph role='dm-node-property'>node-name</emph>">
<!ENTITY dm.prop.parent	      "<emph role='dm-node-property'>parent</emph>">
<!ENTITY dm.prop.prefix	      "<emph role='dm-node-property'>prefix</emph>">
<!ENTITY dm.prop.string-value "<emph role='dm-node-property'>string-value</emph>">
<!ENTITY dm.prop.target	      "<emph role='dm-node-property'>target</emph>">
<!ENTITY dm.prop.type-name    "<emph role='dm-node-property'>type-name</emph>">
<!ENTITY dm.prop.uri          "<emph role='dm-node-property'>uri</emph>">
<!ENTITY dm.prop.typed-value  "<emph role='dm-node-property'>typed-value</emph>">
]>
<spec w3c-doctype="wd">
<header>
  <title>XQuery 1.0 and XPath 2.0 Data Model</title>
  <version/>
  <w3c-designation>&doc.prefix;-&doc.date;</w3c-designation>
  <w3c-doctype>W3C Working Draft</w3c-doctype>
  <pubdate>
    <day>&date.day;</day>
    <month>&date.month;</month>
    <year>&date.year;</year>
  </pubdate>
  <publoc>
     <loc href="&url.this;">&url.this;</loc>
  </publoc>
  <altlocs>
    <loc href="&url.this;data-model.xml">XML</loc>
  </altlocs>
  <latestloc>
    <loc href="http://www.w3.org/TR/xpath-datamodel/">http://www.w3.org/TR/xpath-datamodel/</loc>
  </latestloc>
  <prevlocs>
    <loc href="http://www.w3.org/TR/2004/WD-xpath-datamodel-20040723/">http://www.w3.org/TR/2004/WD-xpath-datamodel-20040723/</loc>
    <loc href="http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/">http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/</loc>
  </prevlocs>
  <authlist>
    <author>
      <name>Mary Fern&aacute;ndez (XML Query WG)</name>
      <affiliation>AT&amp;T Labs</affiliation>
      <email href="mailto:mff@research.att.com">mff@research.att.com</email>
    </author>
    <author>
      <name>Ashok Malhotra (XML Query and XSL WGs)</name>
      <affiliation>Oracle Corporation</affiliation>
      <email href="mailto:ashok.malhotra@alum.mit.edu">ashok.malhotra@alum.mit.edu</email>
    </author>
    <author>
      <name>Jonathan Marsh (XSL WG)</name>
      <affiliation>Microsoft</affiliation>
      <email href="mailto:jmarsh@microsoft.com">jmarsh@microsoft.com</email>
    </author>
    <author>
      <name>Marton Nagy (XML Query WG)</name>
      <affiliation>Science Applications International Corporation (SAIC)</affiliation>
      <email href="mailto:marton.nagy@saic.com">marton.nagy@saic.com</email>
    </author>
    <author>
      <name>Norman Walsh (XSL WG)</name>
      <affiliation>Sun Microsystems</affiliation>
      <email href="mailto:Norman.Walsh@Sun.COM">Norman.Walsh@Sun.COM</email>
    </author>
  </authlist>

<status>
<p><emph>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. A
list of current W3C publications and the latest revision of this
technical report can be found in the <loc
href="http://www.w3.org/TR/">W3C technical reports index</loc> at
http://www.w3.org/TR/.</emph></p>

<p>This is a Public Working Draft for review by W3C Members and
other interested parties.
Publication as a Working Draft does not imply endorsement by the W3C
Membership. This is a draft document and may be updated, replaced or
obsoleted by other documents at any time. It is inappropriate to cite
this document as other than work in progress.</p>

<p>The XQuery 1.0 and XPath 2.0 Data Model has been defined jointly by
the <loc href="http://www.w3.org/XML/Query">XML Query Working Group</loc>
and the
<loc href="http://www.w3.org/Style/XSL/">XSL Working Group</loc>
(both part of the
<loc href="http://www.w3.org/XML/Activity.html">XML Activity</loc>).
</p>

<p>This working draft includes a number of changes made in response to
comments received during the Last Call period that ended on Feb. 15,
2004. The working group is continuing to process these comments, and
additional changes are expected.</p>

<p>This document reflects decisions taken up to and including the
face-to-face meeting in Cambridge, MA during the week of June 21,
2004. These decisions are recorded in the Last Call
<loc href="http://www.w3.org/2004/10/data-model-issues.html">issues list</loc>
(http://www.w3.org/2004/10/data-model-issues.html).
However, some of these decisions may not yet have been made in
this document.
</p>

<p>Public comments on this document and its open issues are invited.
Comments should be sent to the W3C mailing list
<loc href="mailto:public-qt-comments@w3.org">public-qt-comments@w3.org</loc>.
(archived at
<loc href="http://lists.w3.org/Archives/Public/public-qt-comments/">http://lists.w3.org/Archives/Public/public-qt-comments/</loc>) with “[DM]”
at the beginning of the subject field.</p>

<p>The patent policy for this document is the <loc
href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February
2004 W3C Patent Policy</loc>. 
Patent disclosures relevant
to this specification may be found on the <loc
href="http://www.w3.org/2002/08/xmlquery-IPR-statements">XML Query
Working Group's patent disclosure page</loc> and the <loc
href="http://www.w3.org/Style/XSL/Disclosures">XSL Working Group's
patent disclosure page</loc>. An individual who has actual knowledge of
a patent which the individual believes contains Essential Claim(s) with
respect to this specification should disclose the information in
accordance with <loc
href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section
6 of the W3C Patent Policy</loc>.
</p>
</status>

<abstract>
<p>This document defines the W3C XQuery 1.0 and XPath 2.0 Data Model,
which is the data model of <bibref ref="xpath20"/>,
<bibref ref="xslt20"/>, and <bibref ref="xquery"/>, and any other
specifications that reference it. This data model is based on the
<bibref ref="xpath"/> data model and earlier work on an
<bibref ref="XQDM00"/>. This document is the result of joint
work by the <bibref ref="XSLWG"/> and the <bibref ref="XQWG"/>.</p>
</abstract>

<langusage>
<language id="en">English</language>
</langusage>

<revisiondesc>
<p>See the CVS changelog.</p>
</revisiondesc>
</header>

<body>

<div1 id="intro">
<head>Introduction</head>

<p>This document defines the XQuery 1.0 and XPath 2.0 Data Model,
which is the data model of <bibref ref="xpath20"/>, <bibref ref="xslt20"/> and
<bibref ref="xquery"/></p>

<p>The XQuery 1.0 and XPath 2.0 Data Model (henceforth "data model")
serves two purposes.
First, it defines the information contained in the input to an
XSLT or XQuery processor.  Second, it defines all permissible values of
expressions in the XSLT, XQuery, and XPath languages.  A
language is <emph>closed</emph> with respect to a data model if the value
of every expression in the language is guaranteed to be in the data model.
XSLT 2.0, XQuery 1.0, and XPath 2.0 are all closed with respect to
the data model.</p>

<p>The data model is based on the <bibref ref="xml-infoset"/>
(henceforth "Infoset"), but it requires the following new features to
meet the <bibref ref="xpath20req"/> and <bibref ref="xquery-requirements"/>:</p>

<ulist>
  <item>
    <p>Support for XML Schema types. The XML Schema recommendations
    define features, such as structures (<bibref ref="xmlschema-1"/>)
    and simple data types (<bibref ref="xmlschema-2"/>), that extend
    the XML Information Set with precise type information.</p>
  </item>
  <item>
    <p>Representation of collections of documents and of
    complex values. (<bibref ref="xquery-requirements"/>)</p>
  </item>
</ulist>

<p>As with the Infoset, the XQuery 1.0 and XPath 2.0 Data Model
specifies what information in the documents is accessible, but it does
not specify the programming-language interfaces or bindings used to
represent or access the data.</p>

<p>The data model can represent various
values including not only the input and the output of a stylesheet or query, but all
values of expressions used during the intermediate calculations.
Examples include the input document or document repository (represented
as a &documentNode; or a sequence of &documentNode;s), the result of a
path expression (represented as a sequence of nodes), the result of an
arithmetic or a logical expression (represented as an atomic value),
a sequence expression resulting in a sequence of items, etc.
</p>

<p>This document provides a precise definition of the properties of nodes
in the XQuery 1.0 and XPath 2.0 Data Model, how they are accessed, and how
they relate to values in the Infoset and PSVI.</p>

</div1>

<div1 id="concepts">
<head>Concepts</head>

<p>This section outlines a number of general concepts that apply throughout
this specification.</p>

<div2 id="terminology">
<head>Terminology</head>

<p>For a full glossary of terms, see <specref ref="glossary"/>.</p>

<p>In this specification the words <rfc2119>must</rfc2119>,
<rfc2119>must not</rfc2119>,
<rfc2119>should</rfc2119>,
<rfc2119>should not</rfc2119>,
<rfc2119>may</rfc2119> and
<rfc2119>recommended</rfc2119>
are to be interpreted as described in <bibref ref="RFC2119"/>.</p>

<p>This specification distinguishes between the data model as a general
concept and specific items (documents, elements, atomic values, etc.)
that are concrete examples of the data model by identifying all concrete
examples as <termref def="dt-instance">instances of the data model</termref>.
</p>

<p><termdef id="dt-instance" term="instance of the data model">Every
<term>instance of the data model</term> is a
<termref def="dt-sequence">sequence</termref>.</termdef>.
</p>

<p><termdef id="dt-sequence" term="sequence">A <term>sequence</term>
is an ordered collection of zero or more <termref
def="dt-item">items</termref>.</termdef> A sequence cannot be a member
of a sequence. A single item appearing on its own is modeled as a
sequence containing one item. Sequences are defined in <specref
ref="sequences"/>.</p>

<p><termdef id="dt-item" term="item">An <term>item</term>
is either a
<termref def="dt-node">node</termref> or an
<termref def="dt-atomic-value">atomic value</termref></termdef>,
</p>

<p>Every node is one of the seven kinds of nodes defined in <specref
ref="Node"/>. Nodes form a tree that consists of a root node plus
all the nodes that are reachable directly or indirectly from the root node
via the <function>children</function>,
<function>attributes</function>, and <function>namespaces</function>
accessors. Every node belongs to exactly one tree, and every tree has
exactly one root node.</p>

<p><termdef id="dt-document" term="document">A
tree whose root node is a &documentNode; is referred to as a
<term>document</term>.</termdef></p>

<p><termdef id="dt-fragment"
term="fragment">A tree whose root node is not a &documentNode; is
referred to as a <term>fragment</term>.</termdef></p>

<p><termdef id="dt-atomic-value" term="atomic value">An
<term>atomic value</term> is a value in the value space
of an <termref def="dt-atomic-type">atomic type</termref> and is labeled with
the name of that atomic type.</termdef></p>

<p><termdef id="dt-atomic-type" term="atomic type">An <term>atomic type</term>
is a <termref def="dt-primitive-simple-type">primitive simple type</termref>
or a type derived by restriction from
another atomic type.</termdef>
(Types derived by list or union are not atomic.)
</p>

<p><termdef id="dt-primitive-simple-type" term="primitive simple type">There are 24
<term>primitive simple types</term>: the 19 defined in
<xspecref spec="XS2" ref="built-in-primitive-datatypes"/>
of <bibref ref="xmlschema-2"/> and
<code>xdt:anyAtomicType</code>, <code>xdt:untyped</code>,
<code>xdt:untypedAtomic</code>, <code>xdt:dayTimeDuration</code>,
and <code>xdt:yearMonthDuration</code></termdef>, defined in <specref ref="types"/>.</p>

<p>A type is represented in the data model by an
<termref def="dt-expanded-qname">expanded-QName</termref>.
</p>

<p><termdef id="dt-expanded-qname" term="expanded-QName">An <term>expanded-QName</term>
is a pair of values consisting of a possibly empty namespace URI and
a local name. They belong to the value space of the XML Schema type
<code>xs:QName</code>. References to <code>xs:QName</code> in this document
always
mean the value space, i.e. a namespace URI, local name pair (and not
the lexical space referring to constructs of the form
“<code>prefix:local-name</code>”).</termdef>
</p>

<p><termdef id="dt-implementation-defined" term="implementation
defined"><term>Implementation-defined</term> indicates an aspect that
may differ between implementations, but must be specified by the
implementor for each particular implementation.</termdef></p>

<p><termdef id="dt-implementation-dependent" term="implementation
dependent"><term>Implementation-dependent</term> indicates an aspect
that may differ between implementations, is not specified by this or
any W3C specification, and is not required to be specified by the
implementor for any particular implementation.</termdef></p>

<p>In all cases where this specification leaves the behavior
implementation-defined or implementation-dependent, the implementation
has the option of providing mechanisms that allow the user to
influence the behavior.</p>

<p>This document normatively defines the XQuery 1.0 and XPath 2.0 Data
Model. In this document, examples and material labeled as "Note" are
provided for explanatory purposes and are not normative.</p>

</div2>

<div2 id="notation">
<head>Notation</head>

<p>In addition to prose, this specification defines a set of accessor functions to
explain the data model. The accessors are shown with the prefix
<emph>dm:</emph>. This prefix is always shown in italics to emphasize
that these functions are abstract; they exist to explain the interface
between the data model and specifications that rely on the data model:
they are not accessible directly from the host
language.</p>

<p>Several prefixes are used throughout this document for notational
convenience. The following bindings are assumed.</p>

<olist>
<item><p><code>xs:</code> bound to
<code>http://www.w3.org/2001/XMLSchema</code>
</p></item>
<item><p><code>xsi:</code> bound to
<code>http://www.w3.org/2001/XMLSchema-instance</code>
</p></item>
<item><p><code>xdt:</code> bound to
<code>http://www.w3.org/&date.year;/&date.MM;/xpath-datatypes</code>
</p></item>
<item><p><code>fn:</code> bound to
<code>http://www.w3.org/2004/10/xpath-functions</code>
</p></item>
</olist>

<p>In practice, any prefix that is bound to the appropriate URI may be used.</p>

<p>The signature of accessor functions is shown using the same style as
<bibref ref="xpath-functions"/>, described in
<xspecref spec="FO" ref="func-signatures"/>.</p>

<p>This document relies on the <bibref ref="xml-infoset"/> and PSVI. Information items
and properties are indicated by the styles <emph role="info-item">information
item</emph> and <emph role="infoset-property">infoset property</emph>, respectively.</p>

<p>Some aspects of type assignment rely on the ability to access properties of
the schema components. Such properties are indicated by the style
{component property}. Note that this does not mean a lightweight schema processor
cannot be used, it only means that the application must have some mechanism to
access the necessary properties.</p>

</div2>

<div2 id="node-identity">
<head>Node Identity</head>

<p>Each node has a unique identity. Every <termref
def="dt-node">node</termref> in an instance of the data model is unique: identical to
itself, and not identical to any other node. (<termref
def="dt-atomic-value">Atomic values</termref> do not have identity;
every instance of the value “5” as an integer is identical to every
other instance of the value “5” as an integer.)
</p>

<note>
<p>The concept of node identity should not be confused with the
concept of a unique ID, which is a unique name assigned to an element
by the author to represent references using ID/IDREF correlation.</p>
</note>
</div2>

<div2 id="document-order">
<head>Document Order</head>

<p><termdef id="dt-document-order" term="document order">A
<term>document order</term> is defined among all the nodes
accessible during a given query or transformation. Document order is a
total ordering, although the relative order of some nodes is
implementation-dependent. Informally, document order is the order in
which nodes appear in the XML serialization of a document.</termdef>
<termdef id="dt-stable" term="stable">Document order is
<term>stable</term>, which means that the relative order of two
nodes will not change during the processing of a given query or
transformation, even if this order is implementation-dependent.</termdef></p>

<p>Within a tree, document order satisfies the following constraints:</p>

<olist>
<item>
<p>The root node is the first node.
</p>
</item>

<item>
<p>Every node occurs before all of its children and descendants.</p>
</item>

<item>
<p>&namespaceNode;s immediately follow the &elementNode; with which
they are associated. The relative order of &namespaceNode;s is
stable but implementation-dependent.</p>
</item>

<item>
<p>&attributeNode;s immediately follow the &namespaceNode;s of the
element with which they are associated. If there are no
&namespaceNode;s associated with a given element, then the
&attributeNode;s associated with that element immediately
follow the element. The relative order of &attributeNode;s is
stable but implementation-dependent.</p>
</item>

<item>
<p>The relative order of siblings is the order in which they occur in
the &dm.prop.children; property of their parent node.</p>
</item>

<item>
<p>Children and descendants occur before following siblings.</p>
</item>
</olist>

<p>The relative order of nodes in distinct trees is stable but
implementation-dependent, subject to the following constraint: If
any node in a given tree, <code>T1</code>, occurs before any node in a different
tree, <code>T2</code>, then all nodes in <code>T1</code> are before all nodes in
<code>T2</code>.</p>
</div2>

<div2 id="sequences">
<head>Sequences</head>

<p>An important characteristic of the data model is that there is no
distinction between an item (a node or an atomic value) and a
singleton sequence containing that item. An item is
equivalent to a singleton sequence containing that item and vice
versa.</p>

<p>A sequence may contain nodes, atomic values, or any mixture of
nodes and atomic values. When a node is added to a sequence its
identity remains the same. Consequently a node may occur in more than
one sequence and a sequence may contain duplicate items.</p>

<p>Sequences never contain other sequences; if sequences are combined,
the result is always a “flattened” sequence. In other words, appending
“(d e)” to “(a b c)” produces a sequence of length 5: “(a b c d e)”.
It <emph>does not</emph> produce a sequence of length 4: “(a b c (d e))”,
such a nested sequence never occurs.</p>

<note>
<p>Sequences replace node-sets from XPath 1.0. In XPath 1.0, node-sets
do not contain duplicates. In generalizing node-sets to sequences in
XPath 2.0, duplicate removal is provided by functions on node
sequences.</p>
</note>
</div2>

<div2 id="types">
<head>Types</head>

<p>The data model supports strongly typed languages such as
<bibref ref="xpath20"/> and <bibref ref="xquery"/>
that have a type system based on <bibref ref="xmlschema-1"/>. The 
type system is formally defined in <bibref ref="xquery-semantics"/>.</p>

<p>Every <termref def="dt-item">item</termref> in the data model has both
a value and a type.
In addition to nodes, the data model can represent atomic values like
the number 5 or the string “Hello World.” For each of these
atomic values, the data model contains both the value of the item
(such as 5 or “Hello World”) and its type name (such as
<code>xs:integer</code> or <code>xs:string</code>).</p>

<div3 id="types-representation">
<head>Representation of Types</head>

<p>The data model uses
<termref def="dt-expanded-qname">expanded-QNames</termref> to
represent the names of schema types, which include the built-in
types defined by <bibref ref="xmlschema-2"/>, five additional types
defined by this specification, and may include other user- or
implementation-defined types.</p>

<p>For XML Schema types, the namespace name of the expanded-QName 
is the <emph role="infoset-property">target namespace</emph> property
of the type definition, and its local name is the <emph
role="infoset-property">name</emph> property of the type
definition.</p>

<p>The data model relies on the fact that an expanded-QName uniquely
identifies every named type. (Although it is possible for different
schemas to define different types with the same expanded-QName, at
most one of them can be used in any given validation episode.)
</p>

<p>For anonymous types, the processor <rfc2119>must</rfc2119> construct an
<termref def="dt-anonymous-type-name">anonymous type name</termref>
that is distinct from the name of every named type and the name of
every other anonymous type.
<termdef id="dt-anonymous-type-name" term="anonymous type name">An
<term>anonymous type name</term> is an implementation defined,
unique type name provided by the processor for every anonymous
type declared in the schemas available in the static context.</termdef>
Anonymous type names
<rfc2119>must</rfc2119>
be globally unique across all anonymous types that are accessible to
the processor. In the formalism of this specification,
the anonymous type names are assumed to be <code>xs:QNames</code>, but in practice
implementations are not required to use <code>xs:QNames</code> to
represent the implementation-defined names of anonymous types.</p>

<p>The scope over which the names of anonymous types must be
meaningful and distinct depends on the processing context. In XSLT, it
is the duration of an entire transformation. In XQuery, it is the
duration of the evaluation of a top-level expression, i.e. an
expression not contained in any other expression.</p>

<p>The data model associates schema type information with &elementNode;s,
&attributeNode;s and atomic values. The item is guaranteed to be an
instance of that kind of item with the given schema type.
</p>

<p>The data model does not represent element or attribute declaration
schema components, but it supports various type-related operations.
The semantics of other operations, for example, checking if a particular
instance of an &elementNode; has a given schema type is defined in
<bibref ref="xquery-semantics"/>.
</p>
</div3>

<div3 id="types-predefined">
<head>Predefined Types</head>

<p>In addition to the 19 types defined in
<xspecref spec="XS2" ref="built-in-primitive-datatypes"/>
of <bibref ref="xmlschema-2"/>, the data model defines five
additional types: <code>xdt:anyAtomicType</code>,
<code>xdt:untyped</code>, <code>xdt:untypedAtomic</code>,
<code>xdt:dayTimeDuration</code>, and
<code>xdt:yearMonthDuration</code>:</p>

<glist role="newTypes">
<gitem id="xdt-anyAtomicType">
<label><code>xdt:anyAtomicType</code></label>
<def>
<p>The abstract datatype <term>xdt:anyAtomicType</term> is a child of
<term>xs:anySimpleType</term> and is the base type for all the
primitive atomic types described in <bibref ref='xmlschema-2'/>. This
datatype cannot be used in <bibref ref='xmlschema-1'/> type declarations,
nor can it be used as a base for user-defined atomic types. It can be
used, as discussed in
<xspecref spec='XQ' ref="id-expressions-on-datatypes"/>,
to define a required type (for example in a function signature) to
indicate that any of the primitive atomic types or
<code>xdt:untypedAtomic</code> is acceptable.
</p>
</def>
</gitem>

<gitem id="xdt-untyped">
<label><code>xdt:untyped</code></label>
<def>
<p>The datatype <term>xdt:untyped</term> is a child of
<term>xs:anyType</term> and serves as a special type
annotation to indicate types that have not been validated by
a XML Schema or a DTD. This type cannot be
used in <bibref ref='xmlschema-1'/> type declarations, nor can it be used
as a base for user-defined types.  It can be used, as discussed in
<xspecref spec='XQ' ref="id-expressions-on-datatypes"/>,
to define a required type (for example in a function signature) to indicate that
only an untyped value is acceptable.</p> 
</def>
</gitem>

<gitem id="xdt-untypedAtomic">
<label><code>xdt:untypedAtomic</code></label>
<def>
<p>The datatype <term>xdt:untypedAtomic</term> is a child of
<term>xdt:anyAtomicType</term> and serves as a special type
annotation to indicate atomic values that have not been validated by
a XML Schema or a DTD or have received an instance type annotation of
<code>xs:anySimpleType</code> in the PSVI. This datatype cannot be
used in <bibref ref='xmlschema-1'/> type declarations, nor can it be used
as a base for user-defined atomic types.
It can be used, as discussed in 
<xspecref spec='XQ' ref="id-expressions-on-datatypes"/>,
to define a required type (for example in a function signature) to
indicate that only an untyped atomic value is acceptable.</p>
</def>
</gitem>

<gitem id="xdt-dayTimeDuration">
<label><code>xdt:dayTimeDuration</code></label>
<def>
<p>The type <code>xdt:dayTimeDuration</code> is derived from
<code>xs:duration</code> by restricting its lexical representation to
contain only the days, hours, minutes and seconds components. The
value space of <code>xdt:dayTimeDuration</code> is the set of
fractional second values. The components of
<code>xdt:dayTimeDuration</code> correspond to the day, hour, minute
and second components defined in Section 5.5.3.2 of
<bibref ref="ISO8601"/>, respectively. <code>xdt:dayTimeDuration</code> is
derived from <code>xs:duration</code> as follows:</p>

<eg><![CDATA[
<xs:simpleType name='dayTimeDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[\-]?P([0-9]+D(T([0-9]+(H([0-9]+(M([0-9]+(\.[0-9]*)?S
        |\.[0-9]+S)?|(\.[0-9]*)?S)|(\.[0-9]*)?S)?|M([0-9]+
        (\.[0-9]*)?S|\.[0-9]+S)?|(\.[0-9]*)?S)|\.[0-9]+S))?
        |T([0-9]+(H([0-9]+(M([0-9]+(\.[0-9]*)?S|\.[0-9]+S)?
        |(\.[0-9]*)?S)|(\.[0-9]*)?S)?|M([0-9]+(\.[0-9]*)?S|\.[0-9]+S)?
        |(\.[0-9]*)?S)|\.[0-9]+S))"/>
  </xs:restriction>
</xs:simpleType>]]></eg>

<p>To make the long pattern easier to read, it has been formatted on
six lines using additional new line and space characters in the
pattern string. These additional characters should not be interpreted
as part of the pattern.</p>

</def>
</gitem>

<gitem id="xdt-yearMonthDuration">
<label><code>xdt:yearMonthDuration</code></label>
<def>
<p>The type <code>xdt:yearMonthDuration</code> is derived from
<code>xs:duration</code> by restricting its lexical representation to
contain only the year and month components. The value space of
<code>xdt:yearMonthDuration</code> is the set of
<code>xs:integer</code> month values. The year and month components of
<code>xdt:yearMonthDuration</code> correspond to the Gregorian year
and month components defined in section 5.5.3.2 of
<bibref ref="ISO8601"/>, respectively.</p>

<p>The type <code>xdt:yearMonthDuration</code> is derived from
<code>xs:duration</code> as follows:</p>

<eg><![CDATA[<xs:simpleType name='yearMonthDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[\-]?P[0-9]+(Y([0-9]+M)?|M)"/>
  </xs:restriction>
</xs:simpleType>]]></eg>
</def>
</gitem>
</glist>
</div3>

<div3 id="types-hierarchy">
<head>Type Hierarchy</head>

<p>The diagram below shows how the nodes,
<termref def="dt-primitive-simple-type">primitive
simple types</termref>, and user defined types fit together into
a hierarchy.</p>

<p>The <code>xs:IDREFS</code>, <code>xs:NMTOKENS</code>,
<code>xs:ENTITIES</code> and <code>user-defined list and union
types</code> are special types in that these types are lists or unions
rather than true subtypes.</p>

<graphic source="type-hierarchy.png" alt="Type hierarchy graphic"/>
</div3>

<div3 id="AtomicValue">
<head>Atomic Values</head>

<p>An atomic value can be constructed from a lexical
representation. Given a string and an atomic type, the atomic value is
constructed in such a way as to be consistent with validation. If the
string does not represent a valid value of the type, an error is
raised. When <code>xdt:untypedAtomic</code> is specified as the type,
no validation takes place. The details of the construction are
described in <xspecref spec="FO" ref="constructor-functions"/>
and the related <xspecref spec="FO" ref="casting"/>
section of <bibref ref="xpath-functions"/>.
</p>
</div3>

<div3 id="StringValue">
<head>String Values</head>

<p>A string value can be constructed from an atomic value.
Such a value is constructed by
converting the atomic value to its string representation as described
in <xspecref spec="FO" ref="casting"/>.
Using the canonical lexical representation for atomic values
may not always be compatible with XPath 1.0. These and other backwards
incompatibilities are described in
<xspecref spec="XP" ref="id-backwards-compatibility"/>.</p>
</div3>

</div2>
</div1>

<div1 id="construction">
<head>Data Model Construction</head>

<p>This section describes the constraints on instances of the data model.</p>

<p>The data model supports well-formed XML documents conforming to
<bibref ref="REC-xml-names"/> or <bibref ref="xml-names11"/>.
Documents that are not well-formed are,
by definition, not XML. XML documents that do not conform to
<bibref ref="REC-xml-names"/> or <bibref ref="xml-names11"/>
are not supported (nor are they supported by
<bibref ref="xml-infoset"/>).</p>

<p>In other words, the data model supports the following classes
of XML documents:</p>

<ulist>
  <item>
    <p>Well-formed documents conforming to <bibref ref="REC-xml-names"/> or
<bibref ref="xml-names11"/>.</p>
  </item>
  <item>
    <p>DTD-valid documents conforming to <bibref ref="REC-xml-names"/> or
<bibref ref="xml-names11"/>, and</p>
  </item>
  <item>
    <p>W3C XML Schema-validated documents.</p>
  </item>
</ulist>

<p>This document describes how to construct an instance of the data
model from an <bibref ref="xml-infoset"/> or a Post Schema Validation
Infoset (PSVI), the augmented infoset produced by an XML Schema
validation episode.</p>

<p>An instance of the data model can also be constructed directly through
application APIs, or from non-XML sources such as relational tables in
a database.</p>

<p>The data model supports some kinds of values that are not supported
by <bibref ref="xml-infoset"/>. Examples of these are
<termref def="dt-fragment">document fragments</termref>
and sequences of &documentNode;s.
The data model also supports values that are not nodes. Examples of
these are sequences of <termref def="dt-atomic-value">atomic values</termref>,
or sequences mixing nodes and atomic
values. These are necessary to be able to represent the results of
intermediate expressions in the data model during expression
processing.
</p>

<div2 id="const-other">
<head>Direct Construction</head>

<p>Although this document describes construction of an instance of the data model in
terms of infoset properties, an infoset is not an absolutely necessary
precondition for building an instance of the data model.</p>

<p>There are no constraints on how an instance of the data model may be
constructed directly, save that the resulting instance
<rfc2119>must</rfc2119> satisfy all of the constraints described in
this document.</p>

</div2>

<div2 id="const-infoset">
<head>Construction from an Infoset</head>

<p>An instance of the data model can be constructed from an <bibref ref="xml-infoset"/>
that satisfies the
following general constraints:</p>

<ulist>
<item><p>All general and external parsed entities must be fully expanded. The
Infoset must not contain any <emph role="info-item">unexpanded entity
reference information items</emph>.</p>
</item>
<item><p>The infoset <rfc2119>must</rfc2119> provide all of the properties identified as
<quote>required</quote> in this document.
The properties identified as <quote>optional</quote>
may be used, if they are present. All other properties are ignored.</p>
</item>
</ulist>

<p>An instance of the data model constructed from an information set
<rfc2119>must</rfc2119> be consistent with the description provided
for each node kind.</p>
</div2>

<div2 id="const-psvi">
<head>Construction from a PSVI</head>

<p>An instance of the data model can be constructed from a PSVI, whose
element and attribute information items have been strictly assessed,
laxly assessed, or have not been assessed. Constructing an instance of
the data model from a PSVI <rfc2119>must</rfc2119> be consistent with
the description provided in this section and with the description
provided for each node kind.</p>

<p>Data model construction requires that the PSVI provide unique names
for all anonymous schema types.</p>

<note>
<p><bibref ref="xmlschema-1"/> does not require all schema processors to
provide unique names for anonymous schema types. In order to build an
instance of the data model
from a PSVI produced by a processor that does not provide the names,
some post-processing will be required in order to assure that they are
all uniquely identified before construction begins.</p>
</note>

<p><termdef id="dt-incompletely-validated" term="incompletely validated">An
<term>incompletely validated</term> document is an XML document that has a
corresponding schema but whose schema-validity assessment has resulted
in one or more element or attribute information items being assigned
values other than 'valid' for the <emph role="infoset-property">validity</emph>
property in the PSVI.</termdef></p>

<p>The data model supports incompletely validated documents. Elements
and attributes that are not valid are treated as having unknown schema types.</p>

<p>The most significant difference between Infoset construction and PSVI
construction occurs in the area of schema type assignment. Other differences
can also arise from schema processing: default attribute and element values
may be provided, white space normalization of element content may occur, and the
user-supplied lexical form of elements and attributes with atomic schema types
may be lost.</p>

<div3 id="PSVI2Types">
<head>Mapping PSVI Additions to Type Names</head>

<p>A PSVI element or attribute information item may have a
<emph role="infoset-property">validity</emph> property.
The <emph role="infoset-property">validity</emph> property may be
<quote><emph>valid</emph></quote>, <quote><emph>invalid</emph></quote>,
or <quote><emph>notKnown</emph></quote>
and reflects the outcome of schema-validity assessment. In the data
model, precise schema type information is exposed for Element and
&attributeNode;s that are <quote><emph>valid</emph></quote>. Nodes
that are not <quote><emph>valid</emph></quote> are treated as if they
were simply well-formed XML and only very general schema type
information is associated with them.
</p>

<div4 id="PSVI2NodeTypes">
<head>Element and Attribute Node Type Names</head>

<p>The precise definition of the schema type of an element or attribute
information item depends on the properties of the PSVI.
In the PSVI, <bibref ref='xmlschema-1'/>
only guarantees the existence of either the
<emph role="infoset-property">type definition</emph> property,
or the
<emph role="infoset-property">type definition namespace</emph>,
<emph role="infoset-property">type definition name</emph> and
<emph role="infoset-property">type definition anonymous</emph>
properties.
If the type definition refers to a union type, there
are further properties defined, that refer to the type definition
which actually validated the item's normalized value. These properties
are not used to determine the schema type of the node.
</p>

<p>If the <emph role="infoset-property">validity</emph> and
<emph role="infoset-property">validation attempted</emph> properties exist
and have the values <quote><emph>valid</emph></quote> and
<quote><emph>full</emph></quote>, respectively,
the schema type of an element or attribute information item is
represented by an <termref def="dt-expanded-qname">expanded-QName</termref>
whose namespace and local name correspond
to the first applicable items in the following list:
</p>

<ulist>
<item>
<p>If the <emph role="infoset-property">type definition</emph> property exists:</p>
  <ulist>
    <item><p>If the {name} property is not absent, the {target namespace} and {name}
properties of the <emph role="infoset-property">type definition</emph>
property;</p>
    </item>
    <item><p>Otherwise, the namespace and local name of the appropriate
<termref def="dt-anonymous-type-name">anonymous type name</termref>.</p>
    </item>
   </ulist>
</item>

<item>
<p>If <emph role="infoset-property">type definition anonymous</emph> exists:</p>
  <ulist>
    <item><p>If it is <emph>false</emph>:
the <emph role="infoset-property">type definition namespace</emph>
and the <emph role="infoset-property">type definition name</emph> properties;
</p></item>
    <item><p>Otherwise, the namespace and local name of the appropriate
<termref def="dt-anonymous-type-name">anonymous type name</termref>.</p>
    </item>
  </ulist>
</item>
</ulist>

<p>If the <emph role="infoset-property">validity</emph> property does
not exist or is not <quote><emph>valid</emph></quote>, or the
<emph role="infoset-property">validition attempted</emph> property does
not exist or is not <quote><emph>full</emph></quote>,
the schema type of an element is <code>xdt:untyped</code> and the type
of an attribute is <code>xdt:untypedAtomic</code>.</p>
</div4>

<div4 id="TypedValueDetermination">
<head>Typed Value Determination</head>

<p>The typed value of &attributeNode;s and some &elementNode;s is a
sequence of atomic values. (Elements that have a complex type with
element-only or mixed content do not contain atomic values; such nodes
have no typed value and this section does not apply to them.) The
types of the items in the typed value of a node may not be the same as
the type of the node itself. This section describes how the typed
value of a node is derived from the properties of an information item
in a PSVI.</p>

<p>The types of the items in the typed value of a node are determined
by a recursive process called typed value determination. This process
begins with <code>T</code>, the schema type of the node itself, as
represented in the PSVI. The type <code>T</code> has a variety, which
is either atomic, union, or list. The typed value determination
process is defined as follows:</p>

<ulist>
<item>
<p>If the {variety} of <code>T</code> is atomic,</p>
  <ulist>
  <item>
  <p>If <code>T</code> is <code>xdt:untyped</code>, the typed value is an instance of 
xdt:untypedAtomic.</p>
  </item>
  <item>
  <p>Otherwise, the typed value is an instance of <code>T</code>.</p>
  </item>
  </ulist>
</item>
<item>
<p>If the {variety} of <code>T</code> is union, then the type of the
typed value is the determined by the type definition that actually
validated the content of the node, as follows:</p>

  <ulist>
  <item>
    <p>If <emph role="infoset-property">member type definition</emph> exists:
If the {name} property exists, the {target namespace} and {name} 
properties of the <emph role="infoset-property">member type definition</emph>;
otherwise, the appropriate anonymous type name.</p>
  </item>
  <item>
    <p>If <emph role="infoset-property">member type definition anonymous</emph> exists:
If it is false, the <emph role="infoset-property">member type definition namespace</emph>
and <emph role="infoset-property">member type definition name</emph> properties;
otherwise, the appropriate anonymous type name.</p>
  </item>
  </ulist>
<p>The resulting type is substituted for <code>T</code>, and the typed 
value determination process is invoked recursively.</p>
</item>

<item>
<p>If the {variety} of <code>T</code> is list, the
<emph role="infoset-property">schema normalized value</emph> of the 
node is considered to be a space-separated list of lexical forms, each of 
which has its own type. For each of these lexical forms, the type of the 
corresponding item is found in {item type definition}. This type is then 
substituted for <code>T</code>, and the typed value determination process is invoked 
recursively for each member of the list.</p>
</item>
</ulist>

<p>The typed value determination process is guaranteed to result in a 
sequence of atomic values, each having a well-defined atomic type. This 
sequence of atomic values, in turn, determines the string-value and 
typed-value properties of the node in the data model. However, 
implementations are allowed some flexibility in how these properties are 
stored. An implementation may choose to store the string value only and 
derive the typed value from it, or to store the typed value only and 
derive the string value from it, or to store both the string value and the 
typed value.</p>

<p>In order to permit these various implementation strategies, some 
variations in the string value of a node are defined as insignificant. 
Implementations that store only the typed value of a node are permitted to 
return a string value that is different from the original lexical form of 
the node content. For example, consider the following element:</p>

<eg>&lt;offset xsi:type="xs:integer"&gt;0030&lt;/offset&gt;</eg>

<p>Assuming that the node is valid, it has a typed value of 30 as an
<code>xs:integer</code>. An implementation may return either "30" or
"0030" as the string value of the node. Any string that is a valid
lexical representation of the typed value is acceptable. In this
specification, we express this rule by saying that the relationship
between the string value of a node and its typed value must be
"consistent with schema validation."</p>
</div4>

<div4 id="pattern-facets">
<head>Pattern Facets</head>

<p>Creating a subtype by restriction generally reduces the
<emph>value</emph> space of the original schema type. For example,
expressing a hat size as a restriction of decimal with a minimum value
of 6.5 and maximum value of 8.0 creates a schema type whose legal values are
only those in the range 6.5 to 8.0.</p>

<p>The pattern facet is different because it restricts the
<emph>lexical</emph> space of the schema type, not its value space.
Expressing a three-digit number as a restriction of integer with the
pattern facet “[0-9]{3}” creates a schema type whose legal values
are only those with a lexical form consisting of three digits.</p>

<p>The pattern facet is not reversible in practice; given an arbitrary
pattern, there’s no practical way to determine how the lexical form of
a typed value must be constructed so that the result will satisfy that
pattern.</p>

<p>As a consequence, pattern facets are not respected when mapping to
an Infoset or during serialization
and values in the data model that were originally valid with respect to
a schema that contains pattern-based restrictions may not be valid after
serialization.</p>
</div4>
</div3>

<div3 id="nilled">
<head>Mapping <att>xsi:nil</att> on &elementNode;s</head>

<p><bibref ref="xmlschema-2"/> introduced a mechanism for signaling
that an element should be accepted as valid when it has no content
despite a content type which does not require or even necessarily
allow empty content. That mechanism is the <att>xsi:nil</att> attribute.
</p>

<p>The data model exposes this special semantic in the &dm.prop.nilled; property.
(It also exposes the attribute, irrespective of whether or not schema
processing has been performed.)
</p>

<p>If the <emph role="infoset-property">validity</emph> property exists on
an information item and is <quote><emph>valid</emph></quote> then if
the <emph role="infoset-property">nil</emph> property exists and is true,
then the &dm.prop.nilled; property is <quote><emph>true</emph></quote>.
In all other cases, including all cases where schema validity assessment was
not attempted or did not succeed, the
&dm.prop.nilled; property is <quote><emph>false</emph></quote>.</p>

</div3>

<div3 id="dates-and-times">
<head>Dates and Times</head>

<p>The date and time types require special attention. The following sections apply
to <code>xs:dateTime</code>, <code>xs:date</code>, and <code>xs:time</code> types
and types derived from them.</p>

<div4 id="storing-timezones">
<head>Storing <code>xs:dateTime</code>, <code>xs:date</code>, and <code>xs:time</code> Values in the Data Model</head>

<p><bibref ref="xmlschema-2"/> permits <code>xs:dateTime</code>,
<code>xs:date</code>, and <code>xs:time</code>
values both with and without timezones and therefore only specifies
a partial ordering among date and time values. In the data model,
it is necessary to preserve timezone information.</p>

<p>In order to achieve this goal, <code>xs:dateTime</code>,
<code>xs:date</code>, and <code>xs:time</code> values must be stored
with care. If the lexical representation of the value includes a timezone,
it is converted to UTC
as defined by <bibref ref="xmlschema-2"/> and the timezone in the
lexical representation is converted to a
<code>xdt:dayTimeDuration</code> value (as an offset from UTC). Implementations <rfc2119>must</rfc2119> keep
track of both these values for each <code>xs:dateTime</code>,
<code>xs:date</code>, and <code>xs:time</code> stored.</p>

<p>Lexical representations that do not have a timezone are assumed to be
in UTC for the purposes of normalization only. An empty sequence is used for their
timezone.</p>

<p>Thus, for the purpose of validation,
<quote>2003-01-02T11:30:00-05:00</quote> is converted to
<quote>2003-01-02T16:30:00Z</quote>, but in the data model it <rfc2119>must</rfc2119> be
stored as as <quote>(2003-01-02T16:30:00Z, -PT5H0M)</quote>. The value
<quote>2003-01-16T16:30:00</quote> is stored as
<quote>(2003-01-16T16:30:00Z, ())</quote> because it has no timezone.
</p>
</div4>

<div4 id="retreiving-timezones">
<head>Retreiving the Typed Value of <code>xs:dateTime</code>, <code>xs:date</code>, and <code>xs:time</code> Values</head>

<p>For <code>xs:dateTime</code>, <code>xs:date</code> and
<code>xs:time</code>, the typed value is the atomic value
that is determined from its stored form as follows:</p>

<ulist>
<item><p>If the timezone component is not the empty sequence (the timezone
was specified), then the value
contains the time component, normalized to the timezone specified by
the timezone component, as well as the timezone component. The stored values
"(2003-01-02T16:30:00Z, -PT5H0M)" produce the value
"2003-01-02T11:30:00-05:00".</p></item>
<item><p>If the timezone component is the empty sequence (the timezone
<emph>was not</emph> specified), then the time
component without any indication of timezone. The stored values
"(2003-01-02T16:30:00Z, ())"  produce the value "2003-01-02T16:30:00".</p>
</item>
</ulist>
</div4>
</div3>

<div3 id="qnames-and-notations">
<head>QNames and NOTATIONS</head>

<p>The <code>QName</code> and <code>NOTATION</code> data types require
special attention. The following sections apply to
<code>xs:QName</code>, <code>xs:NOTATION</code>, and types derived
from them. These types are referred to collectively as “qualified
names”.</p>

<p>As defined in XML Schema, the lexical space for qualified names
includes a local name and an optional namespace prefix. The value
space for qualified names contains a local name and an optional
namespace URI. Therefore, it is not possible to derive a lexical value
from the typed value, or vice versa, without access to some context
that defines the namespace bindings.</p>

<p>When qualified exist as values of nodes in a well-formed document,
it is always possible to determine such a namespace context. However,
the data model also allows qualified names to exist as freestanding
atomic values, or as the name or value of a parentless attribute node,
and in these cases no namespace context is available.</p>

<p>In this Data Model, therefore, the value space for qualified names
contains a local-name, and optional namespace URI, and an optional
prefix. The prefix is used only when producing a lexical
representation of the value, that is, when casting the value to a
string. The prefix plays no part in other operations involving
qualified names: in particular, two qualified names are equal if their
local names and namespace URIs match, regardless whether they have the
same prefix.</p>

<p>The following consistency constraints apply:</p>

<ulist>
<item>
<p>If the namespace URI of a qualified name is absent, then the prefix must
also be absent.</p>
</item>

<item>
<p>For every element node whose name has a prefix, the prefix must be one
that is bound to the namespace URI of the element name by one of the
namespace nodes of the element.</p>
</item>

<item>
<p>For every element node whose name has no prefix, the element must have a
namespace node that binds the empty prefix to the namespace URI of the
element name, or must have no namespace node that binds the empty prefix in
the case where the name of the element has no namespace URI.</p>
</item>

<item>
<p>For every attribute node whose name has a prefix, the attribute node must
either be parentless, or the prefix must be one that is bound to the
namespace URI of the attribute name by one of the namespace nodes of the
parent element.</p>
</item>

<item>
<p>For every qualified name that contains a prefix and that is included in
the typed value of an element node, or of an attribute node that has an
element node as its parent, the prefix must be one that is bound to the
namespace URI of the qualified name by one of the namespace nodes of that
element.</p>
</item>

<item>
<p>For every qualified name that contains a namespace URI and no prefix, and
that is included in the typed value of an element node, or of an attribute
node that has an element node as its parent, that element node must have a
namespace node that binds the empty prefix to that namespace URI.</p>
</item>

<item>
<p>For every qualified name that contains neither a namespace URI nor a
prefix, and that is included in the typed value of an element node, or of an
attribute node that has an element node as its parent, that element node
must have no namespace node that binds the empty prefix.</p>
</item>
</ulist>
</div3>
</div2>
</div1>

<div1 id="infoset-mapping">
<head>Infoset Mapping</head>

<p>This specification describes how to map each kind of node to the
corresponding information item. This mapping produces an Infoset; it
does not and cannot produce a PSVI. Validation must be used to obtain
a PSVI for a (portion of a) data model instance.
</p>

<p>An Infoset can also be constructed by serializing an instance of
the data model and parsing it. Serialization is governed by
<bibref ref="xslt-xquery-serialization"/>.</p>

</div1>

<div1 id="accessors">
<head>Accessors</head>

<p>A set of accessors is defined <loc href="#Node">nodes</loc>
in the data model.
Some
accessors return a constant empty sequence on certain node kinds.
The
<function>unparsed-entity-system-id</function>,
<function>unparsed-entity-public-id</function>, and
<function>document-uri</function> accessors, which are only available
on &documentNode;s, are not included in this summary.</p>

<p>In order for processors to be able to operate on instances of the
data model, the model must expose the properties of the items it contains.
The data model does this by defining a family of accessor functions.
These are not functions in the literal sense; they are not available
for users or applications to call directly. Rather they are
descriptions of the information that an implementation of the data model
must expose to applications. Functions and operators available to end-users
are described in <bibref ref="xpath-functions"/>.</p>

<p>Some typed values in the data model are <emph>undefined</emph>.
Attempting to access an undefined property always raises an error.</p>

<div2 id="dm-base-uri">
<head><code>base-uri</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="base-uri" return-type="xs:anyURI" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>base-uri</function> accessor returns the base URI of a node
as a sequence containing zero or one URI reference. For more information
about base URIs, see <bibref ref="xmlbase"/>.</p>

<p>It is defined on
<loc href="#acc-summ-base-uri">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-node-kind">
<head><code>node-kind</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="node-kind" return-type="xs:string" returnEmptyOk="no">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>node-kind</function> accessor returns a string identifying the
kind of node. It will be one of the following, depending on the kind of
node: “document”, “element”, “attribute”, “processing-instruction”, “comment”,
or “text”.</p>

<p>It is defined on
<loc href="#acc-summ-node-kind">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-node-name">
<head><code>node-name</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="node-name" return-type="xs:QName" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>node-name</function> accessor returns the name of the node
as a sequence of zero or one <code>xs:QName</code>s. Note that the
QName value includes an optional prefix as described in
<specref ref="qnames-and-notations"/>.</p>

<p>It is defined on
<loc href="#acc-summ-node-name">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-parent">
<head><code>parent</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="parent" return-type="node()" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>parent</function> accessor returns the &dm.prop.parent; of a node
as a sequence containing zero or one nodes.</p>

<p>It is defined on
<loc href="#acc-summ-parent">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-string-value">
<head><code>string-value</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="string-value" return-type="xs:string" returnEmptyOk="no">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>string-value</function> accessor returns the string value
of a node.</p>

<p>It is defined on
<loc href="#acc-summ-string-value">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-typed-value">
<head><code>typed-value</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="typed-value" return-type="xdt:anyAtomicType" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>typed-value</function> accessor returns the
typed-value of the node as a sequence of zero or more atomic
values.</p>

<p>It is defined on
<loc href="#acc-summ-typed-value">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-type-name">
<head><code>type-name</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="type-name" return-type="xs:QName" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>type-name</function> accessor returns the name of the schema type
of a node as a sequence of zero or one <code>xs:QName</code>s.</p>

<p>It is defined on
<loc href="#acc-summ-type-name">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-children">
<head><code>children</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="children" return-type="node()" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>children</function> accessor returns the children of a node
as a sequence containing zero or more nodes.</p>

<p>It is defined on
<loc href="#acc-summ-children">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-attributes">
<head><code>attributes</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="attributes" return-type="attribute()" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>attributes</function> accessor returns the attributes of
a node as a sequence containing zero or more &attributeNode;s.
The order of &attributeNode;s is stable but implementation dependent.</p>


<p>It is defined on
<loc href="#acc-summ-attributes">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-namespace-nodes">
<head><code>namespace-nodes</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="namespace-nodes" return-type="node()" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>namespace-nodes</function> accessor returns the dynamic,
in-scope namespaces associated with a node as a sequence containing
zero or more &namespaceNode;s. The order of &namespaceNode;s is stable
but implementation dependent.</p>

<p>It is defined on
<loc href="#acc-summ-namespace-nodes">all seven</loc> node kinds.</p>

<p>Note: this accessor and the <code>namespace-bindings</code> accessor provide
two views of the same information. Implementations that do not need to expose
&namespaceNode;s might choose not to implement this accessor.</p>

</div2>

<div2 id="dm-namespace-bindings">
<head><code>namespace-bindings</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="namespaces-bindings" return-type="xs:string" returnSeq="yes">
    <arg name="node" type="node()"/>
  </proto>
</example>

<p>The <function>namespace-bindings</function> accessor returns 
returns the dynamic, in-scope namespaces associated with a node
as a set
of prefix/URI pairs. In the formalism of this specification, these
pairs are represented as a list of strings where each odd-numbered
list item is the prefix and the following even-numbered item is the
URI. In practice, implemenations may choose a more efficient return
type.</p>

<p>The prefix for the default namespace is "".</p>

<p>The <function>namespace-bindings</function> accessor is defined on
<loc href="#acc-summ-namespace-bindings">all seven</loc> node kinds.</p>

<p>Note: this accessor and the <code>namespace-nodes</code> accessor provide
two views of the same information.</p>

</div2>

<div2 id="dm-nilled">
<head><code>nilled</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="nilled" return-type="xs:boolean" returnSeq="no"
	 returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>nilled</function> accessor returns true if the
node is <quote>nilled</quote>, see <specref ref="nilled"/>.</p>

<p>It is defined on
<loc href="#acc-summ-nilled">all seven</loc> node kinds.</p>
</div2>

<div2 id="dm-is-id">
<head><code>is-id</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="is-id" return-type="xs:boolean" returnEmptyOk="no">
    <arg name="node" type="element-node()"/>
  </proto>
</example>

<p>The <function>is-id</function> accessor returns true if the
node is an XML ID.</p>

<p>It is defined on <loc href="#ElementNode">Element</loc> and
<loc href="#AttributeNode">Attribute</loc> Nodes.</p>
</div2>

<div2 id="dm-is-idrefs">
<head><code>is-idrefs</code> Accessor</head>


<example role="signature">
  <proto class="dm" name="is-idrefs" return-type="xs:boolean" returnEmptyOk="no">
    <arg name="node" type="element-node()"/>
  </proto>
</example>

<p>The <function>is-idrefs</function> accessor returns true if the
node is an XML IDREF or IDREFS.</p>

<p>It is defined on <loc href="#ElementNode">Element</loc> and
<loc href="#AttributeNode">Attribute</loc> Nodes.</p>
</div2>
</div1>

<div1 id="Node">
<head>Nodes</head>

<p><termdef id="dt-node" term="Node">There are seven kinds of
<term>Nodes</term> in the data model:
<loc href="#DocumentNode">document</loc>,
<loc href="#ElementNode">element</loc>,
<loc href="#AttributeNode">attribute</loc>,
<loc href="#TextNode">text</loc>,
<loc href="#NamespaceNode">namespace</loc>,
<loc href="#ProcessingInstructionNode">processing instruction</loc>, and
<loc href="#CommentNode">comment</loc>.</termdef> Each kind of
node is described in the following sections.</p>

<p id="constraints-general">All nodes <rfc2119>must</rfc2119> satisfy
the following general constraints:</p>

<olist>
<item><p>Every node <rfc2119>must</rfc2119> have a unique identity,
distinct from all other nodes.
</p></item>
<item>
<p>The &dm.prop.children; property of a node <rfc2119>must not</rfc2119>
contain two consecutive &textNode;s.</p>
</item>
<item>
<p>The &dm.prop.children; property of a node <rfc2119>must not</rfc2119>
contain any empty &textNode;s.</p>
</item>
<item>
<p>The &dm.prop.children; and &dm.prop.attributes; properties of a node
<rfc2119>must not</rfc2119>
contain two nodes with the same identity.</p>
</item>
</olist>

&Document;
&Element;
&Attribute;
&Namespace;
&ProcessingInstruction;
&Comment;
&Text;
</div1>

<div1 id="conformance">
<head>Conformance</head>

<p>The data model is intended primarily as a component that can be
used by other specifications. Therefore, the data model relies on
specifications that use it (such as <bibref ref="xpath20"/>, <bibref
ref="xslt20"/>, and <bibref ref="xquery"/>) to specify conformance
criteria for the data model in their respective environments.
Specifications that set conformance criteria for their use of the data
model must not relax the constraints expressed in this
specification.</p>

<p>Authors of conformance criteria for the use of the data
model should pay particular attention to the following features of
the data model:</p>

<olist>
<item>
<p>Support for DTD processing (both validation and unparsed entities).
</p>
</item>
<item>
<p>Support for W3C XML Schema processing.
</p>
</item>
<item>
<p>Support for the normative construction from an infoset described in
<specref ref="const-infoset"/>.
</p>
</item>
<item>
<p>Support for the normative construction from a PSVI described in
<specref ref="const-psvi"/>.
</p>
</item>
<item>
<p>Support for XML 1.0 and XML 1.1.
</p>
</item>
</olist>

</div1>

</body>

<back>
<div1>
<head>XML Information Set Conformance</head>

<p>This specification conforms to the XML Information Set <bibref ref="xml-infoset"/>.
The following information items <rfc2119>must</rfc2119> be exposed
by the infoset producer to construct a data model unless they are explicitly
identified as optional:</p>

<ulist>
  <item><p>The <emph role="info-item">Document Information Item</emph> with
           <emph role="infoset-property">base URI</emph> and
           <emph role="infoset-property">children</emph> properties.</p></item>

  <item><p><emph role="info-item">Element Information Items</emph> with
           <emph role="infoset-property">base URI</emph>,
           <emph role="infoset-property">children</emph>,
           <emph role="infoset-property">attributes</emph>,
           <emph role="infoset-property">in-scope namespaces</emph>,
           <emph role="infoset-property">local name</emph>,
           <emph role="infoset-property">namespace name</emph>,
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Attribute Information Items</emph> with
           <emph role="infoset-property">namespace name</emph>,
           <emph role="infoset-property">local name</emph>,
           <emph role="infoset-property">normalized value</emph>,
           <emph role="infoset-property">attribute type</emph>, and
           <emph role="infoset-property">owner element</emph> properties.</p></item>

  <item><p><emph role="info-item">Character Information Items</emph> with
           <emph role="infoset-property">character code</emph> and
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Processing Instruction Information Items</emph> with
           <emph role="infoset-property">base URI</emph>,
           <emph role="infoset-property">target</emph>,
           <emph role="infoset-property">content</emph> and
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Comment Information Items</emph> with
           <emph role="infoset-property">content</emph> and
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Namespace Information Items</emph> with
           <emph role="infoset-property">prefix</emph> (optional) and
           <emph role="infoset-property">namespace name</emph> properties.</p></item>
</ulist>

<p>Other information items and properties made available by the
Infoset processor are ignored.  In addition to the properties above,
the following properties are required from the PSVI if the data model is
constructed from a PSVI:</p>

<ulist>
  <item><p><emph role="infoset-property">validity</emph>,
  <emph role="infoset-property">type definition</emph>,
  <emph role="infoset-property">type definition namespace</emph>,
  <emph role="infoset-property">type definition name</emph>,
  <emph role="infoset-property">type definition anonymous</emph>,
  <emph role="infoset-property">member type definition</emph>,
  <emph role="infoset-property">member type definition namespace</emph>,
  <emph role="infoset-property">member type definition name</emph>,
  <emph role="infoset-property">member type definition anonymous</emph> and
  <emph role="infoset-property">schema normalized value</emph> properties on
  <emph role="info-item">Element Information Items</emph>.</p></item>

  <item><p><emph role="infoset-property">validity</emph>,
  <emph role="infoset-property">type definition</emph>,
  <emph role="infoset-property">type definition namespace</emph>,
  <emph role="infoset-property">type definition name</emph>,
  <emph role="infoset-property">type definition anonymous</emph>,
  <emph role="infoset-property">member type definition</emph>,
  <emph role="infoset-property">member type definition namespace</emph>,
  <emph role="infoset-property">member type definition name</emph>,
  <emph role="infoset-property">member type definition anonymous</emph> and
  <emph role="infoset-property">schema normalized value</emph> properties on
  <emph role="info-item">Attribute Information Items</emph>.</p></item>
</ulist>

</div1>

<div1>
<head>Error Summary</head>

<error-list>

<error class="TY" code="0001" label="undefined type" type="static">
<p>This error is raised whenever an accessor is called for a property that
is undefined.</p>
</error>
</error-list>
</div1>

<div1 id="references">
<head>References</head>

<div2 id="normative-references">
<head>Normative References</head>

<blist>

<!--FIXME: update ../etc/tr with the latest TR page info! -->

<bibl id="xml-infoset"     key="Infoset"/>
<bibl id="REC-xml-names"   key="Namespaces in XML"/>
<bibl id="xml-names11"     key="Namespaces in XML 1.1"/>

<bibl id="xpath20" key="XPath 2.0">
<titleref href="http://www.w3.org/TR/xpath20">XML Path Language
(XPath) 2.0</titleref>,
Mary F. Fernández, Michael Kay, Jonathan Robie, et. al., Editors.
World Wide Web Consortium, 29 Oct 2004.
This version is http://www.w3.org/TR/2004/WD-xpath20-20041029.
The latest version is available at http://www.w3.org/TR/xpath20.</bibl>

<bibl id="xpath-functions" key="Functions and Operators">
<titleref href="http://www.w3.org/TR/xpath-functions">XQuery 1.0 and
XPath 2.0 Functions and Operators</titleref>,
Ashok Malhotra, Jim Melton, and Norman Walsh, Editors.
World Wide Web Consortium, 29 Oct 2004.
This version is http://www.w3.org/TR/2004/WD-xpath-functions-20041029/.
The latest version is available at http://www.w3.org/TR/xpath-functions/.</bibl>

<bibl id="xmlschema-1"     key="Schema Part 1"/>
<bibl id="xmlschema-2"     key="Schema Part 2"/>

<bibl id="xslt-xquery-serialization" key="Serialization">
<titleref href="http://www.w3.org/TR/xslt-xquery-serialization/">XSLT 2.0
and XQuery 1.0 Serialization</titleref>,
Michael Kay, Norman Walsh, and Henry Zongaro, Editors.
World Wide Web Consortium, 29 Oct 2004.
This version is http://www.w3.org/TR/2004/WD-xslt-xquery-serialization-20041029/.
The latest version is available at http://www.w3.org/TR/xslt-xquery-serialization/.
</bibl>

<bibl id="xquery-semantics" key="Formal Semantics"/>
<bibl id="RFC2119"         key="RFC 2119"/>
<bibl id="charmod"         key="Character Model"/>

</blist>
</div2>

<div2 id="informative-references">
<head>Other References</head>

<blist>

<bibl id="XQDM00"          key="XML Query Data Model">
<titleref href="http://www.w3.org/TR/2001/WD-query-datamodel-20010215/"
>XML Query Data Model</titleref>,
Mary Fernández and Jonathan Robie, Editors.
World Wide Web Consortium,
15 Feb 2001.
</bibl>

<bibl id="xmlbase"         key="XML Base"/>
<bibl id="xpath"           key="XPath 1.0"/>
<bibl id="xpath20req"      key="XPath 2.0 Requirements"/>
<bibl id="xslt20"          key="XSLT 2.0"/>

<bibl id="XQWG" key="XML Query Working Group">
<titleref href="http://www.w3.org/XML/Query"
>XML Query Working Group</titleref>,
World Wide Web Consortium.
Home page: http://www.w3.org/XML/Query
</bibl>

<bibl id="XSLWG" key="XSL Working Group">
<titleref href="http://www.w3.org/Style/XSL/"
>XSL Working Group</titleref>,
World Wide Web Consortium.
Home page: http://www.w3.org/Style/XSL/
</bibl>

<bibl id="xquery" key="XQuery">
<titleref href="http://www.w3.org/TR/xquery">XQuery 1.0:
An XML Query Language</titleref>,
Daniela Florescu, Jonathan Robie, Jérôme Siméon, et. al., Editors.
World Wide Web Consortium, 29 Oct 2004.
This version is http://www.w3.org/TR/2004/WD-xquery-20041029/.
The latest version is available at http://www.w3.org/TR/xquery.</bibl>

<bibl id="xquery-requirements" key="XML Query Requirements"/>

<bibl id="ISO8601" key="ISO 8601">ISO (International Organization for Standardization).
<emph>Representations of dates and times, 2000-08-03.</emph>  
Available from: <loc href="http://www.iso.ch/">http://www.iso.ch/</loc>
</bibl>

</blist>
</div2>
</div1>

<inform-div1 id="glossary">
<head>Glossary</head>
<?glossary?>
</inform-div1>

<inform-div1 id="example">
<head>Example</head>

<p>The following XML document is used to illustrate the information
contained in a data model:</p>

<eg>&dm-example.xml;</eg>

<p>The document is associated with the URI
<quote>http://www.example.com/catalog.xml</quote>,
and is valid with respect to the following XML schema:</p>

<eg>&dm-example.xsd;</eg>

<p>This example exposes the data model for a document that has an associated
schema and has been validated successfully against it.
In general, an XML Schema is not required,
that is, the data model can represent a schemaless, well-formed XML
document with the rules described in <specref ref="types"/>.</p>

<p>The XML document is represented by the nodes described below.
The value <emph>D1</emph> represents a &documentNode;;
the values <emph>E1, E2, etc.</emph> represent &elementNode;s;
the values <emph>A1, A2, etc.</emph> represent &attributeNode;s;
the values <emph>N1, N2, etc.</emph> represent &namespaceNode;s;
the values <emph>P1, P2, etc.</emph> represent &processingInstructionNode;s;
the values <emph>T1, T2, etc.</emph> represent &textNode;s.</p>

<p>For brevity:</p>

<ulist>
<item><p>&textNode;s in the data model that contain only white space are not shown.</p>
</item>
<item><p>Literal strings are shown in quotes without the <code>xs:string()</code>
constructor
</p></item>
<item><p>Literal decimals are shown without the <code>xs:decimal()</code>
constructor
</p></item>
<item><p>Nodes are referred to using the syntax <code>[nodeID]</code>
</p></item>
<item><p>xs:QNames are used with the following prefixes bindings:</p>

<table border="0" summary="Namespace prefixes">
<tbody>
  <tr>
    <td>xs</td><td>http://www.w3.org/2001/XMLSchema</td>
  </tr>
  <tr>
    <td>xsi</td><td>http://www.w3.org/2001/XMLSchema-instance</td>
  </tr>
  <tr>
    <td>cat</td><td>http://www.example.com/catalog</td>
  </tr>
  <tr>
    <td>xlink</td><td>http://www.w3.org/1999/xlink</td>
  </tr>
  <tr>
    <td>html</td><td>http://www.w3.org/1999/xhtml</td>
  </tr>
  <tr>
    <td>anon</td><td>An implementation-dependent prefix associated with
<termref def="dt-anonymous-type-name">anonymous type names</termref></td>
  </tr>
</tbody>
</table>

</item>
<item><p>The abbreviation <quote><code>\n</code></quote> is used in string literals
to represent a newline character; this isn't supported in XPath, but it makes
this presentation clearer.</p></item>
<item><p>Accessors that return the empty sequence have been omitted.</p>
</item>
<item><p>To simplify the presentation, we’re assuming an implementation
that does not expose the namespace axis. Therefore,
&namespaceNode;s are shared across multiple elements.
See <specref ref="NamespaceNode"/>.</p>
</item>
</ulist>

&dm-example.tbl;

<p>A graphical representation of the data model for the preceding
example is shown below. Document order in this representation can be
found by following the traditional in-order, left-to-right,
depth-first traversal; however, because the image has been rotated for
easier presentation, this appears to be in-order, bottom-to-top,
depth-first order.</p>

<table border="0" cellspacing="0" summary="Graphic">
  <tbody>
    <tr>
      <td><graphic source="dm-example.png"
                   alt="Graphical depiction of the example data model."/>
      </td>
    </tr>
    <tr>
      <td>Graphic representation of the data model.
[<loc href="dm-example-large.png">large view</loc>,
<loc href="dm-example.svg">SVG</loc>]
      </td>
    </tr>
  </tbody>
</table>

</inform-div1>

<!-- removed reference to dm-issues-list.xml -->

</back>
</spec>
