<?xml version="1.0"?>
<!DOCTYPE spec SYSTEM "../schema/xsl-query.dtd" [

<!ENTITY Document SYSTEM "document.xml">
<!ENTITY Element  SYSTEM "element.xml">
<!ENTITY Attribute  SYSTEM "attribute.xml">
<!ENTITY Namespace  SYSTEM "namespace.xml">
<!ENTITY ProcessingInstruction  SYSTEM "processing-instruction.xml">
<!ENTITY Comment  SYSTEM "comment.xml">
<!ENTITY Text  SYSTEM "text.xml">

<!ENTITY documentNode "Document Node">
<!ENTITY elementNode "Element Node">
<!ENTITY attributeNode "Attribute Node">
<!ENTITY namespaceNode "Namespace Node">
<!ENTITY processingInstructionNode "Processing Instruction Node">
<!ENTITY commentNode "Comment Node">
<!ENTITY textNode "Text Node">

<!ENTITY dm-example.xml SYSTEM "build/dm-example.xml.cdata">
<!ENTITY dm-example.xsd SYSTEM "build/dm-example.xsd.cdata">
<!ENTITY dm-example.tbl SYSTEM "build/dm-example.tbl.xml">

<!ENTITY date.year "2004">
<!ENTITY date.month "July">
<!ENTITY date.MM "07">
<!ENTITY date.day "23">
<!ENTITY date.DD "&date.day;">
<!ENTITY doc.date "&date.year;&date.MM;&date.DD;">
<!ENTITY doc.prefix "WD-xpath-datamodel">
<!ENTITY url.group "http://www.w3.org/XML/Group/">
<!ENTITY url.group.ql "&url.group;xmlquery/">
<!ENTITY url.publoc "&url.group;&date.year;/&date.MM;/&doc.prefix;-&doc.date;.html">
<!ENTITY url.internal "http://www.w3.org/Style/XSL/Group/xpath2-tf/&doc.prefix;-&doc.date;.html">
<!ENTITY url.external "http://www.w3.org/TR/&date.year;/&doc.prefix;-&doc.date;/">
<!ENTITY url.this "&url.external;">
<!ENTITY aacute "&#225;">

<!ENTITY dm.prop.attributes   "<emph role='dm-node-property'>attributes</emph>">
<!ENTITY dm.prop.base-uri     "<emph role='dm-node-property'>base-uri</emph>">
<!ENTITY dm.prop.children     "<emph role='dm-node-property'>children</emph>">
<!ENTITY dm.prop.content      "<emph role='dm-node-property'>content</emph>">
<!ENTITY dm.prop.namespaces   "<emph role='dm-node-property'>namespaces</emph>">
<!ENTITY dm.prop.nilled	      "<emph role='dm-node-property'>nilled</emph>">
<!ENTITY dm.prop.node-name    "<emph role='dm-node-property'>node-name</emph>">
<!ENTITY dm.prop.parent	      "<emph role='dm-node-property'>parent</emph>">
<!ENTITY dm.prop.prefix	      "<emph role='dm-node-property'>prefix</emph>">
<!ENTITY dm.prop.string-value "<emph role='dm-node-property'>string-value</emph>">
<!ENTITY dm.prop.target	      "<emph role='dm-node-property'>target</emph>">
<!ENTITY dm.prop.type-name    "<emph role='dm-node-property'>type-name</emph>">
<!ENTITY dm.prop.uri          "<emph role='dm-node-property'>uri</emph>">
<!ENTITY dm.prop.string-value "<emph role='dm-node-property'>string-value</emph>">
<!ENTITY dm.prop.typed-value  "<emph role='dm-node-property'>typed-value</emph>">
]>
<spec w3c-doctype="wd">
<header>
  <title>XQuery 1.0 and XPath 2.0 Data Model</title>
  <version/>
  <w3c-designation>&doc.prefix;-&doc.date;</w3c-designation>
  <w3c-doctype>W3C Working Draft</w3c-doctype>
  <pubdate>
    <day>&date.day;</day>
    <month>&date.month;</month>
    <year>&date.year;</year>
  </pubdate>
  <publoc>
     <loc href="&url.this;">&url.this;</loc>
  </publoc>
  <altlocs>
    <loc href="&url.this;data-model.xml">XML</loc>
  </altlocs>
  <latestloc>
    <loc href="http://www.w3.org/TR/xpath-datamodel/">http://www.w3.org/TR/xpath-datamodel/</loc>
  </latestloc>
  <prevlocs>
    <loc href="http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/">http://www.w3.org/TR/2003/WD-xpath-datamodel-20031112/</loc>
    <loc href="http://www.w3.org/TR/2003/WD-xpath-datamodel-20030502/">http://www.w3.org/TR/2003/WD-xpath-datamodel-20030502/</loc>
  </prevlocs>
  <authlist>
    <author>
      <name>Mary Fern&aacute;ndez (XML Query WG)</name>
      <affiliation>AT&amp;T Labs</affiliation>
      <email href="mailto:mff@research.att.com">mff@research.att.com</email>
    </author>
    <author>
      <name>Ashok Malhotra (XML Query and XSL WGs)</name>
      <affiliation>Microsoft</affiliation>
      <email href="mailto:ashokma@microsoft.com">ashokma@microsoft.com</email>
    </author>
    <author>
      <name>Jonathan Marsh (XSL WG)</name>
      <affiliation>Microsoft</affiliation>
      <email href="mailto:jmarsh@microsoft.com">jmarsh@microsoft.com</email>
    </author>
    <author>
      <name>Marton Nagy (XML Query WG)</name>
      <affiliation>Science Applications International Corporation (SAIC)</affiliation>
      <email href="mailto:marton.nagy@saic.com">marton.nagy@saic.com</email>
    </author>
    <author>
      <name>Norman Walsh (XSL WG)</name>
      <affiliation>Sun Microsystems</affiliation>
      <email href="mailto:Norman.Walsh@Sun.COM">Norman.Walsh@Sun.COM</email>
    </author>
  </authlist>

<status>
<p><emph>This section describes the status of this document at the time
of its publication. Other documents may supersede this document. A
list of current W3C publications and the latest revision of this
technical report can be found in the <loc
href="http://www.w3.org/TR/">W3C technical reports index</loc> at
http://www.w3.org/TR/.</emph></p>

<p>This is a Public Working Draft for review by W3C Members and
other interested parties.
Publication as a Working Draft does not imply endorsement by the W3C
Membership. This is a draft document and may be updated, replaced or
obsoleted by other documents at any time. It is inappropriate to cite
this document as other than work in progress.</p>

<p>The XQuery 1.0 and XPath 2.0 Data Model has been defined jointly by
the <loc href="http://www.w3.org/XML/Query">XML Query Working Group</loc>
and the
<loc href="http://www.w3.org/Style/XSL/">XSL Working Group</loc>
(both part of the
<loc href="http://www.w3.org/XML/Activity.html">XML Activity</loc>).
</p>

<p>This working draft includes a number of changes made in response to
comments received during the Last Call period that ended on Feb. 15,
2004. The working group is continuing to process these comments, and
additional changes are expected.</p>

<p>This document reflects decisions taken up to and including the
face-to-face meeting in Cambridge, MA during the week of June 21,
2004. These decisions are recorded in the Last Call
<loc href="http://www.w3.org/2004/07/data-model-issues.html">issues list</loc>
(http://www.w3.org/2004/07/data-model-issues.html).
However, some of these decisions may not yet have been made in
this document.
</p>

<p>Public comments on this document and its open issues are invited.
Comments should be sent to the W3C mailing list
<loc href="mailto:public-qt-comments@w3.org">public-qt-comments@w3.org</loc>.
(archived at
<loc href="http://lists.w3.org/Archives/Public/public-qt-comments/">http://lists.w3.org/Archives/Public/public-qt-comments/</loc>) with “[DM]”
at the beginning of the subject field.</p>

<p>The patent policy for this document is expected to become the <loc
href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 February
2004 W3C Patent Policy</loc>, pending the Advisory Committee review of
the renewal of the XML Query Working Group. Patent disclosures relevant
to this specification may be found on the <loc
href="http://www.w3.org/2002/08/xmlquery-IPR-statements">XML Query
Working Group's patent disclosure page</loc> and the <loc
href="http://www.w3.org/Style/XSL/Disclosures">XSL Working Group's
patent disclosure page</loc>. An individual who has actual knowledge of
a patent which the individual believes contains Essential Claim(s) with
respect to this specification should disclose the information in
accordance with <loc
href="http://www.w3.org/Consortium/Patent-Policy-20040205/#sec-Disclosure">section
6 of the W3C Patent Policy</loc>.
</p>
</status>

<abstract>
<p>This document defines the W3C XQuery 1.0 and XPath 2.0 Data Model,
which is the data model of <bibref ref="xpath20"/>,
<bibref ref="xslt20"/>, and <bibref ref="xquery"/>, and any other
specifications that reference it. This data model is based on the
<bibref ref="xpath"/> data model and earlier work on an
<bibref ref="XQDM00"/>. This document is the result of joint
work by the <bibref ref="XSLWG"/> and the <bibref ref="XQWG"/>.</p>
</abstract>

<langusage>
<language id="en">English</language>
</langusage>

<revisiondesc>
<p>See the CVS changelog.</p>
</revisiondesc>
</header>

<body>

<div1 id="intro">
<head>Introduction</head>

<p>This document defines the XQuery 1.0 and XPath 2.0 Data Model,
which is the data model of <bibref ref="xpath20"/>, <bibref ref="xslt20"/> and
<bibref ref="xquery"/></p>

<p>The XQuery 1.0 and XPath 2.0 Data Model (henceforth "data model")
serves two purposes.
First, it defines the information contained in the input to an
XSLT or XQuery processor.  Second, it defines all permissible values of
expressions in the XSLT, XQuery, and XPath languages.  A
language is <emph>closed</emph> with respect to a data model if the value
of every expression in the language is guaranteed to be in the data model.
XSLT 2.0, XQuery 1.0, and XPath 2.0 are all closed with respect to
the data model.</p>

<p>The data model is based on the <bibref ref="xml-infoset"/>
(henceforth "Infoset"), but it requires the following new features to
meet the <bibref ref="xpath20req"/> and <bibref ref="xquery-requirements"/>:</p>

<ulist>
  <item>
    <p>Support for XML Schema types. The XML Schema recommendations
    define features, such as structures (<bibref ref="xmlschema-1"/>)
    and simple data types (<bibref ref="xmlschema-2"/>), that extend
    the XML Information Set with precise type information.</p>
  </item>
  <item>
    <p>Representation of collections of documents and of
    complex values. (<bibref ref="xquery-requirements"/>)</p>
  </item>
</ulist>

<p>As with the Infoset, the XQuery 1.0 and XPath 2.0 Data Model
specifies what information in the documents is accessible, but it does
not specify the programming-language interfaces or bindings used to
represent or access the data.</p>

<p>The data model can represent various
values including not only the input and the output of a stylesheet or query, but all
values of expressions used during the intermediate calculations.
Examples include the input document or document repository (represented
as a &documentNode; or a sequence of &documentNode;s), the result of a
path expression (represented as a sequence of nodes), the result of an
arithmetic or a logical expression (represented as an atomic value),
a sequence expression resulting in a sequence of items, etc.
</p>

<p>This document provides a precise definition of the properties of nodes
in the XQuery 1.0 and XPath 2.0 Data Model, how they are accessed, and how
they relate to values in the Infoset and PSVI.</p>

</div1>

<div1 id="concepts">
<head>Concepts</head>

<p>This section outlines a number of general concepts that apply throughout
this specification.</p>

<div2 id="terminology">
<head>Terminology</head>

<p>For a full glossary of terms, see <specref ref="glossary"/>.</p>

<p>In this specification the words <rfc2119>must</rfc2119>,
<rfc2119>must not</rfc2119>,
<rfc2119>should</rfc2119>,
<rfc2119>should not</rfc2119>,
<rfc2119>may</rfc2119> and
<rfc2119>recommended</rfc2119>
are to be interpreted as described in <bibref ref="RFC2119"/>.</p>

<p>This specification distinguishes between the data model as a general
concept and specific items (documents, elements, atomic values, etc.)
that are concrete examples of the data model by identifying all concrete
examples as <termref def="dt-instance">instances of the data model</termref>.
</p>

<p><termdef id="dt-instance" term="instance of the data model">Every
<term>instance of the data model</term> is a
<termref def="dt-sequence">sequence</termref>.</termdef>.
</p>

<p><termdef id="dt-sequence" term="sequence">A <term>sequence</term>
is an ordered collection of zero or more <termref
def="dt-item">items</termref>.</termdef> A sequence cannot be a member
of a sequence. A single item appearing on its own is modeled as a
sequence containing one item. Sequences are defined in <specref
ref="sequences"/>.</p>

<p><termdef id="dt-item" term="item">An <term>item</term>
is either a
<termref def="dt-node">node</termref> or an
<termref def="dt-atomic-value">atomic value</termref></termdef>,
</p>

<p>Every node is one of the seven kinds of nodes defined in <specref
ref="Node"/>. Nodes form a tree that consists of a root node plus
all the nodes that are reachable directly or indirectly from the root node
via the <function>children</function>,
<function>attributes</function>, and <function>namespaces</function>
accessors. Every node belongs to exactly one tree, and every tree has
exactly one root node.</p>

<p><termdef id="dt-document" term="document">A
tree whose root node is a &documentNode; is referred to as a
<term>document</term>.</termdef></p>

<p><termdef id="dt-fragment"
term="fragment">A tree whose root node is not a &documentNode; is
referred to as a <term>fragment</term>.</termdef></p>

<p><termdef id="dt-atomic-value" term="atomic value">An
<term>atomic value</term> is a value in the value space
of an <termref def="dt-atomic-type">atomic type</termref> and is labeled with
the name of that atomic type.</termdef></p>

<p><termdef id="dt-atomic-type" term="atomic type">An <term>atomic type</term>
is a <termref def="dt-primitive-simple-type">primitive simple type</termref>
or a type derived by restriction from
another atomic type.</termdef>
(Types derived by list or union are not atomic.)
</p>

<p><termdef id="dt-primitive-simple-type" term="primitive simple type">There are 24
<term>primitive simple types</term>: the 19 defined in
<xspecref spec="XS2" ref="built-in-primitive-datatypes"/>
of <bibref ref="xmlschema-2"/> and
<code>xdt:anyAtomicType</code>, <code>xdt:untyped</code>,
<code>xdt:untypedAtomic</code>, <code>xdt:dayTimeDuration</code>,
and <code>xdt:yearMonthDuration</code></termdef>, defined in <specref ref="types"/>.</p>

<p>A type is represented in the data model by an
<termref def="dt-expanded-qname">expanded-QName</termref>.
</p>

<p><termdef id="dt-expanded-qname" term="expanded-QName">An <term>expanded-QName</term>
is a pair of values consisting of a possibly empty namespace URI and
a local name. They belong to the value space of the XML Schema type
<code>xs:QName</code>. References to <code>xs:QName</code> in this document
always
mean the value space, i.e. a namespace URI, local name pair (and not
the lexical space referring to constructs of the form
“<code>prefix:local-name</code>”).</termdef>
</p>

<p><termdef id="dt-implementation-defined" term="implementation
defined"><term>Implementation-defined</term> indicates an aspect that
may differ between implementations, but must be specified by the
implementor for each particular implementation.</termdef></p>

<p><termdef id="dt-implementation-dependent" term="implementation
dependent"><term>Implementation-dependent</term> indicates an aspect
that may differ between implementations, is not specified by this or
any W3C specification, and is not required to be specified by the
implementor for any particular implementation.</termdef></p>

<p>In all cases where this specification leaves the behavior
implementation-defined or implementation-dependent, the implementation
has the option of providing mechanisms that allow the user to
influence the behavior.</p>

<p>This document normatively defines the XQuery 1.0 and XPath 2.0 Data
Model. In this document, examples and material labeled as "Note" are
provided for explanatory purposes and are not normative.</p>

</div2>

<div2 id="notation">
<head>Notation</head>

<p>In addition to prose, this specification defines a set of accessor functions to
explain the data model. The accessors are shown with the prefix
<emph>dm:</emph>. This prefix is always shown in italics to emphasize
that these functions are abstract; they exist to explain the interface
between the data model and specifications that rely on the data model:
they are not accessible directly from the host
language.</p>

<p>Several prefixes are used throughout this document for notational
convenience. The following bindings are assumed.</p>

<olist>
<item><p><code>xs:</code> bound to
<code>http://www.w3.org/2001/XMLSchema</code>
</p></item>
<item><p><code>xsi:</code> bound to
<code>http://www.w3.org/2001/XMLSchema-instance</code>
</p></item>
<item><p><code>xdt:</code> bound to
<code>http://www.w3.org/&date.year;/&date.MM;/xpath-datatypes</code>
</p></item>
<item><p><code>fn:</code> bound to
<code>http://www.w3.org/2004/07/xpath-functions</code>
</p></item>
</olist>

<p>In practice, any prefix that is bound to the appropriate URI may be used.</p>

<p>The signature of accessor functions is shown using the same style as
<bibref ref="xpath-functions"/>, described in
<xspecref spec="FO" ref="func-signatures"/>.</p>

<p>This document relies on the <bibref ref="xml-infoset"/> and PSVI. Information items
and properties are indicated by the styles <emph role="info-item">information
item</emph> and <emph role="infoset-property">infoset property</emph>, respectively.</p>

<p>Some aspects of type assignment rely on the ability to access properties of
the schema components. Such properties are indicated by the style
{component property}. Note that this does not mean a lightweight schema processor
cannot be used, it only means that the application must have some mechanism to
access the necessary properties.</p>

</div2>

<div2 id="node-identity">
<head>Node Identity</head>

<p>Each node has a unique identity. Every <termref
def="dt-node">node</termref> in an instance of the data model is unique: identical to
itself, and not identical to any other node. (<termref
def="dt-atomic-value">Atomic values</termref> do not have identity;
every instance of the value “5” as an integer is identical to every
other instance of the value “5” as an integer.)
</p>

<note>
<p>The concept of node identity should not be confused with the
concept of a unique ID, which is a unique name assigned to an element
by the author to represent references using ID/IDREF correlation.</p>
</note>
</div2>

<div2 id="document-order">
<head>Document Order</head>

<p><termdef id="dt-document-order" term="document order">A
<term>document order</term> is defined among all the nodes accessible
during a given query or transformation. Document order is a total
ordering, although the relative order of some nodes is
implementation-dependent. Informally, document order corresponds to
the order in which the first character of the XML representation of
each node occurs in the XML representation of the document.</termdef>
<termdef id="dt-stable" term="stable">Document order is
<term>stable</term>, which means that the relative order of two
nodes will not change during the processing of a given query or
transformation, even if this order is implementation-dependent.</termdef></p>

<p>Within a tree, document order satisfies the following constraints:</p>

<olist>
<item><p>The root node is the first node.</p></item>

<item><p>The relative order of siblings is determined by their order in 
the XML representation of the tree. A node N1 occurs before a node N2 in 
document order if and only if the start of N1 occurs before the start of 
N2 in the XML representation.</p></item>

<item><p>&namespaceNode;s immediately follow the &elementNode; with which 
they are associated. The relative order of &namespaceNode;s is stable but 
implementation-dependent.</p></item>

<item><p>&attributeNode;s immediately follow the &namespaceNode;s of the 
element with which they are associated. The relative order of
&attributeNode;s is stable but implementation-dependent.</p></item>

<item><p>&elementNode;s occur before their children; children occur before 
following-siblings.</p></item>
</olist>

<p>The relative order of nodes in distinct trees is stable but 
implementation-dependent, subject to the following constraint: If any node 
in tree T1 is before any node in tree T2, then all nodes in tree T1 are 
before all nodes in tree T2.</p>
</div2>

<div2 id="sequences">
<head>Sequences</head>

<p>An important characteristic of the data model is that there is no
distinction between an item (a node or an atomic value) and a
singleton sequence containing that item. An item is
equivalent to a singleton sequence containing that item and vice
versa.</p>

<p>A sequence may contain nodes, atomic values, or any mixture of
nodes and atomic values. When a node is added to a sequence its
identity remains the same. Consequently a node may occur in more than
one sequence and a sequence may contain duplicate items.</p>

<p>Sequences never contain other sequences; if sequences are combined,
the result is always a “flattened” sequence. In other words, appending
“(d e)” to “(a b c)” produces a sequence of length 5: “(a b c d e)”.
It <emph>does not</emph> produce a sequence of length 4: “(a b c (d e))”,
such a nested sequence never occurs.</p>

<note>
<p>Sequences replace node-sets from XPath 1.0. In XPath 1.0, node-sets
do not contain duplicates. In generalizing node-sets to sequences in
XPath 2.0, duplicate removal is provided by functions on node
sequences.</p>
</note>
</div2>

<div2 id="types">
<head>Types</head>

<p>The data model supports strongly typed language such as
<bibref ref="xpath20"/> and <bibref ref="xquery"/>
that have a type system based on <bibref ref="xmlschema-1"/>. The 
type system is formally defined in <bibref ref="xquery-semantics"/>.</p>

<p>Every <termref def="dt-item">item</termref> in the data model has both
a value and a type.
In addition to nodes, the data model can represent atomic values like
the number 5 or the string “Hello World.” For each of these
atomic values, the data model contains both the value of the item
(such as 5 or “Hello World”) and its type name (such as
<code>xs:integer</code> or <code>xs:string</code>).</p>

<div3 id="types-representation">
<head>Representation of Types</head>

<p>The data model uses
<termref def="dt-expanded-qname">expanded-QNames</termref> to
represent the names of schema types, which include the built-in
types defined by <bibref ref="xmlschema-2"/>, five additional types
defined by this specification, and may include other user- or
implementation-defined types.</p>

<p>For XML Schema types, the namespace name of the expanded-QName 
is the <emph role="infoset-property">target namespace</emph> property
of the type definition, and its local name is the <emph
role="infoset-property">name</emph> property of the type
definition.</p>

<p>The data model relies on the fact that an expanded-QName uniquely
identifies every named type. (Although it is possible for different
schemas to define different types with the same expanded-QName, at
most one of them can be used in any given validation episode.)
</p>

<p>For anonymous types, the processor <rfc2119>must</rfc2119> construct an
<termref def="dt-anonymous-type-name">anonymous type name</termref>
that is distinct from the name of every named type and the name of
every other anonymous type.
<termdef id="dt-anonymous-type-name" term="anonymous type name">An
<term>anonymous type name</term> is an implementation defined,
unique type name provided by the processor for every anonymous
type declared in the schemas available in the static context.</termdef>
Anonymous type names
<rfc2119>must</rfc2119>
be globally unique across all anonymous types that are accessible to
the processor. In the formalism of this specification,
the anonymous type names are assumed to be <code>xs:QNames</code>, but in practice
implementations are not required to use <code>xs:QNames</code> to
represent the implementation-defined names of anonymous types.</p>

<p>The scope over which the names of anonymous types must be
meaningful and distinct depends on the processing context. In XSLT, it
is the duration of an entire transformation. In XQuery, it is the
duration of the evaluation of a top-level expression, i.e. an
expression not contained in any other expression.</p>

<p>The data model associates schema type information with &elementNode;s,
&attributeNode;s and atomic values. The item is guaranteed to be an
instance of that kind of item with the given schema type.
</p>

<p>The data model does not represent element or attribute declaration
schema components, but it supports various type-related operations.
The semantics of other operations, for example, checking if a particular
instance of an &elementNode; has a given schema type is defined in
<bibref ref="xquery-semantics"/>.
</p>
</div3>

<div3 id="types-predefined">
<head>Predefined Types</head>

<p>In addition to the 19 types defined in
<xspecref spec="XS2" ref="built-in-primitive-datatypes"/>
of <bibref ref="xmlschema-2"/>, the data model defines five
additional types: <code>xdt:anyAtomicType</code>,
<code>xdt:untyped</code>, <code>xdt:untypedAtomic</code>,
<code>xdt:dayTimeDuration</code>, and
<code>xdt:yearMonthDuration</code>:</p>

<glist role="newTypes">
<gitem id="xdt-anyAtomicType">
<label><code>xdt:anyAtomicType</code></label>
<def>
<p>The abstract datatype <term>xdt:anyAtomicType</term> is a child of
<term>xs:anySimpleType</term> and is the base type for all the
primitive atomic types described in <bibref ref='xmlschema-2'/>. This
datatype cannot be used in <bibref ref='xmlschema-1'/> type declarations,
nor can it be used as a base for user-defined atomic types. It can be
used, as discussed in
<xspecref spec='XQ' ref="id-expressions-on-datatypes"/>,
to define a required type (for example in a function signature) to
indicate that any of the primitive atomic types or
<code>xdt:untypedAtomic</code> is acceptable.
</p>
</def>
</gitem>

<gitem id="xdt-untyped">
<label><code>xdt:untyped</code></label>
<def>
<p>The datatype <term>xdt:untyped</term> is a child of
<term>xs:anyType</term> and serves as a special type
annotation to indicate types that have not been validated by
a XML Schema or a DTD. This type cannot be
used in <bibref ref='xmlschema-1'/> type declarations, nor can it be used
as a base for user-defined types.  It can be used, as discussed in
<xspecref spec='XQ' ref="id-expressions-on-datatypes"/>,
to define a required type (for example in a function signature) to indicate that
only an untyped value is acceptable.</p> 
</def>
</gitem>

<gitem id="xdt-untypedAtomic">
<label><code>xdt:untypedAtomic</code></label>
<def>
<p>The datatype <term>xdt:untypedAtomic</term> is a child of
<term>xdt:anyAtomicType</term> and serves as a special type
annotation to indicate atomic values that have not been validated by
a XML Schema or a DTD or have received an instance type annotation of
<code>xs:anySimpleType</code> in the PSVI. This datatype cannot be
used in <bibref ref='xmlschema-1'/> type declarations, nor can it be used
as a base for user-defined atomic types.
It can be used, as discussed in 
<xspecref spec='XQ' ref="id-expressions-on-datatypes"/>,
to define a required type (for example in a function signature) to
indicate that only an untyped atomic value is acceptable.</p>
</def>
</gitem>

<gitem id="xdt-dayTimeDuration">
<label><code>xdt:dayTimeDuration</code></label>
<def>
<p>The type <code>xdt:dayTimeDuration</code> is derived from
<code>xs:duration</code> by restricting its lexical representation to
contain only the days, hours, minutes and seconds components. The
value space of <code>xdt:dayTimeDuration</code> is the set of
fractional second values. The components of
<code>xdt:dayTimeDuration</code> correspond to the day, hour, minute
and second components defined in Section 5.5.3.2 of
<bibref ref="ISO8601"/>, respectively. <code>xdt:dayTimeDuration</code> is
derived from <code>xs:duration</code> as follows:</p>

<eg><![CDATA[
<xs:simpleType name='dayTimeDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[\-]?P([0-9]+D(T([0-9]+(H([0-9]+(M([0-9]+(\.[0-9]*)?S
                       |\.[0-9]+S)?|(\.[0-9]*)?S)|(\.[0-9]*)?S)?|M([0-9]+
                       (\.[0-9]*)?S|\.[0-9]+S)?|(\.[0-9]*)?S)|\.[0-9]+S))?
                       |T([0-9]+(H([0-9]+(M([0-9]+(\.[0-9]*)?S|\.[0-9]+S)?
                       |(\.[0-9]*)?S)|(\.[0-9]*)?S)?|M([0-9]+(\.[0-9]*)?S|\.[0-9]+S)?
                       |(\.[0-9]*)?S)|\.[0-9]+S))"/>
  </xs:restriction>
</xs:simpleType>]]></eg>

<p>To make the long pattern easier to read, it has been formatted on
six lines using additional new line and space characters in the
pattern string. These additional characters should not be interpreted
as part of the pattern.</p>

</def>
</gitem>

<gitem id="xdt-yearMonthDuration">
<label><code>xdt:yearMonthDuration</code></label>
<def>
<p>The type <code>xdt:yearMonthDuration</code> is derived from
<code>xs:duration</code> by restricting its lexical representation to
contain only the year and month components. The value space of
<code>xdt:yearMonthDuration</code> is the set of
<code>xs:integer</code> month values. The year and month components of
<code>xdt:yearMonthDuration</code> correspond to the Gregorian year
and month components defined in section 5.5.3.2 of
<bibref ref="ISO8601"/>, respectively.</p>

<p>The type <code>xdt:yearMonthDuration</code> is derived from
<code>xs:duration</code> as follows:</p>

<eg><![CDATA[<xs:simpleType name='yearMonthDuration'>
  <xs:restriction base='xs:duration'>
    <xs:pattern value="[\-]?P[0-9]+(Y([0-9]+M)?|M)"/>
  </xs:restriction>
</xs:simpleType>]]></eg>
</def>
</gitem>
</glist>
</div3>

<div3 id="types-hierarchy">
<head>Type Hierarchy</head>

<p>The diagram below shows how the nodes,
<termref def="dt-primitive-simple-type">primitive
simple types</termref>, and user defined types fit together into
a hierarchy.</p>

<p>The <code>xs:IDREFS</code>, <code>xs:NMTOKENS</code>,
<code>xs:ENTITIES</code> and <code>user-defined list and union
types</code> are special types in that these types are lists or unions
rather than true subtypes.</p>

<graphic source="type-hierarchy.png" alt="Type hierarchy graphic"/>
</div3>

<div3 id="AtomicValue">
<head>Atomic Values</head>

<p>An atomic value can be constructed from a lexical
representation. Given a string and an atomic type, the atomic value is
constructed in such a way as to be consistent with validation. If the
string does not represent a valid value of the type, an error is
raised. When <code>xdt:untypedAtomic</code> is specified as the type,
no validation takes place. The details of the construction are
described in <xspecref spec="FO" ref="constructor-functions"/>
and the related <xspecref spec="FO" ref="casting"/>
section of <bibref ref="xpath-functions"/>.
</p>
</div3>

<div3 id="StringValue">
<head>String Values</head>

<p>A string value can be constructed from an atomic value.
Such a value is constructed by
converting the atomic value to its string representation as described
in <xspecref spec="FO" ref="casting"/>.
Using the canonical lexical representation for atomic values
may not always be compatible with XPath 1.0. These and other backwards
incompatibilities are described in
<xspecref spec="XP" ref="id-backwards-compatibility"/>.</p>
</div3>

</div2>
</div1>

<div1 id="construction">
<head>Data Model Construction</head>

<p>This section describes the constraints on instances of the data model.</p>

<p>The data model supports well-formed XML documents conforming to
<bibref ref="REC-xml-names"/> or <bibref ref="xml-names11"/>.
Documents that are not well-formed are,
by definition, not XML. XML documents that do not conform to
<bibref ref="REC-xml-names"/> or <bibref ref="xml-names11"/>
are not supported (nor are they supported by
<bibref ref="xml-infoset"/>).</p>

<p>In other words, the data model supports the following classes
of XML documents:</p>

<ulist>
  <item>
    <p>Well-formed documents conforming to <bibref ref="REC-xml-names"/> or
<bibref ref="xml-names11"/>.</p>
  </item>
  <item>
    <p>DTD-valid documents conforming to <bibref ref="REC-xml-names"/> or
<bibref ref="xml-names11"/>, and</p>
  </item>
  <item>
    <p>W3C XML Schema-validated documents.</p>
  </item>
</ulist>

<p>This document describes how to construct an instance of the data
model from an <bibref ref="xml-infoset"/> or a Post Schema Validation
Infoset (PSVI), the augmented infoset produced by an XML Schema
validation episode.</p>

<p>An instance of the data model can also be constructed directly through
application APIs, or from non-XML sources such as relational tables in
a database.</p>

<p>The data model supports some kinds of values that are not supported
by <bibref ref="xml-infoset"/>. Examples of these are
<termref def="dt-fragment">document fragments</termref>
and sequences of &documentNode;s.
The data model also supports values that are not nodes. Examples of
these are sequences of <termref def="dt-atomic-value">atomic values</termref>,
or sequences mixing nodes and atomic
values. These are necessary to be able to represent the results of
intermediate expressions in the data model during expression
processing.
</p>

<div2 id="const-other">
<head>Direct Construction</head>

<p>Although this document describes construction of an instance of the data model in
terms of infoset properties, an infoset is not an absolutely necessary
precondition for building an instance of the data model.</p>

<p>There are no constraints on how an instance of the data model may be
constructed directly, save that the resulting instance
<rfc2119>must</rfc2119> satisfy all of the constraints described in
this document.</p>

</div2>

<div2 id="const-infoset">
<head>Construction from an Infoset</head>

<p>An instance of the data model can be constructed from an <bibref ref="xml-infoset"/>
that satisfies the
following general constraints:</p>

<ulist>
<item><p>All general and external parsed entities must be fully expanded. The
Infoset must not contain any <emph role="info-item">unexpanded entity
reference information items</emph>.</p>
</item>
<item><p>The infoset <rfc2119>must</rfc2119> provide all of the properties identified as
<quote>required</quote> in this document.
The properties identified as <quote>optional</quote>
may be used, if they are present. All other properties are ignored.</p>
</item>
</ulist>

<p>An instance of the data model constructed from an information set
<rfc2119>must</rfc2119> be consistent with the description provided
for each node kind.</p>
</div2>

<div2 id="const-psvi">
<head>Construction from a PSVI</head>

<p>An instance of the data model can be constructed from a PSVI, whose
element and attribute information items have been strictly assessed,
laxly assessed, or have not been assessed. Constructing an instance of
the data model from a PSVI <rfc2119>must</rfc2119> be consistent with
the description provided in this section and with the description
provided for each node kind.</p>

<p>Data model construction requires that the PSVI provide unique names
for all anonymous schema types.</p>

<note>
<p><bibref ref="xmlschema-1"/> does not require all schema processors to
provide unique names for anonymous schema types. In order to build an
instance of the data model
from a PSVI produced by a processor that does not provide the names,
some post-processing will be required in order to assure that they are
all uniquely identified before construction begins.</p>
</note>

<p><termdef id="dt-incompletely-validated" term="incompletely validated">An
<term>incompletely validated</term> document is an XML document that has a
corresponding schema but whose schema-validity assessment has resulted
in one or more element or attribute information items being assigned
values other than 'valid' for the <emph role="infoset-property">validity</emph>
property in the PSVI.</termdef></p>

<p>The data model supports incompletely validated documents. Elements
and attributes that are not valid are treated as having unknown schema types.</p>

<p>The most significant difference between Infoset construction and PSVI
construction occurs in the area of schema type assignment. Other differences
can also arise from schema processing: default attribute and element values
may be provided, white space normalization of element content may occur, and the
user-supplied lexical form of elements and attributes with atomic schema types
may be lost.</p>

<div3 id="PSVI2Types">
<head>Mapping PSVI Additions to Type Names</head>

<p>A PSVI element or attribute information item may have a
<emph role="infoset-property">validity</emph> property.
The <emph role="infoset-property">validity</emph> property may be
<quote><emph>valid</emph></quote>, <quote><emph>invalid</emph></quote>,
or <quote><emph>notKnown</emph></quote>
and reflects the outcome of schema-validity assessment. In the data
model, precise schema type information is exposed for Element and
&attributeNode;s that are <quote><emph>valid</emph></quote>. Nodes
that are not <quote><emph>valid</emph></quote> are treated as if they
were simply well-formed XML and only very general schema type
information is associated with them.
</p>

<div4 id="PSVI2NodeTypes">
<head>Element and Attribute Node Type Names</head>

<p>The precise definition of the schema type of an element or attribute
information item depends on the properties of the PSVI.
In the PSVI, <bibref ref='xmlschema-1'/>
only guarantees the existence of either the
<emph role="infoset-property">type definition</emph> property,
or the
<emph role="infoset-property">type definition namespace</emph>,
<emph role="infoset-property">type definition name</emph> and
<emph role="infoset-property">type definition anonymous</emph>
properties.
If the type definition refers to a union type, there
are further properties defined, that refer to the type definition
which actually validated the item's normalized value. These properties
are not used to determine the schema type of the node.
</p>

<p>If the <emph role="infoset-property">validity</emph> and
<emph role="infoset-property">validation attempted</emph> properties exist
and have the values <quote><emph>valid</emph></quote> and
<quote><emph>full</emph></quote>, respectively,
the schema type of an element or attribute information item is
represented by an <termref def="dt-expanded-qname">expanded-QName</termref>
whose namespace and local name correspond
to the first applicable items in the following list:
</p>

<ulist>
<item>
<p>If the <emph role="infoset-property">type definition</emph> property exists:</p>
  <ulist>
    <item><p>If the {name} property is not absent, the {target namespace} and {name}
properties of the <emph role="infoset-property">type definition</emph>
property;</p>
    </item>
    <item><p>Otherwise, the namespace and local name of the appropriate
<termref def="dt-anonymous-type-name">anonymous type name</termref>.</p>
    </item>
   </ulist>
</item>

<item>
<p>If <emph role="infoset-property">type definition anonymous</emph> exists:</p>
  <ulist>
    <item><p>If it is <emph>false</emph>:
the <emph role="infoset-property">type definition namespace</emph>
and the <emph role="infoset-property">type definition name</emph> properties;
</p></item>
    <item><p>Otherwise, the namespace and local name of the appropriate
<termref def="dt-anonymous-type-name">anonymous type name</termref>.</p>
    </item>
  </ulist>
</item>
</ulist>

<p>If the <emph role="infoset-property">validity</emph> property does
not exist or is not <quote><emph>valid</emph></quote>, or the
<emph role="infoset-property">validition attempted</emph> property does
not exist or is not <quote><emph>full</emph></quote>,
the schema type of an element is <code>xdt:untyped</code> and the type
of an attribute is <code>xdt:untypedAtomic</code>.</p>
</div4>

<div4 id="PSVI2TypedValues">
<head>Atomic Value Type Names</head>

<p>The typed value of &attributeNode;s and some &elementNode;s is an
<termref def="dt-atomic-value">atomic value</termref>.
(Elements that have a complex type with element-only content
do not contain atomic values; such nodes have no typed value and this
section does not apply to them.)</p>

<p>The schema type of each item in the typed value of an Element or &attributeNode;
depends on the schema type of the node and may be further refined. The type must
be further refined when the Element or &attributeNode; has a list or union type.</p>

<p>If the schema type definition of a node refers to a union type, the PSVI
will contain properties that refer to the type definition
which actually validated the item's normalized value.
These properties are either the
<emph role="infoset-property">member type definition</emph>,
or the
<emph role="infoset-property">member type definition namespace</emph>,
<emph role="infoset-property">member type definition name</emph> and
<emph role="infoset-property">member type definition anonymous</emph>
properties.
If these are available, the schema type of the typed value of an
element or attribute will be the member type that actually
validated the schema normalized value.
</p>

<p>The schema type of the typed value is
represented by an <termref def="dt-expanded-qname">expanded-QName</termref>
whose namespace and local name correspond
to the first applicable items in the following list:
</p>

<ulist>
<item>
<p>If the schema type of the node (as defined in <specref ref="PSVI2NodeTypes"/>) is
<code>xdt:untyped</code> or <code>xdt:untypedAtomic</code>, the namespace
and local name of <code>xdt:untypedAtomic</code>.
</p>
</item>

<item>
<p>If <emph role="infoset-property">member type definition</emph> exists:</p>
  <ulist>
    <item><p>If the {name} property is not absent, the {target namespace} and {name}
properties of the <emph role="infoset-property">member type definition</emph>
property;</p>
    </item>
    <item><p>Otherwise, the namespace and local name of the appropriate
<termref def="dt-anonymous-type-name">anonymous type name</termref>.</p>
    </item>
  </ulist>
</item>

<item>
<p>If <emph role="infoset-property">member type definition anonymous</emph> exists:</p>
  <ulist>
    <item><p>If it is <emph>false</emph>:
the <emph role="infoset-property">member type definition namespace</emph>
and the <emph role="infoset-property">member type definition name</emph> properties;
</p></item>
    <item><p>Otherwise, the namespace and local name of the appropriate
<termref def="dt-anonymous-type-name">anonymous type name</termref>.</p>
    </item>
  </ulist>
</item>

<item>
<p>Otherwise, the namespace and local name of the node’s schema type.</p>
</item>
</ulist>

<p>The {variety} of the resulting type will be either <code>atomic</code>
or <code>list</code>. If the {variety} is <code>atomic</code>, that is the
type of the atomic value.</p>

<p>If the {variety} is <code>list</code>, each member of the list must
be examined to determine its <termref def="dt-atomic-value">atomic value</termref>.
The <emph role="infoset-property">schema normalized value</emph> of the
node is a space-separated list of lexical forms. These lexical forms
can be used to create a list of strings, where each string represents
one member of the list of atomic values.</p>

<p>For each string in the list, the nominal type of
each member of the list is identified by the {item type definition} of
the type identified above.</p>

<olist>
<item>
<p>If the {variety} of the nominal type is <code>atomic</code>, if
the string is <code>castable</code> to that type, then the atomic
value is the result of casting the string to that type.</p>
</item>

<item>
<p>If the {variety} of the nominal type is <code>union</code>,
then each type listed in the {member type definitions}
must be considered in turn as a nominal type.</p>
</item>

<item>
<p>If the {variety} of the nominal type is <code>list</code>,
then the {item type definition} must be considered as the nominal type.
</p>
</item>
</olist>

<p>Note that this process is recursive: the member type of a union may
be a list which may have an item type that is a union. The process is
guaranteed to terminate because (1) it terminates immediately if the
initial {variety} is atomic and (2) the initial {variety} can only be
non-atomic if validation succeeded.</p>

</div4>

</div3>

<div3 id="nilled">
<head>Mapping <att>xsi:nil</att> on &elementNode;s</head>

<p><bibref ref="xmlschema-2"/> introduced a mechanism for signaling
that an element should be accepted as valid when it has no content
despite a content type which does not require or even necessarily
allow empty content. That mechanism is the <att>xsi:nil</att> attribute.
</p>

<p>The data model exposes this special semantic in the &dm.prop.nilled; property.
(It also exposes the attribute, irrespective of whether or not schema
processing has been performed.)
</p>

<p>If the <emph role="infoset-property">validity</emph> property exists on
an &elementNode; and is <quote><emph>valid</emph></quote> then if
the <emph role="infoset-property">nil</emph> property exists and is true,
then &dm.prop.nilled; property is <quote><emph>true</emph></quote>.
In all other cases, including all cases where schema validity assessment was
not attempted or did not succeed, the
&dm.prop.nilled; property is <quote><emph>false</emph></quote>.</p>

</div3>

<div3 id="dates-and-times">
<head>Dates and Times</head>

<p>The date and time types require special attention. The following sections apply
to <code>xs:dateTime</code>, <code>xs:date</code>, and <code>xs:time</code> types
and types derived from them.</p>

<div4 id="storing-timezones">
<head>Storing <code>xs:dateTime</code>, <code>xs:date</code>, and <code>xs:time</code> Values in the Data Model</head>

<p><bibref ref="xmlschema-2"/> permits <code>xs:dateTime</code>,
<code>xs:date</code>, and <code>xs:time</code>
values both with and without timezones and therefore only specifies
a partial ordering between date and time values. In the data model,
it is necessary to preserve timezone information.</p>

<p>In order to achieve this goal, <code>xs:dateTime</code>,
<code>xs:date</code>, and <code>xs:time</code> values must be stored
with care. If the lexical representation of the value includes a timezone,
it is converted to UTC
as defined by <bibref ref="xmlschema-2"/> and the timezone in the
lexical representation is converted to a
<code>xdt:dayTimeDuration</code> value (as an offset from UTC). Implementations <rfc2119>must</rfc2119> keep
track of both these values for each <code>xs:dateTime</code>,
<code>xs:date</code>, and <code>xs:time</code> stored.</p>

<p>Lexical representations that do not have a timezone are assumed to be
in UTC for the purposes of normalization only. An empty sequence is used for their
timezone.</p>

<p>Thus, for the purpose of validation,
<quote>2003-01-02T11:30:00-05:00</quote> is converted to
<quote>2003-01-02T16:30:00Z</quote>, but in the data model it <rfc2119>must</rfc2119> be
stored as as <quote>(2003-01-02T16:30:00Z, -PT5H0M)</quote>. The value
<quote>2003-01-16T16:30:00</quote> is stored as
<quote>(2003-01-16T16:30:00Z, ())</quote> because it has no timezone.
</p>
</div4>

<div4 id="retreiving-timezones">
<head>Retreiving the Typed Value of <code>xs:dateTime</code>, <code>xs:date</code>, and <code>xs:time</code> Values</head>

<p>For <code>xs:dateTime</code>, <code>xs:date</code> and
<code>xs:time</code>, the typed value is the atomic value
that is determined from its stored form as follows:</p>

<ulist>
<item><p>If the timezone component is not the empty sequence (the timezone
was specified), then the value
contains the time component, normalized to the timezone specified by
the timezone component, as well as the timezone component. The stored values
"(2003-01-02T16:30:00Z, -PT5H0M)" produce the value
"2003-01-02T11:30:00-05:00".</p></item>
<item><p>If the timezone component is the empty sequence (the timezone
<emph>was not</emph> specified), then the time
component without any indication of timezone. The stored values
"(2003-01-02T16:30:00Z, ())"  produce the value "2003-01-02T16:30:00".</p>
</item>
</ulist>
</div4>
</div3>
</div2>

<div2 id="string-and-typed-value">
<head>String and Typed Values</head>

<p>The
<function>string-value</function> and <function>typed-value</function> of
&documentNode;s, &elementNode;s, and &attributeNode;s are defined in
terms of &dm.prop.string-value; and &dm.prop.typed-value; properties.
This specification describes how values are computed for those properties when
constructing an instance of the data model from an Infoset or a PSVI.
</p>

<p>This is a formalism used to simplify the explanations in this
specification, it is not a constraint on implementations. In practice,
implementations are free to adopt any strategy they wish, provided
that the results are indistinguishable in every significant respect
from the results that would be obtained by following precisely the algorithms
described in this specification.</p>

<p>In practice, some implementations, particularly those constructing
instances of the data model from sources other than Infosets or PSVIs, may have access to
only the string or typed values and not both.
In order to support these implementations, this specification explicitly
identifies some variations in the string value of a node as
<emph>insignificant</emph>. Implementations that do not have access
to, or cannot retain, the original string value of a node may reconstruct
it from the typed value. In this case, the string value may differ from
the string value that would be returned by an implementation that preserved
the original lexical form.</p>

<p>Consider the following node:</p>

<eg>&lt;offset xsi:type="xs:integer"&gt;0030&lt;/offset&gt;</eg>

<p>Assuming that the node is valid, it has a typed value of “30” as
an <code>xs:integer</code>. Some implementations will return “0030” as
the string value and some will return “30”. In this regard, any string
value that is a lexical representation of the typed value is
acceptable.</p>

<p>Applications that care about the original lexical forms must choose
an appropriate implementation.</p>

<div3 id="consistent">
<head>Consistent with XML Schema Validation</head>

<p>Validity assessment only assures that the lexical forms present in
the Infoset or PSVI are within the lexical space of the required
schema type. So while such assessment can determine that “0030” is a
valid integer, it does not produce a typed integer value.</p>

<p>The data model contains not only the string value of each node, it also
contains the typed value. In other words, the process of constructing an instance
of the data model
does, in principle, require that an implementation produce the typed integer value
“30” from the lexical value “0030”.</p>

<p>The phrase “derived from the string-value of the node and
its type in a way that is consistent with XML Schema validation” is used
to describe this
process. It is impossible to define precisely what this means as
implementations may choose different internal representations for typed values.
However, it is a constraint on the data model that string values and typed
values must be consistent. Although variations in string value are allowed, the
string value must always be a valid lexical representation of the typed value.</p>

<div4 id="pattern-facets">
<head>Pattern Facets</head>

<p>Creating a subtype by restriction generally reduces the
<emph>value</emph> space of the original schema type. For example,
expressing a hat size as a restriction of decimal with a minimum value
of 6.5 and maximum value of 8.0 creates a schema type whose legal values are
only those in the range 6.5 to 8.0.</p>

<p>The pattern facet is different because it restricts the
<emph>lexical</emph> space of the schema type, not its value space.
Expressing a three-digit number as a restriction of integer with the
pattern facet “[0-9]{3}” creates a schema type whose legal values
are only those with a lexical form consisting of three digits.</p>

<p>The pattern facet is not reversible in practice; given an arbitrary
pattern, there’s no practical way to determine how the lexical form of
a typed value must be constructed so that the result will satisfy that
pattern.</p>

<p>As a consequence, pattern facets are not respected during serialization
and values in the data model that were originally valid with respect to
a schema that contains pattern-based restrictions may not be valid after
serialization.</p>
</div4>

</div3>

<div3 id="undefined">
<head>Undefined Values</head>

<p>Some typed values in the data model are <emph>undefined</emph>.
Attempting to access an undefined property always raises an error.</p>
</div3>
</div2>
</div1>

<div1 id="data-model-serialization">
<head>Data Model Serialization</head>

<p>Serialization of an instance of the data model is governed by
<bibref ref="xslt-xquery-serialization"/>.</p>

</div1>

<div1 id="infoset-mapping">
<head>Infoset Mapping</head>

<p>This specification describes how to map each kind of node to the
corresponding information item. This mapping produces an Infoset, it
does not and cannot produce a PSVI. Validation must be used to obtain
a PSVI for a (portion of a) data model instance.
</p>

</div1>

<div1 id="accessors">
<head>Accessors</head>

<p>A set of accessors is defined on all seven kinds of nodes, see
<specref ref="Node"/>.
Some
accessors return a constant empty sequence on certain node kinds.
The
<function>unparsed-entity-system-id</function>,
<function>unparsed-entity-public-id</function>, and
<function>document-uri</function> accessors, which are only available
on &documentNode;s, and the 
<function>in-scope-namespaces</function> accessor, which is only
available on &elementNode;s are not included in this summary.</p>

<p>In order for processors to be able to operate on instances of the
data model, the model must expose the properties of the items it contains.
The data model does this by defining a family of accessor functions.
These are not functions in the literal sense, they are not available
for users or applications to call directly, rather they are
descriptions of the information that an implementation of the data model
must expose to applications. Functions and operators available to end-users
are described in <bibref ref="xpath-functions"/>.</p>

<div2 id="dm-base-uri">
<head><code>base-uri</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="base-uri" return-type="xs:anyURI" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>base-uri</function> accessor returns the base URI of a node
as a sequence containing zero or one URI reference. For more information
about base URIs, see <bibref ref="xmlbase"/>.</p>

<p>It is defined on
<loc href="#acc-summ-base-uri">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-node-name">
<head><code>node-name</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="node-name" return-type="xs:QName" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>node-name</function> accessor returns the name of the node
as a sequence of zero or one <code>xs:QName</code>s.</p>

<p>It is defined on
<loc href="#acc-summ-node-name">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-parent">
<head><code>parent</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="parent" return-type="node()" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>parent</function> accessor returns the parent of a node
as a sequence containing zero or one nodes.</p>

<p>It is defined on
<loc href="#acc-summ-parent">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-string-value">
<head><code>string-value</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="string-value" return-type="xs:string" returnEmptyOk="no">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>string-value</function> accessor returns the string value
of a node.</p>

<p>It is defined on
<loc href="#acc-summ-string-value">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-typed-value">
<head><code>typed-value</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="typed-value" return-type="xdt:anyAtomicType" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>typed-value</function> accessor returns the
typed-value of the node as a sequence of zero or more atomic
values.</p>

<p>It is defined on
<loc href="#acc-summ-typed-value">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-type-name">
<head><code>type-name</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="type-name" return-type="xs:QName" returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>type-name</function> accessor returns the name of the schema type
of a node as a sequence of zero or one <code>xs:QName</code>s.</p>

<p>It is defined on
<loc href="#acc-summ-type-name">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-children">
<head><code>children</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="children" return-type="node()" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>children</function> accessor returns the children of a node
as a sequence containing zero or more nodes.</p>

<p>It is defined on
<loc href="#acc-summ-children">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-attributes">
<head><code>attributes</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="attributes" return-type="attribute()" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>attributes</function> accessor returns the attributes of
a node as a sequence containing zero or more &attributeNode;s.
The order of &attributeNode;s is stable but implementation dependent.</p>


<p>It is defined on
<loc href="#acc-summ-attributes">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-namespaces">
<head><code>namespaces</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="namespaces" return-type="node()" returnSeq="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>namespaces</function> accessor returns the namespaces
associated with a node as a sequence containing
zero or more &namespaceNode;s.
The order of &namespaceNode;s is stable but implementation dependent.</p>

<p>It is defined on
<loc href="#acc-summ-namespaces">all seven</loc> node kinds.</p>

</div2>

<div2 id="dm-nilled">
<head><code>nilled</code> Accessor</head>

<example role="signature">
  <proto class="dm" name="nilled" return-type="xs:boolean" returnSeq="no"
	 returnEmptyOk="yes">
    <arg name="n" type="node()"/>
  </proto>
</example>

<p>The <function>nilled</function> accessor returns true if the
node is <quote>nilled</quote>, see <specref ref="nilled"/>.</p>

<p>It is defined on
<loc href="#acc-summ-nilled">all seven</loc> node kinds.</p>
</div2>
</div1>

<div1 id="Node">
<head>Nodes</head>

<p><termdef id="dt-node" term="Node">The seven distinct kinds of
<term>Node</term>:
<loc href="#DocumentNode">document</loc>,
<loc href="#ElementNode">element</loc>,
<loc href="#AttributeNode">attribute</loc>,
<loc href="#TextNode">text</loc>,
<loc href="#NamespaceNode">namespace</loc>,
<loc href="#ProcessingInstructionNode">processing instruction</loc>, and
<loc href="#CommentNode">comment</loc>,</termdef> are
defined in the following subsections.</p>

<p id="constraints-general">All nodes <rfc2119>must</rfc2119> satisfy
the following general constraints:</p>

<olist>
<item><p>Every node <rfc2119>must</rfc2119> have a unique identity,
distinct from all other nodes.
</p></item>
<item>
<p>The &dm.prop.children; property of a node <rfc2119>must not</rfc2119>
contain two consecutive &textNode;s.</p>
</item>
<item>
<p>The &dm.prop.children; property of a node <rfc2119>must not</rfc2119>
contain any empty &textNode;s.</p>
</item>
<item><p>The sequence of nodes in the &dm.prop.children;
property of a node is ordered and <rfc2119>must</rfc2119> be in document order.
</p></item>
<item>
<p>The &dm.prop.children; and &dm.prop.attributes; properties of a node
<rfc2119>must not</rfc2119>
contain two nodes with the same identity.</p>
</item>
</olist>

&Document;
&Element;
&Attribute;
&Namespace;
&ProcessingInstruction;
&Comment;
&Text;
</div1>

<div1 id="conformance">
<head>Conformance</head>

<p>The data model is intended primarily as a component that can be
used by other specifications. Therefore, the data model relies on
specifications that use it (such as <bibref ref="xpath20"/>, <bibref
ref="xslt20"/>, and <bibref ref="xquery"/>) to specify conformance
criteria for the data model in their respective environments.
Specifications that set conformance criteria for their use of the data
model must not relax the constraints expressed in this
specification.</p>

<p>Authors of conformance criteria for the use of the data
model should pay particular attention to the following features of
the data model:</p>

<olist>
<item>
<p>Support for DTD processing (both validation and unparsed entities).
</p>
</item>
<item>
<p>Support for W3C XML Schema processing.
</p>
</item>
<item>
<p>Support for the normative construction from an infoset described in
<specref ref="const-infoset"/>.
</p>
</item>
<item>
<p>Support for the normative construction from a PSVI described in
<specref ref="const-psvi"/>.
</p>
</item>
<item>
<p>Support for XML 1.0 and XML 1.1.
</p>
</item>
</olist>

</div1>

</body>

<back>
<div1>
<head>XML Information Set Conformance</head>

<p>This specification conforms to the XML Information Set <bibref ref="xml-infoset"/>.
The following information items <rfc2119>must</rfc2119> be exposed
by the infoset producer to construct a data model unless they are explicitly
identified as optional:</p>

<ulist>
  <item><p>The <emph role="info-item">Document Information Item</emph> with
           <emph role="infoset-property">base URI</emph> and
           <emph role="infoset-property">children</emph> properties.</p></item>

  <item><p><emph role="info-item">Element Information Items</emph> with
           <emph role="infoset-property">base URI</emph>,
           <emph role="infoset-property">children</emph>,
           <emph role="infoset-property">attributes</emph>,
           <emph role="infoset-property">in-scope namespaces</emph>,
           <emph role="infoset-property">local name</emph>,
           <emph role="infoset-property">namespace name</emph>,
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Attribute Information Items</emph> with
           <emph role="infoset-property">namespace name</emph>,
           <emph role="infoset-property">local name</emph>,
           <emph role="infoset-property">normalized value</emph>,
           <emph role="infoset-property">attribute type</emph>, and
           <emph role="infoset-property">owner element</emph> properties.</p></item>

  <item><p><emph role="info-item">Character Information Items</emph> with
           <emph role="infoset-property">character code</emph> and
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Processing Instruction Information Items</emph> with
           <emph role="infoset-property">base URI</emph>,
           <emph role="infoset-property">target</emph>,
           <emph role="infoset-property">content</emph> and
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Comment Information Items</emph> with
           <emph role="infoset-property">content</emph> and
           <emph role="infoset-property">parent</emph> properties.</p></item>

  <item><p><emph role="info-item">Namespace Information Items</emph> with
           <emph role="infoset-property">prefix</emph> (optional) and
           <emph role="infoset-property">namespace name</emph> properties.</p></item>
</ulist>

<p>Other information items and properties made available by the
Infoset processor are ignored.  In addition to the properties above,
the following properties are required from the PSVI if the data model is
constructed from a PSVI:</p>

<ulist>
  <item><p><emph role="infoset-property">validity</emph>,
  <emph role="infoset-property">type definition</emph>,
  <emph role="infoset-property">type definition namespace</emph>,
  <emph role="infoset-property">type definition name</emph>,
  <emph role="infoset-property">type definition anonymous</emph>,
  <emph role="infoset-property">member type definition</emph>,
  <emph role="infoset-property">member type definition namespace</emph>,
  <emph role="infoset-property">member type definition name</emph>,
  <emph role="infoset-property">member type definition anonymous</emph> and
  <emph role="infoset-property">schema normalized value</emph> properties on
  <emph role="info-item">Element Information Items</emph>.</p></item>

  <item><p><emph role="infoset-property">validity</emph>,
  <emph role="infoset-property">type definition</emph>,
  <emph role="infoset-property">type definition namespace</emph>,
  <emph role="infoset-property">type definition name</emph>,
  <emph role="infoset-property">type definition anonymous</emph>,
  <emph role="infoset-property">member type definition</emph>,
  <emph role="infoset-property">member type definition namespace</emph>,
  <emph role="infoset-property">member type definition name</emph>,
  <emph role="infoset-property">member type definition anonymous</emph> and
  <emph role="infoset-property">schema normalized value</emph> properties on
  <emph role="info-item">Attribute Information Items</emph>.</p></item>
</ulist>

</div1>

<div1>
<head>Error Summary</head>

<error-list>

<error class="TY" code="0001" label="undefined type" type="static">
<p>This error is raised whenever an accessor is called for a property that
is undefined.</p>
</error>
</error-list>
</div1>

<div1 id="references">
<head>References</head>

<div2 id="normative-references">
<head>Normative References</head>

<blist>

<!--FIXME: update ../etc/tr with the latest TR page info! -->

<bibl id="xml-infoset"     key="Infoset"/>
<bibl id="REC-xml-names"   key="Namespaces in XML"/>
<bibl id="xml-names11"     key="Namespaces in XML 1.1"/>
<bibl id="xpath20"         key="XPath 2.0"/>
<bibl id="xpath-functions" key="Functions and Operators"/>
<bibl id="xmlschema-1"     key="Schema Part 1"/>
<bibl id="xmlschema-2"     key="Schema Part 2"/>
<bibl id="xslt-xquery-serialization" key="Serialization"/>
<bibl id="xquery-semantics" key="Formal Semantics"/>
<bibl id="RFC2119"         key="RFC 2119"/>
<bibl id="charmod"         key="Character Model"/>

</blist>
</div2>

<div2 id="informative-references">
<head>Other References</head>

<blist>

<bibl id="XQDM00"          key="XML Query Data Model">
<titleref href="http://www.w3.org/TR/2001/WD-query-datamodel-20010215/"
>XML Query Data Model</titleref>,
Mary Fernández and Jonathan Robie, Editors.
World Wide Web Consortium,
15 Feb 2001.
</bibl>

<bibl id="xmlbase"         key="XML Base"/>
<bibl id="xpath"           key="XPath 1.0"/>
<bibl id="xpath20req"      key="XPath 2.0 Requirements"/>
<bibl id="xslt20"          key="XSLT 2.0"/>

<bibl id="XQWG" key="XML Query Working Group">
<titleref href="http://www.w3.org/XML/Query"
>XML Query Working Group</titleref>,
World Wide Web Consortium.
Home page: http://www.w3.org/XML/Query
</bibl>

<bibl id="XSLWG" key="XSL Working Group">
<titleref href="http://www.w3.org/Style/XSL/"
>XSL Working Group</titleref>,
World Wide Web Consortium.
Home page: http://www.w3.org/Style/XSL/
</bibl>

<bibl id="xquery" key="XQuery"/>
<bibl id="xquery-requirements" key="XML Query Requirements"/>

<bibl id="ISO8601" key="ISO 8601">ISO (International Organization for Standardization).
<emph>Representations of dates and times, 2000-08-03.</emph>  
Available from: <loc href="http://www.iso.ch/">http://www.iso.ch/</loc>
</bibl>

</blist>
</div2>
</div1>

<inform-div1 id="glossary">
<head>Glossary</head>
<?glossary?>
</inform-div1>

<inform-div1 id="example">
<head>Example</head>

<ednote>
<edtext>This appendix does not exactly reflect the current state of
the documents. It will be updated before the next publication.</edtext>
</ednote>

<p>The following XML document is used to illustrate the information
contained in a data model:</p>

<eg>&dm-example.xml;</eg>

<p>The document is associated with the URI
<quote>http://www.example.com/catalog.xml</quote>,
and is valid with respect to the following XML schema:</p>

<eg>&dm-example.xsd;</eg>

<p>This example exposes the data model for a document that has an associated
schema and has been validated successfully against it.
In general, an XML Schema is not required,
that is, the data model can represent a schemaless, well-formed XML
document with the rules described in <specref ref="types"/>.</p>

<p>The XML document is represented by the nodes described below.
The value <emph>D1</emph> represents a &documentNode;;
the values <emph>E1, E2, etc.</emph> represent &elementNode;s;
the values <emph>A1, A2, etc.</emph> represent &attributeNode;s;
the values <emph>N1, N2, etc.</emph> represent &namespaceNode;s;
the values <emph>P1, P2, etc.</emph> represent &processingInstructionNode;s;
the values <emph>T1, T2, etc.</emph> represent &textNode;s.</p>

<p>For brevity:</p>

<ulist>
<item><p>&textNode;s in the data model that contain only white space are not shown.</p>
</item>
<item><p>Literal strings are shown in quotes without the <code>xs:string()</code>
constructor
</p></item>
<item><p>Literal decimals are shown without the <code>xs:decimal()</code>
constructor
</p></item>
<item><p>Nodes are referred to using the syntax <code>[nodeID]</code>
</p></item>
<item><p>xs:QNames are used with the following prefixes:</p>

<table border="0" summary="Namespace prefixes">
<tbody>
  <tr>
    <td>xs</td><td>http://www.w3.org/2001/XMLSchema</td>
  </tr>
  <tr>
    <td>xsi</td><td>http://www.w3.org/2001/XMLSchema-instance</td>
  </tr>
  <tr>
    <td>cat</td><td>http://www.example.com/catalog</td>
  </tr>
  <tr>
    <td>xlink</td><td>http://www.w3.org/1999/xlink</td>
  </tr>
  <tr>
    <td>html</td><td>http://www.w3.org/1999/xhtml</td>
  </tr>
</tbody>
</table>

</item>
<item><p>The abbreviation <quote><code>\n</code></quote> is used in string literals
to represent a newline character; this isn't supported in XPath, but it makes
this presentation clearer.</p></item>
<item><p>Accessors that return the empty sequence have been omitted.</p>
</item>
<item><p>To simplify the presentation, we’re assuming an implementation
that does not expose the namespace axis. Therefore,
&namespaceNode;s are shared across multiple elements.
See <specref ref="NamespaceNode"/>.</p>
</item>
</ulist>

&dm-example.tbl;

<p>A graphical representation of the data model for the preceding
example is shown below. Document order in this representation can be
found by following the traditional in-order, left-to-right,
depth-first traversal; however, because the image has been rotated for
easier presentation, this appears to be in-order, bottom-to-top,
depth-first order.</p>

<table border="0" cellspacing="0" summary="Graphic">
  <tbody>
    <tr>
      <td><graphic source="dm-example.png"
                   alt="Graphical depiction of the example data model."/>
      </td>
    </tr>
    <tr>
      <td>Graphic representation of the data model.
[<loc href="dm-example-large.png">large view</loc>,
<loc href="dm-example.svg">SVG</loc>]
      </td>
    </tr>
  </tbody>
</table>

</inform-div1>

<!-- removed reference to dm-issues-list.xml -->

</back>
</spec>
