<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type="text/xsl" href="xmlspec.xsl"?>
<!DOCTYPE spec PUBLIC "-//W3C//DTD Specification V2.1//EN" "xmlspec.dtd" [
<!ENTITY day "10">
<!ENTITY month "July">
<!ENTITY monthno "07">
<!ENTITY year "2002">
<!ENTITY trpath "http://www.w3.org/TR/">
<!ENTITY W3C-path "&trpath;&year;/">
<!ENTITY doctype "WD">
<!ENTITY isodate "&year;&monthno;&day;">
<!ENTITY schemename "xpointer">
<!ENTITY shortname "&schemename;">
<!ENTITY dirname "&W3C-path;&doctype;-xptr-&shortname;-&isodate;/">
<!ENTITY scheme-phrase "<code>&schemename;()</code> scheme">
<!ENTITY media-types "one of <code>text/xml</code>, 
<code>application/xml</code>, 
<code>text/xml-external-parsed-entity</code>,
or <code>application/xml-external-parsed-entity</code>">
<!ENTITY XLink "http://www.w3.org/TR/xlink">
<!ENTITY XML "http://www.w3.org/TR/REC-xml">
<!ENTITY abstract 'The XPointer <code>&schemename;()</code> scheme is intended to be used with the XPointer Framework <bibref ref="xptr-framework"/> to provide full XML addressing functionality.'>
<!ENTITY XMLNames "http://www.w3.org/TR/REC-xml-names">
<!ENTITY XPath "http://www.w3.org/TR/xpath">
<!ENTITY Issues "http://www.w3.org/XML/Group/1999/07/LinkingIssueList">
<!ENTITY RFC2396 "http://www.rfc-editor.org/rfc/rfc2396.txt">
<!ENTITY xptr-framework-file "&W3C-path;&doctype;-xptr-framework-&isodate;/">
]>
<spec w3c-doctype="wd">
<header>
<title>XPointer &schemename;() Scheme</title>
<w3c-designation>&doctype;-xptr-&shortname;-&isodate;</w3c-designation>
<w3c-doctype>W3C Working Draft</w3c-doctype>
<pubdate><day>&day;</day><month>&month;</month><year>&year;</year></pubdate>
<publoc><loc href="&dirname;">&dirname;</loc></publoc>
<latestloc><loc href="&trpath;xptr-&shortname;/">&trpath;xptr-&shortname;/</loc></latestloc>
<prevlocs><loc href="http://www.w3.org/TR/2001/CR-xptr-20010911/">http://www.w3.org/TR/2001/CR-xptr-20010911/</loc></prevlocs>
<otherlocs>This document is also available in the following non-normative format: <loc href="&dirname;&shortname;.xml">XML</loc> (<loc href="&dirname;xmlspec.dtd">DTD</loc>, <loc href="&dirname;xmlspec.xsl">XSL</loc>).
</otherlocs>
<authlist>
<author><name>Steven DeRose</name><affiliation>Brown University Scholarly
Technology Group</affiliation><email href="mailto:Steven_DeRose@Brown.edu">Steven_DeRose@Brown.edu</email>
</author>
<author><name>Eve Maler</name><affiliation>Sun Microsystems</affiliation>
<email href="mailto:eve.maler@sun.com">eve.maler@sun.com</email></author>
<author><name>Ron Daniel Jr.</name><affiliation>Interwoven</affiliation><email
href="mailto:rdaniel@interwoven.com">rdaniel@interwoven.com</email></author>
</authlist>
<abstract>
<p>&abstract;</p>
</abstract>
<status>
<p>This is a W3C Working Draft for review by W3C members and other interested
parties. It is a draft document and may be updated, replaced, or obsoleted
by other documents at any time. It is inappropriate to use W3C Working Drafts
as reference material or to cite them as other than <quote>work in progress.</quote> Comments
on this document should be sent to the public
mailing list <loc href="mailto:www-xml-linking-comments@w3.org">www-xml-linking-comments@w3.org</loc> (<loc
href="http://lists.w3.org/Archives/Public/www-xml-linking-comments/">archive</loc>).</p>
 <p>This document has been produced by the <loc href="http://www.w3.org/XML/Linking">W3C XML Linking Working
Group</loc> as part of the <loc href="http://www.w3.org/XML/Activity">XML
Activity</loc>.  The goals of this work are set out in the <loc href="http://www.w3.org/TR/NOTE-xptr-req">XPointer
Requirements</loc> document.</p>
 <p>There are patent disclosures and license commitments associated with
this working draft, which may be found on the <loc href="http://www.w3.org/2002/06/xptr_IPR_summary.html">XPointer IPR Statement</loc> page in conformance with <loc href="http://www.w3.org/Consortium/Process-20010719/#ipr">W3C policy</loc>.</p>
<p>This specification is being published as an interim Working Draft in order
to show how the &scheme-phrase; portion of the <loc href="http://www.w3.org/TR/2001/CR-xptr-20010911/">XPointer
Candidate Recommendation</loc> published on 11 September 2001 fits into the
current XPointer Framework. This draft does not include any work on disposition
of comments that were reported on the Candidate Recommendation draft.</p>
 <p>A list of current W3C Recommendations and other technical documents can be
found at <loc href="http://www.w3.org/TR/">http://www.w3.org/TR/</loc></p>
</status>
<langusage>
<language id="en">English</language>
<language id="ebnf">Extended Backus-Naur Form (formal grammar)</language>
</langusage>
<revisiondesc>
<slist>
<sitem>20020522: Initial draft.</sitem>
</slist>
</revisiondesc>
</header>
<body>
<div1>
<head>Introduction </head>
<p>&abstract; This scheme supports addressing into the internal structures
of XML documents and external parsed entities. It allows for examination of
a document's hierarchical structure and choice of its internal parts based
on various properties, such as element types, attribute values, character
content, and relative position. In particular, it provides for specific reference
to elements, character strings, and other XML information, whether or not
they bear an explicit ID attribute.</p>
<p>The &scheme-phrase; is built on top of the XML Path Language <bibref ref="XPath"/>,
which is an expression language underlying the XSL Transformations (XSLT)
language. The &scheme-phrase;'s extensions to XPath allow it to address points
and ranges as well as whole nodes, and to locate information by string matching.</p>
<p>The &scheme-phrase; does not cover addressing into the internal structures
of DTDs or the XML declaration.</p>
<div2>
<head>Origin and Goals </head>
<p>In addition to XPath, the following standards have been especially influential
in the development of this specification:</p>
<ulist>
<item><p><emph>HTML</emph> <bibref ref="html"/>: This system popularized an
important location specifier type, the URL (now URI).</p></item>
<item><p><emph>HyTime</emph> <bibref ref="iso10744"/>: This ISO standard defines
location specifier types for all kinds of data.</p></item>
<item><p><emph>Text Encoding Initiative Guidelines</emph> <bibref ref="tei"/>:
This application provides a formal syntax for <quote>extended pointers,</quote> locators
for structured markup that underlie the initial design for the &scheme-phrase;.</p>
</item>
</ulist>
<p>The addressing components of many other hypertext systems have also informed
the design of the &scheme-phrase;, especially <bibref ref="dexter"/>, <bibref
ref="ohs"/>, <bibref ref="fress"/>, <bibref ref="microcosm"/>, and <bibref
ref="intermedia"/>.</p>
<p>See the XPointer Requirements Document <bibref ref="xpreq"/> for a thorough
explanation of requirements for the design of the &scheme-phrase;.</p>
</div2>
<div2>
<head>Notation and Document Conventions </head>
<p><termdef id="dt-must" term="Must, May, etc.">The key words <term>must</term>, <term>must
not</term>, <term>required</term>, <term>shall</term>, <term>shall not</term>, <term>should</term>, <term>should
not</term>, <term>recommended</term>, <term>may</term>, and <term>optional</term> in
this specification are to be interpreted as described in <bibref ref="rfc2119"/>.</termdef></p>
<p>The terms pointer, pointer part, scheme, XPointer processor, application,
error, failure, and namespace binding context are used in this specification
as <xspecref href="&xptr-framework-file;#terminology">defined</xspecref> in
the XPointer Framework specification. Note that errors defined by this specification
are distinct from XPointer Framework errors.</p>
<p>The formal grammar for the &scheme-phrase; is given using simple Extended
Backus-Naur Form (EBNF) notation, as described in the XML Recommendation <bibref
ref="XML"/>.</p>
<p>The prototypes for &scheme-phrase; functions are given using the same notation
used in the <bibref ref="XPath"/> Recommendation.</p>
<p>This specification contains some explanatory text on the XPath language;
however, such text is non-normative. In the case of conflicts, <bibref ref="XPath"/> is
normative.</p>
</div2>
</div1>
<div1>
<head>Terms and Concepts </head>
<p>Some special terms are defined here in order to clarify their relationship
to similar terms used in the technologies on which the &scheme-phrase; is
based. Additional terms specific to the &scheme-phrase; are defined in the
flow of the text. Refer to <bibref ref="XPath"/>, <bibref ref="dom2"/>, <bibref
ref="Infoset"/>, and <bibref ref="rfc2396"/> for definitions of other technical
terms used in this specification.</p>
<glist>
<gitem><label>point</label>
<def>
<p>A position in XML information. This notion is defined fully later (see <termref
def="dt-point">point</termref>), and comes from the DOM Level 2 <bibref ref="dom2"/> specification's
notion of positions; this specification refers to DOM positions by the term <quote>point</quote> to
avoid confusion with XPath positions.</p>
</def></gitem>
<gitem><label>range</label>
<def>
<p>An identification of all the XML information between a pair of points.
This notion is defined fully later (see <termref def="dt-range">range</termref>),
and comes from the DOM Level 2 <bibref ref="dom2"/> specification.</p>
</def></gitem>
<gitem><label><termdef id="dt-location" term="Location">location</termdef></label>
<def>
<p>A generalization of XPath's <term>node</term> that includes points and
ranges in addition to nodes.</p>
</def></gitem>
<gitem><label><termdef id="dt-locset" term="Location 
set">location-set</termdef></label>
<def>
<p>An unordered list of locations, such as produced by an &scheme-phrase; expression.
This corresponds to the <term>node-set</term> that is produced by XPath expressions,
except for the generalization to include points and ranges. Just as for a
node-set, a location-set is treated as having a specific order depending on
the axis that is operating on it. However, this ordering depends on the &scheme-phrase;'s
extended notion of document order as defined in <specref ref="document-order-sec"/>,
rather than XPath's original notion of document order.</p>
</def></gitem>
</glist>
</div1>
<div1 id="conformance">
<head>Conformance </head>
<p>This specification normatively depends on the XPointer Framework <bibref
ref="xptr-framework"/> specification and the XPath <bibref ref="XPath"/> Recommendation.
It also normatively depends on the XPointer <code>xmlns</code>() scheme specification <bibref
ref="xptr-xmlns"/>; XPointer processors claiming to conform to this specification
must also conform to the <code>xmlns</code>() specification.</p>
<p>Scheme data for the &scheme-phrase; conforms to this specification if it
does not cause an error as described in this specification.</p>
<p>Conforming XPointer processors claiming to support the &scheme-phrase; must
conform to the behavior defined in this specification and may conform to additional
XPointer scheme specifications.</p>
<p>Should need arise to refer to the namespace for objects defined by this
specification, the normative namespace URI for the &scheme-phrase; is <code>http://www.w3.org/2001/05/XPointer</code>.</p>
</div1>
<div1 id="model">
<head>Language and Processing</head>
<p>XPath expressions work with a data set that is derived from the elements
and other markup constructs of an XML document. The &scheme-phrase; model
augments this data set. Both <code>&schemename;()</code> expressions and XPath
expressions operate by selecting portions of such data sets, often by their
structural relationship to other parts (for example, the parent of a node
with a certain ID value). Expressions in <code>&schemename;()</code> use iterative
selections, each operating on what is found by the prior one.</p>
<p>Selection of portions of the information hierarchy is done through axes,
predicates, and functions. An axis defines a sequence of candidates that might
be located; predicates then test for various criteria relative to such portions;
and functions generate new candidates or perform various other tasks. For
example, one can select certain elements from among the siblings of some previously
located element, based on whether those sibling elements have an attribute
with a certain value, or are of a certain type such as <quote>footnote</quote>;
or select the point location immediately preceding a certain <quote>para</quote>.</p>
<div2>
<head>Syntax</head>
<p>This section describes the syntax and semantics of the &scheme-phrase; and
the behavior of XPointer processors with respect to this scheme.</p>
<p>The scheme name is <quote>&schemename;</quote>. The scheme data syntax
is as follows; if scheme data in a pointer part with the &scheme-phrase; does
not conform to the syntax defined in this section, it is an error and the
pointer part fails.</p>
<scrap>
<head>xpointer() Scheme Syntax</head>
<prod id="NT-xpointerschemedata">
<lhs>xpointerschemedata</lhs><rhs><xnt href="&XPath;#NT-Expr">Expr</xnt></rhs>
</prod>
</scrap>
<p><xnt href="&XPath;#NT-Expr">Expr</xnt> is as defined in the XPath Recommendation <bibref
ref="XPath"/>, with the extensions defined in this specification.</p>
</div2>
<div2 id="terminology">
<head>Additions to XPath Terms and Concepts </head>
<p>The &scheme-phrase; extends XPath by adding the following:</p>
<ulist>
<item><p>Two new location types, <code>point</code> and <code>range</code>,
corresponding to DOM positions and ranges, that can appear in location-set
results; also tests (akin to node tests) for these location types.</p></item>
<item><p>A generalization of the XPath concepts of nodes, node types, and
node-sets to the &scheme-phrase; concepts of <termref def="dt-locset">locations</termref> (which
subsume nodes, <termref def="dt-point">points</termref>, and <termref def="dt-range">ranges</termref>),
and corresponding location types and location-sets.</p></item>
<item><p>Rules for establishing the XPath evaluation context.</p></item>
<item><p>The functions <function>string-range</function> and <function>range-to</function>,
which return the range location type for selections that are not single XML
nodes.</p></item>
<item><p>The functions <function>here</function> and <function>origin</function>,
to provide for addressing relative to the location of an &scheme-phrase; expression
itself, and to the point of origin for hypertext traversal when expressions
are used in that (very common) application domain.</p></item>
<item><p>The functions <function>start-point</function> and <function>end-point</function>,
to address the beginning and ending locations which bound another location
such as a node or range.</p></item>
<item><p>Allowance (as in <bibref ref="XSLT"/>) for the root node to have
multiple child elements, to allow expressions to address into arbitrary external
parsed entities as well as well-formed documents.</p></item>
<item><p>The functions <function>range</function> and <function>range-inside</function>,
to address the covering range of location sets.</p></item>
</ulist>
<p>XPath provides for locating any subset of the <emph>nodes</emph> in an
XML document or external parsed entities. XPath functionality, such as filtering
an axis output by predicate, is generally defined in terms of operations on
nodes and node-sets.</p>
<p>The &scheme-phrase; has a requirement to identify document and entity portions
that are not nodes in this sense. One example of such a non-node region is
an arbitrary user selection indicated by a drag between two points in a document.
The two points might have different sets of ancestors in the hierarchy, or
the region might form only a small part of a node. For example, a range could
be a single letter or could extend from the middle of one paragraph to the
middle of the next, thus containing only part of the relevant paragraphs and
text nodes. Even though such locations are not nodes, the &scheme-phrase; needs
to be able to apply XPath operations to them as well as to nodes.</p>
<p>To accomplish this, the &scheme-phrase; defines <term>location</term> as
a generalization of XPath's <term>node</term>. Every location is either a <termref
def="dt-point">point</termref>, a <termref def="dt-range">range</termref>,
or an XPath node. Thus, the &scheme-phrase; also defines <term>location-set</term> as
a generalization of XPath's node-set. All locations generated by XPath constructs
are nodes; constructs defined by the &scheme-phrase; can also identify points
and ranges.</p>
<note>
<p>The order of characters displayed on a computer screen might not reflect
their order in the underlying XML document, for example, when a portion of
a right-to-left language such as Arabic is embedded in a left-to-right language
such as French. For expressions that identify ranges of strings, the document
order is used, not the display order. Thus, an expression for a single range
might be displayed non-contiguously, and conversely a user selection of an
apparent single range might correspond to multiple non-contiguous ranges in
the underlying document.</p>
</note>
</div2>
<div2 id="context">
<head>Evaluation Context Initialization </head>
<p>An &scheme-phrase; expression in a pointer part is evaluated to yield an
object of type location-set. This evaluation is carried out within a context
identical to the XPath evaluation context, except for the generalization of
nodes to locations. XPointer processors <termref def="dt-must">must</termref> initialize
the evaluation context as described in this section before evaluating an expression:</p>
<p>The evaluation context contains the following information:</p>
<ulist>
<item><p>A location (the <term>context location</term>), initialized to the
root node of an XML document or external parsed entity.</p></item>
<item><p>A non-zero context position, initialized to 1.</p></item>
<item><p>A non-zero context size, initialized to 1. (At the start, the only
location in the current location list is the context location.)</p></item>
<item><p>A set of variable bindings. No means for initializing these is defined
for XPointer processors. Thus, the set of variable bindings used when evaluating
an expression is empty, and use of a variable reference in an expression results
in failure of the pointer part.</p></item>
<item><p>A library of functions. Only functions defined in XPath or this specification
can be used in expressions. An expression that uses other functions results
in failure of the pointer part.</p></item>
<item><p>A namespace binding context consisting of the initial context defined
in the XPointer Framework specification and additional contributions made
by pointer parts having the <code>xmlns</code>() scheme to the left of the
current pointer part.</p></item>
<item><p>When applicable, properties for the locations that the <function>origin</function> and <function>here</function> functions
return.</p></item>
</ulist>
</div2>
<div2 id="datatypes">
<head>The <code>point</code> and <code>range</code> Location Types </head>
<p>To address non-node locations, the &scheme-phrase; defines two new location
types, <code>point</code> and <code>range</code>, that can appear in location-sets
and can be operated on by XPath node tests and predicates. Locations of the <code>point</code> and <code>range</code> type
represent positions and ranges as in DOM Level 2 <bibref ref="dom2"/>. This
section defines the <code>point</code> and <code>range</code> types and their
characteristics required for XPath interoperability.</p>
<note>
<p>Unlike DOM Level 2, which is based on UTF-16 units, XPath and the &scheme-phrase; are
based on UCS characters. So while the concepts of points and ranges are based
on the DOM 2 notions of positions and ranges, there are differences in detail.
For example, a sequence which in DOM counts as two characters might count
in the &scheme-phrase; as one character.</p>
</note>
<p>Points and ranges can be used as context locations in the &scheme-phrase;.
This allows the <code>[]</code> operator to be used to select from sets of
ranges. Also, a point as a context location, when followed by a <function>range-to</function> function,
selects a range.</p>
<div3>
<head>Definition of Point Location </head>
<p><termdef id="dt-point" term="Point">A location of type <term>point</term> is
defined by</termdef> <termdef id="dt-container-node-index" term="Container Node and Index">a
node, called the <term>container node</term>, and a non-negative integer,
called the <term>index</term>.</termdef> It can represent the location preceding
or following any individual character, or preceding or following any node
in the data set constructed from an XML document or external parsed entity.
Two points are identical if they have the same container node and index. </p>
<note>
<p> This specification does not constrain the implementation of points; applications
need not actually represent points using data structures consisting of a node
and an index.</p>
<p>Also note that, while some nodes containing points have explicit boundaries
(such as element start-tags and end-tags), the boundaries of text nodes are
implicit. Applications that present a graphical user interface for the selection
or rendering of points and ranges need to take into consideration the fact
that some seemingly identical points, such as the points just inside and just
outside the closing boundary of a text node inside an element, are in fact
distinguished.</p>
</note>
<p><termdef id="dt-node-point" term="Node point">When the container node of
a point is of a node type that can have child nodes (that is, when the container
node is an element node or a root node), then the index is an index into the
child nodes; such a point is called a <term>node-point</term>.</termdef> The
index of a node-point <termref def="dt-must">must</termref> be greater than
or equal to zero and less than or equal to the number of child nodes of the
container. An index of zero indicates the point before any child nodes, and
a non-zero index <var>n</var> indicates the point immediately after the <var>n</var>th
child node.</p>
<note>
<p>The zero-based counting of node-points differs from the one-based counting
of <function>string-range</function> and other XPath and &scheme-phrase; functions.</p>
</note>
<p><termdef id="dt-character-point" term="Character point">When the container
node of a point is of a node type that cannot have child nodes (i.e., text
nodes, comments, processing instructions, attribute nodes, and namespace nodes),
then the index is an index into the characters of the string-value of the
node; such a point is called a <term>character-point</term>.</termdef> The
index of a character-point <termref def="dt-must">must</termref> be greater
than or equal to zero and less than or equal to the length of the string-value
of the node. An index of zero indicates a point immediately before the first
character of the string-value, and a non-zero index <var>n</var> indicates
the point immediately after the <var>n</var>th character of the string-value.</p>
<p>A point location does not have an expanded-name.</p>
<p>The <xtermref href="&XPath;#dt-string-value">string-value</xtermref> of
a point location is empty.</p>
<p> The axes of a point location are defined as follows: </p>
<ulist>
<item><p>The <kw>child</kw>, <kw>descendant</kw>, <kw>preceding-sibling</kw>, <kw>following-sibling</kw>, <kw>preceding</kw>, <kw>following</kw>, <kw
>attribute</kw>, and <kw>namespace</kw> axes are empty.</p></item>
<item><p>The <kw>descendant-or-self</kw> axis contains the point itself.</p>
</item>
<item><p>The <kw>self</kw> axis contains the point itself.</p></item>
<item><p>The <kw>parent</kw> axis contains the point's container node.</p>
</item>
<item><p>The <kw>ancestor</kw> axis contains the point's container node and
its ancestors.</p></item>
<item><p>The <kw>ancestor-or-self</kw> axis contains the point itself, the
point's container node, and its ancestors.</p></item>
</ulist>
</div3>
<div3>
<head>Definition of Range Location </head>
<p>A location of type <termdef id="dt-range" term="Range">range</termdef> is
defined by two points, a start point and an end point. A range represents
all of the XML structure and content between the start point and end point.
This is distinct from any list of nodes and/or characters, in part because
some nodes might be only partly included. The start point and end point of
a range <termref def="dt-must">must</termref> be in the same document or external
parsed entity. The start point <termref def="dt-must">must</termref> not appear
after the end point in document order (see <specref ref="document-order-sec"/>).</p>
<p><termdef id="dt-collapsed-range" term="Collapsed range">A range whose start
point and end point are equal is a <term>collapsed range.</term></termdef></p>
<p>If the container node of one point of a range is a node of a type other
than element, text, or root, the container node of the other point of the
range <termref def="dt-must">must</termref> be the same node. For example,
it is allowed to specify a range from the start of a processing instruction
to the end of an element, but not to specify a range from text inside a processing
instruction to text outside it.</p>
<p>A range location does not have an expanded-name.</p>
<p>The <xtermref href="&XPath;#dt-string-value">string-value</xtermref> of
a range location consists of the characters that are in text nodes and that
are between the start point and end point of the range.</p>
<p>The axes of a range location are identical to the axes of its start point.
For example, the <kw>parent</kw> axis of a range contains the parent of the
start point of the range.</p>
<note>
<p>The <function>start-point</function> and <function>end-point</function> functions
can be used to navigate with respect to the boundaries of a range location.</p>
</note>
</div3>
<div3>
<head>Covering Ranges for All Location Types </head>
<p><termdef id="dt-covering-range" term="Covering range">A <term>covering
range</term> is a range that wholly encompasses a location. For every location,
a covering range is defined as follows:</termdef></p>
<ulist>
<item><p>For a range location, the covering range is identical to the range.</p>
</item>
<item><p>For a point location, the start and end points of the covering range
are the point itself.</p></item>
<item><p>For an attribute or namespace location, the container node of the
start point and end point of the covering range is the attribute or namespace
location; the index of the start point of the covering range is 0; and the
index of the end point of the covering range is the length of the string-value
of the attribute or namespace location.</p></item>
<item><p>For the root location, the container node of the start point and
end point of the covering range is the root node; the index of the start point
of the covering range is 0; and the index of the end point of the covering
range is the number of children of the root location.</p></item>
<item><p>For any other kind of location, the container node of the start point
and end point of the covering range is the parent of the location; the index
of the start point of the covering range is the number of preceding sibling
nodes of the location; and the index of the end point is one greater than
the index of the start point.</p></item>
</ulist>
</div3>
<div3>
<head>Tests for <code>point</code> and <code>range</code> Locations </head>
<p>The &scheme-phrase; extends the XPath production for <xnt href="&XPath;#NT-NodeType">NodeType...</xnt> by
adding items for the <code>point</code> and <code>range</code> location types.
The production (number 38 in XPath) becomes as follows:</p>
<scrap>
<head>NodeType </head>
<prod id="NT-NodeType">
<lhs>NodeType</lhs><rhs>'comment'</rhs><rhs>| 'text'</rhs><rhs>| 'processing-instruction'</rhs>
<rhs>| 'node'</rhs><rhs>| 'point'</rhs><rhs>| 'range'</rhs>
</prod>
</scrap>
<p>This definition allows <xnt href="&XPath;#NT-NodeTest">NodeTest</xnt>s
to select locations of type <code>point</code> and <code>range</code> from
a location-set that might include locations of many types.</p>
</div3>
<div3 id="document-order-sec">
<head>Document Order</head>
<p>The &scheme-phrase; extends XPath's concept of <term>document order</term> to
cover point and range locations. The concept applies equally in external parsed
entities.</p>
<p>A point can be either a node point or a character point. Conceptually,
node points label gaps between nodes, while character points occur within
a node, between the node points to the right and left of the node.</p>
<p>Thus, an element P has a node point before and after it. If the P element
contains sub-elements and/or text nodes, there are node points for the gap
before the first child node, between each successive pair of child nodes,
and after the last child node; they are numbered in order from 0.</p>
<p>Within any text node, there are character points preceding the first character
of the text, between each successive pair of characters, as well as after
the last character; they are numbered in order from 0.</p>
<p>For any point, there is an <termdef id="dt-immed-prec" term="immediately preceding node">immediately
preceding node</termdef> defined as follows (except that there is no point
defined preceding or following the root):</p>
<ulist>
<item><p>For a <termref def="dt-node-point">node-point</termref> with a non-zero
index <var>n</var>, the immediately preceding node is the <var>n</var>th child
of the node-point's container node.</p></item>
<item><p>For a node-point with a zero index, the immediately preceding node
is the <termref def="dt-container-node-index">container node</termref> unless
it has any attribute or namespace nodes. If the container node does have attribute
or namespace nodes, then the immediately preceding node is the last of those
attribute or namespace nodes (note that the order of attribute and namespace
nodes is implementation-dependent).</p></item>
<item><p>For a <termref def="dt-character-point">character-point</termref>,
the immediately preceding node is the container node of the character-point.</p>
</item>
</ulist>
<p>The following diagram illustrates the relation between <termref def="dt-container-node-index">container
nodes</termref>, <termref def="dt-node-point">node-points</termref> and <termref
def="dt-character-point">character-points</termref>.</p>
<graphic source="http://www.w3.org/TR/xptr/XPointer_diagram.png" alt="Tree-structured diagram of XML document fragment, illustrating character and node points"/>
<p>The document order of locations is specified here according to the location
types to be compared:</p>
<glist>
<gitem><label>Node and node</label>
<def>
<p>As defined by XPath.</p>
</def></gitem>
<gitem><label>Node and point</label>
<def>
<p>A node and point can never be equal in document order. A node is before
a point if the node is before or equal in document order to the immediately
preceding node of the point; otherwise, the node is after the point.</p>
</def></gitem>
<gitem><label>Node and range</label>
<def>
<p>A node and range can never be equal in document order. A node is before
a range if the node is before the start point of the range; otherwise the
node is after the range.</p>
</def></gitem>
<gitem><label>Point and point</label>
<def>
<p> Two points P1 and P2 are equal if their immediately preceding nodes are
equal and the indexes of the points are equal. P1 is before P2 if P1's immediately
preceding node is before P2's, or if their immediately preceding nodes are
equal and P1's index is less than P2's. Otherwise P2 is after P1.</p>
</def></gitem>
<gitem><label>Point and range</label>
<def>
<p> A point P is equal to a range R if R's start and end points are both equal
to P; otherwise P is before R if P is before or equal to the start point of
R; otherwise P is after R.</p>
</def></gitem>
<gitem><label>Range and range</label>
<def>
<p>Two ranges R1 and R2 are equal in document order if their start points
are equal and their end points are equal. R1 is before R2 if R1's start point
is before R2's start point or if R1's start point is equal to R2's and R1's
end point is before R2's; otherwise R2 is after R1.</p>
</def></gitem>
</glist>
<p>Note that one consequence of these rules is that a point can be treated
the same as the equivalent collapsed range.</p>
</div3>
</div2>
<div2 id="xptr-functions">
<head>Functions Added by the &schemename;() Scheme</head>
<p>The &scheme-phrase; adds the following functions to those in XPath.</p>
<div3 id="rangeexprs">
<head><function>range-to</function> Function </head>
<proto name="range-to" return-type="location-set"><arg type="location-set"/>
</proto>
<p>For each location in the context, <function>range-to</function> returns
a range. The start point of the range is the start point of the context location
(as determined by the <function>start-point</function> function), and the
end point of the range is the end point (as determined by the <function>end-point</function> function)
of the location found by evaluating the <xnt href="&XPath;#NT-Expr">expression</xnt> argument
with respect to that context location.</p>
<p>The change made to the XPath syntax to support the <function>range-to</function> construct
corresponds to a single addition to the <xspecref href="http://www.w3.org/TR/xpath#section-Location-Steps">Step
production</xspecref> of the <bibref ref="XPath"/> specification. The original
production is as follows:</p>
<eg>[4] Step ::= AxisSpecifier NodeTest Predicate*
                | AbbreviatedStep</eg>
<p>The version in the &scheme-phrase; is as follows:</p>
<eg>[4xptr] Step ::= AxisSpecifier NodeTest Predicate*
                    | AbbreviatedStep
                    | 'range-to' '(' Expr ')' Predicate*</eg>
<p>This change is a single exception for the <function>range-to</function> function.
It is not a generic change and is not extensible to other functions. The modified
production expresses that a range computation must be made for each of the
locations in the current location list.</p>
<p>As an example of using the <function>range-to</function> function, the
following pointer part locates the range from the start point of the element
with ID <quote>chap1</quote> to the end point of the element with ID <quote>chap2</quote>.</p>
<eg>xpointer(id("chap1")/range-to(id("chap2")))</eg>
<p>As another example, imagine a document that uses empty elements (such as <code>&lt;REVST/></code> for
revision start and <code>&lt;REVEND/></code> for revision end) to mark the
boundaries of edits. The following pointer part would select, for each revision,
a range starting at the beginning of the <code>REVST</code> element and ending
at the end of the next <code>REVEND</code> element:</p>
<eg>xpointer(descendant::REVST/range-to(following::REVEND[1]))</eg>
</div3>
<div3 id="stringrange">
<head><function>string-range</function>() Function </head>
<proto name="string-range" return-type="location-set"><arg type="location-set"/>
<arg type="string"/><arg type="number" occur="opt"/><arg type="number" occur="opt"/>
</proto>
<p>For each location in the <var>location-set</var> argument, <function>string-range</function> returns
a set of ranges determined by searching the <xtermref href="&XPath;#dt-string-value">string-value</xtermref> of
the location for substrings that match the <var>string</var> argument. An
empty string is defined to match before each character of the string-value
and after the final character.White space in a string is matched literally,
with no normalization except that provided by XML for line ends and attribute
values. Each non-overlapping match can contribute a range to the resulting
location set. </p>
<p>The third argument gives the position of the first character to be in the
resulting range, relative to the start of the match. The default value is
1, which makes the range start immediately before the first character of the
matched string. The fourth argument gives the number of characters in the
range; the default is that the range extends to the end of the matched string.
Thus, both the start point and end point of each range returned by the <function>string-range</function> function
will be <termref def="dt-character-point">character points</termref>.</p>
<p>Element boundaries, as well as entire embedded nodes such as processing
instructions and comments, are ignored as specified by the definition of <xtermref
href="&XPath;#dt-string-value">string-value</xtermref> in <bibref ref="XPath"/>.</p>
<p>For any particular match, if the <var>string</var> argument is not found
in the string-value of the location, or if the third and fourth argument indicates
a range that is wholly beyond the beginning or end of the document or entity,
then no range is added to the result for that match.</p>
<p>The start and end points of the range-locations in the returned location-set
will all be character points.</p>
<p>For example, the following expression returns a range that selects the
17th of those <quote>Thomas Pynchon</quote> strings appearing in a <code>title</code> element:</p>
<eg>string-range(//title,"Thomas Pynchon")[17]</eg>
<p>As another example, the following expression returns a collapsed range
whose points immediately precede the letter <quote>P</quote> (8 from the start
of the string) in the third of those <quote>Thomas Pynchon</quote> strings
appearing in a <code>P</code> element:</p>
<eg>string-range(//P,"Thomas Pynchon",8,0)[3]</eg>
<p>Alternatively this could be specified as follows:</p>
<eg>string-range(string-range(//P,"Thomas Pynchon")[3],"P",1,0)</eg>
<p>String-values are <quote>views</quote> into only the string content of
a document or entity; they do not retain the structural context of any non-text
nodes interspersed with the text. Because the <function>string-range</function> function
operates on a string-value, markup that intervenes in the middle of a string
does not prevent a match. (Note that for this reason, a <function>string-range</function> match
is a range describing the relevant substring of the string-value, not necessarily
a contiguous string in a single text node in the document.) For example, if
the 17th occurrence of <quote>Thomas Pynchon</quote> had some inline markup
in it as follows, it would not change the string identified by the XPointer
processor:</p>
<eg>Thomas &lt;em>Pyn&lt;/em>chon</eg>
<p>The following expression selects the fifth of those exclamation marks appearing
in any text node in the document and the character immediately following that
exclamation mark:</p>
<eg>string-range(/,"!",1,2)[5]</eg>
<p>Although these examples locate ranges via text in the string-values of
elements, <function>string-range</function> is useful for locating ranges
that are wholly enclosed in other node types as well, such as attributes,
processing instructions, and comments.</p>
</div3>
<div3>
<head>Additional Range-Related Functions </head>
<p>The following functions are related to ranges.</p>
<div4>
<head><function>range</function> Function </head>
<proto name="range" return-type="location-set"><arg type="location-set"/>
</proto>
<p>The <function>range</function> function returns ranges covering the locations
in the argument location-set. For each location <var>x</var> in the argument
location-set, a range location representing the covering range of <var>x</var> is
added to the result location-set.</p>
</div4>
<div4>
<head><function>range-inside</function> Function </head>
<proto name="range-inside" return-type="location-set"><arg type="location-set"/>
</proto>
<p>The <function>range-inside</function> function returns locations covering
the contents of the locations in the argument location-set. For each location <var>x</var> in
the argument location-set, a location is added to the result location-set.
If <var>x</var> is a range location or a point, then <var>x</var> is added
to the result location-set. Otherwise <var>x</var> is used as the container
node of the start and end points of the range location to be added, which
is defined in this way: The index of the start point of the range is zero.
If the end point is a character point then its index is the length of the
string-value of <var>x</var>; otherwise its index is the number of children
of <var>x</var>.</p>
</div4>
<div4>
<head><function>start-point</function> Function </head>
<proto name="start-point" return-type="location-set"><arg type="location-set"/>
</proto>
<p>For each location <emph>x</emph> in the argument location-set, <function>start-point</function> adds
a location of type point to the resulting location-set. That point represents
the start point of location <emph>x</emph> and is determined by the following
rules:</p>
<ulist>
<item><p>If <emph>x</emph> is of type point, the resulting point is <emph>x</emph>.</p>
</item>
<item><p>If <emph>x</emph> is of type range, the resulting point is the start
point of <emph>x</emph>.</p></item>
<item><p>If <emph>x</emph> is of type root, element, text, comment, or processing
instruction, the container node of the resulting point is <emph>x</emph> and
the index is 0.</p></item>
<item><p>If <emph>x</emph> is of type attribute or namespace, the pointer
part in which the function appears fails.</p></item>
</ulist>
</div4>
<div4>
<head><function>end-point</function> Function </head>
<proto name="end-point" return-type="location-set"><arg type="location-set"/>
</proto>
<p>For each location <var>x</var> in the argument location-set, <function>end-point</function> adds
a location of type point to the result location-set. That point represents
the end point of location <var>x</var> and is determined by the following
rules:</p>
<ulist>
<item><p>If <var>x</var> is of type point, the resulting point is <var>x</var>.</p>
</item>
<item><p>If <var>x</var> is of type range, the resulting point is the end
point of <var>x</var>.</p></item>
<item><p>If <var>x</var> is of type root or element, the container node of
the resulting point is <var>x</var> and the index is the number of children
of <var>x</var>.</p></item>
<item><p>If <var>x</var> is of type text, comment, or processing instruction,
the container node of the resulting point is <var>x</var> and the index is
the length of the string-value of x.</p></item>
<item><p>If <emph>x</emph> is of type attribute or namespace, the pointer
part in which the function appears fails.</p></item>
</ulist>
</div4>
</div3>
<div3>
<head><function>here</function> Function </head>
<proto name="here" return-type="location-set"></proto>
<p>The <function>here</function> function is meaningful only when the expression
being interpreted occurs in an XML document or external parsed entity; otherwise
the pointer part in which the <function>here</function> function appears fails.
When in an XML context, the <function>here</function> function returns a location-set
with a single member. There are two possibilities for the location returned:</p>
<ulist>
<item><p>If the expression being evaluated appears in a text node inside an
element node, the location returned is the element node.</p></item>
<item><p>Otherwise, the location returned is the node that directly contains
the expression being evaluated.</p></item>
</ulist>
<p>In the following example, the <function>here</function> function appears
inside an expression that is in an attribute node. The expression as a whole,
then, returns the <el>slide</el> element just preceding the <el>slide</el> element
that most directly contains the attribute node in question.</p>
<eg>&lt;button
   xlink:type="simple"
   xlink:href="#xpointer(here()/ancestor::slide[1]/preceding::slide[1])">
Previous
&lt;/button></eg>
<note>
<p>The type of the node in which the <function>here</function> function appears
is likely to be <code>text</code>, <code>attribute</code>, or <code>processing-instruction</code>.
The returned location for an expression appearing in element content does
not have a node type of <code>element</code> because the expression is in
a text node that is itself inside an element.</p>
</note>
</div3>
<div3>
<head><function>origin</function> Function </head>
<proto name="origin" return-type="location-set"></proto>
<p>The <code>origin()</code> function is meaningful only when the expression
is being processed in response to traversal of a link expressed in an XML
document. The <function>origin</function> function enables addressing relative
to third-party and inbound links such as defined in the XLink Recommendation.
This allows expressions to express relative locations when links do not reside
directly at one of their endpoints. The function returns a location-set with
a single member, which locates the element from which a user or program initiated
traversal of the link. (See <bibref ref="XLink"/> for information about traversal.)</p>
<p>It is an error to use <function>origin</function> in the fragment identifier
portion of a URI reference where a URI is also provided and identifies a resource
different from the resource from which traversal was initiated, or in a situation
where traversal is not occurring.</p>
</div3>
</div2>
<div2>
<head>Root Node Children </head>
<p>The XML Recommendation requires well-formed documents to contain a single
element at the top level. Thus, the XPath data model of a well-formed document
will have a root node with a single child node of type element. In order to
address locations in arbitrary external parsed entities, along with well-formed
documents, the &scheme-phrase; extends the XPath data model to allow the root
node to have any sequence of nodes as children that would be possible of an
element node. This extension is identical to the one made by XSLT. Thus, the
root node may contain child nodes of type text, and any number of child nodes
of type element.</p>
</div2>
</div1>
</body><back>
<div1 id="references">
<head>References </head>
<div2>
<head>Normative References </head>
<blist>
<bibl id="dom2" href="http://www.w3.org/TR/DOM-Level-2-Core/" key="DOM2"><titleref>Document
Object Model (DOM) Level 2 Specification: Version 1.0.</titleref> World Wide
Web Consortium, 2000.</bibl>
<bibl id="rfc2119" href="http://www.ietf.org/rfc/rfc2119.txt" key="RFC 2119"><titleref>RFC
2119: Key words for use in RFCs to Indicate Requirement Levels</titleref>.
Internet Engineering Task Force, 1997.</bibl>
<bibl id="rfc2396" href="http://www.ietf.org/rfc/rfc2396.txt" key="RFC 2396"><titleref>RFC
2396: Uniform Resource Identifiers</titleref>. Internet Engineering Task Force,
1995.</bibl>
<bibl id="XML" href="http://www.w3.org/TR/REC-xml" key="XML">Tim Bray, Jean
Paoli, C.M. Sperberg-McQueen, and Eve Maler, editors. <emph>Extensible Markup
Language (XML) 1.0 (Second Edition).</emph> World Wide Web Consortium, 2000.</bibl>
<bibl id="xptr-framework" href="&xptr-framework-file;" key="XPtrFrame">Paul
Grosso, Eve Maler, Jonathan Marsh, and Norman Walsh, editors. <titleref>XPointer
Framework</titleref>. World Wide Web Consortium, 2002.</bibl>
<bibl id="xptr-xmlns" href="&W3C-path;&doctype;-xptr-xmlns-&isodate;/" key="XPtr-xmlns">Steven DeRose, Eve
Maler, and Ron Daniel Jr., editors. <titleref>XPointer xmlns() Scheme Proposal</titleref>.
World Wide Web Consortium, 2001.</bibl>
<bibl id="unicode" href="http://www.unicode.org/unicode/standard/standard.html"
key="Unicode">The Unicode Consortium. <titleref>The Unicode Standard.</titleref></bibl>
<bibl id="XPath" href="http://www.w3.org/TR/xpath" key="XPath">James Clark
and Steve DeRose, editors. <titleref>XML Path Language (XPath)</titleref>.
World Wide Web Consortium, 1999.</bibl>
</blist></div2>
<div2>
<head>Non-Normative References </head>
<blist>
<bibl id="chum" key="CHUM">Steven J. DeRose and David G. Durand. 1995. <quote>The
TEI Hypertext Guidelines.</quote> In <titleref>Computing and the Humanities</titleref> 29(3).
Reprinted in <titleref>Text Encoding Initiative: Background and Context</titleref>,
ed. Nancy Ide and Jean Veronis, ISBN 0-7923-3704-2.</bibl>
<bibl id="dexter" key="Dexter">Halasz, Frank. 1994. <quote>The Dexter Hypertext
Reference Model.</quote> In <titleref>Communications of the Association for
Computing Machinery</titleref> 37 (2), February 1994: 30-39.</bibl>
<bibl id="fress" href="http://www.stg.brown.edu/~sjd/fress.html" key="FRESS">Steven
J. DeRose and Andries van Dam. 1999. <quote>Document structure in the FRESS
Hypertext System.</quote> <titleref>Markup Languages 1</titleref> (1) Winter.
Cambridge: MIT Press: 7-32.</bibl>
<bibl id="html" href="http://www.w3.org/TR/html4/" key="HTML"><titleref href="http://www.w3.org/TR/html4/">HTML
4.01 Specification</titleref>. World Wide Web Consortium, 1999.</bibl>
<bibl id="Infoset" href="http://www.w3.org/TR/xml-infoset/" key="Infoset">John
Cowan and Richard Tobin, editors. <titleref>XML Information Set</titleref>.
World Wide Web Consortium, 2001.</bibl>
<bibl id="intermedia" key="Intermedia">Yankelovich, Nicole, Bernard J. Haan,
Norman K. Meyrowitz, and Steven M. Drucker. 1988. <quote>Intermedia: The Concept
and the Construction of a Seamless Information Environment.</quote> <titleref>IEEE
Computer</titleref> 21 (January, 1988): 81-96.</bibl>
<bibl id="iso10744" href="http://www.ornl.gov/sgml/wg8/docs/n1920/html/n1920.html" key="ISO/IEC 10744"><titleref>ISO/IEC
10744-1992 (E). Information technology --Hypermedia/Time-based Structuring
Language (HyTime).</titleref> Geneva: International Organization for Standardization,
1992. <titleref>Extended Facilities Annex.</titleref> [Geneva]: International
Organization for Standardization, 1996.</bibl>
<bibl id="microcosm" key="MicroCosm">Hall, Wendy, Hugh Davis, and Gerard Hutchings.
1996. <titleref>Rethinking Hypermedia: The Microcosm Approach.</titleref> Boston:
Kluwer Academic Publishers. ISBN 0-7923-9679-0.</bibl>
<bibl id="ohs" href="http://aue.auc.dk/~kock/OHS-HT98/Papers/ossenbruggen.html"
key="OHS">van Ossenbruggen, Jacco, Anton Eliëns and Lloyd Rutledge. <quote>The
Role of XML in Open Hypermedia Systems.</quote> Position paper for the 4th
Workshop on Open Hypermedia Systems, ACM Hypertext '98.</bibl>
<bibl id="rlocs" href="http://www.cs.berkeley.edu/~phelps/Robust/papers/RobustHyperlinks.html"
key="RLocs">Thomas A Phelps and Robert Wilensky. <titleref>Robust Intra-document
Locations.</titleref> University of California, Berkeley.</bibl>
<bibl id="tei" href="http://www.tei-c.org/" key="TEI">C. M. Sperberg-McQueen
and Lou Burnard, editors. <titleref>Guidelines for Electronic Text Encoding
and Interchange</titleref>. Association for Computers and the Humanities (ACH),
Association for Computational Linguistics (ACL), and Association for Literary
and Linguistic Computing (ALLC). Chicago, Oxford: Text Encoding Initiative,
1994.</bibl>
<bibl id="XLink" href="http://www.w3.org/TR/xlink/" key="XLink">Steve DeRose,
Eve Maler, David Orchard, and Ben Trafford, editors. <titleref>XML Linking
Language (XLink)</titleref>. World Wide Web Consortium, 2001.</bibl>
<bibl id="xpreq" href="http://www.w3.org/TR/NOTE-xptr-req" key="XPREQ">Steve
DeRose, editor. <titleref>XML XPointer Language Requirements Version 1.0</titleref>.
World Wide Web Consortium, 1999.</bibl>
<bibl id="XSLT" href="http://www.w3.org/TR/xslt" key="XSLT">James Clark, editor. <titleref>XSL
Transformations (XSLT) Version 1.0</titleref>. World Wide Web Consortium,
1999.</bibl>
</blist></div2>
</div1>
<inform-div1 id="wgmembers">
<head>Working Group Members </head>
<p>The first working drafts of this specification were developed in the XML
Working Group, whose members are listed in <bibref ref="XML"/>. The work was
completed in the XML Linking Working Group, with the following members active
at the completion of this specification:</p>
<orglist>
<member><name>Peter Chen</name><affiliation>LSU, Bootstrap Alliance</affiliation>
</member>
<member><name>Ron Daniel</name><affiliation>Interwoven</affiliation><role>XPointer
co-editor</role></member>
<member><name>Steven DeRose</name><affiliation>invited expert</affiliation>
<role>XPointer co-editor</role></member>
<member><name>David Durand</name><affiliation>University of Southhampton,
Dynamic Diagrams</affiliation></member>
<member><name>Masatomo Goto</name><affiliation>Fujitsu Laboratories</affiliation>
</member>
<member><name>Paul Grosso</name><affiliation>Arbortext</affiliation></member>
<member><name>Chris Maden</name><affiliation>Lexica</affiliation></member>
<member><name>Eve Maler</name><affiliation>Sun Microsystems</affiliation>
<role>past co-chair and co-editor</role></member>
<member><name>Jonathan Marsh</name><affiliation>Microsoft</affiliation></member>
<member><name>David Orchard</name><affiliation>Jamcracker</affiliation></member>
<member><name>Henry Thompson</name><affiliation>W3C and University of Edinburgh</affiliation>
<role>co-chair and W3C staff contact</role></member>
<member><name>Daniel Veillard</name><affiliation>invited expert</affiliation>
<role>co-chair</role></member>
</orglist>
<p>The editors wish to acknowledge substantial contributions from Tim Bray,
who previously served as co-editor and co-chair. We would also like to acknowledge
substantial contributions from James Clark, especially on the integration
with XPath. We would like to thank Gavin Nicol and Martin Dürst for help with
passages related to internationalization. Finally, we would like to thank
the XML Linking Interest Group and Working Group for their support and input.</p>
</inform-div1>
</back></spec>

