W3C

XML Pointer Language (XPointer)

W3C Working Draft 9 July 1999

This version:
http://www.w3.org/1999/07/WD-xptr-19990709
Previous version:
http://www.w3.org/TR/WD-xml-link-970731
Latest version:
http://www.w3.org/TR/WD-xptr
Editors:
Steve DeRose (Inso Corp. and Brown University) <Steven_DeRose@Brown.edu>
Ron Daniel Jr. (DATAFUSION, Inc.) <rdaniel@datafusion.net>

The editors wish to acknowledge substantial contributions by Tim Bray (of Textuality) and Eve Maler (of ArborText) who also served as co-editors on earlier Working Drafts.

Status of this document

The XML Linking Working Group, with this 1999 July 9 first XPointer working draft, invites comment on this specification. For background on this work, please see the XML Activity Statement.

The W3C Membership and other interested parties are invited to review the specification and report implementation experience. Please send comments to www-xml-linking-comments@w3.org (archive). While we welcome implementation experience reports, the XML Linking Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release.

For information about the XLink language with which XPointer is expected to be used, see http://www.w3.org/TR/WD-xlink.

See http://www.w3.org/TR/NOTE-xptr-req for the specific requirements that informed development of this specification.

Abstract

This document specifies a language that builds upon the XML Path Language (XPath), to support addressing into the internal structures of XML documents. In particular, it provides for specific reference to elements, character strings, selections, and other parts of XML documents, whether or not they bear an explicit ID attribute, using traversals of a document's structure and choice of parts based on their properties such as element types, attribute values, character content, and relative position, containment, and order.

XPointer defines the meaning of the "selector" or "fragment identifier" portion of URIs that locate resources of MIME media types "text/xml" and "application/xml".

Table of Contents

1. Introduction
    1.1 Language Design Goals
    1.2 Relationship to Other Documents
    1.3 Notation
2. XPointer Usage
3. The XPointer model and language
    3.1 Character sets and escaping
    3.2 Schemes
    3.3 XPointer errors
4. Summary of XPath
    4.1 XPath basics
    4.2 XPath axes
    4.3 XPath Relative Axes
    4.4 XPath node-tests
    4.5 XPath Predicates
        4.5.1 Introduction to Predicates
        4.5.2 Positional tests
        4.5.3 Local structure tests
    4.6 Examples of axis usage
5. Xpointer extensions to XPath
    5.1 Evaluation context
        5.1.1 Initialization of the context node
        5.1.2 Initialization of the context node list
        5.1.3 Initialization of the variable bindings
        5.1.4 Initialization of the function library
        5.1.5 Initialization of the namespace declarations
    5.2 XPointer axes
        5.2.1 The range axis
        5.2.2 The string axis
    5.3 XPointer absolute location paths
        5.3.1 / (root)
        5.3.2 id
        5.3.3 here
        5.3.4 origin
    5.4 XPointer predicate functions
    5.5 Locations That Are Not Simply Nodes
        5.5.1 String axis semantics
        5.5.2 Range axis semantics
        5.5.3 Multiple spanning locations
    5.6 Link Persistence
6. Conformance

Appendices

A. Glossary
B. References
C. Working Group Members

1. Introduction

This section is normative. Some later sections are instead informative, and are provided to make this specification more understandable, for example by reducing the extent to which other documents must be obtained and fully understood as prerequisites.

This document specifies a language that supports addressing into the internal structures of XML documents. In particular, it provides for specific reference to elements, character strings, and other parts of XML documents, whether or not they bear an explicit ID attribute. XPointer uses a common expression language, XPath, developed in the XML Linking and XSL Working Groups, and extends it to allow its use for addressing ranges as well as nodes, for locating information by string matching, and for using addressing expressions in URIs as fragment identifiers. XPointer expressions locate information by navigating through a document's structure to select parts based on properties such as element types, attribute values, character content, and relative position and order.

This specification provides means for identifying locations in XML documents, independent of whether they are to serve as link destinations or any other application-specific purpose. The construction of hypertext links, which connect located information (that may include XPointer-specified locations in XML documents, as well as images, and many other media types), and provide descriptive information about those connections, is defined in a related specification, XLink.

1.1 Language Design Goals

Following is a brief summary of overall design principles governing XPointer. See the XPointer Requirements Document for more detailed and specific goals, background, and functional requirements.

  1. XPointers shall address into XML documents.
  2. XPointers shall be straightforwardly usable over the Internet.
  3. XPointers shall be straightforwardly usable in URIs.
  4. The XPointer design shall be prepared quickly.
  5. The XPointer design shall be formal and concise.
  6. The XPointer syntax shall be reasonably compact and human readable.
  7. XPointers shall be optimized for usability.
  8. XPointers must be feasible to implement.

1.2 Relationship to Other Documents

Three standards have been especially influential on the development of this specification:

The addressing components of many other hypertext systems have also informed the design of XPointer, especially Dexter, OHS, FRESS, MicroCosm, and InterMedia.

In addition, as this effort and the Extensible Style Language (XSL) evolved, both efforts found their languages converging on a common semantic model. While the most common uses expected in these two applications differ quite substantially, meetings between the working groups concluded that the full ranges of functionality needed by the two were far closer. Both groups had been discovering needs for features the other had, bringing the efforts closer. In short, these diverse application domains (stylesheets and hyperlink addressing) were emphasizing different parts of a common larger space. A common semantic model was therefore worked out in a summit meeting in San Jose in March of 1999, and many syntactic features were aligned as well.

The present draft reflects the common semantic model and terminology, and initiates changes to the XPointer syntax relative to prior Working Drafts of XPointer, to move forward this unification effort. In particular, many of the addressing mechanisms are now defined in a separate joint document, XPath. However, this draft does not complete the unification; certain syntactic issues remain to be worked out by the WGs, and further refinement of this draft is anticipated.

The XML 1.0 specification defines the class of information that XPointer is intended to provide addressing for.

The Namespaces in XMLspecification provides for prefixing XML names so as to disambiguate them; namespace-qualified names are supported by this specification.

The XML information set effort is developing a formal specification for the abstract information structure that XML documents express. If and when it is issued as a recommendation, it will constitute the normative specification of the structure on which XPointer operates (though XPointer may not utilize every aspect of it, it is unlikely to use any information not provided by the Information Set).

XHTML effort is developing a DTD that harmonizes HTML with XML.

Uniform Resource Identifiers define the use of fragment identifiers, of which XPointers are one type (the type for XML).

draft Character Model (this is a work in progress, and not yet publicly available) is expected to provide details of character set usage, interpretation, and other factors relevant especially to the string axis.

1.3 Notation

The formal grammar for locators is given using a simple Extended Backus-Naur Form (EBNF) notation, as described in the XML specification. Explanatory text is included here regarding aspects of XPath, to make this specification more accessible; however, such text is not normative. Should there be conflicts, XPath (when issued) is the normative specification of these common constructs.

2. XPointer Usage

This section is normative.

The locator for a resource is typically provided by means of a Uniform Resource Identifier Reference, or URI Reference. XPointers can be used as fragment identifiers to specify a more precise sub-resource. Any fragment identifier that points into an XML resource must be an XPointer. However, for any locator in an XML resource that identifies a resource that is not an XML document (for example, an HTML or PDF document), XPointer does not constrain the syntax or semantics of the locator.

XPointers can be used with URI References other than those associated with hypertext links. For example, there is no rule against using them in system identifiers for XML external entities.

3. The XPointer model and language

This section is normative.

XPointers operate on the XML Information Set, a tree derived from the elements and other markup constructs of an XML document; precisely the same tree as used by XSL patterns and XPath expressions. XPointers operate by selecting particular parts of such trees, often by their structural relationship to other identified nodes (such as a nearby node bearing an ID). XPointers are conceptually iterative, because they can express multiple such selections, each operating on what is found by the prior one.

Selection of tree portions is done through axes and predicates. An axis defines a sequence of nodes or non-node data portions, as candidates that might be located; predicates then test information in the tree, relative to such portions. For example, one may select certain elements from among the siblings of some previously-located element, based on whether those sibling elements have an attribute with a certain value, or are of a certain type such as FOOTNOTE.

XPointer fragment identifiers allow two shorthand forms and one full form of addressing, summarized here and defined fully below. The first is a shorthand form mainly provided for HTML compatibility. The second is a shorthand form for locating elements by navigation from the root. The third provides access to all the capabilities of the XPointer and XPath specifications, and makes provision for future enhancements.

Formally,
XFragment

XFragment          ::= BareName
                   |  Tumbler
                   |  LocationSpec
LocationSpec       ::= ( Scheme '(' SchemeSpecificExpr ')' )+
BareName           ::= XML_Name
Tumbler            ::= ( '/' [0-9]+ )+
Scheme             ::= 'xptr'
                   |   XML_Name
SchemeSpecificExpr ::= GenlLocationPath (vc: when scheme is xptr)
                   |   XML_CDATA (vc: no unbalanced parentheses)

XPointer extends the XPath syntax for locating data portions to enable addressing non-node locations (such as user selections), and to specify how to use such locators as XML fragment identifiers. However, it does not constrain what uses an application may make of such locations. In particular, implementation of traversal to a resource is not constrained by this specification: whether user "traversal" is the purpose of an XPointer at all, is application-dependent. A formatted-text browser traversal might scroll to and highlight the designated location; a structure-oriented graphical tree viewer or a document-relationship display might do traversal in quite a different way; and a search application, parser, archival system, or expert agent might use XPointers for other purposes entirely.

3.1 Character sets and escaping

Note: The character set assumed for XPointers is that of XML, namely Unicode. Although XPointers may be used in many contexts, a very common use of XPointers is expected to be as fragment identifiers in URIs that identify XML documents (see RFC 2396: Uniform Resource Identifiers (URI): Generic Syntax), which do not permit all Unicode characters. When used in URIs, XPointers may therefore need to encode some characters in special ways. This shall be done using the URI escaping mechanisms: the UTF-8 encoding is used, with any bytes of the result that are not allowed in URIs then expressed as %HH (where HH represents the two hexadecimal digits needed to represent the byte being encoded). Similarly, if XPointers appear in XML document content it is generally necessary to escape occurrences of the '<' and '&' characters as &lt; and &amp;.

EDITOR'S NOTE:
Note that the character model draft says that XML-handling systems are required to treat any 'illegal' characters in URIs as if they had been %-escaped, whether or not they have been. In other words, the %-escaping is not really needed when XPointers are used in URIs in XML documents. However, if the URI is extracted from an XML document and then used, the escaping must be applied.

The end of a LocationSpec is signaled by the presence of the ')' character that is balanced with the starting '(' character in the LocationSpec. Any unbalanced parentheses in the SchemSpecificExpr must therefore be escaped by preceding them with an occurrence of the circumflex '^' character. If the SchemeSpecificExpr contains literal occurrences of the circumflex, those must be escaped in the same manner (i.e. represented in the XPointer as "^^").

3.2 Schemes

In the full form, an initial Scheme identifies the particular notation used; the only particular scheme defined in this specification is "xptr" (as for XML in general, case is significant for scheme names). The explicit Scheme allows for possible future versions of the XPointer language to identify themselves (all scheme names beginning "xptr" are reserved by this specification). In the same way, it allows possible alternative addressing mechanisms to identify themselves: such as ones specialized for other media types such as SVG or SMIL. This specification defines two items related to the Scheme. The first is the general framework for detecting what scheme is in use and the ability to skip over ones that are not understood. The second is the behavior when the Scheme is xptr. Conformance to this specification requires supporting both the general framework for dealing with schemes, and the specific "xptr" scheme as defined in this specification.

More than one scheme, along with its arguments, may occur in a single fragment identifier. Because fragment identifiers are specific to the media type of the returned data, this allows an open syntax in which other specifications may choose to participate, under which users may include alternative fragment identifiers in case the media type of the resource may vary due to content negotiation. For example, a server might return a detailed text description when the user's client does not support graphics; the URI could include an XPointer fragment identifier as well as a media-type-specific identifier for the graphic, perhaps identifying coordinates of a graphical region.

Validity Constraint: When the Scheme is 'xptr', the SchemeSpecificExpr must match the production for GenLocationPath.

EDITOR'S NOTE:

Multiple LocationSpecs are a temporary provision, pending final decision of the WG. When an XPointer contains multiple LocationSpecs, they are to be processed in order. The value of the XPointer as a whole is the value of the first LocationSpec that succeeds in locating a portion of the document. There are two main advantages to this. The first is to provide more reliable locators. As an example, id("foo") will not work if the receiver does not perform validation because the client will not be able to determine which attribtues are of type ID. An XPointer like #xptr(id("foo"))xptr(/@name="foo") would work for both validating and non-validating clients. Multiple schemes can also be used to locate sections not easily addressed using only XPath expressions. For example, SMIL documents are XML, but it may be advantageous to locate parts of the SMIL document based on time of presentation. It would even be possible to extend the XPointer syntax to allow multiple LocationSpecs to operate in succession. For example, http://www.cwi.nl/SMIL/fiets.smil#xptr(/body/par[2]){then-apply}time(begin(3),end(6)) locates the second <par> in the <body>, then selects the material for seconds 3..6 of the SMIL presentation. The cost of all this power is, of course, implementation complexity. Feedback on the costs/benefit tradeoff is requested.

3.3 XPointer errors

There are several kinds of errors that can arise in using XPointers.

A string that does not match the syntax specifications given here is said to have a [Definition:] syntax error, and applications should not attempt to interpret it as an XPointer.

A syntactically correct XPointer may, when interpreted, find that it is given no resource to work from (for example, it may be appended to an incorrect URI), or may find itself operating on a resource that is not well-formed XML data. Such an XPointer is said to have a [Definition:] resource error.

A syntactically correct XPointer may, when interpreted, fail to find a sub-resource matching its description. Such an XPointer is said to have a [Definition:] sub-resource error. Note that XPath allows expressions that return empty nodesets as their results and does not regard that as an error. XPointers are not a general query mechanism, they are a specification of document locations. Therefore, an empty result is a sub-resource error.

This specification does not constrain how applications deal with these errors.

EDITOR'S NOTE:
The Working Group has determined that a way to discover a potential error that a sub-resource is not the same as it was when the link was created (such as via a checksum), is the domain of XLink rather than XPointer. Thus a mechanism for that will be considered as part of the XLink rather than XPointer effort. A failure of such a checksum would be an error at the XLink level, not the XPointer level.

4. Summary of XPath

This section is informative. It is intended for the reader's convenience in understanding XPointer. As noted above, the XPath specification, not the summary in this section, is the normative definition for XPath-specific constructs.

[Definition:] A full-form XPointer consists of the scheme identifier "xptr" with an argument that identifies a location by the methods defined here. This argument may but need not be a LocationPath as defined in XPath. The cases where it is not are defined in other, normative, section of the XPointer specification.

4.1 XPath basics

An XPath LocationPath typically consists of a "/"-separated list of location steps of the form

axis-name :: node-test[predicate]*

For example, the following expression would locate all children of the context node that are of element type "List":

child::List
EDITOR'S NOTE:
The XPath draft uses '/' as the delimiter between location steps. It also defines '//' as an abbreviation for the descendents axis. Unfortunately, some servers appear to change occurrences of '//' to '/' in fragment identifiers. The XML Linking WG is currently investigating this to determine the severity of the problem.

An axis-name specifies a sequence of candidate locations, given certain bindings called a "context" and in particular a current location called the "context node". The context node is initially the document root, and more generally the results, in turn, of a prior location step. A context in XPointer is the same thing as a context in XPath, and is established as described below.. The curly braces, a commonplace notation for sets, distinguish use of axes from use of functions, since the evaluation of arguments in the two cases differs.

Predicates are evaluated for each candidate location along the specified axis, and typically test the element type, attributes, positions, and/or other properties of nodes. Multiple predicates may be specified, and serve as successive filters.

For example, the following expression would locate all children of the context node that are of element type "para", and then find the first following siblings of each of those nodes that is of type "list":

child::para/following-sibling::list[position()=1]

Each location step locates an ordered list of data portions (for all the XPath axes, but not for XPointer's additional "final" axes, each location is a node). Such a list is called a "context node list" (or simply "context" where unambiguous), and any following location steps are interpreted relative to the nodes in that context node list.

Location steps can be used in a sequence, interpreted from left to right. Such sequences are typically used to express a stepwise refinement of a location. For example, id("MYNOTE")/ancestor::SEC locates all SEC elements that contain the element whose ID attribute has value "MYNOTE", while id("MYNOTE")/ancestor::SEC[position()=1] locates the innermost such SEC (that is, the first one in the sequence located, not the first one in document order).

4.2 XPath axes

XPath's axes, summarized below, provide access to nodes that bear significant relationships to a starting node, for example as children, following nodes, attributes, etc.

Those axes that XPointer shares with XPath always locate entire nodes (such as elements, attributes, whole text portions uninterrupted by markup, etc). However, XPointer's range and string axes (defined in a later section) locate ranges that are not necessarily aligned on element boundaries. A common use case for this is when a user selects a range of text in a browser or word processor and then performs an operation on that range, such as annotating it. Because they are not in fact lists of nodes, but are ranges, location steps using these two axes may not be used as context node lists for further location steps.

XPath provides absolute addresses as well, using an initial "/" to locate the document root, and id() to locate the node with a given XML ID value. These are adopted by XPointer without change; XPointer also adds special absolute addresses as defined below, to make it more effective and general in a primary application domain: addressing to support hypertext linking.

4.3 XPath Relative Axes

The axes described in this section depend on the existence of a context node list, and locate other nodes relative to each of its nodes in turn. Very commonly, steps in an XPointer are for the purpose of navigation within the tree. The fundamental navigation directions inherent for ordered trees, therefore have corresponding relative axes.

The candidate nodes along a given axis are ordered by distance from the 'context node'. Thus, axes such as ancestor and preceding consider nodes in reverse document order. If no context node list is explicitly provided, the context is the root element of the containing resource.

[Definition:] Each of the relative axes locates a list of nodes that are candidates for membership in the resulting context node list. Actual results are selected from the candidate by predicate arguments as described below (if no predicate is specified, all candidates from the axis are selected). All relative axes accept the same form of predicates as arguments. The XPath relative axes are:

child
Locates direct child nodes of the context node. Unless restricted by a predicate, children of all types (element, pi, comment, and text) are located. Attributes are not considered children of the elements that bear them (for locating them, use the attribute axis).
descendant
Locates nodes appearing anywhere within the content of the context node. Unless restricted by a predicate, descendants of all types (element, pi, comment, and text) are located.
descendant-or-self
Identical to the descendant axis except that the context node itself is included as a candidate, preceding all descendants. Note: This is particularly useful when a precise location is specified by an ID, but there are constraint on the maximal unit to retrieve, such as always showing the SUMMARY whether the ID is on the summay or its containing SECTION.
parent
Locates the element nodes directly containing the context node.
EDITOR'S NOTE:

The XSL and XPointer working groups have not achieved consensus on whether an attribute node's bearing element is deemed its "parent". The capability of accessing the bearing node is a requirement in general, and there are particular cases of interest to XSL where it helps to be able to locate the parent for elements, and the bearer for attributes, in a unified operation. On the other hand, if bearers are parents, it means that X being a parent of Y does not imply that Y is a child of X; and means that X and Y having the same parent does not imply they are siblings, etc. One solution would be to introduce a separate "bearer::" axis for the particular relationship (which also has the advantage of being distinguishable for optimization), and if needed for XSL, a bearer-or-parent:: axis akin to the "-or-self" axes, to simplify an important special case.

ancestor
Locates element nodes containing the context node (since only elements properly have children as defined here). The first node in the list is the immediate parent of the context node, the last node in the list is root().
ancestor-or-self
Identical to the ancestor axis except that the context node itself is included as a candidate, preceding all ancestors. This axis is particularly useful when a precise location is specified by an ID, but there are constraints on the minimal unit to be treated as the final location, such as always locating the containing SECTION whether the ID is on the SECTION or on some particular descendant. For example,
id("ref37")/ancestor-or-self::SEC

would return the SEC which has the element with ID "ref37". "ref37" could be on a small leaf element, an intermediate container, of the SEC itself. Using the ancestor:: axis, in contrast, would fail or get a higher SEC if "ref37" were on a SEC element itself. This axis can also be used to get the effect of inherited attributes (such as the xml:lang attribute defined in XML 1.0), by locating the list of all ancestor elements; selecting the desired attribute for each of them (any not having it drop out); and then taking the last one. For example,

id("ref37")/ancestor::*/attribute::lang[position()=last()]

would do this.

preceding-sibling
Locates sibling nodes (nodes that share the same parent as the context node) that appear before (preceding) the context node. The nodes are considered in reverse document order, so that the first node in the list is the immediately preceding sibling, and the last node in the list is the first child of the parent.
following-sibling
Locates sibling nodes (sharing their parent with the context node) that appear after (following) the context node. The nodes in the list appear in document order.
preceding
Locates nodes that begin before (preceding) the entire context node. The list is in reverse document order: the node closest to the context node first, root() last. Ancestors are included.
following
Locates nodes that begin after (following) the entire context node. The list is in document order: the first node in the list is for the first node whose start-tag occurs after the context node's end-tag; no ancestors are included.
self
Locates (for each context node in the context node list), a singleton nodelist containing that same context node. Note: This is useful for applying multiple predicates to a single axis, particular when predicates other than the first one must test a context node's position among all those context nodes that were selected by the prior predicates.
attribute
The attributes of the context node. If the context node is not of type element, the list is empty. The order of nodes on this axis is undefined. Typically, a single attribute will be selected by name.

4.4 XPath node-tests

NodeTests, like predicates (see below), are applied to each node in the context node list. Exactly those for which the test returns 'true' remain in the list. XPath provides three forms of NodeTest.

First, QNames (qualified names) can be used to filter by attribute name or element type.


child::para
locates the <para> elements that are children of the current node.

child::x:para
locates children of the context node if the element type is "x:para",
where the NCName "x" expands to a URI using the namespace declarations
that are provided in the expression evaluation context.

Second, XPath provides the functions comment(), text(), and processing-instruction() for use as NodeTests. They return true if the context node is of the corresponding type.


child::text()
Locates the text nodes that are children of the context node.

/child::comment()
Locates any comments that are siblings of the document element.

/descendant::processing-instruction()
Locates any processing instructions in the entire document.

Third, a Nodetest can be the wildcard character '*'. When given by itself, it returns true for any node that is of the principal type of the axis. The principal type of the attribute:: axis is the attribute type, the principal type of the other axes is the element type.

The '*' can also be used as the LocalPart of a QName. This allows filtering that selects any element or attribute from a particular namespace.


id("intro")/child::*
Locates all the elements that are direct children of the
intro node. The elements are given in document order.

id("intro")//x:para
Locates only the <para> elements that come from the namespace currently
mapped to the NCName "x". The namespace declarations to use are supplied by
the expression evaluation context.

id("intro")//x:*
Locates any element that is a descendant of the intro node and
comes from the namespace currently mapped to the "x" NCName.

4.5 XPath Predicates

All XPath's relative axes operate using the same potential arguments, a node-test and zero or more predicates, used to filter the candidates along the given axis. XPointer's syntax and semantics for predicates include all of XPath's except for variable binding (there being no means to bind variables within an XPointer at this time). XPointer also adds a small number of predicates needed for its application domain. This section summarizes the predicate language of XPath and defines XPointer's additions.

4.5.1 Introduction to Predicates

What is located by a given location step depends on the axis and the predicate(s) -- the node-test may be considered a special-case of a predicate. The axis defines an ordered list of potential candidates, and the predicates (if any) select actual results from among the candidates. The effect is as described below (no particular processing procedure is mandatory or necessary; implementations may use any method that produces the same results as described here):

  1. The particular axis in use defines an ordered list of candidate nodes, such as all children of the context node, all following siblings, all substrings of the content, etc. When the context node list has more than one node, the axes are applied using each node in the context node list as the context node, and the results are unioned together.
  2. The node-test and predicates (if any) are then evaluated, with each candidate in turn serving as the context of evaluation, or "context node". This step eliminates all candidates for which the node-test and predicates do not evaluate to true. In effect, each candidate is tested against the constraints, and only those which fulfill them remain.
  3. The same process is repeated for each additional predicate, in order.

If no predicate or node-test is specified, all candidates become members of the resulting node list.

A node-test is a name or "*". A node test that is a QName tests whether the node is an element or attribute with the specified name. For example, attribute::href selects the href attribute of the context node; if the context node has no href attribute, it will select an empty set of nodes. "*" locates any element or attribute node regardless of name (but not other node types). Either a name or "*" may be qualified by a namespace, and if so, matching requires that the namespace prefixes resolve to the same namespace URI.

Several kinds of tests are provided for use within predicates, and are detailed in following sections. For ease of presentation, this section divides them into Positional tests and Local structure tests. The basic tests can be combined with Boolean, numeric, string and other operators, and parentheses; see XPath for the normative specifications of these features.

4.5.2 Positional tests

XPath provides several ways of testing the position of a node among a given list of elements. First, the XPath predicate function "position()" returns the integer position, counting from 1, of the context node within the context node list. For example,

child::*/child::*[position()=1]

locates the list of all first element children of all element children of the node(s) in the initial context node list.

Positions within the set of siblings in the actual document structure can be tested by generating that list. For example,

 following-sibling::item[position()=1] 

locates the first <item> element that is a sibling occuring after the context node.

The XPath predicate function "last()" returns the position of the last node in the context node list. For example,

 child::chapter[position()=last()] 

locates the last chapter child of the context node.

The XPath predicate function "count(LocationPath)" evaluates the embedded LocationPath, and returns the number of nodes identified. For example,

 child::customer[count(child::car) > 1]

locates all customer elements which have more than one car element child.

4.5.3 Local structure tests

As described above, tests can be made for the XML node type (such as element vs. processing instruction), for the particular element type name of an element, and for attribute-values. Selection using a named element type is strongly recommended, as it increases the chance that an XPointer will locate no data instead of the wrong data, if re-used after the resource changes.

Also, note that location paths can be embedded in other ones as predicates. Such a location path is evaluated relative to the current context node at that point. An embedded location path that returns a non-empty node list, is considered true. This provides very effective ways of characterizing nodes in terms of their document contexts.

Certain predicates, such as tests of attribute values, can only be fulfilled by certain node types, such as elements. It is not an error to apply such predicates to other nodes; they simply will never come out true.

4.6 Examples of axis usage

child::
id("intro")/child::* locates all the child elements of the intro node. That is in contrast to id("intro")/child::node() which locates all the nodes, of any type, that are children of the intro node. id("intro")/child::para locates all the element nodes of whose element type (tag name) is para that are direct children of the intro node. id("intro")/child::text() locates all the text nodes that are direct children of the intro node. XPath defines a number of abbreviations. Perhaps the most important of these allows the axis identifier and parentheses to be omitted from a Step. When omitted, the child:: axis identifier is inferred. Thus, id("intro")/para is equivalent to id("intro")/child::para.
descendant::
id("intro")/descendant::* locates all the descendant element nodes of the intro node. A more common use of the descendents axis is in AbsoluteLocationPaths. For example, descendant::para locates all the para elements contained in the context node. This is expected to be used frequently enough that XPath provides the string "//" as a very compact abbreviation for the descendant:: axis identifier. (Actually, the "//" is an abbreviation for " /descendant-or-self::node()/"). The XPathLocationPath //para will also select all the para elements in the context node.
ancestor::
This axis is typically used to obtain the parent node of one identified by other means. For example id("intro")/ancestor::*[1] locates the parent element of the intro element.
attribute::
id("intro")/attribute::* locates the attributes of the intro element. If we assume that the attribute of type ID was also named ID, then id("intro")/attribute::ID would locate it.

5. Xpointer extensions to XPath

This section is normative

The extensions XPointer makes beyond XPath include:

5.1 Evaluation context

XPointer expressions are evaluated according to the same kind of context as described in the XPath specification. XSL provides the means of initializing such a context for use there. This section defines how the evaluation context must be initialized by XPointer implementations prior to evaluating an XPointer.

An XPath evaluation context contains:

5.1.1 Initialization of the context node

The context node is initialized to the root node of the XML document into which the XPointer is directed. When the XPointer is a fragment identifier of a URI, the document identified by the URI is that document; its root node is not its document elements, but an abstract node

5.1.2 Initialization of the context node list

The context node list is initialized to a singleton list containing only the context node.

5.1.3 Initialization of the variable bindings

No means for initializing these is currently defined for XPointer implementations. Thus, the set of variable bindings used when evaluating an XPointer is empty. This means that the use of a variable reference in an XPointer must result in a 'definition error'.

5.1.4 Initialization of the function library

In addition to the functions required by the XPath specification, XPointer implementations must also provide implementations of all functions defined in the XPointer specification.

5.1.5 Initialization of the namespace declarations

If the XPointer occurs in an XML document, the namespace declarations in its evaluation context must be initialized to the namespace declarations currently in scope for the XPointer.

If the XPointer does not occur in an XML document, no namespace declarations shall be in its evaluation context.

EDITOR'S NOTE:

The editors realize that there are problems with initializing the Xpointer namespace context from the document context of an Xpointer. This notation is very clear for users and authors, and is trivial to implement at a client side processor. However, it is not easily implementable when a server is to interpret an Xpointer rather than the client, since the server would not have direct access to the document's namespace bindings. Possible solutions include requiring clients to translate element prefixes in Xpointers before forwarding them to servers, or extending the XPointer grammar to incorporate namespace declarations. Feedback on this point is requested.

5.2 XPointer axes

A LocationPath as defined in XPath always identifies a complete node or set of complete nodes (or nothing) in an XML document. XPointer adds two relative axes: range:: and string::, as defined here. Both locate contexts that are not generally nodes, and so cannot be used as context node lists for further location steps. Such a location is called a "spanning" result, and an XPointer that produces one is called a "spanning XPointer".

5.2.1 The range axis

The range axis locates a sub-resource starting at the beginning of the data selected by its first argument and continuing through to the end of the data selected by its second argument. The second argument uses the first argument as its location source.

range:: may occur only via a new outermost production. It selects the range from the beginning of one location, to the end of another:

GenlLocationPath

    GenlLocationPath   ::= Range | LocationPath
    Range              ::= 'range' '::' LocationPath ',' LocationPath 

For example,
id("a23")/range::child[1],following-sibling[2]

is a spanning XPointer that selects the first through third children of the element with ID a23.

range:: cannot be used to produce a context node list used by other location steps. In particular, it cannot be used as an argument to another range::.

5.2.2 The string axis

EDITOR'S NOTE:

This is likely to change, since XPath is likely to specify this functionality.

The string:: axis selects one or more strings or point positions in the location source, based on matching actual character content.
String-match term
[1] StringTerm ::= 'string' '::' SkipLit (',' Position (',' Length (',' predicate)?)?)?
[2] Position ::= ('+' | '-')? [1-9] Digit* | 'end'
[3] Length ::= [1-9] Digit*

Matches are found in document order and are non-overlapping.

SkipLit
Identifies the candidate string to be found within the location source. A null SkipLit string is considered to identify the position immediately preceding each character in the location source. For example, assuming that the element with ID x37 contains the character string "Thomas", the following XPointer locates the position before the third character ("o"):
id("x37")/string::3,""
Position
Specifies a character offset from the start of the candidate string(s) to the beginning of the desired final string match. The position number may not be zero; if omitted, it is assumed to be 1, the position immediately preceding the first character of the match. A positive position number counts in-order from the beginning of the specified string. A negative position number counts in reverse order, from the end of the string; for example, position -1 is the position immediately preceding the last character in the match. The reserved position value of end selects the position immediately following the last character of the match. Positions may not extend outside the location source; larger values in either direction are treated as locating the respective end of the location source.
Length
Specifies the number of characters to be selected. A length of zero or an omitted length locates a point preceding a character as indicated by Position; a length of one locates a single character. Length may not extend the match beyond the end of the location source; for any larger number the located string extends to the end of the location source.
predicate
Selects among the resulting list of non-overlapping occurrences of the specified string. Most typically, this would be done simply by choosing a numbered instance, such as
[position()=1]
.

When the context node is a PI or comment node, string operates on the content of that node. However, the content of PIs and comments is not otherwise considered text content.

For example, the following XPointer selects the position immediately preceding the letter "P" (8 from the start of the string) in the third occurrence of the string "Thomas Pynchon":
string::3,"Thomas Pynchon",8 

The following XPointer selects the fifth exclamation mark and the character immediately following it:

string::5,'!',1,1

For purposes of string matching, the text of the element means all the character data in the element(s) in the current location source and descendant elements.

Markup characters are ignored. Thus a match may still occur when markup intervenes. For example,

string::1,'affect'

will match at a point in the document structure corresponding to XML input such as

<corr sic='e'>a</corr>ffect

Sequences of whitespace characters are treated as equivalent to a single space character. This is true both for the string to be found, and for the content being examined. Thus,
string::1,"hello   world"

(with multiple whitespace characters between the two words) will match at a location in content such as
<p>hello     worlds</p>

(with different whitespace characters between the words).

No case-folding or combining-character normalization is performed. Thus, there would be no match to "ThomasPynchon" in the following example. The first seeming match differs in case, and the second by omission of the word-separating space:
thomas pynchon
<auth><first>Thomas</first><family>Pynchon</family></auth>
EDITOR'S NOTE:

The WG is considering add case-ignorant string matching, since in some languages it is highly desirable, even though across languages the definition of such operations is complex. See the IETF internet draft for DASL (http://egg.microsoft.com/dasl/files/draft-dasl-requirements-00.txt) for more on this. Some of the concepts there require knowledge of particular XML schemas.

5.3 XPointer absolute location paths

The location mechanisms described in this section do not depend on the existence of a context node list. They can be used to establish an initial context node list or can serve as self-contained addresses.

5.3.1 / (root)

Locates the entire resource; not the document element, but the parent which is over it as well as any adjacent processing instructions and/or other nodes permitted by XML. If an XPointer has no leading absolute location step (that is, it consists only of a RelPath), it is equivalent to having a leading / since the context node in the expression evaluation context is initialized to the root.

EDITOR'S NOTE:

Access to the internal structure of DTDs and the XML declaration are unspecified at this time.

5.3.2 id

Locates the element in the containing resource with an attribute having a declared value (type) of ID and a value matching the given Name. XML defines such comparison to be case-sensitive, and restricts the range of characters allowed in names.

For example, the function id("a27") returns a node list containing the element of the resource which has an attribute declared to be of type ID whose value is a27.

Note: It is only possible to detect attributes that are of type ID if the document specifies a DTD or defines the attributes in the internal subset, and if the application supports such declarations. Applications that do not support the declaration of IDs cannot reliably interpret the id() function.

A similar but not identical alternative is to use descendant:: to find the attribute by name and value. Using name or value along may not suffice, since different element types might declare like-named attributes only some of which are IDs, and since a given ID value may also appear as the value of many non-ID attributes (especially IDREFs) in a document.

5.3.3 here

XPointer adds the here() function to enable locating the node for the element directly containing (as text) or bearing (as an attribute) the XPointer. This expression is an error if the XPointer does not appear in an XML document. It is especially useful for representing reusable relative links when the links reside directly at one of their endpoints; such as "the containing chapter".

5.3.4 origin

XPointer adds the origin() function to enable addressing relative to out-of-line links as defined in XLink. origin() thus provides a meaningful context node list for any following location steps only if the XPointer is being processed by application software in response to a user request for traversal, for example as defined in the XLink specification. In that case, origin() locates the sub-resource from which the user initiated traversal. This allows XPointers to be used in applications to express relative locations when links do not reside directly at one of their endpoints.

It is a resource error to use origin() in a locator where a URI is also provided and identifies a containing resource different from the resource from which traversal was initiated, or in a situation where user traversal is not occurring.

5.4 XPointer predicate functions

XPointer adds the predicate function unique(). Unique() takes no arguments. It returns true if and only if the current context node list contains exactly one node. It has the same syntax as defined for other functions in XPath.

This is particularly useful in applications where a single node is to be located for automated processing, and an XPointer that locates zero or multiple nodes would be in error.

unique() may be considered shorthand for an XPath expression that counts the number of items in the context node list and compares it to 1; it is included because this is considered a very common need for XPointers, and because giving it a separate expression may simplify optimization of XPointer evaluation, such as by using methods understood from the treatment of singleton results in database systems.

5.5 Locations That Are Not Simply Nodes

User selection in most UIs provides an everyday example of data portions that are seldom elements. The most ubiquitous link-creation interface is to let the user select, and then make an XPointer that locates it; perhaps then using it in a hyperlink). To support even this simple case, XPointer must provide the ability to express non-node locations.

Therefore, XPointer adds beyond XPath, mechanisms to locate data portions that are not nodes. Such locations are generally called "spanning locations", and are produced by range:: and string:: as defined above.

Spanning locations correspond closely to the DOM "range" construct. axes that can model user selection. Some details of their semantics are described here.

5.5.1 String axis semantics

The string axis generally locates only part of a node. It may happen that all the characters occur directly in the same element, but this need not be so (and even so, locating the range of all characters in a node is not quite the same thing as locating the node itself). The matched content may well have had markup intervening, and the result may include portions of multiple elements.

For example, a string that specified the 12 characters beginning at the c below would locate the entire text content of the EMPH element, plus the text region that follows the EMPH inside the P:
<P>Hello,
<EMPH>cruel</EMPH> world.</P> 

There is no way to represent such a data portion accurately as a single node, a well-formed XML document, or a sequence of nodes. Even if characters are considered individual nodes, the EMPH node itself remains an issue: it is only partly included in the located data portion. Because of this, located strings require a more complex representation, equivalent to DOM ranges.

5.5.2 Range axis semantics

The range axis may only be used as defined in the grammar, as the outermost axis, with two other location paths as arguments. It has similar characteristics to string, but specifies its bounds in a different way. For example, a range may select the first 3 sections of a chapter; and the following XPointer selects everything from the last P element in one section through the first P in another:

range::id("sec2.1")/P[last()],id("sec2.2")/P[1] 

Like string, range does not generally produce subtrees of XML documents. Also like string (but more obviously here), the results are not mere literal content data strings, since that would lose contextual information, such as all the markup, as well as making many instances of a single string (such as "the") indistinguishable.

Just as with string, implementations cannot represent a spanning location merely as single nodes or as well-formed XML documents; instead they must represent them by some other means that can express their greater generality, for example the DOM range construct.

5.5.3 Multiple spanning locations

It is possible to combine location steps in such a way that multiple spanning locations are involved, although this is expected to be a rare need; for example, the typical "select, create link" interface described above would not typically create such situations.

Creating a range from each member of a multiple location is, however, not complex. No special rules apply. The location steps that result in spanning locations are simply evaluated for each member of the context node list independently (as usual for any context node list with more than one member), and a list of ranges is located.

The more complex case arises when span's first argument produces a context node list containing more than one node. For example, in a document that used empty revision-start and revision-end element to mark the boundaries of edits (a debatable practice, but known to be done), this XPointer would locate them all:

range::descendant::REVST, following::REVEND[1] 

For every revision, this locates the range from a REVST to the first following REVEND element.

Multiple locations from a single member of the first argument, are prohibited for the second argument of the range axis, on grounds of simplicity. So for example, it is not permitted to create a list of many ranges starting at the same place, such as from one REVST to every following REVEND.

range::descendant::REVST[1], following::REVEND

5.6 Link Persistence

It is impossible to guarantee that links to target resources will never break. The resources could be deleted entirely, or changed enough that even the most robust link will break. Alternatively, the author could rewrite it to discuss another subject entirely, making all links irrelevant even if they refer to resources using IDs. However, under typical conditions, XPointers can be made reasonably robust.

The most robust locators are usually those which use only an ID, and this is the preferred locator when available. However, not all elements have IDs, and link creators often do not have enough control over a target resource to have an ID added to it. In such cases the preferred locator is one that points to a nearby element that does have an ID, and then locates the desired element relative to that. This form is relatively robust. That is, it has a good probability of withstanding editing. For example, no edit outside the path between the ID'd element and the result can harm the reference.

In addition, where relative axes are used, selection by named element type is preferred over selection without it, for two reasons:

6. Conformance

This section is normative.

A string conforms to XPointer if it adheres to the syntactic requirements imposed by this specification (including that part of XPath included by reference). Note that this does not require that the string, in association with a URI, actually point to a resource that exists at any given moment.

Application software conforms to XPointer if it interprets all XPointer-conforming strings according to the semantics prescribed by this specification. The syntax must be recognized and processed correctly regardless of whether full forms or abbreviated forms are used.

Applications must recognize and correctly interpret XPointers that appear as the fragment identifier of a URI that locates a resource of MIME media type text/xml or application/xml. Otherwise, application software is free to define its own requirements on where XPointer strings will be recognized. For example, an XML application program might choose to recognize XPointers only when they occur in locator attributes of XLink elements.


Appendices

A. Glossary

This appendix is normative.

The following basic terms apply in this document.

[Definition:] axis
A reserved name that defines a sequence of data portions. Typically, an XPointer utilizes axes in combination with predicates to select particular data portions. Some axes define their sequence relative to a particular node, such as all children or all attributes of a particular node, the character ranges within a given subtree, etc. Other axes define a sequence that does not depend on context, such as the element with a particular ID, or the absolute root of the document.
[Definition:] Context node list
An ordered list of document nodes such as produced by all of the axes common to XPointer and XPath. A context node list can be the final result of evaluating an XPointer, or the reference point for interpreting following location steps within an XPointer. In the latter case, each member of the context node list is treated independently, as a separate context in which the following location step is evaluated.
[Definition:] element tree
An abstract representation of the relevant structure specified by the tags, attributes, and other markup constructs in an XML document. The abstract tree relevant to XPointer is being defined by the XML Information Set Working Group. It includes nodes representing elements, text, processing instructions, and other constructs. Some nodes bear additional information such as their element type name, an attribute list, etc.; however, such information items are not considered "children" of the nodes that bear them.
[Definition:] link
An explicit structure that connects and/or describes data objects or portions of data objects. This specification facilitates but does not define the representation of links; linking mechanisms for XML are defined in the XLink specification.
[Definition:] linking element
An XML element that asserts the existence and describes the characteristics relevant to a link.
[Definition:] locator
Data, typically associated with a link, that identifies a resource or a portion of one. For example, a URI. This specification is primarily concerned with defining the fragment identifier portion of URIs for use with XML documents, to specify particular sub-resources.
[Definition:] node
An entire information set object from an XML document, such as an element, attribute, processing instruction, comment, or text portion (the last extending all the way to the nearest adjacent non-text object at each end -- a smaller text portion is a range, not a node).
[Definition:] predicate
An expression that, given a sub-resource location, can be evaluated to produce a true or false result. XPointer uses predicates to select sub-resources (for example, elements) relative to other ones in documents.
[Definition:] resource
In the abstract sense, an addressable service or unit of information. Examples include files, images, documents, programs, and query results. Concretely, anything reachable by the use of a locator. Note that this term and its definition are taken from the basic specifications governing the World Wide Web.
[Definition:] root
The unique node in an element tree which has no ancestors. As in XSL, this is not the document element (such as <HTML>...</HTML>), but an abstract node which directly contains the document element. This is because there may be processing instructions and/or comments in XML, that reside outside the document element, and properly speaking, the "root" should contain them as well as the document element. The root() function (see below) gives direct access to this node.
[Definition:] singleton
A location that consists of a single, contiguous portion of a document. Some XPointers can locate multiple data portions, such as all the separate ITEMs in a LIST; when an XPointer instead locates only a single contiguous data portion, such as an element or a range of character data, that location is said to be a singleton.
[Definition:] sub-resource
A portion of a resource, pointed to as the precise destination of a link. Although the link ultimately points to the sub-resource, the containing resource or a larger portion of it may provide important context and may need to be displayed in some fashion along with the sub-resource. As one example, a link might specify that an entire document (or a section of one) be retrieved and displayed, but that the desired sub-resource is a particular P or NOTE that is to be highlighted, scrolled to, etc. Sub-resource is a similar concept to anchor, but is not typically represented by explicit markup (such as <A NAME="...">) within the resource.

B. References

This appendix is informative.

XLINK
Eve Maler and Steve DeRose, editors. XML Linking Language (XLink) V1.0. ArborText, Inso, and Brown University. Burlington, Seekonk, et al.: World Wide Web Consortium, 1998. (See http://www.w3.org/TR/WD-xlink.)
ISO/IEC 10744
ISO (International Organization for Standardization). ISO/IEC 10744-1992 (E). Information technology --Hypermedia/Time-based Structuring Language (HyTime). [Geneva]: International Organization for Standardization, 1992. Extended Facilities Annex. [Geneva]: International Organization for Standardization, 1996. (See http://www.ornl.gov/sgml/wg8/docs/n1920/html/n1920.html).
IETF RFC 1738
IETF (Internet Engineering Task Force). RFC 1738: Uniform Resource Locators. 1991. (See http://www.w3.org/Addressing/rfc1738.txt.)
IETF RFC 1808
IETF (Internet Engineering Task Force). RFC 1808: Relative Uniform Resource Locators. 1995. (See http://www.w3.org/Addressing/rfc 1808.txt ).
TEI
C. M. Sperberg-McQueen and Lou Burnard, editors. Guidelines for Electronic Text Encoding and Interchange. Association for Computers and the Humanities (ACH), Association for Computational Linguistics (ACL), and Association for Literary and Linguistic Computing (ALLC). Chicago, Oxford: Text Encoding Initiative, 1994.
DOM
Document Object Model Specification. World Wide Web Consortium, 1997. (See http://www.w3.org/TR/WD-DOM.)
CHUM
Steven J. DeRose and David G. Durand. 1995. The TEI Hypertext Guidelines. In Computing and the Humanities 29(3). Reprinted in Text Encoding Initiative: Background and Context, ed. Nancy Ide and Jean Và(c)ronis, ISBN 0-7923-3704-2.

C. Working Group Members

This appendix is informative.

EDITOR'S NOTE:

The lists below are definitely incomplete. Please forward additions and corrections to the editors.

The first working drafts of this specification were developed in the XML Working Group, whose members are listed in the XML Recommendation. The work was passed on and completed in the XML Linking Working Group, with the following members at various times:

The semantics and functionality of XPath were developed along different paths by the XSL WG and the XML Linking WGs. A summit meeting determined that unification was desirable, and work toward such unification has been carried out by a joint technical committee. The following people participated in the summit meeting and/or the subsequent technical unification work: Sharon Adler, Adam Bosworth, James Clark, Paul Cotton, Ron Daniel, Steve DeRose, Jonathan Marsh, Bill Smith, Ben Trafford.