31 October 2001

1. Document Object Model XPath

Editor:
Ray Whitmer, Netscape/AOL

Table of contents

1.1. Introduction

XPath 1.0 [XPath 1.0] is becoming an important part of a variety of many specifications including XForms, XPointer, XSL, XML Query, and so on. It is also a clear advantage for user applications which use DOM to be able to use XPath expressions to locate nodes automatically and declaratively. But liveness issues have plagued each attempt to get a list of DOM nodes matching specific criteria, as would be expected for an XPath API. There have also traditionally been model mismatches between DOM and XPath. This proposal specifies new interfaces and approaches to resolving these issues.

1.2. Mapping DOM to XPath

This section considers the differences between the Document Object Model [DOM Level 3 Core] and the XPath 1.0 model [XPath 1.0].

1.2.1. Text Nodes

The XPath model relies on the XML Information Set [XML Information set] ands represents Character Information Items in a single logical text node where DOM may have multiple fragmented Text nodes due to cdata sections, entity references, etc. Instead of returning multiple nodes where XPath sees a single logical text node, only the first non-empty DOM Text or CDATASection node of any logical XPath text will be returned in the node set. Applications using XPath in an environment with fragmented text nodes must manually gather the text of a single logical text node possibly from multiple nodes beginning with the first Text node or CDATASection node returned by the implementation.

Note: In an attempt to better implement the XML Information Set, DOM Level 3 Core [DOM Level 3 Core] adds the attribute wholeText on the Text interface for retrieving the whole text for logically-adjacent Text nodes and the method replaceWholeText for replacing those nodes.

1.2.2. Namespace Nodes

The XPath model expects namespace nodes for each in-scope namespace to be attached to each element. DOM and certain other W3C Information Set conformant implementations only maintain the declaration of namespaces instead of replicating them on each Element where they are in-scope. The DOM implementation of XPath returns a new node of type XPATH_NAMESPACE_NODE, defined in the XPathNamespace interface, to properly preserve identity and ordering. This node type is only visible using the XPath evaluation methods.

1.2.3. Document order

The document order of nodes in the DOM Core has been defined to be compatible with the XPath document order. The XPath DOM is extending the document order of the DOM Core to include the XPathNamespace nodes. Element nodes occur before their children. The attribute nodes and namespace nodes of an element occur before the children of the element. The namespace nodes are defined to occur before the attribute nodes. The relative order of namespace nodes is implementation-dependent. The relative order of attribute nodes is implementation-dependent. The compareTreePosition method on the Node interface defined in the DOM Core must compare the XPathNamespace nodes using this extending document order if the XPath DOM module is supported.

1.3. Interfaces

A DOM application may use the hasFeature(feature, version) method of the DOMImplementation interface with parameter values "XPath" and "3.0" (respectively) to determine whether or not the event module is supported by the implementation. In order to fully support this module, an implementation must also support the "Core" feature defined in the DOM Level 3 Core specification [DOM Level 3 Core]. Please, refer to additional information about conformance in the DOM Level 3 Core specification [DOM Level 3 Core].

Exception XPathException

A new exception has been created for exceptions specific to these XPath interfaces.


IDL Definition
exception XPathException {
  unsigned short   code;
};
// XPathExceptionCode
const unsigned short      INVALID_EXPRESSION_ERR         = 1;
const unsigned short      TYPE_ERR                       = 2;

Definition group XPathExceptionCode
Defined Constants
INVALID_EXPRESSION_ERR
If the expression is not a legal expression according to the rules of the specific XPathEvaluator. If the XPathEvaluator was obtained by casting the document, the expression must be XPath 1.0 with no special extension functions.
TYPE_ERR
If the expression cannot be converted to return the specified type.
Interface XPathEvaluator

The evaluation of XPath expressions is provided by XPathEvaluator, which will provide evaluation of XPath 1.0 expressions with no specialized extension functions or variables. It is expected that the XPathEvaluator interface will be implemented on the same object which implements the Document interface in an implementation which supports the XPath DOM module. XPathEvaluator implementations may be available from other sources that may provide support for new versions of XPath or special extension functions or variables which are not defined in this specification.


IDL Definition
interface XPathEvaluator {
  XPathExpression    createExpression(in DOMString expression, 
                                      in XPathNSResolver resolver)
                                        raises(XPathException, 
                                               DOMException);
  XPathResult        createResult();
  XPathNSResolver    createNSResolver(in Node nodeResolver);
  XPathResult        evaluate(in DOMString expression, 
                              in Node contextNode, 
                              in XPathNSResolver resolver, 
                              in unsigned short type, 
                              in XPathResult result)
                                        raises(XPathException, 
                                               DOMException);
};

Methods
createExpression
Creates a parsed XPath expression with resolved namespaces. This is useful when an expression will be reused in an application since it makes it possible to compile the expression string into a more efficient internal form and preresolve all namespace prefixes which occur within the expression.
Parameters
expression of type DOMString
The XPath expression string to be parsed.
resolver of type XPathNSResolver
The resolver permits translation of prefixes within the XPath expression into appropriate namespace URIs. If this is specified as null, any namespace prefix within the expression will result in DOMException being thrown with the code NAMESPACE_ERR.
Return Value

XPathExpression

The compiled form of the XPath expression.

Exceptions

XPathException

INVALID_EXPRESSION_ERR: Raised if the expression is not legal according to the rules of the XPathEvaluatori

DOMException

NAMESPACE_ERR: Raised if the expression contains namespace prefixes which cannot be resolved by the specified XPathNSResolver.

createNSResolver
Adapts any DOM node to resolve namespaces so that an XPath expression can be easily evaluated relative to the context of the node where it appeared within the document. This adapter works by calling the method lookupNamespacePrefix on Node.
Parameters
nodeResolver of type Node
The node to be used as a context for namespace resolution.
Return Value

XPathNSResolver

XPathNSResolver which resolves namespaces with respect to the definitions in scope for a specified node.

No Exceptions
createResult
Creates an XPathResult object which may be passed as a parameter to the evaluation methods of this XPathEvaluator so that a new one is not created on each call to an evaluation method.
Return Value

XPathResult

An empty XPathEvaluator with type ANY_TYPE.

No Parameters
No Exceptions
evaluate
Evaluates an XPath expression string and returns a result of the specified type if possible.
Parameters
expression of type DOMString
The XPath expression string to be parsed and evaluated.
contextNode of type Node
The context is context node for the evaluation of this XPath expression. If the XPathEvaluator was obtained by casting the Document then this must be owned by the same document and must be a Document, Element, Attribute, Text, CDATASection, Comment, ProcessingInstruction, or XPathNamespace node. If the context node is a Text or a CDATASection, then the context is interpreted as the whole logical text node as seen by XPath, unless the node is empty in which case it may not serve as the XPath context.
resolver of type XPathNSResolver
The resolver permits translation of prefixes within the XPath expression into appropriate namespace URIs. If this is specified as null, any namespace prefix within the expression will result in DOMException being thrown with the code NAMESPACE_ERR.
type of type unsigned short
If a specific type is specified, then the result will be coerced to return the specified type relying on XPath conversions and fail if the desired coercion is not possible. This must be one of the type codes of XPathResult.
result of type XPathResult
The result specifies a specific XPathResult to be reused and returned by this method. If this is specified as null, a new XPathResult will be constructed and returned. Any XPathResult which was not created by this XPathEvaluator may be ignored as though a null were passed as the parameter.
Return Value

XPathResult

The result of the evaluation of the XPath expression.

Exceptions

XPathException

INVALID_EXPRESSION_ERR: Raised if the expression is not legal according to the rules of the XPathEvaluatori

TYPE_ERR: Raised if the result cannot be converted to return the specified type.

DOMException

NAMESPACE_ERR: Raised if the expression contains namespace prefixes which cannot be resolved by the specified XPathNSResolver.

WRONG_DOCUMENT_ERR: The Node is from a document that is not supported by this XPathEvaluator.

NOT_SUPPORTED_ERR: The Node is not a type permitted as an XPath context node.

Interface XPathExpression

The XPathExpression interface represents a parsed and resolved XPath expression.


IDL Definition
interface XPathExpression {
  XPathResult        evaluate(in Node contextNode, 
                              in unsigned short type, 
                              in XPathResult result)
                                        raises(XPathException, 
                                               DOMException);
};

Methods
evaluate
Evaluates this XPath expression and returns a result.
Parameters
contextNode of type Node
The context is context node for the evaluation of this XPath expression.
If the XPathEvaluator was obtained by casting the Document then this must be owned by the same document and must be a Document, Element, Attribute, Text, CDATASection, Comment, ProcessingInstruction, or XPathNamespace node.
If the context node is a Text or a CDATASection, then the context is interpreted as the whole logical text node as seen by XPath, unless the node is empty in which case it may not serve as the XPath context.
type of type unsigned short
If a specific type is specified, then the result will be coerced to return the specified type relying on XPath conversions and fail if the desired coercion is not possible. This must be one of the type codes of XPathResult.
result of type XPathResult
The result specifies a specific XPathResult to be reused and returned by this method. If this is specified as null, a new XPathResult will be constructed and returned. Any XPathResult which was not created by this XPathEvaluator may be ignored as though a null were passed as the parameter.
Return Value

XPathResult

The result of the evaluation of the XPath expression.

Exceptions

XPathException

TYPE_ERR: Raised if the result cannot be converted to return the specified type.

DOMException

WRONG_DOCUMENT_ERR: The Node is from a document that is not supported by the XPathExpression that created this XPathExpression.

NOT_SUPPORTED_ERR: The Node is not a type permitted as an XPath context node.

Interface XPathNSResolver

The XPathNSResolver interface permit prefix strings in the expression to be properly bound to namespaceURI strings. XPathEvaluator can construct an implementation of XPathNSResolver from a node, or the interface may be implemented by any application.


IDL Definition
interface XPathNSResolver {
  DOMString          lookupNamespaceURI(in DOMString prefix);
};

Methods
lookupNamespaceURI
Look up the namespace URI associated to the given namespace prefix. The XPath evaluator must never call this with a null or empty argument, because the result of doing this is undefined.
Parameters
prefix of type DOMString
The prefix to look for.
Return Value

DOMString

Returns the associated namespace URI or null if none is found.

No Exceptions
Interface XPathResult

The XPathResult interface represents the result of the evaluation of an XPath expression within the context of a particular node. Since evaluation of an XPath expression can result in various result types, this object makes it possible to discover and manipulate the type and value of the result.


IDL Definition
interface XPathResult {

  // XPathResultType
  const unsigned short      ANY_TYPE                       = 0;
  const unsigned short      NUMBER_TYPE                    = 1;
  const unsigned short      STRING_TYPE                    = 2;
  const unsigned short      BOOLEAN_TYPE                   = 3;
  const unsigned short      NODE_SET_TYPE                  = 4;
  const unsigned short      SINGLE_NODE_TYPE               = 5;

  readonly attribute unsigned short  resultType;
  readonly attribute double          numberValue;
                                        // raises(XPathException) on retrieval

  readonly attribute DOMString       stringValue;
                                        // raises(XPathException) on retrieval

  readonly attribute boolean         booleanValue;
                                        // raises(XPathException) on retrieval

  readonly attribute Node            singleNodeValue;
                                        // raises(XPathException) on retrieval

  XPathSetIterator   getSetIterator(in boolean ordered)
                                        raises(XPathException, 
                                               DOMException);
  XPathSetSnapshot   getSetSnapshot(in boolean ordered)
                                        raises(XPathException, 
                                               DOMException);
};

Definition group XPathResultType

An integer indicating what type of result this is.

Defined Constants
ANY_TYPE
This code does not represent a specific type. An evaluation of an XPath expression will never produce this type. If this type is requested, then the evaluation must return whatever type naturally results from evaluation of the expression.
BOOLEAN_TYPE
The result is a boolean as defined by XPath 1.0.
NODE_SET_TYPE
The result is a node set as defined by XPath 1.0.
NUMBER_TYPE
The result is a number as defined by XPath 1.0.
SINGLE_NODE_TYPE
The result is a single node, which may be any node of the node set defined by XPath 1.0, or null if the node set is empty. This is a convenience that permits optimization where the caller knows that no more than one such node exists because evaluation can stop after finding the one node of an expression that would otherwise return a node set (of type NODE_SET_TYPE).
Where it is possible that multiple nodes may exist and the first node in document order is required, a NODE_SET_TYPE should be processed using an ordered iterator, because there is no order guarantee for a single node.
STRING_TYPE
The result is a string as defined by XPath 1.0.
Attributes
booleanValue of type boolean, readonly
The value of this boolean result.
Exceptions on retrieval

XPathException

TYPE_ERR: raised if resultType is not BOOLEAN_TYPE.

numberValue of type double, readonly
The value of this number result.
Exceptions on retrieval

XPathException

TYPE_ERR: raised if resultType is not NUMBER_TYPE.

resultType of type unsigned short, readonly
A code representing the type of this result, as defined by the type constants.
singleNodeValue of type Node, readonly
The value of this single node result, which may be null. This result is not guaranteed to be the first node in document order where the expression evaluates to multiple nodes.
Exceptions on retrieval

XPathException

TYPE_ERR: raised if resultType is not SINGLE_NODE_TYPE.

stringValue of type DOMString, readonly
The value of this string result.
Exceptions on retrieval

XPathException

TYPE_ERR: raised if resultType is not STRING_TYPE.

Methods
getSetIterator
Creates an XPathSetIterator which may be used to iterate over the nodes of the set of this result.
Parameters
ordered of type boolean
The set must be iterated in document order.
Return Value

XPathSetIterator

An XPathSetIterator which may be used to iterate the node set.

Exceptions

XPathException

TYPE_ERR: raised if resultType is not NODE_SET_TYPE.

DOMException

INVALID_STATE_ERR: The document has been mutated since the result was returned.

getSetSnapshot
Creates an XPathSetSnapshot which lists the nodes of the set of this result. Unlike an iterator, after the snapshot has been requested, document mutation does not invalidate it.
Parameters
ordered of type boolean
The set must be listed in document order.
Return Value

XPathSetSnapshot

An XPathSetSnapshot which may be used to list the node set.

Exceptions

XPathException

TYPE_ERR: raised if resultType is not NODE_SET_TYPE.

DOMException

INVALID_STATE_ERR: The document has been mutated since the result was returned.

Interface XPathSetIterator

The XPathSetIterator interface iterates the node set resulting from evaluation of an XPath expression.


IDL Definition
interface XPathSetIterator {
  Node               nextNode()
                                        raises(DOMException);
};

Methods
nextNode
Returns the next node from the XPathResult node set. If there are no more nodes in the set to be returned by the iterator, this method returns null.
Return Value

Node

Returns the next node.

Exceptions

DOMException

INVALID_STATE_ERR: The document has been mutated since the node set result was returned.

No Parameters
Interface XPathSetSnapshot

The XPathSetSnapshot interface lists the node set resulting from an evaluation of an XPath expression as a static list that is not invalidated or changed by document mutation.

The individual nodes of a XPathSetSnapshot may be manipulated in the hierarchy and these changes are seen immediately by users referencing the nodes through the snapshot.


IDL Definition
interface XPathSetSnapshot {
  Node               item(in unsigned long index);
  readonly attribute unsigned long   length;
};

Attributes
length of type unsigned long, readonly
The number of nodes in the list. The range of valid child node indices is 0 to length-1 inclusive.
Methods
item
Returns the indexth item in the collection. If index is greater than or equal to the number of nodes in the list, this method returns null.
Parameters
index of type unsigned long
Index into the collection.
Return Value

Node

The node at the indexth position in the NodeList, or null if that is not a valid index.

No Exceptions
Interface XPathNamespace

The XPathNamespace interface is returned by XPathResult interfaces to represent the XPath namespace node type that DOM lacks. There is no public constructor for this node type. Attempts to place it into a hierarchy or a NamedNodeMap result in a DOMException with the code HIERARCHY_REQUEST_ERR. This node is read only, so methods or setting of attributes that would mutate the node result in a DOMException with the code NO_MODIFICATION_ALLOWED_ERR.

The core specification describes attributes of the Node interface that are different for different node node types but does not describe XPATH_NAMESPACE_NODE, so here is a description of those attributes for this node type. All attributes of Node not described in this section have a null or false value.

ownerDocument matches the ownerDocument of the ownerElement even if the element is later adopted.

prefix is the prefix of the namespace represented by the node.

NodeName is the same as prefix.

NodeType is equal to XPATH_NAMESPACE_NODE.

namespaceURI is the namespace URI of the namespace represented by the node.

adoptNode, cloneNode, and importNode fail on this node type by raising a DOMException with the code NOT_SUPPORTED_ERR.


IDL Definition
interface XPathNamespace : Node {

  // XPathNodeType
  const unsigned short      XPATH_NAMESPACE_NODE           = 13;

  readonly attribute Element         ownerElement;
};

Definition group XPathNodeType

An integer indicating which type of node this is.

Note: There is currently only one type of node which is specific to XPath. The numbers in this list must not collide with the values assigned to core node types.

Defined Constants
XPATH_NAMESPACE_NODE
The node is a Namespace.
Attributes
ownerElement of type Element, readonly
The Element on which the namespace was in scope when it was requested. This does not change on a returned namespace node even if the document changes such that the namespace goes out of scope on that element and this node is no longer found there by XPath.