XML XPointer Requirements
Version 1.0

W3C Note 24-Feb-1999

This version: http://www.w3.org/TR/1999/NOTE-xptr-req-19990224
Latest version: http://www.w3.org/TR/NOTE-xptr-req
Editors: Steven J. DeRose (Inso Corp. & Brown Univ.) <Steven_DeRose@Brown.edu>.

Copyright ©1999 W3C (MIT, INRIA, Keio) , All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.

Status of this document

This is a W3C Note produced as a deliverable of the XML Linking WG according to its charter. A list of current W3C working drafts and notes can be found at http://www.w3.org/TR .

This document is a work in progress representing the current consensus of the W3C XML Linking Working Group. This version of the XML XPointer Requirements document has been approved by the XML Linking working group and the XML Plenary to be posted for review by W3C members and other interested parties. Publication as a Note does not imply endorsement by the W3C membership. Comments should be sent to www-xml-linking-comments@w3.org, which is an automatically and publicly archived email list.

This document is being processed according to the following review schedule:

Review Schedule
ProcessClosing dateStatusContact
XML Linking WG signoff1999/01/21doneXML Linking WG
XML Plenary signoff1999/02/03done bill.smith@Sun.COM,veillard@w3.org
Publish as W3C Note1999/02/23accepting comments www-xml-linking-comments@w3.org
Checkpoint of comments1999/03/23   

Comments about this document should be submitted to the "contact" listed above for each process.

Many thanks to Tim Bray, James Clark, Mavis Cournane, David Durand, Peter Flynn, Paul Grosso, Chris Maden, Eve Maler, C. M. Sperberg-McQueen, and members of the WG and IG in general for numerous valuable suggestions and other improvements.


This document presents requirements for the XPointer language. XPointer provides ways to directly identify any node, data, or selection in any XML document by describing its structure and context. An identified data location is called a "target." The XPointer specification is particularly meant to enable hyperlinks to identify any such data, regardless of whether there is (or even could be) an ID on the target or not. The XPointer specification is now being developed in the XML-Linking Working Group, building on Working Drafts developed in the XML Working Group.

Because the XPointer language must refer to structural parts of XML documents, those structures must be explicit. Document structure specifications such as DOM and the XML Information Set may wish to consider the XPointer requirements in order to insure interoperability when used with XPointer and XLink.

Related documents

XML Linking Working Group Page [member only], for general information about the activities of the WG.

XML Pointer Language (XPointer) Working Draft, prior WDs produced by the former XML Working Group, and now under the XML Linking WG. Provides a simple yet powerful mechanism for addressing data portions in XML documents. It is very closely based on a multiply-implemented and widely-used technology, extended pointers, defined in the Text Encoding Initiative Guidelines.

XP ointer-Information Set Liaison Statement, produced by the XML Linking Working Group. This document enumerates perceived constraints that work on the XPointer specification has indicated may affect the XML Information Set Working Group, since it is those information structures that XPointer provides access to.

XLink Requirements, produced by the XML Linking Working Group. This document provides requirements governing the work of this WG on the XLink specification.

XML Linking Language (XLink) Working Draft, prior WDs produced by the former XML Working Group, and now under the XML Linking WG.

XML Linking Language (XLink) Design Principles, produced by the former XML Working Group, and now under the XML Linking WG. This document provides general design principles governing the work of this WG, involving both the XLink and XPointer specifications.

Table of Contents

Specific (minimal) XPointer requirements

This section lays out specific minimal functional requirements for the XPointer specification that aim for an appropriate balance of completeness, expressiveness, extensibility, and simplicity. These requirements also apply to any higher-end location-specification language to the extent that it shares some of its functionality objectives with XPointer. Following the specific requirements is a section on background and rationale underlying them.

A: Completeness requirements

This section of the requirements involves the type and variety of data locations, or "targets", that an XPointer must be able to identify.

These requirements make frequent reference to XML information objects such as elements, attributes, PIs, and characters. The formal definition of these objects, their relationships such as ordering, containment, and attribution, and their precise correspondence to XML syntax constructs are the domain of the XML Information Set Working Group. For more detail on the relationship, see the XML Linking Working Group's Liaison Statement.

  1. XPointers must identify XML information objects, rather than necessarily their expressions in the raw XML syntax of a given file.

    For example, an XPointer can identify an element but not a tag, and (potentially) an attribute; but not the equal-sign or quotations that expressed it.

  2. For any single element, character in content, or PI in an XML document, it must be possible to create at least one XPointer that specifically identifies it.

    This includes special processing instructions such as the XML declaration and the stylesheet-attachment PI.

  3. For any single attribute or character in an attribute value in an XML document, it must be possible to create at least one XPointer that specifically identifies it.

    [There is not consensus on whether identification of attributes is required. DOM does not presently provide a built-in way to get from an attribute to the elements bearing it; on the other hand RDF defines an attribute-based representations for which omitting attributes may pose problems. The WG seeks additional input on this issue.]

  4. For any contiguous selection such as could be clicked or dragged by a user in a typical view of the document, it must be possible to create at least one XPointer that specifically identifies it. This includes point targets such as typically selected by a mouse click, entire characters, and synchronous and asynchronous spans, as in typical word processors and browsers.

    This is necessary to model simple everyday user selection, and to support the first interface many applications want to build: the ability for the user to make a selection in the usual way and attach a link to or from it, an annotation or bookmark to it, insert it in a path, and so on. If users could only select or link whole elements, the intuitive "select/act" interface would no longer correspond to the system's actual behavior, and thus become very confusing.

    [Implementor note: Such a selection may include only part of various elements: this is most obvious with a selection that includes the end of one element and the start of the next, but is also true relative to the ancestors of any element. Such a selection may be considered to include information about attributes and boundary locations for elements that include some but not all of the selected range. In the example just given, there would thus be access to the attributes and boundary locations of both the elements that are partly included, even though neither is fully "in" the span; but not, say, of a containing DIV. See Brooks (1988) for an extensive analysis of the semantics and user interface requirements for selecting and editing in tree-structured documents.]

  5. For any point immediately preceding or following any character of content, and for any point immediately inside or outside the beginning or end of any element, or PI, it must be possible to create at least one XPointer that specifically identifies it.

    This is required because it is a standard part of user selection semantics, and to provide an unambiguous way to express targets for myriad other application purposes: inserting or pasting, recording the scope of change or version information, and so on. For example, one may wish to specify the location in content immediately preceding a given PI or sub-element, not only the PI or sub-element itself (especially small elements such as italicized words), in order to unambiguously specify a cursor location, the end of a user selection, the destination of a link, etc.

  6. [There is not consensus on whether this is a short-term requirement] the XPointer specification must provide a way to identify targets that include multiple, potentially discontiguous data portions.

B: Expressiveness requirements

These requirements involve the type and variety of XPointers that can be used to identify a target, which is how the language achieves greater robustness, reusability, and clarity to humans, as described in more detail below). Fulfilling the completeness requirements above does not guarantee fulfilling the expressiveness requirements stated here. These are a different but equally important class of requirement. Indeed it is easy to design a target-rich system, but it would be much more prone to breakage, and far less intuitive and readable to humans (even if it managed to have fewer constructs or be terser, such as the extreme case of a pair of byte offsets into XML source).

  1. The XPointer specification must utilize information structures users can be expected to perceive or understand in documents, such as elements, attributes, PIs, characters, and strings; and to well-known relationships such as containment, siblings, and so on. The XPointer specification is not primarily concerned with machine-oriented concepts such as offsets, absolute nesting depth, and so on.

  2. The XPointer specification must provide for identifying targets of specified names and types, for example by XML IDs, XML PI targets, and element type names.

  3. The XPointer specification must (if Xpointers to attributes are included) provide for identifying an attribute by name, given an element it is on.

  4. The XPointer specification must provide a way for specific XPointers to express that a singleton target is expected. For example, when identifying an element by an XML ID attributes or other potential "key" constructs.

  5. The XPointer specification must provide for identifying elements, PIs, characters, and (if supported) comments, by their ordered position in the document structure relative to other targets that fulfill specified conditions.

  6. The XPointer specification must provide ways to constrain targets by specifying conditions on them, such as element type, attribute values, and the presence of particular content strings.

    For example, identifying a SECTION that contains an ABSTRACT with TYPE=FULL, rather than merely an ABSTRACT; or being unable to test the ABSTRACT's attribute but still identify the SECTION as the intended target. Or, constraining a target to be directly or indirectly within a broader target, or to precede or follow it among the children of a common containing element.

  7. The XPointer specification should make clear a way that it can be extended to support testing datatype-specific conditions when XML Datatypes are later available through the work of the XML Schemas Working Group.

    For example, once it is possible to know which attributes or content strings constitute integers, date, or real numbers, it should be clear how to extend the language to accommodate appropriate comparisons within its conditional constructs.

  8. The XPointer specification must provide a way to specify what version of a target resource is intended to be identified.

  9. The XPointer specification must define how it works in relation to XML namespaces. Since an element or attribute name can be used as part of characterizing locations in an XPointer expression, the meaning of those names must be unambiguous.

C: Robustness requirements

  1. It must be possible, but not mandatory, to create XPointers that can be tested for whether they identify "the same" target when followed as they did when created.

    For example, this may be accomplished by providing a checksum of the destination data. This massively improves robustness because you can detect when a link has broken (although it cannot prevent link breakage from ever happening). [There is not consensus on whether this requirement should be addressed within XPointer or XLink].

  2. All XPointers must survive purely mechanical changes to the target resource.

    For example, changing between single and double quote characters around attributes or between CR, LF, and CRLF for line-ends; inserting extraneous whitespace inside tags (for example, between the element type name and the attributes, or around attribute equal-signs, etc); re-ordering attributes; swapping between CDATA marked sections and entities for escaping tag opens; or rearranging the division between entities.

    Such change are not considered to change the logical information structure, and should not prevent an XPointer from being interpreted as pointing to the same data. However, any change that touches the structure may change the destination of the XPointer, and can be (see the preceding requirement) caught. Just what changes are considered mechanical, is to be worked out in cooperation with the XML Information Set Working Group.

  3. XPointer must attempt to avoid dependencies on character set and internationalization issues, such as its definition of "character".

  4. The XPointer specification must enumerate any dependencies on the presence of a DTD or schema, and attempt to minimize such in order to facilitate interoperability of XPointers across DTD-supporting and non-DTD-supporting XML environments.

    For example, an XPointer that identifies a target in part via an attribute, may not be interpretable if the target is in a non-standalone XML document being processed by a non-DTD-supporting parser. This is inherent in XML, but needs to be made clear in the XPointer specification.

  5. The XPointer specification must be clear about what errors may arise, and what they mean. To that end, it should attempt to enumerate any dependencies on the XML Information Set.

    In particular, the XPointer specification must clearly define the meaning when a syntactically correct XPointer does not resolve to any data object. [There is no consensus as of this writing on whether that is necessarily an error]. There are security issues to consider in this, such as that distinguishing "data does not exist" from "data access not authorized" may itself compromise security.

D: General user requirements

  1. XPointers must be reasonably user-readable and user-writable, particularly for cases likely to be commonly needed by beginning and intermediate users. Syntax must scale smoothly, not abruptly with increasing complexity.

  2. The XPointer specification must provide for re-use of XPointers by other XPointers. [There is not consensus that this is a short-term requirement, though there is consensus on its desirability in principle.]

    For example, it should be possible to express a generic XPointer such as "the following sibling of type T", and then write other XPointers that re-use it, such as "the first ABSTRACT within whatever (that xpointer) points to. This is particularly useful to create generic pointers to relative rather than absolute targets, and to support functionality like the HTML BASE feature.

E: Mechanical and syntactic requirements

  1. XPointers must identify locations in data, not merely the data at those locations.

    A user who selects a word or string in a document and attaches a link to it via an XPointer, may be very unhappy if they follow the link and get just the phrase: they want the target data with access to its context. This is why the HTML mechanism of scrolling to a target element by appending "#" and its name to the document's URL is the way it is: literally returning just the anchor is usually not enough. From a location one can get the context data when needed; from data alone one cannot derive a unique location when needed.

    As an example, a DOM handle to an XML information set object such as an element, would be one way to identify the object in context; while generating a string of content and/or markup that contains no reference to the original context, would not.

  2. XPointer syntax must not require excessive escaping when XPointers are embedded in URLs and external identifiers. That is, XPointer must not go nuts with punctuation marks. This requirement must be balanced with the next one, of course.

    This is to preserve readability in raw form (which is of minimal importance, as in XML), and to reduce multi-escaping errors such as when pasting between external identifiers and URLs, passing URLs through scripts, typing URLs in raw, etc, etc.

    Note: There are two kinds of escaping to be considered: XML escaping such as for ampersand, angle brackets, and quotes when XPointers appear within XML documents; and URL escaping such as %20 for space, when they appear in URLS.

  3. When embedded in URLs, XPointers must be correctly processed by current Web software such as servers and user agents.

    This obviously does not mean that existing software magically becomes XPointer-aware; merely that XPointer syntax must survive generic URL processing such as performed by servers.

  4. XPointers must use syntax in ways that are appropriate and familiar for the constructs we use them for.

    For example "+", if used, should mean addition, not variable substitution. For example, operations from languages that deal with XML-like data (ordered trees of repeatable typed objects, with document-scope unique names) would provide applicable syntax, while languages that deal with XML-unlike data (such as unordered sets of non-repeatable or untyped objects and names) would not.

  5. XPointer syntax must be readily generalizable to more powerful specifications needed in the future. This means we can't use up all the syntax now, or take up all the simple syntax for the one or two simplest cases at the cost of a massive increment of complexity to get any further.

F: Non-requirements

This section collect potential requirements that have been considered and rejected, at least for the initial version of the XPointer specification.

  1. For any entire comment in an XML document, it must be possible to create at least one XPointer that specifically identifies it.

  2. Comments could be included in order to maintain the ability to identify any node in the information structure, assuming that XML Information Structure can include comments. There do not seem to be compelling use cases; critical information should not, in principle, be embedded in comments. Also, XML does not require that comments be passed back to applications at all, and so including them in XPointer would introduce ambiguity here.

    If comments were supported, The XPointer specification would have to define counting and perhaps other operations such that XPointers not directly pointing at comments would work regardless of whether the application had gotten comments from the XML parser or not.

Background and Rationale

This document defines specific requirements for the XPointer specification, beyond necessary but "apple pie" requirements such as "completeness", "clarity", or "brevity". These more detailed functional requirements are intended to facilitate decision-making during development of the Proposed Reccomendation from the established Working Draft.

For XPointers to interoperate, they must be defined in terms of an explicit and consistent conceptual structure. For example, any construct involving child numbers breaks badly if one application can counts PIs, comments, or some other objects as children, while another doesn't: an XPointer interpreted by the two would identify different places, which is unacceptable. Because a consistent structure is required for XPointers to interoperate, constraints on the conceptual information structure for XML are discussed in a related liaison paper.

Kinds of pointing

A system for identifying data locations can be characterized along two basic axes: The set of targets it can express, versus the set of methods it uses to express them.

The first axis, the range of targets, involves the extent or range of things the system can identify at all: just documents as wholes, just elements, individual characters, strings, etc. URLs leading to HTML generally can identify whole documents (typically files), as well as named A elements that the author specifically provided; but not other elements, words, or any other kind of selection or structure in HTML (unless of course you program via CGI, server plug-ins, etc. -- which not surprisingly can do anything any computer program can do).

The second axis, the range of descriptions available for doing the pointing, involves how well a system can express the intent of the pointer as distinct from what it pointed to. For example, saying "the first footnote with author='Smith' rather than "element 3735928559". Weakness of a system on this axis would limits the mean of expressing what you actually mean, as opposed to just what you got, even if the result happens to be identical. This would have the same problems as a human language that tried to remove all synonyms, or a mathematical model that tried to define "integer" by listing all integers. While a grand, even elegant idea at first glance, it is inadequate.

This distinction is the most crucial to any data identification specification. It is usually quite easy to achieve any desired level of target power, and extremely trivial to implement. For example, a pair of byte offsets into an HTML file has very high target power: it can point to any element, any tag, any attribute (name, value, or whole), any character, and any string regardless of how it crosses element boundaries. Generally, the target power is even too high, because most offset pairs point to ill-formed data chunks for which users would likely never need a pointer at all (such as "me='w" in "<p name='wow'>").

At the same time, a pair of byte offsets has almost no descriptive power: the offsets to a particular paragraph, footnote, citation, or other element instance, give no hint of what they are: they're just two numbers. There is nothing in such a system that makes such a reference any easier, more robust, more readable, or simpler to create than an absurd one (such as from the middle of the name of the TYPE attribute of some element to the middle of the 27th word of its 946th descendant). Low-description systems fail to reflect the important fact that some things are ubiquitous, coherent, and highly valuable, but others are bizarre, ill-formed, or nearly useless.

Some other models have the opposite characteristics: IDs have fairly low target power (you can only link to things that have them, and only whole elements can have them), but very high descriptive power (you can say exactly what you mean, since IDs typically express a notion of objects that have true names). This is why they are so highly robust, but not very general.

These two axes are not necessarily a tradeoff: a system can also have neither or both. For example, numbering all the elements of a document in order and using that number, is low to moderate on both kinds of power (though it affords some very nice optimizations for implementors). And a system of structural identifiers that mirrors the structure of information directly can be quite high in both kinds of power.

Systems with high target power but low description power have other problems, such as compromising robustness: they break far more easily and, even worse, make it is nearly impossible to detect failures. They also lack reusability: an identifier created for one context can generally be used in other contexts only if it is highly descriptive. A common SGML and XML example is a document available in multiple languages (or multiple drafts or editions). If you assign the same IDs to corresponding elements in each version, it is trivial to re-use links with any version, or even with several at once (say, for parallel text display or comparison); a system that can utilize ancestor, child, and sibling relationships can frequently get the same result even with few IDs around; a purely target-rich system cannot./p>

Many user comments on the existing WDs and implementations have advocated increasing both kinds of power, but especially description power: to describe the intended destination in more abstract, information-oriented or human-oriented terms, rather than only in terms of geometry or tree position. Taking best advantage of more descriptive pointing may requires at least very slight knowledge of tagging practices, but description helps even without such knowledge; for almost any XML it is likely that pointing to an ID-less element by pointing to some nearby element with an ID, and stepwise from there to the final target element, is a good improvement over pointing merely via child-numbers all the way down from the document root element. In many XML applications schemas and use requirements -- such as a transaction record, a bibliography, an RDF file, etc. -- provide much more information that enable automatic construction of highly clear, descriptive pointers -- if the language allows them to exist at all. For example, a pointer to the last transaction in a set, or the one with the highest price, is clear and unambiguous. In general, finding an element with a given ID, or a given kind of element type or context, or elements with certain combinations of simple characteristics in terms of types, attributes, tree relationships to other characterized targets, etc. add robustness.

On minimalist pointing

Target-rich pointing may appear at first glance to be all that is needed for XPointer. This is probably because 'select/create-link' is the first interface some applications will implement, and because even completely non-descriptive pointing can trivially support that interface. However, there are many other applications for XPointer. If the criterion is merely ability to point somehow, that would logically lead to byte offsets as the simplest solution; but this is widely agreed to be absurd. While for one specific application scenario that might barely suffice, the situation is clearly more complex; it is analogous to some other familiar situations:

Such "completeness" arguments may be true as far as they go, but are not sufficient: much more is required for an adequate solution in all these cases. Descriptive power requirements matter because human usability requires other features such as readability, indirection, re-use, ability to change things without countless manual repairs, and generally better robustness in the face of change. Thus, more descriptive pointing is a vastly better solution.

Radically minimalist pointing has never been contemplated in XPointer, and should not be. It is extremely easy to implement, and attractive for that reason (although descriptive pointing has also been repeatedly shown not to be hard). But at the same time, minimalist pointing has a number of inherent limitations that taken together make descriptive pointing necessary:

  1. Minimalist pointers are not, in general, easily interpretable by humans. Large precise integers, or vectors of smaller ones, are just not as understandable to humans as "the 5th chapter past here" or "chapter 4 section 2 paragraph 5"; the latter are common, familiar notions.

  2. Minimalist pointing decreases the ability to achieve robustness. This is because with minimalist/procedural pointing, you can only refer to what you got, not to what you wanted. This is equivalent so long as nothing changes; but as soon as something changes (as is frequent), minimalist pointers break far more readily than descriptive ones. There is a scale of robustness, and only by providing a range of descriptive pointing techniques can link authors (human or otherwise) take advantage of it.

  3. Minimalist pointing is less appropriate for dealing with dynamically-generated HTML or XML, such as database extractions or dynamically-assembled documents (an increasingly common scenario). This is because such information is likely to be changed in small-scale but widespread ways (replacing the stock ticker or visit counter field, etc), and such minor changes will commonly break minimalist pointers to unchanged surrounding data, but not break descriptive pointers. Also, such data typically has systematic regularities that make automated construction of highly descriptive pointers easy and highly robust -- but only if the pointing language lets such pointers exist.

  4. Minimalist pointers have virtually no potential for re-use. They cannot describe relative locations, such as "the next chapter" or "the chapter 5 milestones tags earlier", and so cannot be re-used in multiple contexts (see DeRose 1989 for more on link reuse). If pointers had too little descriptive power, even a trivial "next slide" link in presentations could not be made generic: only an explicit, separate, tediously different "next slide" link on every slide; which again seems absurd. It is hard to imagine plausible situations where a non-descriptive link could be usefully re-used ("37 elements earlier"???).

  5. Minimalist pointers generally provide no selection of targets consisting of multiple locations, such as the set of all elements of a given type, the set of all characters within a given element, etc.

    This is a more severe problem than it seems, because it impacts any XPointers that involve stepwise specifications (sometimes called "location ladders"), even simple ones such as "the SEC that contains an ABSTRACT". If multiple locations are ruled out even for intermediate results in an addressing expression, then such pointers are ruled out even when they would end up at a single node. This is because the evaluator would find a lot of SEC elements first, and only then be able to go on to pick the one that is the final result. If the implementation must support multiple results in intermediate steps, the savings sometimes claimed for ruling them out largely disappears.

  6. Minimalist pointers are commonly limited to identifying a single whole element, a single point or character, or at best a string (offsets can "sort of" point to more, but really only point to byte ranges that may, sometimes, correspond to these units). Normal user selections cannot be modeled.

The nature of the distinction

Many of these differences arise from a single underlying cause: namely, inadequate descriptive power. A non-descriptive or trivially descriptive pointer language might in theory be able to point to all the same objects as a descriptive language. However, linking to "the 3rd child of the 4th child of the root" does not mean the same thing as linking to "ID chap5" or "the element immediately preceding the ABSTRACT". They may happen to be the same thing on one day, for one version of one location in one dataset; but the meaning is not the same.

Note: This fundamental distinction is so important that it has names and entire literatures within many fields of study. Linguists call such cases de dicto/de re ambiguities; logicians and mathematicians call them intensional vs. extensional specification; computer scientists call them shallow and deep (or weak and strong) equality; markup theorists call them procedural and declarative markup. In all these fields, providing formal systems that support only one of the two cases is a classic error.

Robustness issues

Another example of the difference in power of descriptive over minimalist pointing involves robustness (that is, pointers that have a good chance of pointing to the same place even after the document has been edited in various ways). It has occasionally been suggested that depending on usage scenarios, TREELOCs (which specify a node by giving the sequence of child-numbers to walk down to it) might be just as robust as IDs (unique names), or even more so. While this is true in theory, I find it unconvincing because

  1. Although one can create such scenarios in theory, they are not at all typical of existing practice.

  2. IDs require a "positive option" (editing the ID) to break them, but TREELOCs break readily (even editing a far-distant part of the document) without the user having to do anything local or specific to break them.

  3. Most crucial: with IDs or other descriptive methods, authors can create a work practice under which they can edit without breaking links. Software can even help, for example by tracking deleted IDs so they are not accidentally re-used. With TREELOC there is no such possibility at all: you cannot manage an editing process so that a given node is always the third child, even after you inserted a child before it.

So although it is possible to fail under either approach if you make all the worst possible choices, that does not make the approaches equally robust. This is because only with the ID approach is it possible to protect yourself even if you make all the best choices.


The many advantages of descriptive pointing are crucial for a scalable, generic pointing system. Descriptive pointing is crucial for all the same reasons that descriptive markup is crucial to documents, and that making links first-class objects is crucial to linking. It is also clearly feasible, as shown by multiple implementations of the prior WDs from the XML WG, and of TEI extended pointers. At the same time, in order to get the specification out in the time frame required, we wish to keep a bound on the size of the language, and not implement all possible constructs, tests, filters, and so on. XPointer thus seeks to provide a small but rich set of descriptive pointing mechanisms, such as walking around trees in terms of their fundamental relationships; without taking on the undue task of a full-fledged, multi-purpose tool to express every conceivable predicate. To do more would take too long; to do less would actually complicate and weaken applications, largely by limiting XPointer to human-unclear, less robust, and less re-usable pointers.

Some of the features of descriptive pointing bear some similarity to querying in general, but that is because the term "querying" covers an awful lot of ground: Yu and Meng (1998) note that "the goal of query processing and optimization is to find user-desired data from an often very large database efficiently" -- where "user-desired" is arbitrarily broad; other definitions speak of selecting data that "fulfills arbitrary sets of constraints" or "has certain characteristics". These all cover a wide range of activities, whose requirements, priorities, and consequent design tradeoffs differ greatly. A search for an ID is a query (though a very simple one); and there are many user and developer requests for XPointer features that overlap with what one expects in a full-blown query language (among the relevant issues already assigned numbers, are 17-21, 26, 27, 44, and 46-49).

This similarity is inevitable because any language that selects things out of trees requires certain basic operations, such as genetic access to nodes of the tree; without such operations any language that deals with trees would be utterly crippled. However, XPointer has other requirements that are not shared by various other mechanisms that may arise for XML for other purposes. Among these are robustness, plus a quite different user perspective and priority: the purpose with XPointer is to point to a known data object (typically a single one or a well-defined group to be treated as if it were one), rather than to discover whether any data might be out there somewhere and how much.

By separating minimalist vs. descriptive pointing models and acknowledging our need for both, we can assign our existing XPointer issues more clearly into categories that we can deal with effectively. This two-level approach allows a natural beginning.


Abiteboul, Serge et al. 1997. "Querying Documents in Object Databases." In International Journal on Digital Libraries 1(1): 5-19.

André, Jacques, Richard Furuta, and Vincent Quint (eds). 1989. Structured Documents. Cambridge: Cambridge University Press. ISBN 0-521-36554-6.

Brooks, Kenneth P. 1988. "A Two-view Document Editor with User-definable Document Structure." Dissertation, Stanford University Department of Computer Science. Reprinted as Technical Report #33 by Digital Systems Research Center.

Burkowski, Forbes J. 1991. "An Algebra for Hierarchically Organized Text-Dominated Databases." Waterloo, Ontario, Canada: Department of Computer Science, University of Waterloo. Manuscript: Portions "appeared as part of a paper presented at RIAO '91: Intelligent Text and Image Handling, Barcelona, Spain, Apr. 1991."

Conklin, Jeff. 1987. "Hypertext: An Introduction and Survey." IEEE Computer 20 (9): 17-41.

DeRose, Steven J. 1989. "Expanding the Notion of Links." In Proceedings of Hypertext '89, Pittsburgh, PA. Baltimore, MD: Association for Computing Machinery Press.

DeRose, Steven J. and David G. Durand. 1995. "The TEI Hypertext Guidelines." In Text Encoding Initiative: Background and Context. Boston: Kluwer Academic Publishers. ISBN 0-7923-3689-5.

DeRose, Steven and Eve Maler (eds). 1998. "XML Linking Language (XLink)." World Wide Web Consortium Working Draft. March 1998.

DeRose, Steven and Eve Maler (eds). 1998. "XML Pointer Language (XPointer)." World Wide Web Consortium Working Draft. March 1998.

Kahn, Paul. 1989. "Webs, Trees, and Stacks: How Hypermedia System Design Affects Hypermedia Content." In Proceedings of the Third International Conference on Human-Computer Interaction, Boston, MA, September 18-22, 1989.

Liu, C. L. 1977. Elements of Discrete Mathematics. New York: McGraw-Hill. ISBN 0-07-038131-3.