This document is also available in the following non-normative format: XML (DTD, XSL).
Copyright © 2002 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply.
This specification defines the XML Pointer Language (XPointer) Framework, an extensible system for XML addressing that underlies additional XPointer scheme specifications. The framework is intended to be used as a basis for fragment identifiers for any resource whose Internet media type is one of text/xml
, application/xml
, text/xml-external-parsed-entity
, or application/xml-external-parsed-entity
. Other XML-based media types are also encouraged to use this framework in defining their own fragment identifier languages.
This is a Last Call W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress." Comments on this document should be sent no later than 31 July 2002 to the public mailing list www-xml-linking-comments@w3.org (archive).
This document has been produced by the W3C XML Linking Working Group as part of the XML Activity. The goals of this work are set out in the XPointer Requirements document.
There are patent disclosures and license commitments associated with this working draft, which may be found on the XPointer IPR Statement page in conformance with W3C policy.
Even though it has not been seen before in this form, this specification is being published as a Last Call Working Draft because it is essentially a subset of the previous specification. This specification contains only the "bare names" and scheme mechanism features that were in the XPointer Candidate Recommendation published on 11 September 2001. Note that the "bare names" functionality has been extended slightly to include a schema-based ID addressing option.
We are specifically seeking input on what the XML Linking WG should recommend
as a minimum conformance level for the purposes of XPointer usage in fragment
identifiers for any resource whose Internet media type is one of text/xml
, application/xml
, text/xml-external-parsed-entity
, or application/xml-external-parsed-entity
. Actually
specifying this level is an issue that will have to be taken up normatively
in the successor to IETF RFC 3023 [RFC 3023].
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/
1 Introduction
1.1 Notation
1.2 Terminology
2 Conformance
3 Language and Processing
3.1 Syntax
3.2 Shorthand Pointer
3.3 Scheme-Based Pointer
3.4 Namespace Binding Context
A References
A.1 Normative References
A.2 Non-Normative References
This specification defines the XML Pointer Language (XPointer) Framework, an extensible system for XML addressing that underlies additional XPointer scheme specifications. The framework is intended to be used as a basis for fragment identifiers for any resource whose Internet media type is one of text/xml
, application/xml
, text/xml-external-parsed-entity
, or application/xml-external-parsed-entity
. Other XML-based media types are also encouraged to use this framework in defining their own fragment identifier languages.
Many types of XML-processing applications need to address into the internal structures of XML-encoded resources using URI references, for example, the XML Linking Language [XLink], XML Inclusions [XInclude] , the Resource Description Framework [RDF], and SOAP V1.2 [SOAP12]. This specification does not constrain the types of applications that utilize URI references to XML-encoded resources, nor does it constrain or dictate the behavior of those applications once they locate the desired information in those resources.
[Definition: The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [RFC 2119].]
The formal grammar for the XPointer Framework is given using simple Extended Backus-Naur Form (EBNF) notation, as described in the XML Recommendation [XML].
A string conforming to this specification. This specification defines the syntax and semantics of pointers.
A portion of a pointer that provides a scheme name and some pointer data that conforms to the definition of that scheme.
A specialized pointer data format that has a name and is defined in a specification.
A software component that identifies some subresource of an XML-encoded resource by applying a pointer to it. This specification defines the behavior of XPointer processors.
A software component that incorporates or uses an XPointer processor because it needs to access XML-encoded resources by means of URI references. The occurrence and usage of URI references, and the behavior to be applied to resources and subresources obtained by accessing those URI references, are governed by the definition of each application's corresponding data format (which could be XML-based or non-XML-based). For example, HTML [HTML] Web browsers and XInclude processors are applications that might use XPointer processors.
A violation of the rules of this specification; results are undefined. Specifications of XPointer schemes may define their own error conditions that have different consequences from XPointer Framework errors.
The inability of an XPointer processor to identify a subresource within an XML-encoded resource. Note that while failure of a shorthand pointer (shorthand) causes an XPointer Framework error, failure of a pointer part does not.
A list of zero or more pairs of XML Namespace-defined [XML-Names] namespace prefixes and their associated namespace URIs.
This specification defines a framework; it does not currently define a minimum conformance level for XPointer processors. Thus, the information in this section defines conformance requirements only for the framework portion of any minimum conformance level.
XPointer processors normatively depend on reversing any fragment identifier encoding and escaping done in conformance to [RFC 2396] (as updated by [RFC 2732]).
XPointer processors normatively depend on sufficient input about an XML resource to identify the following information items and properties, in the cases where they exist in the resource:
From the XML Information Set [Infoset]:
[document element] property
Note that if the XML resource is not a document but rather an external parsed entity, this property will not be reported. Rather, the information set is effectively extended to report the one or more top-level elements in the entity as ordered "root element" properties for the entity.
[attributes] property
[children] property
[attribute type] property
[normalized value] property
From the XML Schema post-schema validation information set (PSVI) [XMLSchema]:
[schema normalized value] property
Either:
[member type definition] property
[type definition] property
or:
[member type definition namespace] property
[member type definition name] property
[type definition namespace] property
[type definition name] property
Software components claiming to be XPointer processors must conform to this XPointer Framework specification and any other specifications that, together with this specification, define the minimum conformance level for XPointer, and may conform to additional XPointer scheme specifications. XPointer processors must document the additional scheme specifications to which they conform. Specifications that depend on XPointer processing should document the schemes they require and allow.
Conforming XPointer processors must report XPointer Framework errors to the application. Applications may terminate or recover from XPointer Framework errors in any fashion. XPointer processors should not report failure conditions that do not result in error conditions.
This section describes the XPointer Framework and the behavior of XPointer processors with respect to the framework.
An XPointer processor takes as input an XML-encoded resource and a fragment identifier taken from the URI reference that was used to access the resource, and produces as output either an identification of some subresource within that resource based on the pointer extracted from the fragment identifier, or one or more errors, or both. If the fragment identifier contains any characters that have been encoded or escaped to conform to [RFC 2396] (as updated by [RFC 2732]) requirements, the XPointer processor must reverse the encoding or escaping in order to interpret the pointer.
If a string used as a pointer does not adhere to the syntax defined in this section, it is an error.
[1] | pointer | ::= | shorthand | schemebased | |
[2] | shorthand | ::= | Name | |
[3] | schemebased | ::= | ptrpart (S?, ptrpart)* | |
[4] | ptrpart | ::= | scheme '(' schemedata ')' | [VC: Parenthesis escaping] |
[5] | scheme | ::= | NCName | |
[6] | schemedata | ::= | Char* |
Validity constraint: Parenthesis escaping
The end of a
pointer part is signaled by the right parenthesis ")
" character
that is balanced with the left parenthesis "(
" character
that began the part. If either a left or a right parenthesis occurs without
being balanced by its counterpart, it must be escaped with a circumflex (^
)
character preceding it. Any literal occurrences of the circumflex must be
escaped with an additional circumflex (that is, ^^
). Any other
use of a circumflex is an error.
A shorthand pointer consists of an XML-defined Name alone. The Name identifies a single element in the XML resource by ID as follows, in terms of the XML resource's information set:
If the document has an XML Schema [XMLSchema] PSVI, and exposes the [type definition] and [member type definition] properties, then:
Return the first element information item in document order with an [attributes] collection containing an attribute information item, or whose [children] collection contains an element information item, which has a [schema normalized value] equal to the Name and which has a type definition (the [member type definition], if one is present, otherwise the [type definition]) which has either:
a {target namespace} of "http://www.w3.org/2001/XMLSchema
"
and a {name} of "ID
" (having the effect of identifying
the element information item associated with an identifier that is directly
assigned the XML Schema ID type), or
a {base type definition} that directly or recursively satisfies (a) (having the effect of identifying the element information item associated with an identifier that is directly or indirectly derived from the XML Schema ID type);
or, if an element information item was not identified in the previous step because the document has an XML Schema PSVI but does not expose the [type definition] and [member type definition] properties, then:
Return the first element information item in document order with
an [attributes] collection containing an attribute information item,
or whose [children] collection contains an element information item,
which has a [schema normalized value] equal to the Name and
which has a [type definition namespace] of "http://www.w3.org/2001/XMLSchema
"
and a [type definition name] of "ID
" or a [member
type definition namespace] of
"http://www.w3.org/2001/XMLSchema
" and a [member type definition name] of "ID
";
or, if an element information item was not identified in the previous step:
Return the first element information item in document order with
an [attributes] collection containing an attribute information item
with an [attribute type] of "ID
" and a [normalized
value] equal to the Name.
If no element is identified by this process, the Name fails and the pointer is in error.
Note:
A shorthand pointer provides, for resources with XML-based media types, a rough analog of HTML fragment identifier behavior. However, if ID typing information is not available because no DTD or schema information is available, the pointer will not identify any element. There are several ways to make element identification more reliable. For example, the creator of a resource can use an internal DTD subset to indicate the presence of ID-typed attributes, and the creator of a pointer can, instead of a shorthand pointer, use a schema-based pointer and provide one or more schemes that address the desired element in other ways.
A scheme-based pointer consists of one or more pointer parts, optionally separated by XML-defined white space (S). Each part has a scheme name and contains, within parentheses, data (zero or more of the XML-defined Char) conforming to the named scheme. If the scheme data contains parentheses, they must be either balanced or escaped.
In the case of URI references referring to any resource whose Internet
media type is one of text/xml
, application/xml
, text/xml-external-parsed-entity
, or application/xml-external-parsed-entity
, this specification reserves all scheme names
for definition in additional W3C XPointer scheme specifications. However,
the scheme mechanism provides a general framework for extensibility; other
XML-based media types are also encouraged to use this framework in defining
their own fragment identifier languages.
When multiple pointer parts are provided, an XPointer processor must evaluate them in left-to-right order. If a part being evaluated fails because the XPointer processor does not support the scheme, because the scheme data is syntactically in error according to the specification governing that scheme, or because the scheme identifies no subresource, that part is consumed and the next, if any, is evaluated. The result of the first pointer part whose evaluation succeeds is reported by the XPointer processor as the subresource identified by the pointer as a whole, and evaluation stops. If all the parts fail, it is an error. If a scheme-based pointer has an error in its construction as a whole, evaluation stops and pointer parts are not consumed.
A scheme-based pointer might contain characters that are not allowed in a URI reference. For example, this XPointer Framework specification allows spaces between pointer parts, and individual XPointer scheme specifications might allow or require other characters that are disallowed in URI references. During creation of the pointer, it is typically not necessary to perform encoding or escaping on disallowed characters. However, by the time a pointer is fed as input (as a fragment identifier on a URI reference) into a URI resolver, any such characters must have been encoded and/or escaped as defined in [RFC 2396] (as updated by [RFC 2732]).
Scheme specifications may define ways to bind XML Namespaces [XML-Names] prefixes to namespace names for the purpose of interpreting element and attribute names' namespace prefixes appearing in pointer parts. These bindings contribute to a namespace binding context that applies to all pointer parts to the right of the pointer part making the binding, unless exceptions are explicitly made by the schemes in question. The documentation for any namespace-binding scheme must specify whether its bindings remain in effect for later pointer parts. The documentation for every scheme must specify whether it uses the namespace binding context.
The initial namespace binding context prior to evaluation of the first
pointer part consists of a single entry: the xml
prefix bound
to the URI http:/www.w3.org/XML/1998/namespace
. Pointer parts
must not attempt to redefine the xml
prefix; any attempt to do
so results in no change being made to the namespace binding context.