Copyright © 2000 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This document proposes a facility, similar to that of HTML BASE, for defining base URIs for parts of XML documents.
The XML Linking Working Group, with this 07 June 2000 XML Base second Last Call working draft, invites comment on this specification. Because of the number and nature of comments received, a second Last Call Working Draft is considered prudent. See the XML Base Disposition of Comments. The Last Call period begins 7 June and ends 28 June 2000.
The W3C Membership and other interested parties are invited to review the specification and report early implementation experience. Please send comments to www-xml-linking-comments@w3.org (archive).
While we welcome implementation experience reports, the XML Linking Working Group will not allow early implementation to constrain its ability to make changes to this specification prior to final release.
For background on this work, please see the XML Activity Statement.
A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR/.
The XML Linking Language [XLink] defines XML constructs to describe links between resources. One of the stated requirements on XLink is to support HTML [HTML 4.01] linking constructs in a generic way. The HTML BASE element is one such construct which the XLink Working Group has considered. BASE allows authors to explicitly specify a document's base URI for the purpose of resolving relative URIs in links to external images, applets, form-processing programs, style sheets, and so on.
This document describes a mechanism for providing base URI services to XLink, but
as a modular specification so that other XML applications benefiting from additional
control over relative URIs but not built upon XLink can also make use of it. The
syntax consists of a single XML attribute named xml:base
.
The deployment of XML Base is through normative references by new specifications, for example XLink and the XML Infoset. Applications and specifications built upon these technologies will natively support XML Base.
[Definition: ] The key words must, must not, required, shall, shall not, should, should not, recommended, may, and optional in this specification are to be interpreted as described in [IETF RFC 2119].
The terms base URI and relative URI are used in this spec as they are defined in [IETF RFC 2396].
The attribute xml:base
may be inserted in XML documents to
specify a base URI other than the base URI of the document or external
entity. The value of this attribute is interpreted as a URI Reference as
defined in RFC 2396 [IETF RFC 2396], after processing
according to Section 4.1.
In namespace-aware XML processors, the "xml" prefix is bound to the namespace
name http://www.w3.org/XML/1998/namespace
as described in Namespaces in XML
[XML Names]. Note that xml:base
can be still used by
non-namespace-aware processors.
An example of xml:base
in a simple XHTML document follows.
<?xml version="1.0"?> <html xmlns="http://www.w3.org/1999/xhtml" xml:base="http://example.org/today/"> <head> <title>Virtual Library</title> </head> <body> <p>See <a href="new.xml">what's new</a>!</p> <p>Check out the hot picks of the day!</p> <ol xml:base="/hotpicks/"> <li><a href="pick1.xml">Hot Pick #1</a></li> <li><a href="pick2.xml">Hot Pick #2</a></li> <li><a href="pick3.xml">Hot Pick #3</a></li> </ol> </body> </html> |
The URIs in this example resolve to full URIs thus:
Relative URIs appearing in an XML document are resolved against xml:base
attributes as follows:
A relative URI appearing in text content is resolved against the base
URI described by the xml:base
attribute of the nearest
ancestor element having an xml:base
attribute.
A relative URI appearing in an attribute value is resolved against the
base specified in the xml:base
attribute appearing on the
element owning the attribute, if one exists, otherwise the xml:base
attribute of the nearest ancestor of the owning element having an
xml:base
attribute. Note that this applies to xml:base
attributes themselves.
A relative URI appearing in the content of a processing instruction is
resolved against the base URI described by the xml:base
attribute of the nearest ancestor element having an xml:base
attribute.
In each case, if no ancestor with an xml:base
attribute is found,
the base is determined by applying RFC 2396 [IETF RFC 2396]
which is summarized as follows (highest priority to lowest):
The base URI is determined by the scope of xml:base
attributes
within the current XML entity, as described above.
The base URI is that of the encapsulating entity (message, document, or none).
The base URI is that of the URI used to retrieve the entity.
The base URI is defined by the context of the application.
The scope of xml:base
does not extend into external
entities, but it does extend into internal entities.
NOTE:The presence of
xml:base
attributes might lead to unexpected results in the case where the attribute value is provided, not directly in the XML document entity, but via a default attribute declared in an external entity. Such declarations might not be read by software which is based on a non-validating XML processor. Many XML applications fail to require validating processors. For correct operation with such applications,xml:base
values should be provided either directly or via default attributes declared in the internal subset of the DTD.
The set of characters allowed in xml:base
attributes
is the same as for XML, namely [Unicode]. However, some
Unicode characters are disallowed from URI references, and thus
processors must encode and escape these
characters to obtain a valid URI reference from the attribute value.
The disallowed characters include all non-ASCII characters, plus the excluded characters listed in Section 2.4 of [IETF RFC 2396], except for the crosshatch (#) and percent sign (%) characters and the square bracket characters re-allowed in [IETF RFC 2732]. Disallowed characters must be escaped as follows:
Each disallowed character is converted to UTF-8 [IETF RFC 2279] as one or more bytes.
Any octets corresponding to a disallowed character are escaped with the URI escaping mechanism (that is, converted to %HH, where HH is the hexadecimal notation of the byte value).
The original character is replaced by the resulting character sequence.
XML 1.0 [XML] uses URI references in the system identifiers
for external entities. Since these declarations appear outside of the document
element (in an internal subset or external DTD), the scoping rules for
xml:base
prevent these URIs from being affected by the value of
xml:base
.
The XML Infoset [XML Infoset] defines the base URI property of element information items. The latest Infoset specification supports XML Base for purposes of determining the value of this property. Interfaces, applications, and specifications referencing this infoset property will support XML Base natively.
Namespaces in XML [XML Names] uses URI references, which as currently
defined should not be resolved relative to the base URI defined by
xml:base
for the purposes of namespace identification. Higher level
processes which dereference namespace URIs are not covered by the namespaces spec
and might at their option specify that xml:base
is honored for the
purposes of fetching resources at those URIs.
The XLink [XLink] specification requires support for XML Base.
XHTML [XHTML] uses URI references beyond those expressible in XLink. These URI references might be resolved by an application relative to the base URI defined by XML Base. The XHTML specification might want to describe their level of support for XML Base.
XML Schema Part 2: Datatypes [XML Datatypes] defines a uriReference
primitive
datatype. The XML Datatypes specification might want to require that applications
recognizing this datatype and resolving such URIs be aware of XML Base.