Default Prefix Declaration

Henry S. Thompson
13 Jan 2010

Table of Contents

1. Disclaimer and backpointer

The ideas behind the proposal presented here are neither particularly new nor particularly mine. I've made the effort to write this down so anyone wishing to refer to ideas in this space can say "Something along the lines of [this posting]" rather than "Something, you know, like, uhm, what we talked about, prefix binding, media-type-based defaulting, that stuff".

This is an expanded version of a W3C QA Blog entry. It has no official standing, but is input into ongoing TAG discussion of the distributed extensibility issue.

2. Introduction

Criticism of XML namespaces as an appropriate mechanism for enabling distributed extensibility for the Web typically targets two issues:

  1. Syntactic complexity
  2. API complexity

Of these, the first is arguably the more significant, because the number of authors exceeds the number of developers by a large margin. Accordingly, this proposal attempts to address the first problem, by providing a defaulting mechanism for namespace prefix bindings which covers the 99% case.

3. Goal(s) of this exercise

There are two potential targets for the design presented here:

The former is a long-term target at best. As well as the well-known commitment to stability on the part of the XML community, there is the additional low-level technical problem that as things currently stand there is no mechanism for specifying XML namespace version information in an XML document.

HTML5 is then the primary target. This proposal could be seen either as an addition or as an alternative to the Microsoft proposal.

4. The proposal

Binding
Define a trivial XML language which provides a means to associate prefixes with namespace names (URIs);
Invoking from HTML
Define a link relation dpd for use in the (X)HTML header;
Invoking from XML
Define a processing instruction xml-dpd and/or an attribute xml:dpd for use at the top of XML documents;
Defaulting by Media Type
Implement a registry which maps from media types to a published dpd file;
Semantics
Define a precedence, which operates on a per-prefix basis, namely xmlns: >> explicit invocation >> application built-in default >> media-type-based default, and a semantics in terms of namespace information items or appropriate data-model equivalent on the document element.

5. Why prefixes?

XML namespaces provide two essentially distinct mechanisms for 'owning' names, that is, preventing what would otherwise be a name collision by associating names in some way with some additional distinguishing characteristic:

  1. By prefixing the name, and binding the prefix to a particular URI;
  2. By declaring that within a particular subtree, unprefixed names are associated with a particular URI.

In XML namespaces as they stand today, the association with a URI is done via a namespace declaration which takes the form of an attribute, and whose impact is scoped to the subtree rooted at the owner element of that attribute.

Liam Quin has proposed an additional, out-of-band and defaultable, approach to the association for unprefixed names, using patterns to identify the subtrees where particular URIs apply. I've borrowed some of his ideas about how to connect documents to prefix binding definitions.

The approach presented here is similar-but-different, in that its primary goal is to enable out-of-band and defaultable associations of namespaces to names with prefixes, with whole-document scope. The advantages of focussing on prefixed names in this way are:

Provision is also made for optionally specifying a binding for the default namespace at the document element, primarily for the media type registry case, where it makes sense to associate a primary namespace with a media type.

6. Example

If this proposal were adopted, and a dpd document for use in HTML 4.01 or XHTML1:

<dpd ns="http://www.w3.org/1999/xhtml">
 <pd p="xf" ns="http://www.w3.org/2002/xforms"/>
 <pd p="svg" ns="http://www.w3.org/2000/svg"/>
 <pd p="ml" ns="http://www.w3.org/1998/Math/MathML"/>
</dpd>

was registered against the text/html media type, the following would result in a DOM with html and body elements in the XHTML namespace and an input element in the XForms namespace:

<html> <body>
 <xf:input ref="xyzzy">...</xf:input>
 </body>
</html>

7. Open questions and problems

7.1. Not namespace-well-formed

Although the document in the preceding example is well-formed XML, it is not namespace-well-formed, and if served as application/xhtml+xml, will be rejected by existing browsers which process XML as such.

This problem is eliminated at least in principle if the proposal is only adopted for the HTML serialization of HTML5.

7.2. Transition points

At the moment the HTML5 specification lists svg and math explicitly as being allowed as certain kinds of content in HTML5 documents. It defines a category of foreign elements, namely "elements from the MathML namespace and the SVG namespace", and parsing works largely on the basis of a foreign flag.

It seems likely therefore that converting the spec. to treat all non-HTML-namespace elements and attributes in the same way would not require a great deal of work, but see below under Error recovery and Case insensitivity.

7.3. Error recovery

The above comments notwithstanding, there are some clauses in the current HTML5 spec (see for example the 'in-foreign-content' insertion mode) which fall into the general category of error recovery, which take action based on specific properties of the SVG and MathML document types. Such type-specific recovery strategies will not generalize to other XML vocabularies, so the amount of "error recovery" which can be supported for them would of necessity be very limited.

7.4. Case (in)sensitivity

The HTML serialization of HTML5 remains resolutely case-insensitive. The current specification goes to some considerable idiosyncratic trouble to repair the damage this causes to camel-case MathML and SVG names.

I don't know what if anything can be done about this, it may just be something we have to live with.

7.5. Static scoping

Dan Connolly has raised the issue that the existing namespace prefix binding mechanism is invulnerable to action at a distance (as long as DTD defaulting is not relied on), but this proposal is. John Kemp proposed using the HTML link element to do the binding, which would address this objection. We would then have, for the earlier example:

<dpd ns="http://www.w3.org/1999/xhtml">
 <link id="xf" href="http://www.w3.org/2002/xforms"/>
 <link id="svg" href="http://www.w3.org/2000/svg"/>
 <link id="ml" href="http://www.w3.org/1998/Math/MathML"/>
</dpd>

where the link elements could also be in a document header, where they would take priority over any offboard bindings.