Default Prefix Declaration
Default Prefix Declaration
Henry S. Thompson
18 Nov 2009
Table of Contents
- 1. Disclaimer
- 2. Introduction
- 3. The proposal
- 4. Why prefixes?
- 5. Example
1. Disclaimer
The ideas behind the proposal presented here are neither particularly new nor particularly mine. I've made the effort to write this down so anyone wishing to refer to ideas in this space can say "Something along the lines of [this posting]" rather than "Something, you know, like, uhm, what we talked about, prefix binding, media-type-based defaulting, that stuff".
2. Introduction
Criticism of XML namespaces as an appropriate mechanism for enabling distributed extensibility for the Web typically targets two issues:
- Syntactic complexity
- API complexity
Of these, the first is arguably the more significant, because the number of authors exceeds the number of developers by a large margin. Accordingly, this proposal attempts to address the first problem, by providing a defaulting mechanism for namespace prefix bindings which covers the 99% case.
3. The proposal
- Binding
- Define a trivial XML language which provides a means to associate prefixes with namespace names (URIs);
- Invoking from HTML
- Define a link relation
dpd
for use in the (X)HTML header; - Invoking from XML
- Define a processing instruction
xml-dpd
and/or an attributexml:dpd
for use at the top of XML documents; - Defaulting by Media Type
- Implement a registry which maps from media types to a published dpd file;
- Semantics
- Define a precedence, which operates on a per-prefix basis, namely xmlns: >> explicit invocation >> application built-in default >> media-type-based default, and a semantics in terms of namespace information items or appropriate data-model equivalent on the document element.
4. Why prefixes?
XML namespaces provide two essentially distinct mechanisms for 'owning' names, that is, preventing what would otherwise be a name collision by associating names in some way with some additional distinguishing characteristic:
- By prefixing the name, and binding the prefix to a particular URI;
- By declaring that within a particular subtree, unprefixed names are associated with a particular URI.
In XML namespaces as they stand today, the association with a URI is done via a namespace declaration which takes the form of an attribute, and whose impact is scoped to the subtree rooted at the owner element of that attribute.
Liam Quin has proposed an additional, out-of-band and defaultable, approach to the association for unprefixed names, using patterns to identify the subtrees where particular URIs apply. I've borrowed some of his ideas about how to connect documents to prefix binding definitions.
The approach presented here is similar-but-different, in that its primary goal is to enable out-of-band and defaultable associations of namespaces to names with prefixes, with whole-document scope. The advantages of focussing on prefixed names in this way are:
- Ad-hoc extensibility mechanisms typically use prefixes. The HTML5 specification already has at least two of these:
aria-
anddata-
; - Prefixed names are more robust in the face of arbitrary cut-and-paste operations;
- Authors are used to them: For example XSLT stylesheets and W3C XML Schema documents almost always use explicit prefixes extensively;
- Prefix binding information can be very simple: just a set of pairs of prefix and URI.
Provision is also made for optionally specifying a binding for the default namespace at the document element, primarily for the media type registry case, where it makes sense to associate a primary namespace with a media type.
5. Example
If this proposal were adopted, and a dpd document for use in HTML 4.01 or XHTML1:
<dpd ns="http://www.w3.org/1999/xhtml"> <pd p="xf" ns="http://www.w3.org/2002/xforms"/> <pd p="svg" ns="http://www.w3.org/2000/svg"/> <pd p="ml" ns="http://www.w3.org/1998/Math/MathML"/> </dpd>
was registered against the text/html
media type, the following would result in a DOM with html
and body
elements in the XHTML namespace and an input
element in the XForms namespace:
<html> <body> <xf:input ref="xyzzy">...</xf:input> </body> </html>
I'd like to see a new version of XML where it's as simple as this:
Where the prefix is registered and the "::" denotes that the namespace is sticky.
Being registered "in the W3C namespace registry" would mean no need to use a processing instruction or attribute to get a PDP. Namespaces that are NOT registered could continue to be used using the old xmlns mechanism for namespacing, or perhaps this PDP stuff. Like content type registration, perhaps unregistered types could be allowed if prefixed with "x-".
The registry would only provide default values, so old content would not be broken in new implementations of this - the old content would be using the xmlns mechanism, and that would override the defaults in the W3C's registry.
Being sticky would mean that descendants would be in the same namespace without needing to prefix them too. If you didn't want stickiness, you could use single colon ":" as usual.
Note no need in this "next" version of XML to prefix the closing tag, since there's no real reason to require that.
Nice stuff, this should keep things clean and organized. Could we start using this mechanism immediately? or xml parsers don't understand it yet?
This is really good and helpfull. this also helps us optimize the entire coding involved. thanks a ton.
Lest there be any confusion, this is a proposal which has not been accepted or implemented anywhere. . .
Henry,
doesn't this proposal ties your dpd to a media type? If this is the case, are we addressing the problem of distributed extensibility at all?
My typical use case is to annotate a document with my own structured content. With this proposal I would have no way to do so without also creating a derivative media type, which is unpractical for as long as media types cannot derive from other media types.
I just worry slightly about the dependence on media types, which tend to be a bit fragile - for example, I don't know of any popular operating system that stores the media type as a property of a file. Perhaps it would work well in conjunction with an xml:media-type="application/xml+xslt" attribute.
1) I'd concur with Dr. Kay's comment, and have a couple of my own. Has any thought been given to the implication of this type of mechanism with respect to validation? While I like both Liam's proposal and your approach, I've not seen much thought given to these proposals wrt XSD or DTDs, though given that there is no formal schema for HTML5 at this stage, this may be a moot point.
2) It seems to me there may be some potential to discuss the intersection points between namespaces, mime-types and bindings as a formal mechanism, perhaps along these lines. Most web clients perforce have a LUT of some sort that does mime-type to bindings; this proposal looks like it has the potential to unify these with namespaces, either via loaded plugins or via XBLs of some sort (see Google's work in this regard).
An earlier version of my proposal did use prefixes; I found, unfortunately, that existing Web browsers run in namespace-checking mode, and refuse to process an XML document that's not well-formed. I considered a mechnism to let people define a character othr than colon (too complex!) and also using _ or - (not compatible). I want to make sure that existing XML tools can continue to process the documents at some level, even if they don't understand the new namespaces.
I felt it was important that a system to simplify namespaces did not require that people had a deep understanding of the subtleties of namespaces as they stand today in order to use simplified namespaces, so I wanted it to be possible for instance documents to be namespace-free, and also for the namespace definition file itself to be free of traditional namespace syntax, even though I was not (am not) proposing to remove the traditional namespace syntax.
Consider the HTML 5 treatment of a plain, unadorned svg element: it ends up in the SVG namespace automatically, introducing a new default namespace, but the HTML 5 people I've listened to so far are mostly adamant that they don't want prefixes or further declarations, just a plain svg element as if it were part of HTML. Maybe your proposal would convince them otherwise, though.
In the end I decided to go with (unobtrusive namespaces) but I think didn't make it clear enough that I want to allow a mixture. One could have XSLT or XQuery (say) to populate instance documents with namespace prefixes, and a JavaScript implementation is possible (although it would potentially interfere with other scripts, as Simon Pieters pointed out to me); these things are not possible if you use colons, but if you use a hyphen it'd work I think, modulo some CSS complexities.
If you extend your proposal very slightly, to allow it to specify default namespaces, it would subsume mine altogether (and become closer to ISO DSRL).
If you allow a prefix other than colon (dot, say), you can maybe get to something rather like the pragmatic namespaces that Micah Dubinko proposed, and also that Tim Bray supported, with (e.g.) ch.ac.cern.html or com.netscape.script or org.w3c.svg (since elements would need to retain the domain name where they were first introduced).
Ongoing discussion about all this is very productive!
How about (instead of the 'pd' element) we use something like this:
Where you could then say, for example:
Which would then even allow local references (via #) to prefix declarations?
NOTE: in private discussion with Henry, he suggested that I use the 'id' attribute instead of 'name':
I definitely like that better than my original suggestion.
John - Interesting, I suppose in principle this means one could use HTML link elements in an HTML document in this way.
Then maybe one could have a way to say, "use the declarations in this other document" too.
There is still a problem that one can't use the colon in this way in much of today's software, making this (I think) an unacceptably disruptive change. But one could use the dot, as Micah has suggested in Pragmatic Namespaces.
Liam