Comments on Namespaces 1.1

from XML Schema Working Group

15 November 2002



The XML Schema WG congratulates the XML Core WG on the production of a last-call draft of Namespaces in XML 1.1. We apologize for the late transmission of these comments and hope they can nonetheless be acted upon.

As regards the modifications made in version 1.1 vis a vis version 1.0 of Namespaces in XML, we have no particular technical concerns. We are glad to see the definitions of the terms namespace-well-formed and namespace-valid.

We do have some concerns about some portions of the 1.0 specification which have proven harmful in practice but which have not been changed in the draft of 1.1. We urge the XML Core WG not to move this document further toward Recommendation status without fixing at least those problems listed below with the label 'SEVERE'. Items not so labeled are included in the hopes that the editor will find them useful, but they are not lie-down-in-the-road issues for us.

Within sections of the document, we indicate locations by paragraph (para) and sentence (sent) number; negative numbers are counted from the end of the section or paragraph.

There are several relatively major points we wish to raise, which are described in the following sections of this note. A final section contains a number of less important suggestions included for the convenience of the editors.

Some of the comments below were unanimously approved by the XML Schema Working Group; on some comments, however, as indicated below, there was dissent.

Universal names

[Note: one member of the XML Schema Working Group dissents from the inclusion of this comment on the grounds that it is unreasonable and unnecessary to ask the XML Core Working Group to rethink the motivation and explanation of XML Namespaces. Another member of the Working Group argues that the correction of errors is explicitly in scope for Namespaces 1.1, and that the comments here identify an error and are thus clearly in scope. The Working Group as a whole is not persuaded by either of these positions.]

sec 1 para 3, SEVERE:

These considerations require that document constructs should have universal names, whose scope extends beyond their containing document. This specification describes a mechanism, XML namespaces, which accomplishes this.

The phrase "universal names", in conjunction with similar phrases (e.g. "universally unique" in para 8) suggests to some readers names which are universally unambiguous. Given a "universal name", such readers expect to be able to identify, without further information, a single object denoted by the universal name.

Since (in the view of some observers) the specification could, in fact, if written differently, provide identifiers with such globally unique denotation, it is not immediately obvious to all readers that the interpretation just given of the phrase 'universal names' is erroneous. Since the spec does not, however, in fact provide globally unambiguous identifiers, it is unacceptable to describe it as if it did.

The misleading description in paragraph 3 cost the XML Schema Working Group a substantial amount of time; in the view of one of our chairs, we lost six months or more owing to misconceptions about the nature of the Namespace Recommendation caused more or less directly by this paragraph. We believe this paragraph should be deleted and replaced by one which accurately describes what the specification does. Possible replacement text:

These considerations require that document constructs should have names constructed so as to avoid name clashes between names assigned by different designers, specifications, or naming authorities. This specification describes a mechanism, XML namespaces, which accomplishes this.

Optionally add:

It should be noted that the namespace-qualified names described by this specification are not guaranteed to have globally unique denotations; because this specification does not constrain the construction or internal structure of namespaces, it is possible for the same qualified name to denote more than one object. (For example, in a namespace for a typical XML vocabulary, an element type and a global attribute may have the same qualified name.) Such names must be disambiguated by means not prescribed by this specification; in practice, they are often disambiguated by reference to the context in which they are used.

para 8 sent -2 SEVERE:

... The combination of the universally managed IRI namespace and the document's own namespace produces identifiers that are universally unique. ...

The term 'universally unique' is undefined and misleading. It suggests to some readers that the identifiers so described will or must have universally unique denotations (see our note on para 3), but such universally unique denotation is neither guaranteed nor required by the mechanism defined in this spec.

It is true that a qualified name is universally unique in that it is necessarily distinct from any other qualified name. But this is true of unqualified names as well: the identifier p is necessarily distinct from any other identifier, i.e. from any identifier which is not p, and it is thus in that sense universally unique.

Neither qualified names nor unqualified names are guaranteed to have unique denotations; what is achieved in practice by the Namespaces specification is that different naming authorities can assign names without the risk of name clashes between names assigned by different authorities.

(We note in passing that the use of namespaces can only guarantee freedom from name collisions if different naming authorities can be relied on to choose different URIs to serve as namespace names; in practice, they do, but we note that nothing in the Namespaces in XML specification provides any guidance on the matter. If two different naming authorities were to attempt to define names for the namespace 'http://ecommerce.org/schema', for example, there is no guarantee that they would successfully avoid name collisions. But we are unable to identify any rule in the Namespaces specification which would make their practice non-conforming.)

XML namespaces and conventional namespaces

sec 1 para 4 sent -2/-1 editorial:

... XML namespaces differ from the "namespaces" conventionally used in computing disciplines in that the XML version has internal structure and is not, mathematically speaking, a set. These issues are discussed in B The Internal Structure of XML Namespaces.

These last two sentences have proven more misleading than they seem to us to be worth. They are in any case false, since (1) XML namespaces need not have any particular structure and (2) the namespaces used in conventional programm languages (e.g. the variable and function-name namespaces of Algol 60, C, Lisp, Pascal, etc.) are not sets.

On (1): Nothing in this specification or in the XML 1.0 or 1.1 specifications constrains the internal organization of a namespace, requires those responsible for a namespace to follow any particular discipline, or makes any guarantees about the nature of namespaces. The description of 'namespace partitions' in Appendix B is not normative and does not provide an adequate account of the naming discipline of XML vocabularies defined by any known schema language. XML 1.0 DTDs have naming constraints not described there, and newer schema languages including XML Schema 1.0 do not place all elements into the same symbol space.

On (2): in many conventional programming languages, the namespace used for variables (for example) is not guaranteed to be a set. There may be arbitrarily many distinct variables with the same name in a program; in Algol and its descendants they are distinguished by their lexical scope, and in some other languages by their dynamic scope.

On the whole, we think it may be wisest simply to delete these two sentences. We would be happy if it were possible to replace them with a coherent account of the notion of 'namespace' as used in this specification and how it relates to other uses of the term. But we recognize that clear, coherent descriptions of namespaces as defined by this specification have proven remarkably difficult to construct or to elicit. The best we have managed ourselves is:

The namespaces described by this specification differ from the 'namespaces' conventionally used in computing disciplines in that this specification does not define any particular internal structure for namespaces, nor any rules for resolving name clashes.

Appendix B The Internal Structure of XML Namespaces

This appendix may have seemed a good idea when Namespaces 1.0 was issued; it has not worn well, and has caused more confusion than it has avoided. We believe sections B.1 and B.2 should either be rewritten or deleted. At the very least, the description of 'partitions' should be related explicitly to XML 1.0 and 1.1 DTDs, and it should be pointed out that other schema languages use different naming disciplines.

Namespace declarations as attributes or as pseudo-attributes

[Note: two members of the XML Schema Working Group dissent from the inclusion of this comment on the grounds that the layering of the information set above XML-plus-namespaces above XML-as-specified is sufficiently clear: the information set does not say that namespace declarations are not attributes, only that some attributes in the XML document are represented by attribute information items in the value of the [attributes] property, and other attributes in the XML document are represented by attribute information items in the value of the [namespace attributes] property. The Working Group as a whole felt that the distinction in the information set Recommendation and the wording in the Namespaces 1.1 draft, taken together, were confusing enough to merit the comment.]

sec 1 para 9 sent 3, SEVERE:

An attribute-based syntax described below is used to declare the association of the namespace prefix with an IRI reference; ...

This sentences describes the syntax of namespace declarations as "attribute-based", but this seems incompatible with the decision on this matter made by the XML Core Working Group in the development of the XML Information Set Recommendation. We recall that the Core WG decided then, over the protests of some commentators, that namespace declarations should not appear in the value of the [attributes] property. The XML Schema spec has followed the lead of the Infoset spec in this matter and we do not believe that the decision should be reversed or revisited; that means, however, that the text here should be revised to conform to it. The syntax of namespace declarations may be described as attribute-like, or namespace declarations may be described as pseudo-attributes, but they MUST NOT be defined here as attributes and in the infoset spec as non-attributes, without even a note making clear that they are syntactically attributes, but do not appear in the value of the [attributes] property in the information set.

sec 2 para 1 sent 1, SEVERE:

[Definition: A namespace is declared using a family of reserved attributes....]

According to the infoset spec, namespaces are NOT declared using a family of attributes, but using namespace declarations (see above on section 1, para 9). For 'attributes' perhaps read 'pseudo-attributes' or for 'using a family of reserved attributes' perhaps read 'using a special attribute-like syntax'.

Terminology

sec 1 para 8 editorial, serious:

Names from XML namespaces may appear as qualified names, which may contain a single colon separating the name into a namespace prefix and a local part. The prefix, which is mapped to an IRI reference, selects a namespace. The combination of the universally managed IRI namespace and the document's own namespace produces identifiers that are universally unique. Mechanisms are provided for prefix scoping and defaulting.

We have found it exceedingly helpful, both in the XML Schema specification and in the internal discussions of our Working Group, to have several different pairs of terms, which denote respectively:

Distinction (a) is conveyed by the terms QName and NCName, both in this spec and in ours. Distinction (b) we have often made by means of the terms 'namespace-qualified name' vs. 'unqualified name'. We denote distinction (c) with the terms 'prefixed name' and 'unprefixed name'. It would be useful, we believe, for all of those who must discuss, and even more importantly those who must teach others, the ins and outs of namespaces in XML, if the Namespaces in XML specification defined terms which conveyed distinctions (b) and (c). The failure of the 1.0 version to do so is an annoying but easily reparable flaw.

Miscellaneous

Section 1: Motivation and Summary

para 1 sent 2 typo: for 'modularity;' read 'modularity:'

para 1 sent -1 editorial:

... One motivation for this is modularity; if such a markup vocabulary exists which is well-understood and for which there is useful software available, it is better to re-use this markup rather than re-invent it....

We believe this statement in its current form, expressed without qualification or exception, is too strong. It does not in fact mirror the views or practice of all experienced language designers. For "it is better" we suggest "it may be better" or "some designers will prefer".

para 2 sent 2 editorial: for "tags" we suggest "elements".

para 5 editorial:

[Definition: IRI references which identify namespaces are considered identical if and only if they are exactly the same character-for-character.] Case differences and escaping differences (including case differences in escape sequences) are therefore significant. ... Examples include IRI references which differ only in case or escaping , or which are in external entities which have different effective base URIs.

Delete 'therefore'; in 'escaping , ' delete excess blank before comma.

para 6 editorial, medium serious:

The empty string, though it is a legal IRI reference, cannot be used as a namespace name.

for 'cannot be' we think 'must not be' should probably be substituted. We think, that is, that this is a normative prohibition, not the statement of a fact which could be established as a logical consequence of rules elsewhere in the spec; we realize we may be wrong in this thought.

Section 4 Using Qualified Names

para 2 sent 1 (after production [11]) editorial, medium serious:

An example of a qualified name serving as an element type:

For "a qualified name serving as an element type" read "a qualified element type name" or "a qualified name serving to identify an element type". Element types are not identical to their names.

para 5 Namespace constraint: Prefix defined, sent -1.

Furthermore, the attribute value in the innermost such declaration must not be empty.

For "empty" (which is undefined and prone, as our experience amply shows, to confusion) we recommend substituting "the empty string". [If you accept our objections to saying that namespace declarations are attributes, then the term "attribute value" should also be changed to "value".]

Section 5.1 Namespace Scoping

para -1, example: the start-tag for the third 'x' element is not aligned properly; should be aligned with the illegal 'n1:a' element above it.

Section 5.2 Namespace Defaulting

para 2, example: We believe it would be desirable to use the actual XHTML namespace in this example. The same applies to all the other examples which use an imaginary HTML namespace; this includes the first example in section 5.1 and both of the last two examples in section 5.2.

para 3 sent -1, editorial:

This has the same effect, within the scope of the declaration, of there being no default namespace.

For "the same effect ... of there being" read "the effect ... of there being" or "the effect ... as there being".

Section 7 Internationalized Resource Identifiers (IRIs)

para 2, ordered list, item 1, editorial: for 'bytes' read 'octets', to avoid confusion with the historic meaning of 'byte' as 'the number of bits needed to represent a single character'. (et passim)