The following is a Change Proposal for ISSUE-41, Distributed Extensibility.
This proposal was written by Rob Ennals, but contains ideas from many other people, including Tony Ross, Maciej Stachowiak, Henry S. Thompson, Sam Ruby, David Singer, and Ian Hickson.
This is a personal proposal, and not an official Intel position.
This is a rather long change proposal for what is, in effect, a relatively simple change. However this has been the source of much argument, and its important to give a proper hearing to the issues that have been discussed.
An extension spec can define attributes, but not element names. To prevent name clashes and draw attention to the fact that an extension is being used, such attributes must be of the form "-prefix-name". Where a spec would otherwise have defined a new element name, it should instead allow authors to write "<foo -prefix-name>" where "foo" is the most appropriate element currently in HTML (which may be just "span" or "meta"). When future versions of HTML add more element types, the recommended element type may change.
To prevent clashes between different prefixes, and allow people to know what the different prefixes mean, a spec should register its prefix and a spec defining its attributes, on a central wiki. If a spec has not yet registered a prefix then it should only use prefixes of the form "x-prefix-name".
A document that uses such extensions is not valid HTML, however it is valid "extended HTML". A validator may describe a document as being valid "extended HTML+X+Y" where X and Y are the extensions that the document uses.
It is important that we allow independent implementers and standards bodies to develop well-specified vendor-neutral extensions for HTML5 in a way that does not cause problems for other extensions or for future extensions to HTML5. In particular:
- It is often useful for people to define extensions to HTML. These may be vendor-specific experiments for features that may eventually get folded into the main HTML spec. They may be enterprise-specific extensions that would be of little use for general HTML. They may be community-specific features that are useful to too few people to make sense as part of the core spec. They may be experiments by a relatively obscure group on top of a patched open source browser that turn out to work well and get adopted by a vendor.
- Irrespective of whether we think it is good for people to create extensions, it is inevitable that they will, and so it is useful for us to provide guidelines about how they can do so in a way that is minimally damaging.
- We should encourage writers of extensions to HTML to design their specs in such a way that they cause minimal harm to future versions of HTML. In particular, the authors of such specs should be encouraged to use prefixes to avoid name clashes, and define attributes rather than tag names, to allow graceful fall-back to HTML elements. As a way of giving cultural support for people who design their extensions properly, we allow documents that use such extensions to validate as "extended HTML".
- We should make a clear distinction between features that are part of HTML, and features that are part of an extension to HTML. Features that are part of HTML are assumed to be well specified, vendor-neutral, supported by the majority of browsers, and designed to integrate well with the rest of the spec. Once an extension specification develops to the level where it meets these criteria, it should itself become part of HTML and its features should no longer require a prefix. The presence of a prefix should be a signal to a developer that they are using a non-standard extension.
- However we also want to make it clear to authors when they are using non-standard features that may not be widely supported. We thus do not allow documents that use such extensions to be considered valid HTML. They may only be validated as "extended HTML".
- Experience has shown that problems arise when people design extensions that define new element names. Since an element can have only one tag name, it becomes impossible for an element to fall back to an appropriate HTML tag, and problems arise if the element name eventually becomes part of HTML. We thus do not allow extensions to define new tag names. Rather than allowing an author to write "<-prefix-name>" a spec should instead request that an author write "<tag -prefix-name>" where "tag" is whatever element type in the current HTML standard has the most appropriate semantics.
- In a previous draft of this proposal, we allowed extension specs to define new tag names, provided that they guaranteed that they would treat an element that had an attribute with the same name with equivalent semantics. This has been dropped, since we concluded that this adds too much complexity to the design of extensions, and such fall-back to attributes is likely to be poorly tested in practice.
- Using ":" as the prefix separator causes compatibility problems, and so "-" is used instead. If the namespace of a prefixed node is defined to depend on the innermost XMLNS declaration, then the parsed DOM is incompatible with the current HTML spec. If however the handling of namespaces is left as it is, then the DOM is incompatible with an XML parse of the same document. Moreover, it is confusing to use the same syntax for these fixed-meaning prefixes and XML's variable-meaning namespace-binding-dependent prefixes.
It is also important that whatever mechanism we provide for distributed extensibility is easy for authors to use. In particular:
- Authors have difficulty understanding the relationship between a prefix and a namespace and assume a prefix always means the same thing. We avoid this problem by using "-" separated prefixes that have a fixed meaning, rather than ":" separated prefixes in which the meaning depends on the namespace bindings currently in scope.
- It is important that the DOM representation of a document be identical between XML and HTML. The use of ":" separated prefixes, as used by XML namespaces, would cause problems, since HTML treats the prefix as being part of the name, and XML separates it off. To avoid this problem, we separate prefixes with "-". In this model, there is no namespace and the prefix is always part of the name.
- Authors make errors when they copy and paste code into a document in which a prefix is either not defined, or is defined to have a different meaning. The meaning of a prefixed node should not depend on its context.
These modifications are based on the W3C working draft as of March 4th, 2010: http://www.w3.org/TR/2010/WD-html5-20100304/
Section 2.2.2: Extensibility
Replace this text:
Vendor-specific proprietary extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
Vendor-specific proprietary extensions to this specification are strongly discouraged, since doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
If a document is to be considered valid HTML then it must not use such extensions. A document may however be considered valid "extended HTML" if it only uses extensions that confirm to the criteria below:
Replace this text:
If vendor-specific markup extensions are needed, they should be done using XML, with elements or attributes from custom namespaces. If such DOM extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
If vendor-specific markup extensions are needed, they may be done in one of two ways. The preferred mechanism is that such extensions should be done in XML, with elements and attributes from custom namespaces.
Alternatively, an extension may define new attribute names (but not elements) that can be used in both XML and "extended HTML", provided that all such names are of the form "-prefix-name", where "prefix" is chosen with the aim of being unique.
To ensure that such prefixes are indeed unique, the authors of such extension should register their prefix on the HTML WG wiki (url), together with a description of what the extension does, and a specification. Alternatively, an experimental spec may use a prefix of the form "x-prefix", where there is no guarantee that the prefix is unique.
If DOM extensions are needed, the members should also be prefixed with string that is unique to the extension or prefix, to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
- HTML5 supports distributed extensibility. Independent groups can create extensions to HTML5 without having to go through the HTML working group.
- Life is made easier for authors by using fixed, registered, prefixes, making the meaning of a node depend purely on its name, and making elements from unstable extensions visually distinct (with the "x-" prefix).
- By not using XML namespaces or ":" separated prefixes, this proposal avoids incompatibilities between the XML and HTML processing of such elements.
- By using "-" rather than ":" as our separator, we avoid confusion about what a prefix means. A prefix is just part of the name, and does not denote a namespace defined elsewhere.
- By disallowing the creation of new tag names, we prevent problems where someone defines a new tag "-my-foo", "foo" is later adopted into HTML, and an author wants to define a node that is both a "foo" and a "-my-foo". If "-my-foo" is an attribute, an author can simply write "<foo -my-foo>".
- This proposal is not compatible with XML namespaces. We now have two different prefixing mechanisms for names, that work in different ways (simple string prefix vs namespace indirection).
- While existing XML specifications such as RDFa and XSLT can be used "as is" in the XML representation of HTML, they cannot be used "as is" in the HTML representation, unless their attributes are re-coded into the form "-prefix-name" and their element types are converted into attributes. If someone wants to use such an extension in an HTML (not XML) document, and wishes to use an existing tool that understands such a spec to process their file, then they will have to include an intermediate conversion step that translates names.
Conformance Classes Changes
A new conformance class of "extended HTML" is defined, that allows an HTML document to contain prefixed attributes defined by an extension specification.
- HTML5 Working Draft as of 4th March 2010: http://www.w3.org/TR/2010/WD-html5-20100304/