The following is a Change Proposal for ISSUE-41, Distributed Extensibility.
This proposal was written by Rob Ennals, but contains ideas from many other people, including Tony Ross, Maciej Stachowiak, Henry S. Thompson, Sam Ruby, David Singer, and Ian Hickson.
This is the original change proposal that I said that I would write up formally and closely matches the proposal that I had previously described informally.
An extension spec can define attributes and element names. To prevent name clashes and draw attention to the fact that an extension is being used, such attributes must be of the form "prefix:name". To avoid name clashes, the author of an extension can register their extension, its namespace, and its prefix in a wiki maintained by the HTML WG. Alternatively, if they have not registered a prefix, they may use a prefix that starts with "x-".
In HTML documents, the user agent should ignore the namespace associated with a prefixed name, and look only at the prefix when deciding what semantics to give to a name. For compatibility with XML, an HTML file may contain an XMLNS declaration for a prefix. If the prefix is a registered prefix, then the namespace must be the one that was registered. If the document contains more than one XMLNS declaration for the same prefix, then they must bind it to the same namespace. If no XMLNS declaration is given for a prefix, then prefixed names are in the null namespace, otherwise they are in the namespace given by the XMLNS declaration.
A document that uses such extensions is NOT valid HTML, however it is valid "extended HTML". A validator may describe a document as being valid "extended HTML+X+Y" where X and Y are the extensions that the document uses.
An extension should only define a new element type if no HTML element type other than span would be a logical supertype. Otherwise it should define an attribute that can be added to existing HTML elements. Similarly, an extension that gives semantics to elements with tag X should also give the same semantics to a node that has boolean attribute X.
It is important that we allow independent implementers and standards bodies to develop well-specified vendor-neutral extensions for HTML5 in a way that does not cause problems for other extensions or for future extensions to HTML5. In particular:
- It is often useful for people to define extensions to HTML. These may be vendor-specific experiments for features that may eventually get folded into the main HTML spec. They may be enterprise-specific extensions that would be of little use for general HTML. They may be community-specific features that are useful to too few people to make sense as part of the core spec. They may be experiments by a relatively obscure group on top of a patched open source browser that turn out to work well and get adopted by a vendor.
- Irrespective of whether we think it is good for people to create extensions, it is inevitable that they will, and so it is useful for us to provide guidelines about how they can do so in a way that is minimally damaging.
- We should encourage writers of extensions to HTML to design their specs in such a way that they cause minimal harm to future versions of HTML. In particular, the authors of such specs should be encouraged to use prefixes to avoid name clashes, and define attributes rather than tag names, to allow graceful fall-back to HTML elements. As a way of giving cultural support for people who design their extensions properly, we allow documents that use such extensions to validate as "extended HTML".
- We should make a clear distinction between features that are part of HTML, and features that are part of an extension to HTML. Features that are part of HTML are assumed to be well specified, vendor-neutral, supported by the majority of browsers, and designed to integrate well with the rest of the spec. Once an extension specification develops to the level where it meets these criteria, it should itself become part of HTML and its features should no longer require a prefix. The presence of a prefix should be a signal to a developer that they are using a non-standard extension.
- However we also want to make it clear to authors when they are using non-standard features that may not be widely supported. We thus do not allow documents that use such extensions to be considered valid HTML. They may only be validated as "extended HTML".
- Experience has shown that problems arise when people design extensions that define new element names. Since an element can have only one tag name, it becomes impossible for an element to fall back to an appropriate HTML tag, and problems arise if the element name eventually becomes part of HTML. We thus strongly discourage the creation of new element types, and request that if an extension gives semantics to an element with name X then it should give the same semantics to an element with boolean attribute X>
It would also be useful for the extension mechanism to be compatible to some extent with XML. In particular:
- It is useful to be able to apply existing extension specifications, such as RDFa, without having to change the definition of such specs, or change the tool chain that processes DOM elements defined by such specs.
It is also important that whatever mechanism we provide for distributed extensibility is easy for authors to use. In particular:
- Authors have difficulty understanding the relationship between a prefix and a namespace and assume a prefix always means the same thing. Requiring that a prefix have a fixed meaning avoids this problem.
- Authors have difficulty understanding that the meaning of a tag is defined by a namespace in addition to the textual name of the node. To avoid this problem, we declare that the namespace of a prefixed node in HTML is always null, even when the document contains an XMLNS declaration for the prefix.
- Authors find APIs like createElementNS confusing. To avoid this problem, authors can create prefixed elements using createElement("prefix:name").
- Authors make errors when they copy and paste code into a document in which a prefix is either not defined, or is defined to have a different meaning. The meaning of a prefixed node should not depend on its context.
These modifications are based on the W3C working draft as of March 4th, 2010: http://www.w3.org/TR/2010/WD-html5-20100304/
Section 2.2.2: Extensibility
Replace this text:
Vendor-specific proprietary extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
Vendor-specific proprietary extensions to this specification are strongly discouraged, since doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
If a document is to be considered valid HTML then it must not use such extensions. A document may however be considered valid "extended HTML" if it only uses extensions that confirm to the criteria below:
Replace this text:
If vendor-specific markup extensions are needed, they should be done using XML, with elements or attributes from custom namespaces. If such DOM extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
If vendor-specific markup extensions are needed, they may be done in one of two ways. The preferred mechanism is that such extensions should be done in XML, with elements and attributes from custom namespaces.
Alternatively, if an extension registers the meaning of a prefix on the HTML WG Wiki (url), then names with that prefix may also be used in "extended HTML" documents. It is also permissable for a prefix of the form "x-prefix" to be used in an "extended HTML" document. In both cases, the meaning of such a name MUST depend entirely on the nodeName and not on any namespace that is associated with it.
An extension specification SHOULD NOT define a new element type to represent a subtype of an HTML element type (other than span), since this hides the HTML semantics of the element from user agents that do not understand the extension. In such cases, an extension specification should instead define an attribute that can be added to elements with the appropriate HTML element type. Similarly, if an extension gives meaning to a node with tag name "prefix:name" then it SHOULD give the same meaning to a node with boolean attribute "prefix:name".
If DOM extensions are needed, the members should also be prefixed with a string that is unique to the extension or prefix, to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
Section 2.8: Namespaces
Add text like the following:
In an HTML document, the namespace of a prefixed element is that specified by the innermost enclosing XMLNS declaration for that prefix (as in XML). If no XMLNS declaration gives a namespace to a prefix, then nodes with that prefix default to being in the null namespace. Note however that if a registered prefix is bound to a namespace then this MUST be the namespace registered in the HTML WG Wiki.
Section 3.2.3: Global Attributes
After the following paragraph:
In HTML documents, elements in the HTML namespace may have an xmlns attribute specified, if, and only if, it has the exact value "http://www.w3.org/1999/xhtml". This does not apply to XML documents.
Add a line like the following:
If an HTML document contains an attribute of the form "xmlns:<prefix>" where <prefix> is a registered prefix, then the value of the attribute MUST be the namespace that was registered for the prefix. If an HTML document contains an attribute of the form "xmlns:<prefix>" for an unregistered prefix (starts with "x-") then the value of that attribute MUST be the same as any other attributes with the same name in the same document.
- HTML5 supports distributed extensibility. Independent groups can create extensions to HTML5 without having to go through the HTML working group.
- Life is made easier for authors by using fixed, registered, prefixes, making the meaning of a node depend purely on its name, and making elements from unstable extensions visually distinct (with the "x-" prefix).
- Extensions are encouraged to treat a node with an attribute "X" the same as they would a node with tag name "X".
- Canformance checkers make it clear to a user when they are using extensions.
- Any document that parses correctly in both XML and HTML is guaranteed to parse to the same DOM tree.
- Existing XML specs such as RDFa and XSLT can be used "as is" in an HTML document and the DOM can be correctly processed by existing XML tools.
- The meaning of ":" has changed. In XML the meaning of a prefixed name depends on the enclosing XMLNS declaration, while in HTML the meaning of a prefixed name is fixed.
- People are still permitted to define new element names, and although they are required to allow fall-back by giving the same semantics to a boolean attribute, there is a danger that this will be poorly implemented or poorly tested. It is arguably better to simply disallow an extension to define new element names.
Conformance Classes Changes
A new conformance class of "extended HTML" is defined, that allows an HTML document to contain prefixed attributes defined by an extension specification.
- HTML5 Working Draft as of 4th March 2010: http://www.w3.org/TR/2010/WD-html5-20100304/