The following is a Change Proposal for ISSUE-41, Distributed Extensibility.
This proposal was written by Rob Ennals, but contains ideas from many other people, including Tony Ross, Maciej Stachowiak, Henry S. Thompson, Sam Ruby, David Singer, and Ian Hickson.
This is currently my preferred resolution on ISSUE-41, in preference to my other two proposals.
I'm sure there are bugs/things that could be improved in this. This should be seen as a model of what a good solution might look like, rather than something set in stone.
- 1 Summary
- 2 Rationale
- 3 Details
- 4 Impact
- 5 References
An extension spec can integrate itself with HTML following the model taken previously by SVG and MathML. A spec defines a "root element" (e.g. svg or math) that content for the extension can be included under. This root element should have a default namespace xmlns declaration, giving the namespace for the extension. The root element MUST also have its @extension attribute set, to inform the user agent that this element should be treated as an extension root and that the default namespace it defines should be applied to the contained nodes. If the extension becomes part of a future version of HTML, then the expectation is that it will become legal to omit the xmlns declaration and @extension attribute.
Authors of extensions are strongly advised to communicate with the HTML WG to make sure their spec interacts well with HTML and does not have name clashes with other specs. To help them do this, extension authors are strongly advised to register the name of their root element in a central registry. They should ideally also register the names of all element types and attributes that they define. If a spec has registered its root element and a document omits the namespace declaration for a registered root element, then a user agent that supports that extension should behave as if the registered namespace had been set. Although such a document is still technically non-conforming, this allows a user agent to correctly handle documents that conform to a future version of HTML that has made this extension part of standard HTML.
An extension spec may also define attributes that can be attached to nodes from other namespaces (including HTML). All such attributes must be prefixed, and the prefix should be registered in the central registry. If a document uses such a prefix then it must bind the prefix to the correct namespace using an xmlns declaration.
A document that uses such extensions is not valid HTML, however it is valid "extended HTML".
This proposal is designed to allow for the creation of several kinds of extension:
- "platform extensions" such as SVG and MathML that define new types of content that can be rendered in a browser. These extensions are expected to be vendor-neutral and have a specification. They may become part of HTML in the future. Although such extensions have their own specification, it is useful to provide guidelines for spec authors directing how such a spec should be integrated with HTML, and guidelines for authors of user-agents, advising them about what kinds of markup they should expect to see when a document is using an extension.
- "language extensions" such as RDFa that define new attributes on top of nodes from other namespaces. They need to have a way to avoid name clashes between attributes defined by other extensions.
- "vendor-specific experimental extensions" such as the experimental features that Webkit and Mozilla have created. There should be guidelines about how vendors should create such extensions in a non-destructive way.
In the design of this proposal, we are motivated by the following goals:
- There should be minimal changes to the parsing of existing documents. For this reason, an xmlns default namespace declaration on a root node is only obeyed by the parser if the root node contains the @extension attribute.
- It should be possible for a vendor-neutral extension to become part of HTML in the future without documents needing to support both the old and the new syntax. For this reason, we do not require that a name from an extension be prefixed, and make clear the expectation that it will become legal for a namespace declaration to be omitted from a root node once it becomes part of HTML.
- If a user agent supports an extension, and that extension subsequently becomes part of standard HTML, then the user agent should understand use of the extension when it is used in the new version of HTML, even if the new standard comes out after the user agent was released. For this reason, a user agent should infer the correct namespace for the root node of an extension that it understands - since this would be conforming in a future version of HTML where this feature had been made standard.
- It is useful for authors if a conformance checker warns them about things that are likely to have been unintentional mistakes, or things that might be undesirable for interoperability reasons. For this reason, a conformance checker should require that the default namespace be declared and the @extension attribute be present for any extension that is being used - even though a supporting user agent will infer this. The lack of a default namespace and @extension attribute may indicate that the user is not aware that they are using an extension that may not be present in all user agents.
- Specs like RDFa may need to add attributes to elements from other namespaces. Such attributes need to take care not to clash with each other. For this reason, we require that all such attributes contain a prefix indicating what extension they are part of.
- Experience has shown that users find prefixes confusing and don't understand the relationship between a prefix and a namespace. We thus require that a prefix always map to the same namespace, and that this mapping be registered.
These modifications are based on the W3C working draft as of March 4th, 2010: http://www.w3.org/TR/2010/WD-html5-20100304/
I've probably missed some bits, but hopefully this is enough to get the gist of it.
Section 2.2.2: Extensibility
Add the following text:
If a document uses such an extension then it is not valid HTML. A document may however be considered "valid extended HTML" if it only uses extensions that conform to the criteria below:
An extension specification may define one or more "root elements" that can contain elements designed by the extension specification. Such root elements MUST be registered in the HTML WG registry, together with their namespace. Such a root element MUST have an xmlns declaration giving a default namespace, which must be the same as the namespace registered in the registry, and it MUST have the @extension attribute set, to indicate that the elements inside have semantics defined by the extension.
If a user agent that does not support the extension encounters an unknown element with an @extension attribute and a default namespace, then the user agent MUST apply that namespace to all element names inside the root element, unless this namespace is overridden by an intermediate root element.
An extension specification may define attributes that can be applied to elements from other namespaces. Such such attributes must be prefixed by a prefix that is unique to that extension. Such a prefix MUST be registered in the HTML WG registry, together with it's namespace. If a document uses such a prefix, it MUST contain an xmlns declaration binding the prefix to its registered namespace, and the element with the xmlns declaration MUST have the @extension attribute set.
In the future, if an extension becomes part of HTML, it is expected that the future version of the HTML standard will remove the requirement for the root element to declare its default namespace, or for there to be a namespace declaration for any prefixes.
If a user agent supports an extension, and encounters a root element for that extension that does not have an xmlns declaration of an @extension attribute, then the user agent should behave as if both were present. Similarly, if a user agent supports an extension and encounters a registered prefix that does not have an xmlns declaration in the document, then it should act as if the prefix had been declared. These rules allow the user agent to behave correctly if the extension becomes part of standard HTML.
Section 3.2.3: Global Attributes
After the following paragraph:
In HTML documents, elements in the HTML namespace may have an xmlns attribute specified, if, and only if, it has the exact value "http://www.w3.org/1999/xhtml". This does not apply to XML documents.
Add a line like the following:
If an HTML document contains an attribute of the form "xmlns:<prefix>" where <prefix> is a registered prefix, then the value of the attribute MUST be the namespace that was registered for the prefix. If an unknown element contains an "xmlns" attribute, then it must be the same namespace that that element is associate with in the HTML WG extension root element registry.
Section 18.104.22.168: The insertion mode
After the following bullet:
If node is an element from the MathML namespace or the SVG namespace, then set the foreign flag to true.
insert the following bullet:
If node has the "extension" attribute set and has an "xmlns" attribute, then set the foreign flag to true.
Section 22.214.171.124: Creating and inserting elements
Replace the following:
If the newly created element has an xmlns attribute in the XMLNS namespace whose value is not exactly the same as the element's namespace, that is a parse error.
with the following:
If the newly created element has an xmlns attribute in the XMLNS namespace whose value is not exactly the same as the element's namespace, that is a parse error, unless the element has an "extension" attribute set. If the element has an "extension" attribute set, then this is an "extension prefix binding". The parser should record the binding between the prefix and the namespace, and use this to parse attributes later in the document with that prefix.
Section 126.96.36.199: The "in body" insertion mode
After the bullet called "A start tag whose tag name is 'svg'", insert the following:
A start tag that has the "extension" attribute set: Insert a foreign element for the token, in the namespace specified by the element's "xmlns" attribute.
If the token has its self-closing flag set, pop the current node off the stack of open elements and acknowledge the token's self-closing flag.
Otherwise, if the insertion mode is not already "in foreign content", let the secondary insertion mode be the current insertion mode, and then switch the insertion mode to "in foreign content".
Section 8.1.2: Elements
Foreign elements Elements from the MathML namespace and the SVG namespace.
Foreign elements Elements from the MathML namespace, the SVG namespace, or an extension namespace.
Remove the block quote that starts with "The HTML syntax does not support namespace declarations".
Section 188.8.131.52: Attributes
No other namespaced attribute can be expressed in the HTML syntax.
When a element has an attribute with a prefix specified by an "extension prefix binding", the namespace of the attribute is that specified by the extension prefix binding.
- Future extensions can be added to HTML in the same way that SVG and MathML were integrated in the past
- Provides guidelines for spec authors to help them write specs that work well with HTML
- Helps user agents know what to expect from documents that use unknown extensions, and how to handle such documents gracefully
- Extensions can be merged into a future version of HTML without the need for document to continue to support a previous syntax
- Existing XML markup can be pasted directly into HTML, without having to worry about prefixes
- A document can use extensions and parse correctly in both HTML and XML
- Since "xmlns" only has meaning when the extension attribute is set, parsing behaviour is unchanged for pre-existing documents
- Copy/paste problems with prefixes go away, since prefixed attributes have a fixed meaning.
- If a document omits the namespace declaration for an extension root node, then the namespace of descendant elements will be different when parsed by user agents that support the extension or do not support it. This includes user agents that support a future version of HTML that includes the extension. This is also the case for SVG and MathML in HTML5.
Conformance Classes Changes
A new conformance class of "extended HTML" is defined, that allows an HTML document to contain prefixed attributes defined by an extension specification.
- HTML5 Working Draft as of 4th March 2010: http://www.w3.org/TR/2010/WD-html5-20100304/