ChangeProposals/html:xmlns

From HTML WG Wiki
< ChangeProposals
Revision as of 02:28, 9 August 2010 by Lsilli (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The following is a Change Proposal for ISSUE-41, Distributed Extensibility.

Author: Leif Halvard Silli. Rob Ennals’ extensibility proposals were a starting point and inspiration.

Last edition: 9th of August 2010. (Simplified and generalized the proposal somewhat.)

Summary

  1. The HTML5 specification has already adopted what this change proposal will refer to as HTML5 standard namespaces.
  2. This change proposal in addition introduces underscore namespace prefixes as a way for authors to declare XML namespaces. This change proposal does not permit the use of default namespace declarations – except when the element is a HTML standard namespace.
  3. Further more, HTML5 already allows namespaces inside the <script> element. But this change proposal suggests one small extension of the use of namespaces in that element.

Rationale

The goal is to implement XML namespaces as a way to a) independently extend the HTML syntax, and b) link the extensions to a namespace, and c) to allow HTML authors to reuse the semantics of other namespaces within HTML documents, d) to allow namespaces outside text/html land to reuse these namespaces (even if the extension operates in null-namespace inside text/html). e) offer a solution for vendor prefixes.

The restrictions on the syntax and on how XML namespaces are implemented in text/html, are meant to deal with the problems which existing "talisman" namespaces poses – if they suddenly started to be treated as namespaces, then then it could cause lots of harmful, unintended effects. This disaster must be prevented.

It is permitted to let an XML prefix start with the character _. And by requiring prefixes to begin with the character _, we eliminate the problems that could arise if all the xmlns talisman prefixes (or Microsoft based namespaces) became active as a real namespaces. The reasons why _ protects against talismans are

  1. that prefixes beginning with _ are (probably) very seldom to be found in the wild.
  2. tests show that Internet Explorer version 6 to 8 do not support namespace prefixes beginning with an undescore _.
  3. if an HTML element begins with the character _, then the element is not interpreted as an element. Thus, all HTML user agents start from scratch when it comes to _prefix.

The permission to use prefixed namespaces on the script element seems logical, as the content of <script> is not, by default, interpreted as HTML elements anyhow. (E.g. consider the coding praxis of AmpleSDK - http://www.amplesdk.com/examples/chart/bar/.)

Details

  • General – outside <script>:
    1. HTML5 standard namespaces are used by those elements, attributes or prefixes whose namespace is declared by the use of a particular root element. Examples in HTML5: HTML, SVG, MathML, XLink, XML, XMLNS. HTML5 standard namespaces are typically default namespaces. However, the XLink namespace is an example of a HTML5 standard namespaces linked to a prefix. It is permitted to declare HTML5 standard namespaces, however, doing so has no effect, in text/html documents.
    2. Underscore prefixes may either be declared via xmlns or not be declared via xmlns. If it is not declared via xmlns, then the namespace of the prefix is simply the HTML namespace. If it is declared via xmlns, then it can be any namespace. Undeclared underscore prefixes are permitted, provided that a) the prefix has been registered and b) the prefix is linked ot the HTML namespace. The purpose of the permission to use registered underscore prefixes is to allow such prefixes to be used as vendor prefixes and similar use cases. If an Underscore prefixed namespace becomes popular, then there is the option of extending the HTML5 specification – or another relevant specification – so that the underscore prefixed namespace can be used like a HTML5 standard namespaces. That is: one does not need to use the prefix anymore.
    3. namespace prefixes which does not start with the underscore letter, are not interpreted as namespace prefixes in text/html. Thus, for example <my:name> is not interpreted as an element whose name is name but as an element whose name is my:name. Whereas <_my:name> is interpretated as an element whose name is name.
    4. namespaces not beginning with the underscore character, MAY be permitted in 'text/html', provided that they a) are declared, b) it is not expected that text/html user agents support such namespaces and c) that it is used for attributes and attribute contents. Example: RDFa.
  • The script element
    1. the <script> element is not subject to the above restrictions – HTML5 conformance checkers do not validate the content of the script element. However, this proposal suggests one extension, namely that prefixed namespaces of any kind, might be declared inside the <script> start tag.
  • Prefix and namespace registry:
    1. Prefix registry: The purpose of the prefix registry is to allow vendors and other interested parties to register a prefix - and thus not a namespace. All registered prefixes must belong to the HTML namespace.
    2. Namespace registrry: This registry — which could potentially be the HTML5 specification itself — lists the HTML5 standard namespaces.

Impact

Positive Effects

  • The meaning of ":" is not changed. This proposal differs from Rob Ennal’s proposal in that the namespace becomes the important point.
  • HTML5 supports distributed extensibility. Independent groups can create extensions to HTML5 without having to go through the HTML working group.
  • If a namespace URI is (re)registered with a default namespace URI status, life is potentially made easier for authors as they can drop using the prefix and drop declaring the namespace URI.
  • Conformance checkers should not negatively make it clear to users when they are using extensions, unless they use them incorrectly. Validators are required to check that registered vendor prefixes with a default namespace URIs are used without namespace declaration and namespace prefix.
  • Any document that parses correctly in both XML and HTML is guaranteed to parse to the same DOM tree. (I hope. I think this depends on how one sees it.)
  • Existing XML specs such as RDFa and XSLT can be used "as is" in an HTML document and the DOM can be correctly processed by existing XML tools. (I know too little about XSLT to know if this is true.)


Negative Effects

  • There will be a transition period before namespaces in text/html is supported.
  • This change proposal requires that one uses HTML5-compatible prefixes - that is: prefixes must begin with the undescore character.

Conformance Classes Changes

  1. A conforming HTML5 client is expected to support underscore namespace prefixes
  2. Validators are expected to flag undeclared prefixes as errors – unless the prefix is registered. Declared namespaces are triggering a warning, only.

Section 2.2.2: Extensibility

Under this section, Rob Ennals mentioned concrete changes to the HTML5 draft. I have noted whether I agree or suggest something else:

I agree in replacing this:

Vendor-specific proprietary extensions to this specification are strongly discouraged. Documents must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.

But I am not sure if validators should give any warning, as Rob explained in his replacement text:

Vendor-specific proprietary extensions to this specification are strongly discouraged, since doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question. If a document is to be considered valid HTML then it must not use such extensions. A document may however be considered valid "extended HTML" if it only uses extensions that confirm to the criteria below:

I agree in replacing this text:

If vendor-specific markup extensions are needed, they should be done using XML, with elements or attributes from custom namespaces. If such DOM extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.

But to myself, the word “vendor“ is a bit unclear. If “vendors“ is synonymous with e.g. Webkit, Opera etc, then I think that such prefixes are much less needed in XHTML than in text/html, in the first place, as XHTML seems to be more generic than text/html is. At any rate, I think I would strike the word ‘vendor-specific‘:

If vendor-specific markup extensions are needed, they may be done in one of two ways. The preferred mechanism is that such extensions should be done in XML, with elements and attributes from custom namespaces.

I suggest to replace this text of Rob:

Alternatively, if an extension registers the meaning of a prefix on the HTML WG Wiki (url), then names with that prefix may also be used in "extended HTML" documents. It is also permissable for a prefix of the form "x-prefix" to be used in an "extended HTML" document. In both cases, the meaning of such a name MUST depend entirely on the nodeName and not on any namespace that is associated with it.

With something like this:

One may use namespaces according to the rules described above. If a foreign namespace is permitted in text/html, then its namespace may be declared both as default default and linked to a prefix. E.g <svg:svg xmlns:svg="http://www.w3.org/2000/svg">. It is also permissable for a prefix of the form _x: to be used. The meaning of such a name do not need to only depend on the nodeName. One may instead rely on the namespace that is associated with it. For vendor prefixes (which is a class of prefixes which has a default default namespace URIs), the intention is precisely that different parsers treat the extensions differently w.r.t. namespaces vs. nodename.

My proposal do no allow elements, so I changed this a little bit:

The namespace prefix syntax of HTML5 do not permit an extension specification to define new element types, e.g. to represent a subtype of an HTML element type (other than span), since this hides the HTML semantics of the element from user agents that do not understand the extension. In such cases, an extension specification should instead define an attribute that can be added to elements with the appropriate HTML element type. Similarly, if an extension gives meaning to a node with tag name "prefix:name" then it SHOULD give the same meaning to a node with boolean attribute "prefix:name".

</del>

My proposal operates with proper namespaces, thus I think that this part can be stricken - the DOM extension that is needed is a way to go from _name="" to _name: to xmlns:_name.

If DOM extensions are needed, the members should also be prefixed with a string that is unique to the extension or prefix, to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.


Section 2.8: Namespaces

I agree with Rob’s text here:

In an HTML document, the namespace of a prefixed element is that specified by the innermost enclosing XMLNS declaration for that prefix (as in XML). If no XMLNS declaration gives a namespace to a prefix, then nodes with that prefix default to being in the null namespace. Note however that if a registered prefix is bound to a namespace then this MUST be the namespace registered in the HTML WG Wiki.

Section 3.2.3: Global Attributes

I agree that after the following paragraph, some text is needed:

In HTML documents, elements in the HTML namespace may have an xmlns attribute specified, if, and only if, it has the exact value "http://www.w3.org/1999/xhtml". This does not apply to XML documents.

However, I think this part of Rob’s proposal …

If an HTML document contains an attribute of the form "xmlns:<prefix>" where <prefix> is a registered prefix, then the value of the attribute MUST be the namespace that was registered for the prefix. If an HTML document contains an attribute of the form "xmlns:<prefix>" for an unregistered prefix (starts with "x-") then the value of that attribute MUST be the same as any other attributes with the same name in the same document.

… should be replaced with something like this:

The registry has two levels, and it doesn't matter if the namespace URI is registered or not, as long if it is only on the first level. But if it is at level 2 namespace URI = a prefix with a default namespace URI, then if an HTML document contains an attribute of the form "xmlns:<prefix>" where <prefix> is a registered level 2 prefix, then the value of the attribute MUST be the namespace that was registered for the prefix, and its namespace should not be declared. The exception is vendor prefixes (which is a subset of the prefixes with a default namespace URI), where only the vendor is required to support the default namespace URI.

Risks

Authors may needlessly use another namespace than the HTML namespace. This proposal however allows prefixes which do belong to the HTML namespace. Thereby it puts a mild pressure on authors to use the HTML namespace instead of another namespace.

References

Example 1 - non default namespaces

<html xmlns:_foo="http://foo.example.org">
<style> *[*|class=bar]{color:red}</style>
<div _foo:class="bar" >xyz</div>

Example 2 - an _attribute=""

<html xmlns:_foo="http://foo.example.org">
<style> *[*|_foo=bar]{color:red}</style>
<div _foo:_foo=bar >xyz</div>