Bugzilla – Bug 20767
Restrict "Encoding declaration state" to only media types with provided charset parameter
Last modified: 2013-02-22 22:57:52 UTC
Currently, "Encoding declaration state" <http://www.w3.org/TR/html5/document-metadata.html#attr-meta-http-equiv-content-type> applies to every meta/@http-equiv='Content-Type'. Only some media types contain information about character encoding, i.e. those with 'charset' parameters.
It should not be an error to provide a meta tag with @http-equiv='Content-Type' which has a content attribute which does not specify a character set encoding or which indicates a media type that is not "text/html". Such an element does not constitute a character set encoding declaration and has no semantics in HTML5. The HTML parser <http://www.w3.org/TR/html5/syntax.html#determining-the-character-encoding> ignores meta/@http-equiv="Content-Type" that does not contain a "charset" media type parameter in its content attribute. For this reason, an element like this should NOT be considered in "Encoding declaration state" and thus should not participate in "Encoding declaration state" errors.
I suggest normative text specifying conformance errors for existence of one or more meta/@http-equiv="Content-Type" without "text/html" or "charset" be redacted from both HTML and XML serializations of HTML5. Instead, these elements should be advisory metadata only to be used by validators and consumers as they see fit. To continue to treat these elements as conformance errors and prohibit their use in HTML5 is pointless and goes against the liberal spirit of HTML5.
This correction is motivated by the realization that there may exist many equivalent interpretations for a single representation. In the future, someone may invent and promulgate "text/hml" for a Happy Markup Language which is perfectly compatible with "text/html" but prohibits any use of closing tags. Or perhaps someone will create "text/ohml" for an Obfuscated Hypertext Markup Language which is perfectly compatible with "text/html" but prohibits extraneous whitespace, quotation marks, and optional ending tag delimiters.
<meta http-equiv="Content-Type" content="text/hml">
<meta http-equiv="Content-Type" content="text/ohml">
I believe that HTML5 should provide for leniency in these cases and give authors a natural and straightforward place to declare this kind of metadata.