Bug 20767 - Restrict "Encoding declaration state" to only media types with provided charset parameter
Summary: Restrict "Encoding declaration state" to only media types with provided chars...
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P3 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: HTML WG Bugzilla archive list
Depends on:
Reported: 2013-01-25 03:27 UTC by kosmo.zb
Modified: 2015-06-17 03:40 UTC (History)
3 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description kosmo.zb 2013-01-25 03:27:10 UTC
Currently, "Encoding declaration state" <http://www.w3.org/TR/html5/document-metadata.html#attr-meta-http-equiv-content-type> applies to every meta/@http-equiv='Content-Type'. Only some media types contain information about character encoding, i.e. those with 'charset' parameters.

It should not be an error to provide a meta tag with @http-equiv='Content-Type' which has a content attribute which does not specify a character set encoding or which indicates a media type that is not "text/html". Such an element does not constitute a character set encoding declaration and has no semantics in HTML5. The HTML parser <http://www.w3.org/TR/html5/syntax.html#determining-the-character-encoding> ignores meta/@http-equiv="Content-Type" that does not contain a "charset" media type parameter in its content attribute. For this reason, an element like this should NOT be considered in "Encoding declaration state" and thus should not participate in "Encoding declaration state" errors.

I suggest normative text specifying conformance errors for existence of one or more meta/@http-equiv="Content-Type" without "text/html" or "charset" be redacted from both HTML and XML serializations of HTML5. Instead, these elements should be advisory metadata only to be used by validators and consumers as they see fit. To continue to treat these elements as conformance errors and prohibit their use in HTML5 is pointless and goes against the liberal spirit of HTML5.

This correction is motivated by the realization that there may exist many equivalent interpretations for a single representation. In the future, someone may invent and promulgate "text/hml" for a Happy Markup Language which is perfectly compatible with "text/html" but prohibits any use of closing tags. Or perhaps someone will create "text/ohml" for an Obfuscated Hypertext Markup Language which is perfectly compatible with "text/html" but prohibits extraneous whitespace, quotation marks, and optional ending tag delimiters.

<meta http-equiv="Content-Type" content="text/hml">
<meta http-equiv="Content-Type" content="text/ohml">

I believe that HTML5 should provide for leniency in these cases and give authors a natural and straightforward place to declare this kind of metadata.
Comment 1 Michael[tm] Smith 2015-06-17 03:40:36 UTC
No other comments ever posted to it since it was raised 2+ years ago.

It would be more fruitful to take this up for discussion in another forum first, then come back and re-open this bug with a pointer to that discussion, if any indications of support emerge from it.