This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 18338 - Registries (IANA): text/html MIME type definition should require that charset="" value be valid and correct
Summary: Registries (IANA): text/html MIME type definition should require that charset...
Status: RESOLVED WORKSFORME
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other All
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard: registry
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-20 00:14 UTC by contributor
Modified: 2017-07-21 11:12 UTC (History)
4 users (show)

See Also:


Attachments

Description contributor 2012-07-20 00:14:33 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html
Multipage: http://www.whatwg.org/C#charset
Complete: http://www.whatwg.org/c#charset

Comment:
Include a document-conformance requirement for valid encoding information in
the Content-Type headers with a charset param

Posted from: 1.72.6.183 by mike@w3.org
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.15 Safari/537.1
Comment 1 Michael[tm] Smith 2012-07-20 00:26:04 UTC
What I mean here is, if a document's Content-Type header does not have a charset parameter at all, but it does specify an encoding using a meta element in the document itself, then that's OK.

But if a document's Content-Type header has a charset parameter but the value of that parameter is malformed such that a browser will end up ignoring the value, then that should be a document-conformance error.

The page http://greenbytes.de/tech/tc/httpcontenttype/#l-charset-parsing has examples of some Content-Type headers with malformed charset parameters, along with test results for various browsers. Some examples:

  - Content-Type: text/plain; charset = UTF-8  (whitespace around the "=" sign)
  - Content-Type: text/plain; charset='UTF-8' (single-quoted encoding name)

I recently added some code to the validator that will cause it to report errors for cases such as those, so it would be helpful to have a clearly stated explicit requirement in the spec to go along with that.
Comment 2 Ian 'Hixie' Hickson 2012-08-25 16:47:39 UTC
Isn't that an HTTP-level conformance error? Why does it need to be a document-level conformance error?
Comment 3 Ian 'Hixie' Hickson 2012-10-31 22:27:01 UTC
hsivonen: MikeSmith tells me he'd like your input on this.
Comment 4 Henri Sivonen 2012-11-01 13:33:44 UTC
text/html owns its charset parameter. So I think it would be appropriate to say that the value must be a label from the Encoding Standard and must be the label of the encoding actually used if the charset parameter is present. That should make bogus values an error.
Comment 5 Ian 'Hixie' Hickson 2012-11-01 22:32:33 UTC
Aah, defining it as part of text/html makes sense, yeah, dunno why I missed that.

Ok, will deal with this in January along with the rest of the MIME registry stuff.
Comment 6 Anne 2017-07-21 11:12:36 UTC
This seems to have gotten fixed along the way.