<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>18338</bug_id>
          
          <creation_ts>2012-07-20 00:14:33 +0000</creation_ts>
          <short_desc>Registries (IANA): text/html MIME type definition should require that charset=&quot;&quot; value be valid and correct</short_desc>
          <delta_ts>2017-07-21 11:12:36 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>HTML</component>
          <version>unspecified</version>
          <rep_platform>Other</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>WORKSFORME</resolution>
          
          
          <bug_file_loc>http://www.whatwg.org/specs/web-apps/current-work/#charset</bug_file_loc>
          <status_whiteboard>registry</status_whiteboard>
          <keywords></keywords>
          <priority>P3</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter>contributor</reporter>
          <assigned_to name="Ian &apos;Hixie&apos; Hickson">ian</assigned_to>
          <cc>annevk</cc>
    
    <cc>hsivonen</cc>
    
    <cc>ian</cc>
    
    <cc>mike</cc>
          
          <qa_contact>contributor</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>71192</commentid>
    <comment_count>0</comment_count>
    <who name="">contributor</who>
    <bug_when>2012-07-20 00:14:33 +0000</bug_when>
    <thetext>Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/semantics.html
Multipage: http://www.whatwg.org/C#charset
Complete: http://www.whatwg.org/c#charset

Comment:
Include a document-conformance requirement for valid encoding information in
the Content-Type headers with a charset param

Posted from: 1.72.6.183 by mike@w3.org
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.15 Safari/537.1</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>71193</commentid>
    <comment_count>1</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2012-07-20 00:26:04 +0000</bug_when>
    <thetext>What I mean here is, if a document&apos;s Content-Type header does not have a charset parameter at all, but it does specify an encoding using a meta element in the document itself, then that&apos;s OK.

But if a document&apos;s Content-Type header has a charset parameter but the value of that parameter is malformed such that a browser will end up ignoring the value, then that should be a document-conformance error.

The page http://greenbytes.de/tech/tc/httpcontenttype/#l-charset-parsing has examples of some Content-Type headers with malformed charset parameters, along with test results for various browsers. Some examples:

  - Content-Type: text/plain; charset = UTF-8  (whitespace around the &quot;=&quot; sign)
  - Content-Type: text/plain; charset=&apos;UTF-8&apos; (single-quoted encoding name)

I recently added some code to the validator that will cause it to report errors for cases such as those, so it would be helpful to have a clearly stated explicit requirement in the spec to go along with that.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>72772</commentid>
    <comment_count>2</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-08-25 16:47:39 +0000</bug_when>
    <thetext>Isn&apos;t that an HTTP-level conformance error? Why does it need to be a document-level conformance error?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>77598</commentid>
    <comment_count>3</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-10-31 22:27:01 +0000</bug_when>
    <thetext>hsivonen: MikeSmith tells me he&apos;d like your input on this.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>77656</commentid>
    <comment_count>4</comment_count>
    <who name="Henri Sivonen">hsivonen</who>
    <bug_when>2012-11-01 13:33:44 +0000</bug_when>
    <thetext>text/html owns its charset parameter. So I think it would be appropriate to say that the value must be a label from the Encoding Standard and must be the label of the encoding actually used if the charset parameter is present. That should make bogus values an error.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>77716</commentid>
    <comment_count>5</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2012-11-01 22:32:33 +0000</bug_when>
    <thetext>Aah, defining it as part of text/html makes sense, yeah, dunno why I missed that.

Ok, will deal with this in January along with the rest of the MIME registry stuff.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>128760</commentid>
    <comment_count>6</comment_count>
    <who name="Anne">annevk</who>
    <bug_when>2017-07-21 11:12:36 +0000</bug_when>
    <thetext>This seems to have gotten fixed along the way.</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>