<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>18135</bug_id>
          
          <creation_ts>2012-07-18 17:24:10 +0000</creation_ts>
          <short_desc>multipart/form-data: field name encoding is not specified; browsers do incompatible things</short_desc>
          <delta_ts>2016-04-19 22:39:40 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>HTML WG</product>
          <component>HTML5 spec</component>
          <version>unspecified</version>
          <rep_platform>Other</rep_platform>
          <op_sys>other</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>MOVED</resolution>
          
          
          <bug_file_loc>http://www.whatwg.org/specs/web-apps/current-work/#multipart-form-data</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P1</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter>contributor</reporter>
          <assigned_to name="Robin Berjon">robin</assigned_to>
          <cc>ej</cc>
    
    <cc>mike</cc>
    
    <cc>public-html-admin</cc>
    
    <cc>public-html-wg-issue-tracking</cc>
    
    <cc>robin</cc>
    
    <cc>travil</cc>
          
          <qa_contact name="HTML WG Bugzilla archive list">public-html-bugzilla</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>70737</commentid>
    <comment_count>0</comment_count>
    <who name="">contributor</who>
    <bug_when>2012-07-18 17:24:10 +0000</bug_when>
    <thetext>This was was cloned from bug 16909 as part of operation convergence.
Originally filed: 2012-05-02 20:09:00 +0000

================================================================================
 #0   contributor@whatwg.org                          2012-05-02 20:09:21 +0000 
--------------------------------------------------------------------------------
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html
Multipage: http://www.whatwg.org/C#multipart-form-data
Complete: http://www.whatwg.org/c#multipart-form-data

Comment:
The specification is unclear about how field names should be encoded. In
particular, what should be done if they include special characters? (eg.
quotes, new lines, unicode, etc?). I started a mailing list thread on this
issue...

Posted from: 74.66.64.60
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.36 Safari/536.5
================================================================================
 #1   Evan Jones                                      2012-05-02 20:10:52 +0000 
--------------------------------------------------------------------------------
The specification is unclear about how field names should be encoded. In particular, what should be done if they include special characters? (eg. quotes, new lines, unicode, etc?).
================================================================================
 #2   Evan Jones                                      2012-05-02 20:41:21 +0000 
--------------------------------------------------------------------------------
Argh; whoops. Sorry for the bugzilla spam. I didn&apos;t realize that the &quot;comment&quot; thingy just filed a bugzilla bug.

HTML5 states: &quot;Encode the (now mutated) form data set using the rules described by RFC 2388&quot;. However, it then modifies the rules:

&quot;The parts of the generated multipart/form-data resource that correspond to non-file fields must not have a Content-Type header specified. Their names and values must be encoded using the character encoding selected above (field names in particular do not get converted to a 7-bit safe encoding as suggested in RFC 2388).&quot;

http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#multipart-form-data

So the problem is: what are we supposed to do with field names? In particular, what if they contain &quot;special&quot; MIME characters (e.g. \r\n newlines, backslashes, double quotes, or semi-colons?). Different browsers do different things, meaning that currently server code must detect the browser to do the right thing.


Example: &lt;input name=&apos;bàz%22\&quot;\&apos; value=&quot;foo&quot;&gt;

Firefox 13b: Content-Disposition: form-data; name=&quot;bàz%22\\&quot;\&quot;
Webkit nightly: Content-Disposition: form-data; name=&quot;bàz%22\%22\&quot;

Firefox backslash quotes double quotes, except it fails to quote backslashes. This means its header fails to parse according to the MIME specification (it sort of decodes as bàz%22\ with an extra trailing \&quot;

Webkit %-escapes the double quotes, but does not %-escape the percent. Thus the above form control could be either name=&apos;bàz&quot;\&quot;\&apos; or the desired name. Webkit has a bug open on this issue, asking for specification guidance: https://bugs.webkit.org/show_bug.cgi?id=62107


HTML5 should specify exactly how field names are encoded. Some potential solutions:

1) Bless Firefox&apos;s backslash quoting rules (they are very weird but I think they are unambiguous?). This means Webkit POSTs will be decoded to the wrong field names, and POSTs to older servers may parse incorrectly if the name includes a \ (but that must already happen for Firefox?).

2) Bless Webkit&apos;s percent escaping rules (ideally also escaping %). Servers that strictly parse this format will fail to parse Firefox POSTs if the name includes a \, and will 

3) Adopt RFC 6266&apos;s approach of having two name parameters when there are special characters: one with the existing escaping, and one with an unambiguously escaped version. Ideally, existing servers will parse the first name and not break unless the form value contains a special character. As servers are upgraded, they will be able to unambiguously parse the new header. See: http://tools.ietf.org/html/rfc6266


Aside: The *same* issue happens for uploaded file names. I started a mailing list thread to attempt to collect more information about this: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-May/035610.html
================================================================================</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>121039</commentid>
    <comment_count>1</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2015-06-16 10:16:07 +0000</bug_when>
    <thetext>Making this a higher priority to actively seek more feedback on from implementers and webdevs.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>125927</commentid>
    <comment_count>2</comment_count>
    <who name="Travis Leithead [MSFT]">travil</who>
    <bug_when>2016-04-19 22:39:40 +0000</bug_when>
    <thetext>HTML5.1 Bugzilla Bug Triage: Moved

Moved the summary and tracking of followup on this issue to GitHub:
https://github.com/w3c/html/issues/222

If this resolution is not satisfactory, please copy the relevant bug details/proposal into a new issue at the W3C HTML5 Issue tracker: https://github.com/w3c/html/issues/new where it will be re-triaged. Thanks!</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>