Bug 16302 - possible unnecessary insertion of U+0020 SPACE in mime type expression
possible unnecessary insertion of U+0020 SPACE in mime type expression
Status: RESOLVED NEEDSINFO
Product: WHATWG
Classification: Unclassified
Component: Unwelcome
unspecified
All All
: P2 normal
: Unsorted
Assigned To: Michael[tm] Smith
sideshowbarker+unwelcome
http://dvcs.w3.org/hg/xhr/raw-file/8d...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-10 04:41 UTC by Glenn Adams
Modified: 2012-10-30 17:24 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Glenn Adams 2012-03-10 04:41:16 UTC
In section 4.7.6, step 3, under "If data is a FormData", the text specifies:

"Let mime type be the concatenation of "multipart/form-data;", a U+0020 SPACE character, "boundary=", and ..."

The text ", a U+0020 SPACE character" is unnecessary and undesirable,  unless it is here for some unstated compatibility reason.

HTML5 references HTTP (RFC2616) for the definition of a MIME type, which defines it as follows:

media-type     = type "/" subtype *( ";" parameter )
parameter       = attribute "=" value
attribute          = token
value               = token | quoted-string

and which does not show the insertion of SPACE after ";"

However, I would note that the examples of multipart/form-data in RFC2046 (HTTP 1.0) 5.1.1 [1] and HTML4 [2] do show SPACE after ";". So there may be a hidden compatibility argument operating here. If that is the case, then please add a note explaining it with an informative reference to another document justifying/defining this usage.

[1] http://tools.ietf.org/html/rfc2616#section-3.7
[2] http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.2
Comment 1 Julian Reschke 2012-03-10 10:01:06 UTC
RFC 2616's ABNF has "implied Linear Whitespace", see <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1>.

Compare the media type definition with: <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-18.html#media.types>
Comment 2 Glenn Adams 2012-03-10 19:16:02 UTC
(In reply to comment #1)
> RFC 2616's ABNF has "implied Linear Whitespace", see
> <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1>.
> 
> Compare the media type definition with:
> <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-18.html#media.types>

thanks for reminding me of that; i had looked for this but didn't find it;

in any case, I wonder why it is necessary to require the use of *optional* whitespace in this case where it is not specified in the two cases immediately above the one cited here:

(1) Let mime type be "application/xml" or "text/html" if Document is an HTML document, followed by ";charset=", followed by encoding.

(2) Let mime type be "text/plain;charset=UTF-8".

on the surface, this behavior appears to be inconsistent;

further, since this is effectively defining a serialization rule, i wonder how this relates to the "canonical form" of an "Internet media type"; unfortunately, the language about canonicalization in RFC2616 [1] and httpbis [2] does address this point adequately (nor does it address the case of multiple same-named parameters);

[1] http://greenbytes.de/tech/webdav/rfc2616.html#canonicalization.and.text.defaults
[2] http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-18.html#canonicalization.and.text.defaults
Comment 3 Julian Reschke 2012-03-10 19:31:42 UTC
(In reply to comment #2)
> (In reply to comment #1)
> > RFC 2616's ABNF has "implied Linear Whitespace", see
> > <http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.2.1>.
> > 
> > Compare the media type definition with:
> > <http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-18.html#media.types>
> 
> thanks for reminding me of that; i had looked for this but didn't find it;
> 
> in any case, I wonder why it is necessary to require the use of *optional*
> whitespace in this case where it is not specified in the two cases immediately
> above the one cited here:

It's not. It's a case of over-specification.

> (1) Let mime type be "application/xml" or "text/html" if Document is an HTML
> document, followed by ";charset=", followed by encoding.
> 
> (2) Let mime type be "text/plain;charset=UTF-8".
> 
> on the surface, this behavior appears to be inconsistent;
> 
> further, since this is effectively defining a serialization rule, i wonder how
> this relates to the "canonical form" of an "Internet media type";

That is about a canonical form of the *payload*, not the Content-Type header field.

> unfortunately, the language about canonicalization in RFC2616 [1] and httpbis
> [2] does address this point adequately (nor does it address the case of
> multiple same-named parameters);
> 
> [1]
> http://greenbytes.de/tech/webdav/rfc2616.html#canonicalization.and.text.defaults
> [2]
> http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p3-payload-18.html#canonicalization.and.text.defaults
Comment 4 Glenn Adams 2012-03-10 19:48:46 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > further, since this is effectively defining a serialization rule, i wonder how
> > this relates to the "canonical form" of an "Internet media type";
> 
> That is about a canonical form of the *payload*, not the Content-Type header
> field.

good point; we apparently need a specification of the canonical form of an Internet media type to refer to in order to obtain a consistent serialization; perhaps someone should raise this point in the httpbis process (I am not participating in that process)
Comment 5 Julian Reschke 2012-03-10 20:05:17 UTC
(In reply to comment #4)
> ...
> good point; we apparently need a specification of the canonical form of an
> Internet media type to refer to in order to obtain a consistent serialization;

Why do we need a consistent serialization?

> perhaps someone should raise this point in the httpbis process (I am not
> participating in that process)

Just send email to the HTTPbis mailing list; that is all that is needed to participate.
Comment 6 Glenn Adams 2012-03-10 20:35:13 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > ...
> > good point; we apparently need a specification of the canonical form of an
> > Internet media type to refer to in order to obtain a consistent serialization;
> 
> Why do we need a consistent serialization?

it sounds like you are asking why consistency is useful... isn't that the point of having standards? or is the point to enshrine inconsistency? i prefer the former, and i would guess that is why many of us are here, yes?
Comment 7 Julian Reschke 2012-03-10 21:36:18 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > ...
> > > good point; we apparently need a specification of the canonical form of an
> > > Internet media type to refer to in order to obtain a consistent serialization;
> > 
> > Why do we need a consistent serialization?
> 
> it sounds like you are asking why consistency is useful... isn't that the point
> of having standards? or is the point to enshrine inconsistency? i prefer the
> former, and i would guess that is why many of us are here, yes?

Right now we have no canonical forms for any headers. Yes, having them might be useful in some cases (like signing messages), but defining them is a lot of work, and AFAIK nobody has done it yet.

The fact that each new header field would need to define it's canonical form individually doesn't help.

In the IETF, work happens if people volunteer and write a spec that other can review. I don't think anybody has done that yet, but maybe it's something you want to try?
Comment 8 Anne 2012-03-26 17:26:45 UTC
Sounds like this is not my problem for now.
Comment 9 Michael[tm] Smith 2012-10-06 03:55:30 UTC
Not sure what action if any I need to take on this. Re-assign to one of the W3C XHR editors?