This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12819 - Fully define application/x-www-form-urlencoded
Summary: Fully define application/x-www-form-urlencoded
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-30 13:07 UTC by contributor
Modified: 2011-08-15 03:46 UTC (History)
7 users (show)

See Also:


Attachments

Description contributor 2011-05-30 13:07:40 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/
Section: http://www.whatwg.org/specs/web-apps/current-work/#url-encoded-form-data

Comment:
How do you decode this format on the server? There seems to be no definition
of the format, apart from the definition of how to encode it. Expecting every
implementer to reverse this algorithm seems prone to mistakes.

Posted from: 109.246.246.173
User agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Ubuntu/10.10 Chromium/13.0.772.0 Chrome/13.0.772.0 Safari/535.1
Comment 1 Michael[tm] Smith 2011-08-04 05:06:23 UTC
mass-moved component to LC1
Comment 2 Ian 'Hixie' Hickson 2011-08-12 20:49:08 UTC
Wow, looks like nobody's ever registered application/x-www-form-urlencoded, HTML4 doesn't define how to parse it, and there's no other documentation worth anything on it either.

Ok I guess we should register the type in the IANA considerations section, and then in 4.10.22.5 URL-encoded form data add a paragraph and list at the end saying how to decode it. Should probably mention _charset_ there too. While I'm at it maybe also add a similar section for multipart/form-data (saying to see the RFC), and for text/plain (saying it's ambiguous and can't be parsed).

So the parsing rules here should be:

 - cut on &s => list of name-value pairs
 - cut name-value pairs on =s limit 1 => names, values
 - replace +s in names, values with 0x20
 - expand %xxs to corresponding bytes
 - look for _charset_ name, treat value as encoding if found. otherwise use the encoding determined by magic
 - decode names, values per that encoding

Might want to mention the isindex exception? Maybe not.
Comment 3 Geoffrey Sneddon 2011-08-12 20:57:33 UTC
Probably should document the isindex exception. What happens about non-ASCII bytes in the form submission?
Comment 4 Ian 'Hixie' Hickson 2011-08-12 21:10:29 UTC
isindex exception is just that if you're expecting isindex input, you skip the two "cut" steps and just process one "value" being the whole input. Not sure how I'll phrase that exactly.
Comment 5 Anne 2011-08-14 11:09:51 UTC
Not really useful as far as I can tell, but for reference:

http://lists.w3.org/Archives/Public/www-archive/2006Sep/thread.html#msg30
http://tools.ietf.org/html/draft-hoehrmann-urlencoded (last updated September 2010)
Comment 6 Ian 'Hixie' Hickson 2011-08-15 03:45:27 UTC
Yeah there doesn't seem to be any implementor interest around that format, so probably best to ignore it.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given below
Rationale: Concurred with reporter's comments.
Comment 7 contributor 2011-08-15 03:46:14 UTC
Checked in as WHATWG revision r6450.
Check-in comment: Define how to parse the various form submission formats. Register the legacy one. Some editorial tweaks for consistency.
http://html5.org/tools/web-apps-tracker?from=6449&to=6450