19978 2012-11-16 07:28:59 +0000 The decoding algorithm uses ampersand as the separator. But HTML4 recommended semicolon. http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.2.2 Therefore both semicolon and ampersand should be the separator. 2014-01-15 14:03:53 +0000 1 1 1 Unclassified WHATWG URL unspecified Other other RESOLVED WONTFIX http://www.whatwg.org/specs/web-apps/current-work/#url-encoded-form-data P3 normal Unsorted 1 contributor annevk annevk ian mike naruse sideshowbarker+urlspec oldest_to_newest 78380 0 contributor 2012-11-16 07:28:59 +0000 Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html Multipage: http://www.whatwg.org/C#url-encoded-form-data Complete: http://www.whatwg.org/c#url-encoded-form-data Comment: The decoding algorithm uses ampersand as the separator. But HTML4 recommended semicolon. http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.2.2 Therefore both semicolon and ampersand should be the separator. Posted from: 218.45.212.2 User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11 78741 1 annevk 2012-11-24 16:44:22 +0000 That's not compatible with implementations. 78815 2 naruse 2012-11-26 05:55:52 +0000 (In reply to comment #1) > That's not compatible with implementations. What I point is not encoding but decoding, so this should increase compatibility. 78820 3 annevk 2012-11-26 10:29:14 +0000 Compatibility with which encoding implementations? Or do decoding implementations typically implement this? (Though if you cannot get it generated that seems pointless.) 78824 4 naruse 2012-11-26 13:27:59 +0000 (In reply to comment #3) > Compatibility with which encoding implementations? Or do decoding > implementations typically implement this? (Though if you cannot get it > generated that seems pointless.) For example CGI.pm of perl emits semicolon-separated query string: > perl -e'use CGI;$q=CGI->new;$q->param(foo => "bar");$q->param(hoge => "fuga");print $q->query_string()' foo=bar;hoge=fuga Ruby's cgi.rb emits &-separated string, but can parse ;-separated one. 78825 5 annevk 2012-11-26 13:35:09 +0000 I guess we might want to consider it a bit more then. In any event, I want to use this algorithm for the URLQuery API and there I definitely do not want ; to count as separator. "?na;me=value&name;2=othervalue" should just be split on & and then =. 78843 6 naruse 2012-11-26 18:15:27 +0000 Is there a web browser which doesn't escape ; in name on form submitting? 78846 7 annevk 2012-11-26 18:27:57 +0000 I'm not sure, but you can get such URLs by manipulating via JavaScript or simply with <a>. 78867 8 naruse 2012-11-26 21:14:42 +0000 What creates such URLs? When it is via JavaScript, there's no direct generator function from a form like form.queryString(). If it uses escape(), it escapes ; and &. If it uses encodeURI(), it doesn't escape neither ; and &, but it is wrong use. If it uses encodeURIComponent(), it escapes ; and &. So there's no problem. When it is simple a href, it seems by some libraries or by hand. If it uses libries, Perl's CGI.pm decodes with ; and & as separators, and encodes with ; by default. Python's cgi.py decodes with ; and & as separators, and urllib.py encodes with &. Ruby's cgi.rb decodes with ; and & as separators, and uri.rb encodes with &. PHP decodes with & as a separator (can change by arg_separator.input), and http_build_query encodes with &. All of them encodes both & and ; of key. So there's no problem. If it is written by hand, there's many possibility. It may use odd separator like !, ,, |, $, and so on. Of course it includes a query string like "?na;me=value&name;2=othervalue". Therefore I think splitting with [&;] is reasonable de facto standard. Current decoding algorithm breaks CGI.pm, the majority, defending rare edge cases. 80627 9 ian 2012-12-27 00:14:46 +0000 This is a URL spec bug now right? 80628 10 ian 2012-12-27 00:15:20 +0000 (assuming it's a bug at all, I mean; I personally think it should be WONTFIXed as I see no value in using semicolons as well, and supporting multiple syntaxes is a recipe for security bugs, typically) 80636 11 annevk 2012-12-27 12:40:37 +0000 It's URL once HTML starts making that dependency. Agreed about WONTFIX. Seems better if everyone aims to converge to a single format. 80637 12 naruse 2012-12-27 15:01:52 +0000 I'm ok about WONTFIX if you decide it with understanding above facts I showed. 80646 13 ian 2012-12-27 21:34:41 +0000 Ok, Anne, your call. (Assume HTML defers to URL for this stuff.) 98500 14 annevk 2014-01-15 14:03:53 +0000 Current libraries on the server also have other quirks as illustrated in bug 24222. They are free to implement other things I think. The specification just defines parsing for the format produced by the web platform.