This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 23821 - The WebSocket constructor resolves the URL using UTF-8, but .url appears to use the document's encoding as URL character encoding.
Summary: The WebSocket constructor resolves the URL using UTF-8, but .url appears to u...
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-11-14 04:35 UTC by contributor
Modified: 2013-11-22 07:44 UTC (History)
3 users (show)

See Also:


Attachments

Description contributor 2013-11-14 04:35:39 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/
Multipage: http://www.whatwg.org/C#dom-websocket-url
Complete: http://www.whatwg.org/c#dom-websocket-url
Referrer: 

Comment:
The WebSocket constructor resolves the URL using UTF-8, but .url appears to
use the document's encoding as URL character encoding.

Posted from: 59.37.57.226
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.26 Safari/537.36 OPR/18.0.1284.11 (Edition Next)
Comment 1 contributor 2013-11-21 22:57:57 UTC
Checked in as WHATWG revision r8305.
Check-in comment: WebSocket.url should be consistent with how the URL is used in the first place.
http://html5.org/tools/web-apps-tracker?from=8304&to=8305
Comment 2 Ian 'Hixie' Hickson 2013-11-21 23:29:45 UTC
Note that no browser actually does this per the new spec, currently.
Firefox just passes the URL through unmodified (!).
Safari and Chrome use the doc encoding for the query component and UTF-8 for the path, converting U+263A in the Win1252 query component into %26%239786%3B. I don't see anything in the URL spec that comes close to this (I don't even see how the path gets %-encoded, actually).
Comment 3 Simon Pieters 2013-11-22 07:44:25 UTC
(In reply to Ian 'Hixie' Hickson from comment #2)
> Note that no browser actually does this per the new spec, currently.

Presto does, it seems. My copy of Safari also (6.0.4).

data:text/html;charset=windows-1252,%3C!DOCTYPE%20html%3E%0A%3Cscript%3Evar%20s%3D%20new%20WebSocket('ws%3A%2F%2Fexample.invalid%2F%3F%5Cu00e5')%3B%20alert(s.url)%20%3C%2Fscript%3E

> Firefox just passes the URL through unmodified (!).
> Safari and Chrome use the doc encoding for the query component and UTF-8 for
> the path, converting U+263A in the Win1252 query component into
> %26%239786%3B. I don't see anything in the URL spec that comes close to this

It looks like the <form> error handling mode

"Otherwise, emit the result of running utf-8 encode on U+0026, U+0023, followed by the shortest sequence of ASCII digits representing c in base ten, followed by U+003B."
http://encoding.spec.whatwg.org/#error-handling-mode

> (I don't even see how the path gets %-encoded, actually).

Do you mean in the spec?