This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Specification: https://html.spec.whatwg.org/multipage/comms.html Multipage: https://html.spec.whatwg.org/multipage/#dom-websocket Complete: https://html.spec.whatwg.org/#dom-websocket Referrer: https://html.spec.whatwg.org/multipage/ Comment: Use USVString instead of DOMString for url argument and send() method (removes lone surrogates) Posted from: 46.127.136.57 by annevk@annevk.nl User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:35.0) Gecko/20100101 Firefox/35.0
Why the URL argument?
The url parser deals with scalar values only.
So this applies to anywhere and everywhere that I get a URL and pass it to the URL parser?
Yes.
So how does that work with, say, content attributes and reflecting IDL attributes?
I suppose those still need http://heycam.github.io/webidl/#dfn-obtain-unicode (It seems browsers however have some kind of IDL extension named [Reflect] that would better allow for this kind of type sharing. Hopefully we can do something like that at some point.)
I don't understand. Why can't URL just take care of this when I hand in a DOMString?
For the same reason e.g. the network doesn't take care of it? It's a different system and expects different values.
I think this should happen at the URL level. Otherwise we have to have prose all over the place doing conversions back and forth. There's really no need for it when it could be a single sentence in the URL spec that does the conversion. (I'm not really even clear on why it needs to be any prose at all. You just treat the bytes differently.)
The current setup works well for anything APIs. It seems that the only thing that it does not work well for is reflected attributes, which is also a one sentence fix.
I'm very confused. The content and IDL attributes here are just regular DOMString attributes, they're not anything special until they are later parsed as URLs. Are you going to change e.g. getAttribute() to return a USVString?
I mean that for the case of reflected attributes you would have to invoke the conversion yourself before handing it to the URL parser. I hope that at some point we can define reflected attributes as such, which is already the case in Chromium as I understand it: [Reflect=URL] attribute USVString href;
Oh you're talking just about how these attributes resolve themselves? Not about how the URL is actually used? I really don't understand the problem here. I pass DOMString strings to the URL parser all the time (e.g. whenever I take a content attribute and resolve it to get an absolute URL). Why would reflecting attributes be any different.
The problem is that the URL parser does not take a DOMString.
That is indeed the problem. I'm saying that instead of everyone having to convert their strings to Unicode before ever interacting with the URL spec, the URL spec should just act like everyone else and take the same kind of string as all the other APIs. If it needs to then act on them as if they're Unicode and not UTF-16, then that's fine, but that's an internal concern, not something you should expose in your API.
I disagree, but I'm happy to add something like a "DOMString-accepting URL parser" hook.
Why don't you agree? Why not just have the current hook, but just make it so it accepts both?
So I was convinced by Simon that this was a better strategy as it keeps the URL parser surrogate free. Reversing that would be somewhat painful, but is definitely doable of course.
You can still keep the parser surrogate free. I'm just saying that whatever prose you would have me put at all the call sites, you would just put at the top of whatever algorithm I'm calling.
That was my suggestion in comment 16...
Right, but you said you disagreed. I'm trying to figure out why you disagree.
I would prefer addressing this per comment 6, but I'm okay with addressing this per comment 16 for now (until something like IDL [Reflect] becomes feasible which seems like a better solution).
I don't understand how comment 6 would help. Most of the call sites for this aren't the one "reflecting IDL attribute" call site. Can you elaborate? Why is having this in multiple call sites, and having additional IDL syntax, better than just having one line of prose in the URL spec?
This seems similar to the various algorithms in css-syntax that need to be invoked with different kind of inputs from different places (a token stream or a string). http://dev.w3.org/csswg/css-syntax/#parser-entry-points I agree with Hixie that it seems nicer to normalize the input on your end instead of having all other specs convert to the input you want. It centralizes the conversion so it is done consistently, and it is less prose for other specs.
https://github.com/whatwg/html/pull/840