This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
http://url.spec.whatwg.org/#host-parsing [[ Let host be the result of running utf-8's decoder on the percent decoding of input. ]] Percent decoding is defined for "a string using code points in the range U+0000 to U+007F", but there is not guarantee that input does not contain other code points at that point of host parsing. What happens to non-ASCII code points?
How do you get non-ASCII there?
new URL('http://☃/')
So one option would be to percent escape during the host name state. But that is not the only entry point to the host parser. Hmm.
The host parser should UTF8-percent-encode, just before it percent-decodes in step 3.
https://github.com/whatwg/url/commit/3cfaa1779bfb9a3ba2b907a5802ca0251ca9a7e6