This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26134 - The href attribute on a and area elements must have a value that is a "valid URL potentially surrounded by spaces" or
Summary: The href attribute on a and area elements must have a value that is a "valid...
Status: RESOLVED WONTFIX
Alias: None
Product: WHATWG
Classification: Unclassified
Component: URL (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Anne
QA Contact: sideshowbarker+urlspec
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-18 10:41 UTC by contributor
Modified: 2014-06-25 20:09 UTC (History)
3 users (show)

See Also:


Attachments

Description contributor 2014-06-18 10:41:32 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/links.html
Multipage: http://www.whatwg.org/C#links-created-by-a-and-area-elements
Complete: http://www.whatwg.org/c#links-created-by-a-and-area-elements
Referrer: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html

Comment:
The href attribute on a and area elements must have a value that  is a "valid
URL potentially surrounded by spaces" or 
"valid URL potentially containing spaces" if attribute value is in
double-quoted or single-quoted state.

A string is a "valid URL potentially containing spaces" if, after escaping
Unicode Character 'SPACE' (U+0020) in it, it is a valid URL.

Posted from: 46.119.142.166
User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0
Comment 1 andrij 2014-06-18 11:05:49 UTC
The URL code points (http://url.spec.whatwg.org/#url-code-points) NOT INCLUDE Unicode code point U+0020.
But U+0020 is very often used in the names and SHOULD BE ALLOWED on 'a' elements in double-quoted or single-quoted state.
If href attribute in double-quoted or single-quoted state on 'a' elements have containing spaces value, then html-validator reports error.
Most user agents work very well with it on this time.
Comment 2 Ian 'Hixie' Hickson 2014-06-18 21:13:22 UTC
I'm confused. What's the change you want made to the spec? Do you have a test showing how browsers handle the case you want allowed?
Comment 3 andrij 2014-06-19 06:32:58 UTC
The change is:
'The href attribute on 'a' and 'area' elements must have a value that  is a "valid
URL potentially surrounded by spaces" or 
"valid URL potentially containing spaces" if attribute value is in
double-quoted or single-quoted state.
A string is a "valid URL potentially containing spaces" if, after escaping
Unicode Character 'SPACE' (U+0020) in it, it is a valid URL.'

I have a living example.
Go to page http://proridne.com/
Downstairs there is a link to the page 'Етюд для бандури № 18'
In html it's:
<li><a href="/content/книги/Олег Курінний/Етюди для бандури/Етюд для бандури № 18.html" title="Етюд 18 — Олег Курінний">Етюд для бандури № 18</a></li>

If you will click on it Firefox GET will be:

curl 'http://proridne.com/content/%D0%BA%D0%BD%D0%B8%D0%B3%D0%B8/%D0%9E%D0%BB%D0%B5%D0%B3%20%D0%9A%D1%83%D1%80%D1%96%D0%BD%D0%BD%D0%B8%D0%B9/%D0%95%D1%82%D1%8E%D0%B4%D0%B8%20%D0%B4%D0%BB%D1%8F%20%D0%B1%D0%B0%D0%BD%D0%B4%D1%83%D1%80%D0%B8/%D0%95%D1%82%D1%8E%D0%B4%20%D0%B4%D0%BB%D1%8F%20%D0%B1%D0%B0%D0%BD%D0%B4%D1%83%D1%80%D0%B8%20%E2%84%96%2018.html' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: uk,en-us;q=0.8,en;q=0.5,ru;q=0.3' -H 'Connection: keep-alive' -H 'Cookie: _ga=GA1.2.211979494.1395091749; _ym_visorc_197771=w' -H 'DNT: 1' -H 'Host: proridne.com' -H 'Referer: http://proridne.com/' -H 'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0'

As you can see, spaces (U+0020) are replaced by %20  as well as other non-Latin characters are escaped.
The same is in other modern browsers.
Comment 4 Ian 'Hixie' Hickson 2014-06-20 21:29:53 UTC
I still don't understand. Why do you think what the spec says disagrees with what browsers do? As far as I can tell, the spec requires that this:

   <base href="http://example.com">
   <a href=" A B "> ... </a>

...be resolved to:

   http://example.com/A%20B

...which is what browsers do.

Note that this has nothing to do with the text that says "The href attribute on a and area elements must have a value that is a valid URL potentially surrounded by spaces", which just means that you can put spaces before and after your URL in the attribute. It doesn't say anything about what browsers do with spaces in the attribute, especially not what they do with spaces in the middle of the attribute.
Comment 5 andrij 2014-06-21 14:02:40 UTC
<a href=" A B "> ... </a> is NOT VALID URL because Unicode Character 'SPACE' (U+0020) is out of the URL code points http://url.spec.whatwg.org/#url-code-points

HTML validator will be report error on it 

See example
http://validator.w3.org/check?uri=http%3A%2F%2Fvolyn.su%2Fexample.html&charset=%28detect+automatically%29&doctype=Inline&ss=1&outline=1&group=0&verbose=1&user-agent=W3C_Validator%2F1.3+http%3A%2F%2Fvalidator.w3.org%2Fservices

If you want to use a link with a space in it, then your link will not be correct (NOT VALID URL).
Your link can be correct (VALID URL) only if you replace the spaces with '%20'

<a href=" A B "> ... </a> NOT VALID URL
<a href=" A%20B "> ... </a> VALID URL

If spec will be say: 

'The href attribute on 'a' and 'area' elements must have a value that  is a "valid URL potentially surrounded by spaces" or "valid URL potentially containing spaces" if attribute value is in double-quoted or single-quoted state.
A string is a "valid URL potentially containing spaces" if, after escaping
Unicode Character 'SPACE' (U+0020) in it, it is a valid URL.'

then <a href=" A B "> ... </a> will be VALID URL
Comment 6 Ian 'Hixie' Hickson 2014-06-23 20:39:55 UTC
There's nothing the HTML spec can say which would make "A B" a valid URL; the URL spec is what defines what's a valid URL, and it says spaces aren't valid.

   http://url.spec.whatwg.org/

Is what you're asking for to change the URL syntax to allow spaces in them?
Comment 7 andrij 2014-06-24 04:57:44 UTC
http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html#syntax-attribute-value

The attribute value must not contain a space, if you use the unquoted attribute value syntax.
But space is allowed if you use Single-quoted or Double-quoted attribute value syntax.

Prohibition of the use of spaces in href attribute value is appropriate in the case of the unquoted attribute value syntax not Single-quoted or Double-quoted attribute value syntax

http://url.spec.whatwg.org/ not allowed space in URL
But in my examples, you can see that the addresses with spaces works fine in modern browsers, though it contradict the specifications

Therefore if HTML spec says "The href attribute on a and area elements must have a value that is a valid URL potentially surrounded by spaces" then it may says in addition to the above "The href attribute on a and area elements in double-quoted or single-quoted state must have a value that is a valid URL potentially containing by spaces (valid URL potentially containing by spaces - if after escaping spaces in it, it is valid Url)"
Then the HTML code will be valid.
And people can use natural addresses "A B C" instead "A%20B%20B"
Comment 8 Ian 'Hixie' Hickson 2014-06-25 15:34:17 UTC
The text you quote from the HTML spec is about the general attribute syntax, it's subsumed by the URL syntax rules. (You can have spaces in quoted attributes only if the content of the attribute's specific syntax allows it as well.)

Anyway it sounds like what you're saying is that you want URLs to be allowed to contain spaces, which isn't up to HTML. Reassigning to the URL spec.
Comment 9 Anne 2014-06-25 17:29:12 UTC
The URL syntax is intentionally constrained as to make URLs more portable without having to adjust them for different environments.
Comment 10 andrij 2014-06-25 19:51:50 UTC
HTML Specification can only indicate what should be the value of the href attribute, not the URL spec.
And only html specification must specify a user agent to convert (escape) the spaces in the href attribute value, that value was valid URL
Comment 11 Anne 2014-06-25 19:56:18 UTC
Ian already indicated HTML defers to URL on this topic.
Comment 12 andrij 2014-06-25 20:09:03 UTC
"A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid URL."
"When a user agent is to strip leading and trailing whitespace from a string, the user agent must remove all space characters that are at the start or end of the string."  - IT's NOT URL SPEC SAYS. It says HTML spec.

html attribute value may be not URL, it may be valid URL after stripping whitespace - it's not URL spec

so user agent may not only strip whitespace but also escape spaces - and it must say only HTML spec not URL spec

feel the difference