This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6534 - validator throws error on valid html
Summary: validator throws error on valid html
Status: RESOLVED INVALID
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: HEAD
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL: http://validator.w3.org/check
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-06 03:43 UTC by Austin Guthals
Modified: 2009-02-07 00:31 UTC (History)
2 users (show)

See Also:


Attachments

Description Austin Guthals 2009-02-06 03:43:07 UTC
I am using the latest verion of the validator

The problem occurs when I have a link that passes paramters to a server page

For example consider the following:

______________________________________________________________

<a href="Test.aspx?param1=0&param2=131231234123&param3=fsdf221"><img src="AutoImage.aspx?a1=0&ss2=131231234123&h1=fsdf221" /></a>

______________________________________________________________

The above is perfectly valid html.  According to w3c standards on page http://www.w3.org/TR/html4/interact/forms.html#successful-controls 

  "The control names/values are listed in the order they appear in the   document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'"

The validator throws an error.

The validator wants the '&' in the href and src to be encoded as &amp; 

This is not correct because the '&' is used by the server to separate the named paramters, any value containg an '&' should be encoded with the proper ascii value %26 not &amp;

This error is all wrong.  If you use &amp; in your url links, all broweser will pass the &amp; to the server and the server is expecting '&' to separate named parameters.
Comment 1 Olivier Thereaux 2009-02-06 15:06:34 UTC
Hello,

This is a FAQ:
http://validator.w3.org/docs/help.html#faq-ampersand
Comment 2 Austin Guthals 2009-02-06 18:35:22 UTC
I read the FAQ.

The Faq is not correct either.

<a href="foo.cgi?chapter=1&section=2&copy=3&lang=en">Test</a>

is perfectly valid because &copy will not get translated to the copyright sign in the browsers url.  You only need to encode ampersands if they will be displayed in the content of your page.

Here is how it would work

<a href="foo.cgi?chapter=1&section=2&copy=3&lang=en">foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en</a>

Here the characters that will be painted in the content of the page must be encoded, however the url does not need to be encoded.

If I am mistaken, please give me an example of this link 


<a href="foo.cgi?chapter=1&section=2&copy=3&lang=en">foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en</a>

that validates with your checker and that works with a web server that is expecting '&' to delimit named parameters.
Comment 3 Olivier Thereaux 2009-02-06 20:00:41 UTC
(In reply to comment #2)
> The Faq is not correct either.
> <a href="foo.cgi?chapter=1&section=2&copy=3&lang=en">Test</a>
> is perfectly valid 

Saying it is valid doesn't make it so. ampersands are a bother for anyone writing HTML. We all deal with it.

> however the url does not need to be encoded.

yes, it does. Look for "ampersand url" in your favorite search engine and you'll get the explanation in thousands of ways.

> If I am mistaken, please give me an example[...]

 <a href="foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en">...
Comment 4 Austin Guthals 2009-02-06 21:58:16 UTC
After further testing I have found that encoding ampersands in the href attribute of an a element works correctly in most browsers, however other elements and attributes do not work so well.

<a href="foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en">...</a>

will properly encode &amp; as '&' in the url.

However the following example does not get encoded correctly with IE7 or Firefoz

<img src="foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en"></img>

The above example validates in your checker, however all modern browsers will not properly translate the &amp; into '&' and what happens is the following gets posted to the server

foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en

Since servers are looking for &, you end up with the following paramters

chapter = 1
amp;section=2
amp;copy=3
amp;lang=en

As you can see this is incorrect.  Your validator should not throw errors if there are '&' in the path of an image because browsers will send the url verbatim to the server, there is no encoding going on.

I can create a sample web page to demonstrate if you do not understand what is going on.

Basically I am streaming images from the server and the w3c validator is throwing bogus errors in which there are no workarounds or fixes to.
Comment 5 Olivier Thereaux 2009-02-06 22:44:17 UTC
(In reply to comment #4)
> <img src="foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en"></img>

Tried <img src="" ... /> ?
That's the actual XHTML syntax.



> Basically I am streaming images from the server and the w3c validator is
> throwing bogus errors in which there are no workarounds or fixes to.

An old favorite quote of mine from the validator's mailing list is:
Nothing wrong with the validator here, it just knows HTML better than you do
The quote is not meant to be patronizing, but whether we like or not these errors are not "bogus".  

I am not sure why your script/browser is not working in this particular instance, but I can assure you than when you write HTML, ampersands have to be escaped. Always.
Comment 6 Austin Guthals 2009-02-07 00:10:51 UTC
After performing even more tests I have found that the documentation on your site is correct.  When I use simple pure html cases I can demonstrate the proper functionality.

The reason I was having trouble was because I used a server based contral that rendered the html for me, and this control was encoding my url in a way that I was not aware of.  Here is what it was doing. 

 <img src="foo.cgi?chapter=1&amp;section=2&amp;copy=3&amp;lang=en"/>

was actually getting rendered as


<img src="foo.cgi?chapter=1&amp%3Bsection=2&amp%3Bcopy=3&amp%3Blang=en">


Sorry for troubling you.
Comment 7 Olivier Thereaux 2009-02-07 00:31:36 UTC
(In reply to comment #6)
> The reason I was having trouble was because I used a server based contral that
> rendered the html for me, and this control was encoding my url in a way that I
> was not aware of.  

How did you fix this? That might be of interest to other asp (.net?) users running into the issue.