This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 922 - Wrong content-type detected with broken HTTP server
Summary: Wrong content-type detected with broken HTTP server
Status: RESOLVED INVALID
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.6.7
Hardware: PC other
: P5 normal
Target Milestone: 1.0
Assignee: Olivier Thereaux
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-10-20 01:17 UTC by Nuno
Modified: 2004-12-02 14:56 UTC (History)
0 users

See Also:


Attachments

Description Nuno 2004-10-20 01:17:58 UTC
I used the w3c validator before to validate my php pages, and they validated
correctly, so I added the w3c and css validator logos (css validated ok as
well). Clicking on the w3c html validator logo on my php validated page, sent me
to the w3c site with a no errors found validation result.

But now, yesterday and today I am trying to validate my php pages again, just by
clicking the w3c logo, or by entering the url on the w3c site, and I can't get
the result as before. I get a yellow message starting with: "Sorry, I am unable
to validate this document because its content type is <my site's url>, which is
not currently supported by this service." I tried the absolute url (without file
names, only directories), and the url with the index.php file name.

Why? Aren't php pages supported anymore? I just want to validate the html, just
like before, and before it worked all right. Does w3c validator have a new
version that doesn't support php?
Comment 1 Olivier Thereaux 2004-10-20 02:43:14 UTC
Are you sure the message stated "... because its content type is <your site's url>" ?

Your Web server sends a "content-type" header specifying what kind of document it is serving, and in 
the case of html (whether the html is static or generated by php) should have a content-type of either 
text/html or application/xhtml+xml, and in any case it should *never* have the url of a site in there.

Please clarify by either copying the full message or giving the url of the site, so that we can diagnose 
further (or both), but my preliminary diagnosis is that there is an error on your side (possibly a bad 
header() call in your php).
Comment 2 Nuno 2004-10-20 11:58:05 UTC
Here is a line that is in all my php files:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

So, you can see that the content type is specified as text/html. And, of course,
this line doesn't have the url of a site in it, I try to follow all the w3c
recomendations as much as possible.

The error being on my side is out of the question (and I currently don't have
any header() calls in my php code to send the content type), because my php
pages validated well before (some weeks ago), and I didn't change or updated my
site too much. Only in the previous couple of days, I couldn't validate as
before due to the error I get. Could it be because my free web hosting service
recently changed all users' data to a new, bigger, better, faster and more
reliable server? I'm using http://www.100webspace.com as web hosting for my php
sites.

Now I'll try to give you all the information I can. The full error message, with
a yellow background, that I get is:

"Sorry, I am unable to validate this document because its content type is
http://nunoanjos.freeprohost.com/index.php?pagina=1, which is not currently
supported by this service.

The Content-Type field is sent by your web server (or web browser if you use the
file upload interface) and depends on its configuration. Commonly, web servers
will have a mapping of filename extensions (such as ".html") to MIME
Content-Type values (such as text/html).

That you recieved this message can mean that your server is not configured
correctly, that your file does not have the correct filename extension, or that
you are attempting to validate a file type that we do not support yet. In the
latter case you should let us know that you need us to support that content type
(please include all relevant details, including the URL to the standards
document defining the content type) using the instructions on the Feedback Page."

I get this message with index.php?pagina=1 in the address, just index.php, or no
file name. Before, when it worked all right, I could get a normal validation
without errors with any address: index.php, file name with a query string, or
just the ...prohost.com/ (directory).

My site's address is http://nunoanjos.freeprohost.com, you can check any page
source. Previously, maybe a few weeks ago (back then, in the old and overloaded
server), all my site's php pages validated well, as I told you. Maybe it's their
new server that isn't properly configurated to let php files be validated. If
this is the case, I can use their support service (which seems very good) to ask
them about it.
Comment 3 Bj 2004-10-20 12:31:39 UTC
FWIW, http://nunoanjos.freeprohost.com/index.php?pagina=1 responds with only 
the entity body, no response line, no headers...
Comment 4 Nuno 2004-10-20 12:48:50 UTC
Yes, Björn, it seems to respond without sending the content type header to the
browser, but it shouldn't, I have a meta tag to choose the content type inside
the <head></head> tags of every page. Can anyone suggest a possible cause and a
way to fix it? Thanks.
Comment 5 Nuno 2004-10-20 20:42:40 UTC
Thanks Olivier and Björn for your help. Now, some hours later, the validation is
working well again.

The server is new, and they must be testing and changing it, that's why I got
the error. It wasn't related to W3C and its validators at all.

I'm marking this bug as resolved and invalid, since it is not a W3C bug, but my
web hosting service's bug.
Comment 6 Olivier Thereaux 2004-10-21 03:24:40 UTC
I am tempted to re-open the issue, if only to understand what happened in the validator's code that 
would result in such a weird content-type detection. 

Granted, the HTTP server on the other hand was/is not playing by the book, and the markup validator is 
not an HTTP validator, but the error message is, at best, misleading.
Comment 7 Nuno 2004-10-21 11:35:56 UTC
Since you are re-opening the bug, and I am also curious about this issue, I can
provide you with more (and new) information.

By the time I started this bug, 2004-10-20 01:17, I was getting the error
message with a yellow background. From that message, the possible explanation is:

"That you recieved this message can mean that your server is not configured
correctly, that your file does not have the correct filename extension, or that
you are attempting to validate a file type that we do not support yet."

From these 3 options, I think the last one (file type not yet supported) was
never valid for this case, so that leaves the first 2 possible.

And the last option was never possible, because I validated PHP pages before,
and it worked ok. What is validated is the HTML code generated by the PHP
scripts, along with the static HTML code.

This also explains the second option, the file not having a correct filename
extension. I validated PHP files before, so this is not possible.

Therefore, I think that only "your server is not configured correctly" could be
the correct explanation.

Between 2004-10-20 12:48 and 2004-10-20 20:42, I tried to validate the site
again a few times. And, once, I got a new error. I didn't copy the text, or took
a screenshot for example, but from what I can remember, it looked more or less
like this:

Error 500: Bad chunk-size in HTTP response: <line number (I don't remember which)>

Some hours later, shortly before my 2004-10-20 20:42 conclusion, I tried the
validation again, and, to my surprise, the validation is normal again, and all
my pages validate without errors (meaning W3C compliance)...

Some days ago, I couldn't even have access to my pages, there are news on my web
hosting provider site, http://www.100webspace.com/, about "DDoS attacks towards
everywebhost.com and mybesthost.com". The server is new, and they must be
configuring it. So, this could be the best cause for the bug firstly reported.
Thank you, Olivier, for your interest.
Comment 8 Nuno 2004-10-21 11:55:12 UTC
Now that I think better about it, how can it be possible? At any time, the site
was loaded and correctly shown by a browser. When I saw any page source, the
HTML code is complete, seeming well-interpreted by the browser, and I think the
<head></head> section was processed as well, because the <title></title> tags
were executed (looking at the top of the browser window, there was the title
being shown).

And still, and this same time, the W3C validator gave me two different kinds of
errors, being unable to validate my site. That is what I don't understand. If
the browser was getting the full HTML code and processing it correctly,
shouldn't the W3C validator do the same? I don't know exactly how the validator
works, but I guess it behaves like the browser, parsing line by line, and trying
to find known tags to check if they comply.

If the W3C validator received the same HTML code as the browser, how can the
browser execute it well and the W3C be unable to parse it? Maybe, with the
information given by me, the W3C validator authors could understand better than
I why this happened.
Comment 9 Nuno 2004-10-21 11:58:58 UTC
Sorry, when I mentioned this error "Error 500: Bad chunk-size in HTTP response:
<line number (I don't remember which)>", there was also something about EOF (end
of file), maybe EOF not expected or something like that.
Comment 10 Olivier Thereaux 2004-11-16 01:18:34 UTC
Probably won't make it for 0.7.0, targeting 1.0, and making low priority (edge case).
Comment 11 Takuya Murata 2004-11-21 05:52:03 UTC
I thought this might help. The other day, I was writing my very simple http server and I got the same 
problem as described here. The weird thing was my brower has no problem showing a page from my 
server but the validator says the content type is my url and is not valid just like what the poster says.

Anyway, after spending hours, it turned out the validator sends a http header with connection: close, 
and that triggers my server to close the connection prematurely (even without sending anything). This 
is a bug in my server but in many cases, http clients expect the server to keep the connection so this 
was a kind of blind case to me.

True, the error message needs to be more helpful for broken servers.
Comment 12 Terje Bless 2004-11-28 16:13:51 UTC
This is caused by parsing logic blowing up somewhere in LWP and propogating
up to our code. I'd suggest closing this as INVALID as it isn't really our
bug (neither cause nor symptom is in our code).

Possibly this could be dealt with in our code iff we add some level of HTTP
checking, but meanwhile it's not really something we want to deal with.


Reassigning to Olivier; I'm not touching this one so I'll leave it to you
to make a call on what to do with the Bug.
Comment 13 Bj 2004-11-28 16:20:42 UTC
+1 for closing as invalid.
Comment 14 Olivier Thereaux 2004-12-02 14:56:04 UTC
Closing, will see if I can report this to the LWP bug database.