This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
I used the w3c validator before to validate my php pages, and they validated correctly, so I added the w3c and css validator logos (css validated ok as well). Clicking on the w3c html validator logo on my php validated page, sent me to the w3c site with a no errors found validation result. But now, yesterday and today I am trying to validate my php pages again, just by clicking the w3c logo, or by entering the url on the w3c site, and I can't get the result as before. I get a yellow message starting with: "Sorry, I am unable to validate this document because its content type is <my site's url>, which is not currently supported by this service." I tried the absolute url (without file names, only directories), and the url with the index.php file name. Why? Aren't php pages supported anymore? I just want to validate the html, just like before, and before it worked all right. Does w3c validator have a new version that doesn't support php?
Are you sure the message stated "... because its content type is <your site's url>" ? Your Web server sends a "content-type" header specifying what kind of document it is serving, and in the case of html (whether the html is static or generated by php) should have a content-type of either text/html or application/xhtml+xml, and in any case it should *never* have the url of a site in there. Please clarify by either copying the full message or giving the url of the site, so that we can diagnose further (or both), but my preliminary diagnosis is that there is an error on your side (possibly a bad header() call in your php).
Here is a line that is in all my php files: <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> So, you can see that the content type is specified as text/html. And, of course, this line doesn't have the url of a site in it, I try to follow all the w3c recomendations as much as possible. The error being on my side is out of the question (and I currently don't have any header() calls in my php code to send the content type), because my php pages validated well before (some weeks ago), and I didn't change or updated my site too much. Only in the previous couple of days, I couldn't validate as before due to the error I get. Could it be because my free web hosting service recently changed all users' data to a new, bigger, better, faster and more reliable server? I'm using http://www.100webspace.com as web hosting for my php sites. Now I'll try to give you all the information I can. The full error message, with a yellow background, that I get is: "Sorry, I am unable to validate this document because its content type is http://nunoanjos.freeprohost.com/index.php?pagina=1, which is not currently supported by this service. The Content-Type field is sent by your web server (or web browser if you use the file upload interface) and depends on its configuration. Commonly, web servers will have a mapping of filename extensions (such as ".html") to MIME Content-Type values (such as text/html). That you recieved this message can mean that your server is not configured correctly, that your file does not have the correct filename extension, or that you are attempting to validate a file type that we do not support yet. In the latter case you should let us know that you need us to support that content type (please include all relevant details, including the URL to the standards document defining the content type) using the instructions on the Feedback Page." I get this message with index.php?pagina=1 in the address, just index.php, or no file name. Before, when it worked all right, I could get a normal validation without errors with any address: index.php, file name with a query string, or just the ...prohost.com/ (directory). My site's address is http://nunoanjos.freeprohost.com, you can check any page source. Previously, maybe a few weeks ago (back then, in the old and overloaded server), all my site's php pages validated well, as I told you. Maybe it's their new server that isn't properly configurated to let php files be validated. If this is the case, I can use their support service (which seems very good) to ask them about it.
FWIW, http://nunoanjos.freeprohost.com/index.php?pagina=1 responds with only the entity body, no response line, no headers...
Yes, Björn, it seems to respond without sending the content type header to the browser, but it shouldn't, I have a meta tag to choose the content type inside the <head></head> tags of every page. Can anyone suggest a possible cause and a way to fix it? Thanks.
Thanks Olivier and Björn for your help. Now, some hours later, the validation is working well again. The server is new, and they must be testing and changing it, that's why I got the error. It wasn't related to W3C and its validators at all. I'm marking this bug as resolved and invalid, since it is not a W3C bug, but my web hosting service's bug.
I am tempted to re-open the issue, if only to understand what happened in the validator's code that would result in such a weird content-type detection. Granted, the HTTP server on the other hand was/is not playing by the book, and the markup validator is not an HTTP validator, but the error message is, at best, misleading.
Since you are re-opening the bug, and I am also curious about this issue, I can provide you with more (and new) information. By the time I started this bug, 2004-10-20 01:17, I was getting the error message with a yellow background. From that message, the possible explanation is: "That you recieved this message can mean that your server is not configured correctly, that your file does not have the correct filename extension, or that you are attempting to validate a file type that we do not support yet." From these 3 options, I think the last one (file type not yet supported) was never valid for this case, so that leaves the first 2 possible. And the last option was never possible, because I validated PHP pages before, and it worked ok. What is validated is the HTML code generated by the PHP scripts, along with the static HTML code. This also explains the second option, the file not having a correct filename extension. I validated PHP files before, so this is not possible. Therefore, I think that only "your server is not configured correctly" could be the correct explanation. Between 2004-10-20 12:48 and 2004-10-20 20:42, I tried to validate the site again a few times. And, once, I got a new error. I didn't copy the text, or took a screenshot for example, but from what I can remember, it looked more or less like this: Error 500: Bad chunk-size in HTTP response: <line number (I don't remember which)> Some hours later, shortly before my 2004-10-20 20:42 conclusion, I tried the validation again, and, to my surprise, the validation is normal again, and all my pages validate without errors (meaning W3C compliance)... Some days ago, I couldn't even have access to my pages, there are news on my web hosting provider site, http://www.100webspace.com/, about "DDoS attacks towards everywebhost.com and mybesthost.com". The server is new, and they must be configuring it. So, this could be the best cause for the bug firstly reported. Thank you, Olivier, for your interest.
Now that I think better about it, how can it be possible? At any time, the site was loaded and correctly shown by a browser. When I saw any page source, the HTML code is complete, seeming well-interpreted by the browser, and I think the <head></head> section was processed as well, because the <title></title> tags were executed (looking at the top of the browser window, there was the title being shown). And still, and this same time, the W3C validator gave me two different kinds of errors, being unable to validate my site. That is what I don't understand. If the browser was getting the full HTML code and processing it correctly, shouldn't the W3C validator do the same? I don't know exactly how the validator works, but I guess it behaves like the browser, parsing line by line, and trying to find known tags to check if they comply. If the W3C validator received the same HTML code as the browser, how can the browser execute it well and the W3C be unable to parse it? Maybe, with the information given by me, the W3C validator authors could understand better than I why this happened.
Sorry, when I mentioned this error "Error 500: Bad chunk-size in HTTP response: <line number (I don't remember which)>", there was also something about EOF (end of file), maybe EOF not expected or something like that.
Probably won't make it for 0.7.0, targeting 1.0, and making low priority (edge case).
I thought this might help. The other day, I was writing my very simple http server and I got the same problem as described here. The weird thing was my brower has no problem showing a page from my server but the validator says the content type is my url and is not valid just like what the poster says. Anyway, after spending hours, it turned out the validator sends a http header with connection: close, and that triggers my server to close the connection prematurely (even without sending anything). This is a bug in my server but in many cases, http clients expect the server to keep the connection so this was a kind of blind case to me. True, the error message needs to be more helpful for broken servers.
This is caused by parsing logic blowing up somewhere in LWP and propogating up to our code. I'd suggest closing this as INVALID as it isn't really our bug (neither cause nor symptom is in our code). Possibly this could be dealt with in our code iff we add some level of HTTP checking, but meanwhile it's not really something we want to deal with. Reassigning to Olivier; I'm not touching this one so I'll leave it to you to make a call on what to do with the Bug.
+1 for closing as invalid.
Closing, will see if I can report this to the LWP bug database.