This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 785 - validator does not supply reasonable Accept header by default
Summary: validator does not supply reasonable Accept header by default
Status: RESOLVED WONTFIX
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.7.0
Hardware: Other other
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
: 5330 5432 5970 9416 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-06-02 15:32 UTC by John Belmonte
Modified: 2011-08-23 21:34 UTC (History)
9 users (show)

See Also:


Attachments
Forwards Accept, Accept-Charset and Accept-Language headers in a referer request (4.90 KB, patch)
2008-06-15 07:25 UTC, Etienne Miret
Details
Forwards Accept and Accept-Language headers in a referer request (4.15 KB, patch)
2008-06-15 07:29 UTC, Etienne Miret
Details

Description John Belmonte 2004-06-02 15:32:36 UTC
By default, the validator should request resources as if it were an HTML 
browser.  That means placing weight on HTML types such as text/html and 
application/xhtml+xml in the Accept header. 
 
Bug #18 is not a solution to this problem.  Most users don't use the advanced 
validator, and most users are expecting the validator to retrieve documents 
just like their standard-conforming browser.  (Yes, IE is not in this group.)
Comment 1 Terje Bless 2004-09-01 10:38:23 UTC
Setting blocker on Bug #18 -- resolving that is necessary for resolving this bug
-- and accepting it.
Comment 2 Derek Young 2006-05-25 22:04:22 UTC
Bug #18 has a working solution sort of, that can handle this.
Comment 3 Olivier Thereaux 2007-02-22 06:25:30 UTC
Terje,

I disagree that this bug depends on Bug #18, or that solving Bug #18 would solve this. While Bug #18 might be interesting to implement, there will still be a need for a default. 

At the moment default Accept-* headers are those by default for:
new HTTP::Request(GET => $uri);

Two possibilities:
- forward those of the requesting UA
- hardcode a default for all the media types supported by the validator

My preference goes to the latter.

Thoughts?

(P.S Terje, if you don't think you'll be working on this, please reassign to me. Thanks.)
Comment 4 Terje Bless 2007-02-22 07:06:24 UTC
Resolving Bug #18 would add some of the code required for this bug. You may prefer to think of the dependancy being the other way around, but either way these two bugs are related.

I don't believe the Validator should send anything in its default Accept headers. It should probably offer an option to proxy the client browser's Accept headers as well as the ability to set it with more specificity.

This may mean that this bug should be closed WONTFIX (it's asking for a default value) and Bug #18 is where this actually gets implemented.
Comment 5 Sam Ruby 2007-04-19 01:57:12 UTC
I don't think that forwarding on the Accept headers is a good idea.  Having Opera users reporting that a page is valid when IE users see errors (or vice versa) would be very confusing.

That being said, the beta validator does have specific MIME types it supports (try validating a page served as text/plain), and it should provide this information to the server.  Or at least provide an option to provide the information.

Many people conditionally serve XHTML with the application/xhtml+xml mime type as there are browsers (most notably Lynx and IE) which do not support this MIME type.
Comment 6 Olivier Thereaux 2007-04-19 10:27:16 UTC
(In reply to comment #5)
> That being said, the beta validator does have specific MIME types it supports
> (try validating a page served as text/plain),

Indeed it does. There are also a few media types the markup validator cannot handle directly, but which it can "pass" to other validators (at least text/css, and if it's not the case, I think it should pass atom and so on to the feed validator).
 
> Many people conditionally serve XHTML with the application/xhtml+xml mime type
> as there are browsers (most notably Lynx and IE) which do not support this MIME
> type.

Do you know whether most of such techniques have specific URIs for the  application/xhtml+xml and text/html representation?

Comment 7 Sam Ruby 2007-04-19 12:11:08 UTC
(In reply to comment #6)
> (In reply to comment #5)
> 
> > Many people conditionally serve XHTML with the application/xhtml+xml mime type
> > as there are browsers (most notably Lynx and IE) which do not support this MIME
> > type.
> 
> Do you know whether most of such techniques have specific URIs for the 
> application/xhtml+xml and text/html representation?

Most of the ones that I'm aware of use a variation of  http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html, so no.

Including me.  http://validator-test.w3.org/check?uri=http%3A%2F%2Fwww.intertwingly.net%2Fblog%2F&charset=&doctype=&ss=1&group=0

Note: I'm *not* expecting the validator at this point to support HTML5.  I was, however, expecting it to support XML.

Comment 8 Olivier Thereaux 2007-04-19 12:18:02 UTC
(In reply to comment #7)
> > Do you know whether most of such techniques have specific URIs for the 
> > application/xhtml+xml and text/html representation?
> 
> Most of the ones that I'm aware of use a variation of 
> http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html, so no.

Thanks, that satisfies my curiosity.
 
> Note: I'm *not* expecting the validator at this point to support HTML5.  I was,
> however, expecting it to support XML.

It supports documents written in xml-based languages, provided they are properly served as XML and that they declare something to validate against...
Comment 9 PatomaS 2007-04-20 09:09:51 UTC
Hi 

(In reply to comment #8)
> (In reply to comment #7)
> ...
> 
> > Note: I'm *not* expecting the validator at this point to support HTML5.  I was,
> > however, expecting it to support XML.
> 
> It supports documents written in xml-based languages, provided they are
> properly served as XML and that they declare something to validate against...
> 

How is it possible to validate a document properly served if the validator does not have such option available?

Or, do you expect that all the pages made with xhtml should be served with application/xhtml+xml with out detecting if the browser can or can not manage such a thing?

If this is what you propose as a solution, it is a very bad idea, first, because obviously it makes a lot of sites worst than invisible for the majority of the users for already mentioned and well know behaviours of Explorer. Second, because it contradicts the accesibility guidelines and it's core essence, make sites usable and accesible for every one or as many people as possible.

This is something that has to be solved properly since it a bug and it has been around for quite some time and is a major flaw from my point of view for the validator and it's team.

Yo have done a very nice work and a pice of art with this tool, but this kind of spots are still things to work on.

Bye, and keep the good work.
Comment 10 Olivier Thereaux 2007-04-20 09:15:22 UTC
(In reply to comment #9)
> Or, do you expect that all the pages made with xhtml should be served with
> application/xhtml+xml with out detecting if the browser can or can not manage
> such a thing?

No, XHTML 1.0 documents served as text/html validate just fine. try it :)

Comment 11 PatomaS 2007-04-20 14:47:07 UTC
Hi people

(In reply to comment #10)
> (In reply to comment #9)
> > Or, do you expect that all the pages made with xhtml should be served with
> > application/xhtml+xml with out detecting if the browser can or can not manage
> > such a thing?
> 
> No, XHTML 1.0 documents served as text/html validate just fine. try it :)
> 

Well I am not sure how to interpret that answer, may be is a joke or may be it is serious... just in case, I will not feel bad for it...

But...

As we can read in the XHTML Media Types document from 1 August 2002 (http://www.w3.org/TR/xhtml-media-types/), the w3c states this:
"This document summarizes the best current practice for using various Internet media types for serving various XHTML Family documents. In summary, application/xhtml+xml' SHOULD be used for XHTML Family documents, and the use of 'text/html' SHOULD be limited to HTML-compatible XHTML 1.0 documents. 'application/xml' and 'text/xml' MAY also be used, but whenever appropriate, 'application/xhtml+xml' SHOULD be used rather than those generic XML media types."

So, it means, among other interpretations, that the validator should offer a fair option to validate a document with the application/xhtml+xml media type through some kind of navigation option in the page or shouldn'b any kind of reminder that the document served as text/html is not correct application/xhtml+xml.

And of course it only applies to xhtml 1.0, but if you are validating an xhtml 1.1 document, ther is no possible interpretation to use text/html. Execept, the fact that the documents should be served that way for the Explorer.

So i think that the point is still valid, it means, the request filled here.

Bye people
Comment 12 David Dorward 2007-04-24 11:16:36 UTC
> So, it means, among other interpretations, that the validator should offer

I can't see anything there which describes what CLIENTS should do. Only servers.

> shouldn'b any kind of reminder that the document served as 
> text/html is not correct application/xhtml+xml.

I've just tested an XHTML 1.0 document with the validator. It gave no complaints with text/html or application/xhtml+xml

It does complain when a XHTML 1.1 document is served as text/html, but the documentation is pretty clear when it says that you SHOULD NOT do that.

If the problem is caused by you detecting that a client doesn't support XHTML and then serving XHTML 1.1 as text/html despite the specification, then don't do that. If you are doing that, then it is highly unlikely that you are getting any of the possible benefits of client side XHTML, so you might as well stick to HTML. Even if you continue using XHTML then its relatively trivial to use XSLT to output HTML 4.01 or XHTML 1.0 from an XHTML 1.1 document (and since the extra features added by XHTML aren't available to text/html clients, this is unlikely to cause problems).

> And of course it only applies to xhtml 1.0, but if you are validating an 
> xhtml 1.1 document, ther is no possible interpretation to use text/html.
> Execept, the fact that the documents should be served that way for the 
> Explorer.

Needing to support clients that do not support a standard is usually a good reason to use a different standard. It isn't usually a good reason to violate the specification.
Comment 13 PatomaS 2007-04-24 12:19:54 UTC
Hi

(In reply to comment #12)
> > So, it means, among other interpretations, that the validator should offer
> 
> I can't see anything there which describes what CLIENTS should do. Only
> servers.

     Well, i have to say that this idea is not clear for me, so may be you can explain yourself a bit.

> > shouldn'b any kind of reminder that the document served as 
> > text/html is not correct application/xhtml+xml.
> 
> I've just tested an XHTML 1.0 document with the validator. It gave no
> complaints with text/html or application/xhtml+xml

     This is just arguing since i already mentioned that option a few lines below.

> It does complain when a XHTML 1.1 document is served as text/html, but the
> documentation is pretty clear when it says that you SHOULD NOT do that.
> 
> If the problem is caused by you detecting that a client doesn't support XHTML
> and then serving XHTML 1.1 as text/html despite the specification, then don't
> do that. If you are doing that, then it is highly unlikely that you are getting
> any of the possible benefits of client side XHTML, so you might as well stick
> to HTML. Even if you continue using XHTML then its relatively trivial to use
> XSLT to output HTML 4.01 or XHTML 1.0 from an XHTML 1.1 document (and since the
> extra features added by XHTML aren't available to text/html clients, this is
> unlikely to cause problems).

Well, if i serve the document in a wrong way, then it will be my fault. But if you ever have made and application, you should know that should exist any kind of default state or information, so my default state is to serve the same document xhtml 1.1 document with out the <?xml version="1.0" encoding="utf-8"?> line and served with the text/html header. It is true that that combination is not valid, but i still have a valid document to serve, at least, to the browsers that support it properly.

So my question/proposal is that the validator should offer any option to send to it the right document with the right header. There are a lot of options for the validator application to do that, and a simple one is to send a header. Another simple one is to offer that emulation in a menu, combo box or whatever option you choose.
 
> > And of course it only applies to xhtml 1.0, but if you are validating an 
> > xhtml 1.1 document, ther is no possible interpretation to use text/html.
> > Execept, the fact that the documents should be served that way for the 
> > Explorer.
> 
> Needing to support clients that do not support a standard is usually a good
> reason to use a different standard. It isn't usually a good reason to violate
> the specification.
> 

But it seems that i and all the people that think the same way are not going to get any kind of positive answer, you can close that bug. May be one day someone will open new one and then will get a different answer.

Bye
Comment 14 PatomaS 2007-04-25 01:52:48 UTC
Hi

I was kind of upset in the last message, excuse me for that.

I understand the point, even when it is expressed in a way that makes me think you never have made a website using serious programming or had to deal with extrange behaviours in browsers and servers due to this default state.

So if we are strict to the rule, the only way is to send the right header and send a different document to browsers who do not support the right one. Well. Ok.

But this is not as simple as parse it with xslt.

And even in that situation, if i have such a detection, every time i try to validate the document here, i will validate the default state, it means the application/html+xml and the xhtml 1.1 document. But, i can not validate the, let say, text/html and html 4.01 option because of the lack of headers from the validator. The only way to validate that option, is to request it in a non compilant browser, then copy the content and paste it in that option of the validator.

So i still think that is fair to expect from the validator to give equal chances to validate both options in the same way, not one document in one way and the other document in other way.

And this is quite simple to implement that i think it should be considered.

Bye :)
Comment 15 Sierk Bornemann 2007-04-26 14:13:11 UTC
Is there any good reason, to *not* sending an accept header at all or to send an empty accept header?
I presume the validator to be a normal requesting client from the servers' point of view. So the validator should provide a reasonable accept header.

Please have a look at http://validator-test.w3.org/check?uri=http%3A%2F%2Fsierkbornemann.de%2F&charset=&doctype=&group=0

Per default, I serve .html with the MIME type "text/html". If a client requests, which accepts "application/xhtml+xml", then an Apache rewrite rule rewrites the MIME type to "application/xhtml+xml" as for instance  http://www.xml.com/pub/a/2003/03/19/dive-into-xml.html proposes. This solution works very well, but not in validator 0.8 beta, which throws out a warning about a false served media type. This warning is fully acceptable concerning the fact, that the validator in fact seems to reseive "text/html" instead of "application/xhtml+xml". This problem would not exist, if the Validator would provide a reasonable accept header! If it would provide one and indicates, what MIMEtypes it accepts, then the webserver's rules have the chance to match. At the moment, there is no chance for such rules to match because of lacking information from the validator. Currently the validator throws out a warning about a problem or misconfiguration, which does not necessary apply, if the validator would be a little more server-friendly and would provide a proper and talking accept header.

My solution in conditionnaly rewriting the MIME type by the webserver, is a compromise to not let the Internet Explorer out of the playground. If IE would understand "application/xhtml+xml", I would have much less sleepless nights. In that case, I could serve "application/xhtml+xml" per default to any XHTML document, like the spec defines/recommends. But that is fiction so far. So little workarounds have to be done.

I *want* to use XHTML, lastly to promote it. I intentionally *want* to use XHTML 1.1, lastly to promote it and lastly to provide it to web browsers, who are capable in doing their work correctly and fulfilling the standards.
And I want my documents be parsed as XML and not as SGML as far as possible.
Browsers like the Internet Explorer, who don't work correctly and don't catch up the stabdards, have got a bad standing (at least and especially in my eyes), and I am *not* willing to provide (or going further: foster) this bad standing any longer. The browser vendor, especially Microsoft, *has to do* his homework in delivering a good and reliable piece of software. Is that the intention of the W3C QA? Don't give XHTML 1.1 real chances, because of one single web browser out there, who can't really deal with it?

The content on my website http://sierkbornemann.de/ semantically meets XHTML 1.1. So why shouldn't I use a XHTML 1.1 DTD and the appropriate mimetype "application/xhtml+xml" for clients, which are capable of it?
If all other web browsers but one browser (the IE) could be served with fully compliant XHTML 1.1 including the correct MIME type, then I want to do it. If this one browser (the IE) couldn't be served with the correct MIME type or only could be served with minimal flaws, then it is my risk to take.

The validator should behave like a client, which is full capable of these standards, and it should provide this information to the public in presenting a reasonable and talking accept header to the webservers out there.
Comment 16 Olivier Thereaux 2007-09-26 13:52:42 UTC
(In reply to comment #15)
> Is there any good reason, to *not* sending an accept header at all or to send
> an empty accept header?

Probably not. Now where is the patch that will cover all the media types which the validator accepts? I am changing the assignment to this bug, to make it very clear I am not working on it, and others should feel free to work on a patch.
Comment 17 Dean Edridge 2007-09-27 02:22:06 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Or, do you expect that all the pages made with xhtml should be served with
> > application/xhtml+xml with out detecting if the browser can or can not manage
> > such a thing?
> 
> No, XHTML 1.0 documents served as text/html validate just fine. try it :)
> 
Arrr... no they don't actually. This causes a enormous amount of problems. XHTML SHOULD not be served as text/html and the validator is wrong to do so, and misleads people when it validates such a page. Sooner or later people will have to be told the truth. It's going to be an interesting time when people start changing from XHTML1.0 (sent as text/html) to HTML5 (sent as text/html). Then people will find out once and for all what XHTML really is :)
Comment 18 Olivier Thereaux 2007-09-27 02:57:58 UTC
(In reply to comment #17)

> XHTML SHOULD not be served as text/html and the validator is wrong to do so,
> and misleads people when it validates such a page. 

Reference please? All I have is http://www.w3.org/TR/2002/REC-xhtml1-20020801/#media
[[ XHTML Documents which follow the guidelines set forth in Appendix C, "HTML Compatibility Guidelines" may be labeled with the Internet Media Type "text/html" [RFC2854], as they are compatible with most HTML browsers. ]]

And how is this on topic?
Comment 19 Dean Edridge 2007-09-27 04:00:50 UTC
(In reply to comment #18)
> (In reply to comment #17)
> 
> > XHTML SHOULD not be served as text/html and the validator is wrong to do so,
> > and misleads people when it validates such a page. 
> 
> Reference please? All I have is
> http://www.w3.org/TR/2002/REC-xhtml1-20020801/#media
> [[ XHTML Documents which follow the guidelines set forth in Appendix C, "HTML
> Compatibility Guidelines" may be labeled with the Internet Media Type
> "text/html" [RFC2854], as they are compatible with most HTML browsers. ]]
> 
> And how is this on topic?
> 

Ref: http://webkit.org/blog/?p=68

Well if the W3C would stop misleading everyone with the appendix C circus and start helping people to use XHTML properly we wouldn't be having this conversation and this bug would not exist. The fact is you can't use XHTML on the web today with out using server-side content negotiation.

I'm trying to help the W3C promote and encourage the proper use of XHTML on the web, please be more open to input from the public. The fact that this error with the validator has been around for 5 years makes me wonder if the W3C has a problem listening to the advise of industry experts. 


Thanks,
Dean  
Comment 20 Olivier Thereaux 2007-09-27 10:27:54 UTC
(In reply to comment #19)

> Well if the W3C would stop misleading everyone with the appendix C circus and
> start helping people to use XHTML properly we wouldn't be having this
> conversation and this bug would not exist. The fact is you can't use XHTML on
> the web today with out using server-side content negotiation.

Your intensity about the issue is appreciated, and I hope that you are as adamant in lobbying browser vendors not properly supporting XHTML.

I am not a fan of appendix C, actually, but I am working on a tool that is supposed to respect and enforce standardized rules. 

This is why the validator is not complaining about XHTML 1.0 served as text/html (we do have a bug about making the validator complain if a text/html XHTML 1.0 document does not respect said appC). 

This is why the validator is sending a warning (not an error...) when XHTML 1.1 is served as text/html. Whether a broken browser is a good reason to do this, whether it is a good idea to serve XHTML 1.1 as text/html to IE, search engines and other agents, whether this actually helps or hinders the progress of XHTML on the web, is off-topic for the validator.
Comment 21 Olivier Thereaux 2007-09-27 10:33:15 UTC
Since:
* there has been a lot of discussion on this, but no patch
* there is no way in HTTP to claim acceptance of e.g application/*+xml
* there is a fix on its way for Bug 18, which would allow the precise setting of the accept header, as well as accept-language (something much more flexible and useful IMHO)

I am moving to closing this as WONTFIX.
Comment 22 Olivier Thereaux 2007-12-25 23:32:59 UTC
*** Bug 5330 has been marked as a duplicate of this bug. ***
Comment 23 Olivier Thereaux 2008-01-28 13:56:49 UTC
*** Bug 5432 has been marked as a duplicate of this bug. ***
Comment 24 Olivier Thereaux 2008-04-29 12:01:10 UTC
(In reply to comment #21)
> * there has been a lot of discussion on this, but no patch
> * there is no way in HTTP to claim acceptance of e.g application/*+xml
> * there is a fix on its way for Bug 18, which would allow the precise setting
> of the accept header, as well as accept-language (something much more flexible
> and useful IMHO)
> 
> I am moving to closing this as WONTFIX.

To make things clear, this bug may be reopened if anyone can submit a functional patch.
Comment 25 Etienne Miret 2008-06-15 07:25:07 UTC
Created attachment 555 [details]
Forwards Accept, Accept-Charset and Accept-Language headers in a referer request

This patch makes use of the already available "accept", "accept-language" and "accept-charset" parameters and populate them with the values provided by the client *in case of a referer request*. It will also make sure those values are kept across revalidation. This makes URI to be very long. Sorry.

The headers sent by the client are copied verbatim, that means that the validator will send Accept and Accept-Charset headers with types and charset it doesnt support. This is the desired behaviour since a check/referer link on a - say - PDF document should trigger an error "This document type cannot be validated" even if a HTML/XHTML variant is available.

No Accept-Encoding header is sent because in case the validator gets an encoding it doesnt know about, it tries to validate the encoded document. This is different from charset and content-type, where the validator will display an appropriate error message whenever it gets one it doesn't know about.

Beside Accept-Encoding, a server may do content-negotiation with any HTTP header, notably User-Agent, and even other informations (like IP address). Hence, there is no, and there cannot be any warranty that the validated document is the one the user was actually viewing, although this patch will help this.
Comment 26 Etienne Miret 2008-06-15 07:29:31 UTC
Created attachment 556 [details]
Forwards Accept and Accept-Language headers in a referer request

This patch is basically identical to the previous one, except that it wont forward the Accept-Charset header. Olivier Thereaux asked for it, but Im not sure it is better.
Comment 27 Olivier Thereaux 2008-08-14 12:50:40 UTC
*** Bug 5970 has been marked as a duplicate of this bug. ***
Comment 28 Olivier Thereaux 2008-08-14 13:15:25 UTC
*** Bug 5970 has been marked as a duplicate of this bug. ***
Comment 29 Olivier Thereaux 2008-08-14 13:25:32 UTC
*** Bug 5970 has been marked as a duplicate of this bug. ***
Comment 30 Ville Skyttä 2011-08-23 21:00:28 UTC
*** Bug 9416 has been marked as a duplicate of this bug. ***
Comment 31 John A. Bilicki III 2011-08-23 21:09:47 UTC
Does the validator send an Accept header? If so why is the application/xhtml+xml mime not simply added?

Since it is merely a statement that the user agent supports the mime and it's clear that the validator does support application/xhtml+xml what exactly is holding back the string from simply being added to the header?
Comment 32 Etienne Miret 2011-08-23 21:34:12 UTC
(In reply to comment #31)
> Does the validator send an Accept header?
No, it doesnt.