This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Version 0.6.5 of the W3C Markup Validation Service reports that documents with HTML 4.01 Strict DOCTYPE declaration are HTML 4.01 Transitional documents: 'The uploaded file was checked and found to be valid HTML 4.01 Transitional. This means that the resource in question identified itself as "HTML 4.01 Transitional" and that we successfully performed a formal validation using an SGML or XML Parser (depending on the markup language used).' I have tried different source spacinglots of line breaks or none at all; a single line for the DOCTYPE declaration or multiple linesbut always with the same results. Here is a test case (in its liberally spaced form for easy reading): -----[ Begin Clip ]----- <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Strict//EN" "http://www.w3.org/TR/html4/strict.dtd"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> <TITLE>Test Case</TITLE> </HEAD> <BODY> <DIV> <H1>Test Case</H1> <P>Despite the HTML 4.01 Strict DOCTYPE declaration, the W3C validator reports that this document is "HTML 4.01 Transitional":</P> <P>'The uploaded file was checked and found to be valid HTML 4.01 Transitional. This means that the resource in question identified itself as "HTML 4.01 Transitional" and that we successfully performed a formal validation using an SGML or XML Parser (depending on the markup language used).'</P> <P>I have tried different source spacing—lots of line breaks or none at all; a single line for the DOCTYPE declaration or multiple lines—but always with the same results.</P> </DIV> </BODY> </HTML> -----[ End Clip ]-----
Created attachment 362 [details] A Well-Spaced Test Case This is the same test case as within the body of my report (except that it lacks the hard line breaks that were introduced by Bugzilla), but in a separate file for your convenience.
This is caused by validator's "fall back to HTML 4.01 Transitional if no known public identifier is found" behaviour. What triggers it in your document is that you've specified: "-//W3C//DTD HTML 4.01 Strict//EN" ...but there's no such public identifier. It should be: "-//W3C//DTD HTML 4.01//EN" ...for HTML 4.01 Strict. Nevertheless, the validator's output for these cases is not acceptable, I wish we can get this fixed soon. BTW, specifying just about anything as long as the root element is "HTML" will produce the same results, for example this: <!DOCTYPE HTML PUBLIC "-//foo//DTD bar//EN" "quux.dtd">
I am not sure how I got into the habit of using a non-standard DTD declaration, but I am confirming the assertion made in Comment #2 with reference URLs: http://www.w3.org/QA/2002/04/valid-dtd-list.html http://www.w3.org/QA/Tips/Doctype It should be noted, however, that the same non-standard DTD declaration does appear in what looks like old versions of the validator's source code archived on the W3C Web site: http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.13 http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.14 http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.15 http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.16 http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.17 Naturally, it also appears in some discussion on the validator list.
Argh! And if the page happens to be valid HTML 4.01 Transitional you won't even get as much as a warning about this, much less an error. OpenSP isn't even outputting anything we could key off of. Anyone have suggestions?
*** Bug 454 has been marked as a duplicate of this bug. ***
One suggestion off the cuff, untested: remove the "HTML" ~ "HTML 4.01 Transitional" fallback from the catalog(s) (as well as other similar ones from other catalogs, if there are any), and insert the doctype ourselves if none is found, akin to how it's currently done in doctype override.
I found that idea impractical to implement last time I looked at it (for several reasons). If you can come up with code for this I'd be very happy. In either case, this is too disruptive for 0.6.6 IMO; setting target 0.7.
What about just removing the catch-all-HTML catalog fallback for 0.6.6? Currently the problems it causes outweigh its benefits, and removing the fallback should not be too disruptive.
Two points Re: Terje's #4: (1) I'm confused that it's giving preference to a catalogue default over a valid SYSTEM identifier in a case where the latter is provided. But that's really a side-issue: if we fix it, the problem just reappears when someone mistypes the SYSTEM id as well. (2) I think we could get OpenSP to emit a useful message by hacking the local defaulted DTD to emit a custom version string we can test against.
Retargetting for 0.6.7 and change Status->Accepted.
Seems like Terje's tweak with "OVERRIDE NO" in sgml.soc fixed this. One interesting consequence is that documents with no doctype are now said to be "NOT VALID ". Which is quite true, they're not valid anything, they don't claim to be anything. I'm sure some people would consider it a heresy that a validation claim does not express against what. Other than that, a run through /dev/tests/ did not show anything abnormal.
Sort of fixed, but the "friendliness" ie. fallback from no doctype to HTML 4.01 Transitional seems to be gone, or at least is partially broken: http://qa-dev.w3.org/wmvs/0.6/check?uri=http%3A%2F%2Fkoti.welho.com%2Fvskytta%2Fbug705-2.html
False alarm! I repeat, false alarm! :-) The change appeared to "fix" it because it introduced a bug that disabled the fallback Doctype completely. With that gone, the behaviour described in this bug remains.
The bug is apparently fixed in current HEAD. close?