This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 705 - The Markup Validation Service detects HTML 4.01 Strict as HTML 4.01 Transitional.
Summary: The Markup Validation Service detects HTML 4.01 Strict as HTML 4.01 Transitio...
Status: RESOLVED FIXED
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.6.5
Hardware: All All
: P1 major
Target Milestone: 0.6.7
Assignee: Terje Bless
QA Contact: qa-dev tracking
URL: http://validator.w3.org/check
Whiteboard:
Keywords:
: 454 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-05-07 19:59 UTC by Brian Sexton
Modified: 2005-02-05 04:45 UTC (History)
2 users (show)

See Also:


Attachments
A Well-Spaced Test Case (933 bytes, text/html)
2004-05-08 14:27 UTC, Brian Sexton
Details

Description Brian Sexton 2004-05-07 19:59:12 UTC
Version 0.6.5 of the W3C Markup Validation Service reports that documents with 
HTML 4.01 Strict DOCTYPE declaration are HTML 4.01 Transitional documents:

'The uploaded file was checked and found to be valid HTML 4.01 Transitional. 
This means that the resource in question identified itself as "HTML 4.01 
Transitional" and that we successfully performed a formal validation using an 
SGML or XML Parser (depending on the markup language used).'

I have tried different source spacinglots of line breaks or none at all; a 
single line for the DOCTYPE declaration or multiple linesbut always with the 
same results.  Here is a test case (in its liberally spaced form for easy 
reading):

-----[ Begin Clip ]-----

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 
Strict//EN" "http://www.w3.org/TR/html4/strict.dtd">


<HTML>


<HEAD>

  <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">

  <TITLE>Test Case</TITLE>

</HEAD>


<BODY>


<DIV>


<H1>Test Case</H1>


<P>Despite the HTML 4.01 Strict DOCTYPE declaration, the W3C validator reports 
that this document is &quot;HTML 4.01 Transitional&quot;:</P>

<P>'The uploaded file was checked and found to be valid HTML 4.01 
Transitional. This means that the resource in question identified itself as 
&quot;HTML 4.01 Transitional&quot; and that we successfully performed a formal 
validation using an SGML or XML Parser (depending on the markup language 
used).'</P>

<P>I have tried different source spacing&mdash;lots of line breaks or none at 
all; a single line for the DOCTYPE declaration or multiple lines&mdash;but 
always with the same results.</P>


</DIV>


</BODY>


</HTML>

-----[ End Clip ]-----
Comment 1 Brian Sexton 2004-05-08 14:27:41 UTC
Created attachment 362 [details]
A Well-Spaced Test Case

This is the same test case as within the body of my report (except that it
lacks the hard line breaks that were introduced by Bugzilla), but in a separate
file for your convenience.
Comment 2 Ville Skyttä 2004-05-15 06:47:17 UTC
This is caused by validator's "fall back to HTML 4.01 Transitional if no known
public identifier is found" behaviour.

What triggers it in your document is that you've specified:

   "-//W3C//DTD HTML 4.01 Strict//EN"

...but there's no such public identifier.  It should be:

   "-//W3C//DTD HTML 4.01//EN"

...for HTML 4.01 Strict.

Nevertheless, the validator's output for these cases is not acceptable, I wish
we can get this fixed soon.  BTW, specifying just about anything as long as the
root element is "HTML" will produce the same results, for example this:

  <!DOCTYPE HTML PUBLIC "-//foo//DTD bar//EN" "quux.dtd">
Comment 3 Brian Sexton 2004-05-15 16:39:08 UTC
I am not sure how I got into the habit of using a non-standard DTD declaration,
but I am confirming the assertion made in Comment #2 with reference URLs:

http://www.w3.org/QA/2002/04/valid-dtd-list.html

http://www.w3.org/QA/Tips/Doctype

It should be noted, however, that the same non-standard DTD declaration does
appear in what looks like old versions of the validator's source code archived
on the W3C Web site:

http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.13
http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.14
http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.15
http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.16
http://dev.w3.org/cvsweb/validator/httpd/cgi-bin/check?rev=1.200.2.17

Naturally, it also appears in some discussion on the validator list.
Comment 4 Terje Bless 2004-05-16 03:45:41 UTC
Argh! And if the page happens to be valid HTML 4.01 Transitional you won't even
get as much as a warning about this, much less an error. OpenSP isn't even
outputting anything we could key off of. Anyone have suggestions?
Comment 5 Terje Bless 2004-05-17 16:45:16 UTC
*** Bug 454 has been marked as a duplicate of this bug. ***
Comment 6 Ville Skyttä 2004-05-17 18:15:39 UTC
One suggestion off the cuff, untested: remove the "HTML" ~ "HTML 4.01
Transitional" fallback from the catalog(s) (as well as other similar ones from
other catalogs, if there are any), and insert the doctype ourselves if none is
found, akin to how it's currently done in doctype override.
Comment 7 Terje Bless 2004-05-17 19:13:54 UTC
I found that idea impractical to implement last time I looked at it (for several reasons). If you can come 
up with code for this I'd be very happy. In either case, this is too disruptive for 0.6.6 IMO; setting target 
0.7.
Comment 8 Ville Skyttä 2004-05-18 02:27:48 UTC
What about just removing the catch-all-HTML catalog fallback for 0.6.6?
Currently the problems it causes outweigh its benefits, and removing the
fallback should not be too disruptive.
Comment 9 niq 2004-05-19 03:19:46 UTC
Two points Re: Terje's #4:

(1) I'm confused that it's giving preference to a catalogue default over a valid
SYSTEM identifier in a case where the latter is provided.  But that's really a
side-issue: if we fix it, the problem just reappears when someone mistypes the
SYSTEM id as well.

(2) I think we could get OpenSP to emit a useful message by hacking the local
defaulted DTD to emit a custom version string we can test against.
Comment 10 Terje Bless 2004-05-20 08:14:58 UTC
Retargetting for 0.6.7 and change Status->Accepted.
Comment 11 Olivier Thereaux 2004-05-20 21:42:44 UTC
Seems like Terje's tweak with "OVERRIDE NO" in sgml.soc fixed this. 

One interesting  consequence is that documents with no doctype are now said to be "NOT VALID  ". 
Which is quite true, they're not valid anything, they don't claim to be anything. I'm sure some people 
would consider it a heresy that a validation claim does not express against what. 

Other than that, a run through /dev/tests/ did not show anything abnormal. 
Comment 12 Ville Skyttä 2004-05-21 03:16:37 UTC
Sort of fixed, but the "friendliness" ie. fallback from no doctype to HTML 4.01
Transitional seems to be gone, or at least is partially broken:
http://qa-dev.w3.org/wmvs/0.6/check?uri=http%3A%2F%2Fkoti.welho.com%2Fvskytta%2Fbug705-2.html
Comment 13 Terje Bless 2004-05-21 11:31:19 UTC
False alarm! I repeat, false alarm! :-)

The change appeared to "fix" it because it introduced a bug that disabled the
fallback Doctype completely. With that gone, the behaviour described in this bug
remains.
Comment 14 Olivier Thereaux 2004-09-06 06:24:02 UTC
The bug is apparently fixed in current HEAD. close?