This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 14 - XHTML Detection is over-eager.
Summary: XHTML Detection is over-eager.
Status: NEW
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.6.0b1
Hardware: All All
: P2 normal
Target Milestone: 1.0
Assignee: Olivier Thereaux
QA Contact: qa-dev tracking
URL: http://www.damowmow.com/playground/ht...
Whiteboard:
Keywords:
Depends on: 24 739 1500
Blocks:
  Show dependency treegraph
 
Reported: 2002-10-25 02:04 UTC by Terje Bless
Modified: 2008-12-01 03:03 UTC (History)
2 users (show)

See Also:


Attachments
Hello india (deleted)
2008-03-07 05:35 UTC, venki
Details
Hello india (deleted)
2008-03-07 05:37 UTC, venki
Details
Hello india (deleted)
2008-03-07 05:37 UTC, venki
Details

Description Terje Bless 2002-10-25 02:04:14 UTC
Reported by Ian Hickson:

The following document:

   http://www.damowmow.com/playground/html-not-xml-2.html

....is a valid HTML 4.01 document. However, with the new validator, I get
the following error message:

| This Page Is NOT Valid XHTML 1.0 Strict!
|
| Below are the results of attempting to parse this document with an SGML
| parser.
|
| 1. Line 2, column 7: S separator in comment declaration
|
| <!-- -- -->
|        ^

This is probably a bug in the XHTML detection code.

Furthermore, when I force it to be handled as HTML 4.01, it still gets
autodetected as XHTML.
Comment 1 Terje Bless 2002-10-27 11:40:00 UTC
This case is pathological and compounded by the differing comment syntax between
SGML and XML. AFAICT, the root cause is that HTML::Parser doesn't understand the
comment syntax and so detects the XHTML DOCTYPE, forcing the Validator into XML
mode. Any fix for this needs to begin by fixing HTML::Parser's comment parser
and then we can see what this leaves us with in "check". This probably also
means we'll have to fix Bug #24 first.

Setting blocker on Bug #24 and target to 0.7.0 to revisit the issue then.
Comment 2 Olivier Thereaux 2005-05-12 01:19:47 UTC
I doubt we'll get around to fixing this bug for 0.7.0. 
Terje, What do you think?
Comment 3 Ian 'Hixie' Hickson 2005-05-12 02:03:56 UTC
According to the HTML WG, a UA is non-compliant if it handles an XHTML document
sent as text/html as XHTML; such a UA must apparently handle the document as
HTML regardless of what it looks like.

# [...] documents served as text/html should be treated as HTML and not as XHTML.
 -- http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html

But I don't know if they meant to include validators in that statement.
Comment 4 Olivier Thereaux 2005-08-05 00:45:20 UTC
(In reply to comment #2)
> I doubt we'll get around to fixing this bug for 0.7.0. 

indeed.
Comment 5 murali 2006-01-06 08:33:55 UTC

*** This bug has been marked as a duplicate of 12 ***
Comment 6 Olivier Thereaux 2006-08-30 02:26:05 UTC
adding dependency on Bug #1500 too, as the switching to xml mode may end up being decided not by the doctype but by the media type.
Comment 7 Olivier Thereaux 2007-05-31 01:04:01 UTC
(In reply to comment #3)
> According to the HTML WG, a UA is non-compliant if it handles an XHTML document
> sent as text/html as XHTML; such a UA must apparently handle the document as
> HTML regardless of what it looks like.

According to the WG, XHTML is always XML, and should be validated as such.

http://lists.w3.org/Archives/Public/www-validator/2007Apr/0175.html

> But I don't know if they meant to include validators in that statement.

Apparently not.
Comment 8 venki 2008-03-07 05:35:38 UTC
Created attachment 524 [details]
Hello india

bug QA
Comment 9 venki 2008-03-07 05:37:11 UTC
Created attachment 525 [details]
Hello india

bug QA
Comment 10 venki 2008-03-07 05:37:39 UTC
Created attachment 526 [details]
Hello india

bug QA