This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3626 - XHTML detection relies only on namespace declaration
Summary: XHTML detection relies only on namespace declaration
Status: NEW
Alias: None
Product: CSSValidator
Classification: Unclassified
Component: XHTML1.0 (show other bugs)
Version: CSS Validator
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-08-25 18:48 UTC by Christoph Schneegans
Modified: 2009-04-04 14:21 UTC (History)
0 users

See Also:


Attachments

Description Christoph Schneegans 2006-08-25 18:48:22 UTC
In order to extract CSS rules and declarations from HTML/XHTML documents, the 
CSS Validator needs to parse these documents. Therefore, it needs to determine 
whether a document is HTML or XHTML.

The method currently in use is not very sophisticated; the CSS Validator only 
looks for an XHTML namespace declaration. In particular, it ignores XML 
declarations and XHTML document type declarations. However, the presence of 
these declarations is a very reliable indicator for XHTML, so the CSS Validator 
can safely parse the document as such.

<http://jigsaw.w3.org/css-validator/validator?uri=http://schneegans.de/temp/no-xmlns.html>
does not use an XML parser. Otherwise, the well-formedness violation would be 
detected.

Furthermore, the CSS Validator fails to detect a namespace declaration when it 
is preceded by too many characters. Again,
<http://jigsaw.w3.org/css-validator/validator?uri=http://schneegans.de/temp/late-xmlns.html>
does not use an XML parser.

Parsing XHTML documents as HTML may incorrectly throw errors, e.g.
<http://schneegans.de/temp/space-preserve-no-namespace.html> is valid XHTML and 
conforms to Appendix C guidelines, although
<http://jigsaw.w3.org/css-validator/validator?uri=http://schneegans.de/temp/space-preserve-no-namespace.html>
complains about the "xml:space" attribute.
Comment 1 Olivier Thereaux 2006-08-28 02:07:49 UTC
Thank you for adding the issue to the bug tracking system.

For reference, this was also discussed on the mailing-list:
http://lists.w3.org/Archives/Public/www-validator-css/2006Aug/thread.html#msg15
Comment 2 Olivier Thereaux 2007-07-17 07:22:32 UTC
I think this bug is moot now that we replaced our dual parser with tagsoup.
Need to test and close.