This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Reported by Klaus Johannes Rusch: The beta validator at http://validator.w3.org:8001/ indicates the following document is not valid: <!DOCTYPE html SYSTEM "http://www.ibm.com/data/dtd/v11/ibmxhtml1-transitional.dtd"> [...] Any ideas why this is failing (the validator seems to not allow XHTML notation with custom DTDs, validating the same document against the W3C XHTML DTD works).
When a document with a custom DTD (not a "Well Known" FPI) is served as text/html it will always be treated as SGML and not XML. This is hard to avoid, at least in the current code. To enable custom XML DTDs you need to use an XML Content-Type. Setting target to 0.7.0 to investigate this again at that time.
As per the further comments from Klaus Johannes Rusch, perhaps we should add a "Force XML Mode" setting to allow the "We must have text/html for the older browsers" crowd to use custom XML DTDs.
Retarget for 1.0. Won't make the cut for 0.7.
The following page sets Content-type to application/xml, and uses a custom DTD, but still doesn't validate. http://validator.w3.org/check?uri=http%3A%2F%2Fdev.silverstripe.com%2Fplay%2Fcustom-dtd.php An alternative validator succeeds in validating it: http://www.htmlhelp.com/cgi-bin/validate.cgi?url=http%3A%2F%2Fdev.silverstripe.com%2Fplay%2Fcustom-dtd.html&xml=yes
It doesn't work with text/xml either :P
This looks very similar to the problem described in Bug #1500, where it is argued that the sgml/xml mode switch should be based on content-type, before pre-parsing, and not based on pre-parsing and detection of a known (or unknown in this case) doctype. Setting dependency accordingly.
There were some problems in the logic of the parse mode detection, which indeed did not properly take into account the media type in choosing a parse mode for documents with unknown document types (including custom DTDs). Fixed now in CVS: http://lists.w3.org/Archives/Public/www-validator-cvs/2007Mar/0092.html
rewriting the parse mode selection routine entirely fixes this. http://lists.w3.org/Archives/Public/www-validator-cvs/2007Mar/0095.html has the CVS diff for the refactoring. The logic goes: * if neither content type nor doctype are helpful, => throw warning, use SGML as fallback * in case of an unknown doctype but useful mime type (generally, XML mime type) => follow the mime type * in case of an ambiguous mime type (text/html) but well-known doctype (any HTML served as text/html...) => follow the doctype * if neither are ambiguous, but they collide => throw warning, follow the mime type This was tested with the documents in the catalogue, documents outside the catalogue (custom DTDs), document served with the wrong mime type (XHTML 1.1 as text/html, html 4.01 as application/xhtml+xml etc.) successfully, see: http://qa-dev.w3.org/wmvs/HEAD/dev/tests/ Considering this fixed. If reopening, please provide clear test cases, thank you.