This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This doc't is valid per SP 1.3.4, but the online validator complains it can't detect an encoding. I suspect the charset sniffer is not getting past the PIs at the beginning -- if I remove them it's happy [Note I have no idea what version I'm using -- neither the form nor the result page gives a version number that I can see -- sorry if I'm missing something obvious]
That URL is is 404 Compliant.
Sorry, my screw-up, it's in place now
I think this is a case of "Don't Do That Then". The charset sniffer is groping around for a <meta> charset because there isn't one in the Content-Type (Strike #1), and fails to find it because at this stage we're using a non-SGML parser (Perl's HTML::Parser) which can handle weird constructs prior to the information it's after (Strike #2). The only way setting encoding info inside the document can ever work is if you take extreme care to make the bytes up to that point be easily parsed. This includes avoiding any non-vanilla constructs and making sure whatever encoding it's in looks identical to US-ASCII up to that point. I don't think we're going to be able to fix this without some fairly elaborate digging inside HTML::Parser's guts, or by using OpenSP and doing a two-pass parse. Given the overhead and the low gain, I'm not sure it's worth it for this bug alone (but it's another point in favour of doing a two-pass parse). Resolving as "LATER", and setting Target to 1.0 (aka. "Once upon a time..."). Thanks for the catch Henry!