This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20974 - Comments including meta tag
Summary: Comments including meta tag
Status: NEW
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: HEAD
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
Depends on:
Reported: 2013-02-12 11:45 UTC by iaack
Modified: 2013-02-13 18:13 UTC (History)
1 user (show)

See Also:


Description iaack 2013-02-12 11:45:06 UTC
The validator fails to handle comments including meta tags like this

<!DOCTYPE html>

 <meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS">

After validated, This text will be changed to

<!DOCTYPE html>

 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><!-- <meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS"> -->

The error message is following

Consecutive hyphens did not terminate a comment. -- is not permitted inside a comment, but e.g. - - is.
Comment 1 Michael[tm] Smith 2013-02-12 13:46:59 UTC
Thanks for reporting this. It seems to be a bug in the legacy validator code, because you're using to check your document. I recommend that you don't use that but instead use

The code for the legacy validator at is not code I work on but as far as I can see it appears to be doing some pre-processing on the input it sends to the HTML5 backend. The way to avoid that broken preprocessing is to instead use the UI at which sends the input to the same HTML5 backend but without doing any preprocessing.

I'll try to get this bug in the legacy validator fixed but I think it's likely I'll do that simply by having the request redirected to so that the preprocessing gets bypassed completely.
Comment 2 Michael[tm] Smith 2013-02-13 08:02:18 UTC
Ville, I can reproduce this bug with the doctype set to HTML4, so this isn't a bug in the HTML5 backend or in the REST API for the HTML5 validator but rather it seems a bug in some preprocessing step in the validator perl code.
Comment 3 Ville Skyttä 2013-02-13 18:13:41 UTC
That's right, I took a brief look at it too. The culprit is override_charset() which does a simple text replacement without any "intelligence" whatsoever and thus has no idea about the context it is working in. I'm afraid it'll take more than a few lines of code to fix this.