This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20974 - Comments including meta tag
Summary: Comments including meta tag
Status: NEW
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: HEAD
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-02-12 11:45 UTC by iaack
Modified: 2013-02-13 18:13 UTC (History)
1 user (show)

See Also:


Attachments

Description iaack 2013-02-12 11:45:06 UTC
The validator fails to handle comments including meta tags like this

<!DOCTYPE html>
<title></title>

<!--
 <meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS">
-->

After validated, This text will be changed to

<!DOCTYPE html>
<title></title>

<!--
 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><!-- <meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS"> -->
-->

The error message is following

Consecutive hyphens did not terminate a comment. -- is not permitted inside a comment, but e.g. - - is.
Comment 1 Michael[tm] Smith 2013-02-12 13:46:59 UTC
Thanks for reporting this. It seems to be a bug in the legacy validator code, because you're using http://validator.w3.org/ to check your document. I recommend that you don't use that but instead use http://validator.w3.org/nu/

The code for the legacy validator at http://validator.w3.org/ is not code I work on but as far as I can see it appears to be doing some pre-processing on the input it sends to the HTML5 backend. The way to avoid that broken preprocessing is to instead use the UI at http://validator.w3.org/nu/ which sends the input to the same HTML5 backend but without doing any preprocessing.

I'll try to get this bug in the legacy validator fixed but I think it's likely I'll do that simply by having the request redirected to http://validator.w3.org/nu/ so that the preprocessing gets bypassed completely.
Comment 2 Michael[tm] Smith 2013-02-13 08:02:18 UTC
Ville, I can reproduce this bug with the doctype set to HTML4, so this isn't a bug in the HTML5 backend or in the REST API for the HTML5 validator but rather it seems a bug in some preprocessing step in the validator perl code.
Comment 3 Ville Skyttä 2013-02-13 18:13:41 UTC
That's right, I took a brief look at it too. The culprit is override_charset() which does a simple text replacement without any "intelligence" whatsoever and thus has no idea about the context it is working in. I'm afraid it'll take more than a few lines of code to fix this.