This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26347 - Suppress "Internal encoding declaration ... disagrees with the actual encoding of the document (UTF-8)" error when user uses "Check by text input"
Summary: Suppress "Internal encoding declaration ... disagrees with the actual encodin...
Status: RESOLVED WONTFIX
Alias: None
Product: HTML Checker
Classification: Unclassified
Component: General (show other bugs)
Version: unspecified
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael[tm] Smith
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-15 23:55 UTC by Takeshi Kurosawa
Modified: 2018-03-30 08:05 UTC (History)
2 users (show)

See Also:
mike: needinfo-


Attachments

Description Takeshi Kurosawa 2014-07-15 23:55:55 UTC
Nu Markup Checker reports an error "Internal encoding declaration shift_jis disagrees with the actual encoding of the document (UTF-8)" in following conditions.

1. Select "Check by text input"
2. Paste html which has non utf-8 encoding declaration.
3. Click "Check"

I think this error is not reasonable. Because

- Nu Markup Checker only accepts utf-8 input and user cannot change "actual encoding of the document"
- Nu Markup Checker reports a warning (Legacy encoding shift_jis used. Documents should use UTF-8.) for same document when user uses "Check by file upload"

W3C Markup Validator reports "info" in same condition. I think it is reasonable.
http://validator.w3.org/

> Using Direct Input mode: UTF-8 character encoding assumed
> Unlike the “by URI” and “by File Upload” modes, the “Direct Input” mode of the validator provides validated content in the form of characters pasted or typed in the validator's form field. This will automatically make the data UTF-8, and therefore the validator does not need to determine the character encoding of your document, and will ignore any charset information specified.

To recap, I think Nu Markup Checker should reports warning (same as "file upload") or info (same as W3C Markup Validator).

Sample HTML:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="shift_jis">
<title>Shift_JIS</title>
</head>
<body>
<p>Shift_JIS</p>
</body>
</html>
Comment 1 Takeshi Kurosawa 2018-03-30 08:05:55 UTC
The spec has changed to disallow encoding other than UTF-8. Reporting an error instead of warning is reasonable.

https://github.com/validator/validator/commit/261121e8a675d6e39ba09da402b20a375bbc44a2

I close this bug.