This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1920 - Validator fails becuase of symbol not found in windows-1251 character set
Summary: Validator fails becuase of symbol not found in windows-1251 character set
Status: RESOLVED DUPLICATE of bug 1833
Alias: None
Product: Validator
Classification: Unclassified
Component: check (show other bugs)
Version: 0.7.0
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Terje Bless
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-08-31 11:28 UTC by Maxim Maximov
Modified: 2005-10-18 07:31 UTC (History)
0 users

See Also:


Attachments

Description Maxim Maximov 2005-08-31 11:28:21 UTC
http://validator.w3.org fails on this HTML:

====
<?xml version="1.0" encoding="windows-1251"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ru" lang="ru">
<head><title>test</title>
<body>

&#1048;

</body>
</html>
====

with this error:

====
Result:  	 Failed validation,
File:	upload://Form Submission
Encoding:	windows-1251
Doctype:	

Sorry, I am unable to validate this document because on line 7 it contained one
or more bytes that I cannot interpret as windows-1251 (in other words, the bytes
found are not valid values in the specified Character Encoding). Please check
both the content of the file and the character encoding indication.
====

However, the symbol on line 7 is russian capital I. This is perfectly valid
common character. Maybe you have wrong charset definition? This symbol has ASCII
code 200 (decimal). Here
(http://dll.botik.ru/educ/clerk/Library/Method/kod-tabl.ru.html) you can get a
clue what this symbol looks like. There's an image under CP1251 heading, that
shows russian capital I above code 200.

BTW, most other symbols are ok, however I didn't checked them all.
Comment 1 Maxim Maximov 2005-08-31 11:29:51 UTC
Bugzilla changed this symbol into HTML entity. Please, mind it.
Comment 2 Olivier Thereaux 2005-10-18 07:31:04 UTC
This was the same problem with direct input validation as Bug 1833, which was fixed with the most recent 
release. 

*** This bug has been marked as a duplicate of 1833 ***