This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4547 - Correct line for non utf-8 characters not flagged in .8.0 beta 1
Summary: Correct line for non utf-8 characters not flagged in .8.0 beta 1
Status: RESOLVED FIXED
Alias: None
Product: Validator
Classification: Unclassified
Component: Parser (show other bugs)
Version: 0.8.0b1
Hardware: PC Windows XP
: P2 normal
Target Milestone: ---
Assignee: Terje Bless
QA Contact: qa-dev tracking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-05-08 17:32 UTC by d4d8n7m02
Modified: 2007-05-20 23:19 UTC (History)
0 users

See Also:


Attachments
Test case for bug (9.30 KB, text/html)
2007-05-08 21:40 UTC, d4d8n7m02
Details

Description d4d8n7m02 2007-05-08 17:32:25 UTC
The current live version gives me this error on a document I uploaded:

Sorry, I am unable to validate this document because on line 222 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.

But the new version at http://validator-test.w3.org/ says:

Sorry, I am unable to validate this document because on line 0 it contained one or more bytes that I cannot interpret as utf-8 (in other words, the bytes found are not valid values in the specified Character Encoding). Please check both the content of the file and the character encoding indication.

The difference in line numbers indicates a problem, but in addition to that I don't see what character is off.  I looked at line 222 in Notepad++ with "show all characters" mode and I didn't see any characters that shouldn't be there.

Is   not allowed in an anchor element contents?

Line 222 is:

                            <a id="link1" class="Edit" title="Edit Edit Something" href="#">Edit&nbsp;Something</a>

doctype and namespace:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
Comment 1 Olivier Thereaux 2007-05-08 21:33:21 UTC
(In reply to comment #0)
> The current live version gives me this error on a document I uploaded:
> 
> Sorry, I am unable to validate this document because on line 222

> But the new version at http://validator-test.w3.org/ says:
> 
> Sorry, I am unable to validate this document because on line 0 

That looks like a bug indeed. Do you have the URI of the document you were testing?

>  I looked at line 222 in Notepad++ with "show
> all characters" mode and I didn't see any characters that shouldn't be there.

I think the problematic character is between "Edit" and "something".
Comment 2 d4d8n7m02 2007-05-08 21:36:59 UTC
It's not on the web anywhere, but I can upload it as an attachment if you'd like.  So are entities not allowed in element contents such as anchors?

(In reply to comment #1)
> (In reply to comment #0)
> > The current live version gives me this error on a document I uploaded:
> > 
> > Sorry, I am unable to validate this document because on line 222
> 
> > But the new version at http://validator-test.w3.org/ says:
> > 
> > Sorry, I am unable to validate this document because on line 0 
> 
> That looks like a bug indeed. Do you have the URI of the document you were
> testing?
> 
> >  I looked at line 222 in Notepad++ with "show
> > all characters" mode and I didn't see any characters that shouldn't be there.
> 
> I think the problematic character is between "Edit" and "something".
> 

Comment 3 d4d8n7m02 2007-05-08 21:40:34 UTC
Created attachment 469 [details]
Test case for bug
Comment 4 Olivier Thereaux 2007-05-20 23:19:10 UTC
fixed now in CVS, testable at:
http://qa-dev.w3.org/wmvs/HEAD/check?uri=http%3A%2F%2Fwww.w3.org%2FBugs%2FPublic%2Fattachment.cgi%3Fid%3D469

thanks a lot for your report!