Quoted strings in "encoding sniffing algorithm" "get an attribute"

It's not uncommon to see pages with:

   <meta http-equiv="Content-Type"content="text/html; charset=windows-1252">

The encoding sniffing algorithm fails to detect this. "Get an attribute" 
gets the 'http-equiv' attribute, and stops when 'position' is the second 
'"'. The case "If the attribute's name is neither "charset" nor 
"content", then return to step 2 in these inner steps" applies, so it 
gets another attribute starting from 'position', getting name 
'"content', which is wrong.

"Get an attribute" should be changed to increment 'position' before 
returning after a quoted string.

-- 
Philip Taylor
pjt47@cam.ac.uk

Received on Friday, 7 March 2008 22:54:11 UTC