Character encoding errors (detailed review of parsing algorithm)

(This is part of my detailed review of the parsing algorithm.)

The spec says:
> Bytes or sequences of bytes in the original byte stream that could  
> not be converted to Unicode characters must be converted to U+FFFD  
> REPLACEMENT CHARACTER code points.

The spec should probably say explicitly that such byte sequences are  
parse errors.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Wednesday, 18 July 2007 09:02:50 UTC