This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenizing-character-references “Otherwise, return a character token for the Unicode character whose code point is that number. Additionally, if the number is in the range 0x0001 to 0x0008, 0x000E to 0x001F, 0x007F to 0x009F, 0xFDD0 to 0xFDEF, or is one of 0x000B, 0xFFFE, 0xFFFF, 0x1FFFE, 0x1FFFF, 0x2FFFE, 0x2FFFF, 0x3FFFE, 0x3FFFF, 0x4FFFE, 0x4FFFF, 0x5FFFE, 0x5FFFF, 0x6FFFE, 0x6FFFF, 0x7FFFE, 0x7FFFF, 0x8FFFE, 0x8FFFF, 0x9FFFE, 0x9FFFF, 0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE, 0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or 0x10FFFF, then this is a parse error.” This includes the following code points which map to their matching Unicode symbol but were already listed in the table above (http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#table-charref-overrides): * U+0081 * U+000D * U+008D * U+008F * U+0090 * U+009D I guess these can be removed from the table.
(Found via <https://github.com/mathiasbynens/he/compare/8bd18e6cdf4071f04c0ed9583f2d96b500db1da3...842c259d3923bf8ddd7a7fe79f77527cb81bbeb7#diff-5>.)
Actually it didn't include U+000D, I had to add that in, which made the list not match the previous list. But it's still a good change. Thanks.
Checked in as WHATWG revision r8128. Check-in comment: Hide redundant rows since otherwise it'd be two parse errors for no reason. http://html5.org/tools/web-apps-tracker?from=8127&to=8128