22873 – Tokenizing character references: remove redundant code points

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22873 - Tokenizing character references: remove redundant code points

Summary: Tokenizing character references: remove redundant code points

Status:	RESOLVED FIXED

Alias:	None

Product:	WHATWG
Classification:	Unclassified
Component:	HTML (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P2 trivial
Target Milestone:	Unsorted
Assignee:	Ian 'Hixie' Hickson
QA Contact:	contributor

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-08-04 09:04 UTC by Mathias Bynens
Modified:	2013-08-05 18:15 UTC (History)
CC List:	3 users (show)

See Also:

Attachments

Description Mathias Bynens 2013-08-04 09:04:48 UTC

http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#tokenizing-character-references

“Otherwise, return a character token for the Unicode character whose code point is that number. Additionally, if the number is in the range 0x0001 to 0x0008, 0x000E to 0x001F, 0x007F to 0x009F, 0xFDD0 to 0xFDEF, or is one of 0x000B, 0xFFFE, 0xFFFF, 0x1FFFE, 0x1FFFF, 0x2FFFE, 0x2FFFF, 0x3FFFE, 0x3FFFF, 0x4FFFE, 0x4FFFF, 0x5FFFE, 0x5FFFF, 0x6FFFE, 0x6FFFF, 0x7FFFE, 0x7FFFF, 0x8FFFE, 0x8FFFF, 0x9FFFE, 0x9FFFF, 0xAFFFE, 0xAFFFF, 0xBFFFE, 0xBFFFF, 0xCFFFE, 0xCFFFF, 0xDFFFE, 0xDFFFF, 0xEFFFE, 0xEFFFF, 0xFFFFE, 0xFFFFF, 0x10FFFE, or 0x10FFFF, then this is a parse error.”

This includes the following code points which map to their matching Unicode symbol but were already listed in the table above (http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#table-charref-overrides):

* U+0081
* U+000D
* U+008D
* U+008F
* U+0090
* U+009D

I guess these can be removed from the table.

Comment 1 Mathias Bynens 2013-08-04 09:06:15 UTC

(Found via <https://github.com/mathiasbynens/he/compare/8bd18e6cdf4071f04c0ed9583f2d96b500db1da3...842c259d3923bf8ddd7a7fe79f77527cb81bbeb7#diff-5>.)

Comment 2 Ian 'Hixie' Hickson 2013-08-05 18:15:27 UTC

Actually it didn't include U+000D, I had to add that in, which made the list not match the previous list. But it's still a good change. Thanks.

Comment 3 contributor 2013-08-05 18:15:37 UTC

Checked in as WHATWG revision r8128.
Check-in comment: Hide redundant rows since otherwise it'd be two parse errors for no reason.
http://html5.org/tools/web-apps-tracker?from=8127&to=8128