This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 17490 - `entities.json` uses invalid syntax and has incorrect content
Summary: `entities.json` uses invalid syntax and has incorrect content
Status: RESOLVED FIXED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL: http://www.whatwg.org/specs/web-apps/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-06-14 19:29 UTC by contributor
Modified: 2012-09-19 22:41 UTC (History)
3 users (show)

See Also:


Attachments
Valid, working version (52.17 KB, application/octet-stream)
2012-06-14 19:30 UTC, Mathias Bynens
Details

Description contributor 2012-06-14 19:29:07 UTC
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/named-character-references.html
Multipage: http://www.whatwg.org/C#named-character-references
Complete: http://www.whatwg.org/c#named-character-references

Comment:
`entities.json` is invalid syntax and incorrect content

Posted from: 78.20.165.163 by mathias@qiwi.be
User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1173.0 Safari/537.1
Comment 1 Mathias Bynens 2012-06-14 19:30:30 UTC
Created attachment 1144 [details]
Valid, working version

Based on http://mathias.html5.org/tests/html/named-character-references/data.json
Comment 2 Mathias Bynens 2012-06-15 07:27:19 UTC
http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json 
currently has the following format:

    {
      "&AElig": { "codepoints": [0x000C6], "characters": "\u00C6" },
      …
    }

However, hexadecimal integer literals (although valid in JavaScript) aren’t
allowed in JSON.

The easiest solution would be to use the numerical value in decimal notation
instead, e.g. `198` instead of `0x000C6`.

Another solution would be to make the `codepoints` property an array of strings
instead of hexadecimal integers.

(You can check for JSON conformance using a tool like http://jsonlint.com/.)
Comment 3 Mathias Bynens 2012-06-15 07:41:41 UTC
Possible fix for `entity-processor-json.py`:

Replace:

    codes = '0x' + value[1:6] + ', 0x' + value[7:]

With:

    codes = str(int(value[1:6], 16)) + ', ' + str(int(value[7:], 16))

And replace:

    codes = '0x' + value[1:]

With:

    codes = str(int(value[1:], 16))
Comment 4 Mathias Bynens 2012-06-16 07:14:21 UTC
Heads up: both http://www.whatwg.org/specs/web-apps/current-work/entities.json and http://www.whatwg.org/specs/web-apps/current-work/multipage/entities.json still show the old, invalid version.
Comment 5 contributor 2012-07-18 17:46:25 UTC
This bug was cloned to create bug 18232 as part of operation convergence.
Comment 6 Ian 'Hixie' Hickson 2012-07-20 02:28:31 UTC
Odd.
Comment 7 Ian 'Hixie' Hickson 2012-09-19 22:41:11 UTC
Oh, I see, it's the hardcoded legacy values that were still wrong. Fixed.