© Leif Halvard Silli - 2010.02.09, updated 2010.02.10. New update: 2010.03.22.
Test of NCR support in text/HTML UAs.
This page tests the following aspects of NCR support:
- 1) Semicolon
- Question: Does the UA always/sometimes require that the NCR is terminated with a semicolon?
- 2) Text
- For rendered text, does the intended character render? Does replacement character show instead? Does the NCR – or bits of it – show instead?
- 3) Attributes
- Can the NCR encoding be used inside attributes – does user agents recognize their meaning then? Is the attribute support and the text support identical?
- 4) Length
- How long can the NCR be? (How many superfluous zeros can there be in front of the actual character number, before the NCR eventually stop working?)
- 5) Hexadecimal vs. decimal NCR
- Are there any differences between the support for hexadecimal NCRs and decimal NCRs?
- 6) Not yet tested. Specifics about NCR termination
- What I intend to test is whether NCRs work differently depending varius thigns such as presence/lack of whitespace after the unicode number. And does NCRs work the same way inside all attributes? I have a suspicion they don’t, but I don’t know yet. This is relevant to test both for those NCRs that do end with a semicolon and (of course) especially for those that do not end with a semicolon.
Description
The two tables below are the tests. They should be read as follows:
- Structure: Each cell in the columns titled Hex – ü(;) and Dec – ü(;) of each of the two tables, contains the letter ü encoded as a NCR. The specifics of how it is encoded can be read in table or seen in the source code. In addition, the
class
attribute of the same cell contains the same letter ü encoded in exactly the same way as in the cell. This allows us to test NCR support in text and attributes simultaneously.
- Semicolon: In Table 1 all NCRs are terminated with semicolon. In Table 2 none of the NCRs have semicolon.
- Text: In each cell of the column Hex – ü(;) and column Dec – ü(;), check that the letter ü is readable.
- Attributes: Green background indicates that the NCR in use is supported att the attribute level (specifically the
class
attribute, which is used as a selector. NB! I do not test CSS escaping!)
- Length: For each row a zero is added to the hex NCR and to the dec NCR – one more zero than on the preceding row. This allows us to see when the NCR(s) eventually break.
- hexadecimal vs decimal That there are two columns – one with hexadecimal NCR and another with decimal NCR encoding – allows us to see if the tested user agents behaves differently w.r.t.. the one or the other kind of NCR.
Tests
Table 1. Escapes with semicolon.
| # | illustrated | ü with semicolon |
Hex – ü | Dec – ü |
Number of superfluos zeros inside the NCR
| 0 | – | ü | ü
|
---|
1 | 0 | ü | ü
|
---|
2 | 00 | ü | ü
|
---|
3 | 000 | ü | ü
|
---|
4 | 0000 | ü | ü
|
---|
5 | 00000 | ü | ü
|
---|
6 | 000000 | ü | ü
|
---|
7 | 0000000 | ü | ü
|
---|
8 | 00000000 | ü | ü
|
---|
9 | 000000000 | ü | ü
|
---|
10 | 0000000000 | ü | ü
|
---|
11 | 00000000000 | ü | ü
|
---|
12 | 000000000000 | ü | ü
|
---|
13 | 0000000000000 | ü | ü
|
---|
Table 2. Escapes without semicolon
| # | illustrated | ü without semicolon |
Hex – ü | Dec – ü |
Number of superfluos zeros inside the NCR
| 0 | – | ü | ü
|
---|
1 | 0 | ü | ü
|
---|
2 | 00 | ü | ü
|
---|
3 | 000 | ü | ü
|
---|
4 | 0000 | ü | ü
|
---|
5 | 00000 | ü | ü
|
---|
6 | 000000 | ü | ü
|
---|
7 | 0000000 | ü | ü
|
---|
8 | 00000000 | ü | ü
|
---|
9 | 000000000 | ü | ü
|
---|
10 | 0000000000 | ü | ü
|
---|
11 | 00000000000 | ü | ü
|
---|
12 | 000000000000 | ü | ü
|
---|
13 | 0000000000000 | ü | ü
|
---|
Results
Table of test results
UA | 5) Hex vs Dec | 1) Semicolon | 4) Length | Comments |
with | without | 2) Text | 3) Attributes
|
Firefox | Hex | 100% | 100% | 100% | 100% | |
Dec | 100% | 100% | 100% | 100% |
|
Opera | Hex | 100% | 100% | 100% | 100% | |
Dec | 100% | 100% | 100% | 100% |
|
Mac IE5 | Hex | 100% | 100% | 100% | 100% | |
Dec | 100% | 100% | 100% | 100% |
|
Lynx | Hex | 100% | 100% | 100% | untested | |
Dec | 100% | 100% | 100% | 100% |
|
Lobo | Hex | 100% | nil | 100% | 100% | Lobo is a Java-based browser |
Dec | 100% | nil | 100% | 100%
|
IE 6, 7, 8 | Hex | 100% | 50% | nil | max 4 zeros | |
Dec | 100% | 100% | max 4 zeros | max 4 zeros |
|
Webkit | Hex | 100% | 100% | max 6 zeros | max 6 zeros | |
Dec | 100% | 100% | max 5 zeros | max 5 zeros |
|
Konqueror | Hex | 100% | 100% | max 6 zeros | max 6 zeros | |
Dec | 100% | 100% | max 5 zeros | max 5 zeros |
|
Conclusions
Caution: termination testing is not yet done.
The common UAs
By «common UAs», it is meant these browsers and browser families: IE, Mozilla, Webkit, Konqueror, Opera and Chrome (Chrome is assumed to behave like Webkit)
Semicolon
- Of the common UAs, only IE had a particular problem with NCRs without semicolon termination.
- But the IE issue is only related to hexadecimal NCRs, when used inside text. There were no problems when used inside attributes.
Length
- Firefox and Opera: No limitation on the length (the number of zeros) was been found in this test
- IE: supports 4 zeros. But without semicolon, then for text written with hex NCRs, then there is support – regardless of whether one use a zero in the NCR at all or not. Whereas for decimal NCRs, then there is not these discrepancies within IE.
- Webkit and Konqueror: similar to IE, but support 6 zeros in hexadecimal NCRs and 5 zeros for decimal NCRs.
Observations
IE aligns NCR and CSS escape length
IE: It is interesting to note that the maximum number of alphanumeric characters that IE support in a hexadecimal NCR is 6 (plus the "&#x" in the start, and ";" in the end). 6 is also the limit on the lengh of CSS escapes. IE thus perhaps made the same calculations about how long an escape needed to be, that the authors of CSS 2.1. did.
Lynx
- Semicolon: Lynx has 100% support for lack of semicolon.
- Lenght: Lynx has 100% support (equal to Firefox/Opera) for long NCRs - regardless of decimal or hexadecimal.
Lobo
- Semicolon: Lobo has zero support for NCRs without semocolon;
- Lenght: Lobo has no limitations on the length; And full support both within attributes and text, both decimal and hexadecimal entities.