This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
This was was cloned from bug 14430 as part of operation convergence. Originally filed: 2011-10-11 16:12:00 +0000 Original reporter: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> ================================================================================ #0 Leif Halvard Silli 2011-10-11 16:12:19 +0000 -------------------------------------------------------------------------------- * The 'Glyph' column for for ⟩ contains 〉 whereas the 'Character(s)' column says that ⟩ has been redefined since HTML4 and is now pointing to ⟩/⟩ And ditto: * The 'Glyph' column for for ⟨ contains 〈 whereas the 'Character(s)' column says that ⟨ has been redefined since HTML4 and is now pointing to ⟨/⟨ Or to quote the source code in the spec: <tr id="entity-rang"> <td> <code title="">rang;</code> </td> <td> U+027E9 </td> <td> <span class="glyph" title="">〉</span> </td> </tr> <tr id="entity-lang"> <td> <code title="">lang;</code> </td> <td> U+027E8 </td> <td> <span class="glyph" title="">〈</span> </td> </tr> Btw, you might also want to see bug 14429 ================================================================================ #1 Ian 'Hixie' Hickson 2011-10-21 22:53:03 +0000 -------------------------------------------------------------------------------- That's a bug in the preprocessor's parser, I think. ================================================================================ #2 Ian 'Hixie' Hickson 2012-01-27 18:26:52 +0000 -------------------------------------------------------------------------------- Is this still broken? ================================================================================ #3 Leif Halvard Silli 2012-03-01 16:28:16 +0000 -------------------------------------------------------------------------------- Yes, it is still broken. Tested today. ================================================================================
So in .../index, the character is confusingly just output as "⟩", but in multipage/named*.html, it's output as 0xe2 0x8c 0xaa, which is U+232A (9002). Both are suboptimal. Any tools people know the cause or what I should do here?
From what I remember this was because of Python / lxml and we had not figured out a way of solving it other than hacking the output. (This is a duplicate of a bug from David Carlisle.)