This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
When I parse the expression '\u0045' I see the following: |START [1:0 - 1:8] | QueryList [1:0 - 1:8] | Module [1:0 - 1:8] | MainModule [1:0 - 1:8] | Prolog [1:0 - 0:0] | QueryBody [1:0 - 1:8] | Expr [1:0 - 1:8] | StringLiteral 'E' [2:1 - 1:8] I expected: | StringLiteral '\u0045' [2:1 - 1:8]
The conversion of unicode escapes (\uHHHH) is happening in code that JavaCC generates. The odd thing is, we're explicitly telling JavaCC (via the JAVA_UNICODE_ESCAPE option) to generate that code. We've done so since 2005/03/25. It looks like this came about from confusion/conflation with the UNICODE_INPUT option (which controls whether the input stream object reads "Unicode files" or ASCII files).
Unfortunately, simply disabling the option causes some tests to fail, so this will take some investigation.
This issue came to my attention due to test cases in our QT3 test suite. I've reported the bug there as Bug #14328.
*** Bug 29568 has been marked as a duplicate of this bug. ***
I tried disabling the JAVA_UNICODE_ESCAPE option again today, and (to my surprise) the test-failures mentioned in comment #2 no longer occurred. I'm surprised because the version of JavaCC I'm using hasn't changed since I wrote comment #2. However, the versions of Java and Ant that I'm using *have* changed, so perhaps the bug lay somewhere in there. Anyhow, the applets now deliver the result that Andrew expected originally, so I'm marking this bug resolved-fixed.