This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 9291 - wrong XQueryX tests - double UTF-8 encoding
Summary: wrong XQueryX tests - double UTF-8 encoding
Status: RESOLVED FIXED
Alias: None
Product: XML Query Test Suite
Classification: Unclassified
Component: XML Query Test Suite (show other bugs)
Version: 1.0.2
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Andrew Eisenberg
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL: http://zorba-xquery.com
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-22 15:40 UTC by Daniel Turcanu
Modified: 2010-05-07 14:45 UTC (History)
0 users

See Also:


Attachments

Description Daniel Turcanu 2010-03-22 15:40:04 UTC
The XQueryX tests that contain unicode characters have those characters encoded as UTF-8 twice. That is, one non-ascii character gets encoded on 4 bytes instead of 2.

The failing tests are:
XQueryX/EncodeURIfunc/K-EncodeURIfunc-4
XQueryX/EscapeHTMLURIFunc/K-EscapeHTMLURIFunc-5
XQueryX/Functions/AllStringFunc/AssDisassStringFunc/StringToCodepointFunc/fn-string-to-codepoints1args-4
XQueryX/Functions/AllStringFunc/EscapingFuncs/EncodeURIfunc/fn-encode-for-uri1args-2
XQueryX/Functions/AllStringFunc/EscapingFuncs/EscapeHTMLURIFunc/fn-escape-html-uri1args-2
XQueryX/Functions/AllStringFunc/EscapingFuncs/IRIToURIfunc/fn-iri-to-uri1args-2
XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-12
XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-19
XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-20
XQueryX/StringToCodepointFunc/K-StringToCodepointFunc-21

The testing was performed using Zorba XQuery 1.1.
Comment 1 Andrew Eisenberg 2010-03-24 20:57:42 UTC
Sorry, but I am not seeing the problem that you describe.

I looked at the first test case you listed, K-EncodeURIfunc-4. The XQuery contains encode-for-URI("~bébé") ... I see the string literal as bytes 7E 62 C3 A9 62 C3 A9. The XQueryX that is generated is:

              <xqx:functionCallExpr>
                <xqx:functionName>encode-for-uri</xqx:functionName>
                <xqx:arguments>
                  <xqx:stringConstantExpr>
                    <xqx:value>~b&#233;b&#233;</xqx:value>
                  </xqx:stringConstantExpr>
                </xqx:arguments>
              </xqx:functionCallExpr>

The two-byte Unicode characters are being replaced by charRefs in the XQueryX that is generated.

Comment 2 Andrew Eisenberg 2010-05-03 22:04:12 UTC
Daniel, if I don't receive any further information from you, then I will have to close this bug report without making any changes.
Comment 3 Daniel Turcanu 2010-05-07 14:45:28 UTC
Ok, I just checked them all and they work fine. They must have been fixed in the latest XQTS.