6998 – Test orderBy35 / values larger than 1.0E6

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 6998 - Test orderBy35 / values larger than 1.0E6

Summary: Test orderBy35 / values larger than 1.0E6

Status:	RESOLVED INVALID

Alias:	None

Product:	XML Query Test Suite
Classification:	Unclassified
Component:	XML Query Test Suite (show other bugs)
Version:	1.0.2
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	---
Assignee:	Andrew Eisenberg
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-06-05 18:34 UTC by Bogdan Butnaru
Modified:	2010-02-24 21:04 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Bogdan Butnaru 2009-06-05 18:34:33 UTC

Hello! I suppose this was discussed before, but I can't find any explicit decision. (By the way, I'm basing this on version 1.0.2 of the TS, I haven't checked if the CVS version does things differently, yet.)

Test orderBy35 outputs several xs:float values representing powers of ten from zero to 17. The expected result is “1.0E17 1.0E16 1.0E15 1.0E14 1.0E13 1.0E12 1.0E11 1.0E10 1.0E9 1.0E8 1.0E7 1.0E6 100000 10000 1000 100 10 1 0”.

However, for values over 1E6 the specifications allow several different representations. (Java, for instance, converts 1.0E17f to string “9.9999998E16”.)

The FO specs mention this, and the test suite documentation also acknowledges that for tests that involve operations on floats/doubles and converting those results to strings, even as one explicit value is given, other values may also be acceptable.

In view of the last paragraph, I'm not sure if this would be regarded as a bug.

However, at least for this case, the test could be changed to avoid the problem. For instance, the test as written could remove all values above 1E6; separate tests could then check that, e.g.:
* “xs:float(1000000000) eq xs:float("1.0E9")”,
* “xs:string(xs:float("1E9"))” matches a specific regex, and that
* “xs:float("1.0E17") lt xs:float("1.00000007E17") and xs:float("1.0E17") gt xs:float("9.999999E16")”.

(The numbers in the last expression correspond to what Java's StrictMath says are the next highest and next lowest float values around 1.0E17; I assume it's correct about that...)

Something similar could be done for most tests that involve conversion between float and string values (this probably applies to double, as well). It seems verbose, but other than changing the FO spec to create a unique canonical representation* for float values (which is not realistic, except maybe for 1.1) I don't see any other solution.

(*: For instance, the canonical representation of a float value could be defined as “the shortest string that xs:float would convert to that value; if there are several, the first as ordered by the codepoint collation”. Though that may be computationally intensive, I haven't checked.)

Comment 1 Michael Kay 2009-06-05 19:41:18 UTC

Actually, there's nothing special about float values above 1E6. The problem applies to all floating-point values. For example, the query float('1.0') can legitimately produce the output 1.0000000000000000001.

We've generally been pragmatic about this in the test suite. Sometimes we've added alternative allowed results, sometimes we've changed the test to do something loke round-half-to-even() to reduce the number of digits, and the fallback is to say simply that you can claim a pass if your result are correct even if they differ from the published results.

Early on I argued we should have a "wrapped" format for results that would allow values to be labelled with their type allowing a better canonical comparison, but that's a lot of complexity to introduce, so we'll just have to continue handling things on a case-by-case basis.

Comment 2 Bogdan Butnaru 2009-06-06 01:10:57 UTC

(In reply to comment #1)
> Actually, there's nothing special about float values above 1E6. The problem
> applies to all floating-point values. For example, the query float('1.0') can
> legitimately produce the output 1.0000000000000000001.

I'm not sure about this one. Of course, “xs:float(REPR)” _can_ produce other representations than REPR. However, unless I'm misreading something, xs:string(xs:float('1.0')) can only produce the string "1.0E0":

a) First the string '1.0' is converted to a value in the float value space. Neither [1] nor [2] explicitly say how to do that, but presumably [3] applies, which means that the xs:float value is exactly 1 × 2^0 (i.e., 1). 

b) Converting this value to a string is done according to [4]; the constraint under the first sub-bullet applies, since the absolute value is between one millionth (inclusive) and one million (exclusive). Thus, the value is converted to an xs:decimal and the resulting xs:decimal is converted to an xs:string according to the rules above, “as though using an implementation of xs:decimal that imposes no limits on the totalDigits or fractionDigits facets”.

c) According to [5], “[i]f ST is xs:float or xs:double, then TV is the xs:decimal value, within the set of xs:decimal values that the implementation is capable of representing, that is numerically closest to SV. If two values are equally close, then the one that is closest to zero is chosen.” In this case, the numerically closest xs:decimal value is “1 × 10^0”. Incidentally, this value must be exactly represented by all minimally conforming processors (totalDigits is 1); however, even if that were not true, [4] says to do it as though there were no limit on totalDigits or fractionDigits.

d) According to [1], this (unlimited-precision) decimal is transformed to a string according to a canonical representation (xs:integer if applicable, xs:decimal otherwise), which is by definition unique.

[1] http://www.w3.org/TR/2007/REC-xpath-functions-20070123/#casting-from-strings
[2] http://www.w3.org/TR/2007/REC-xpath-functions-20070123/#casting-to-float
[3] http://www.w3.org/TR/xmlschema-2/#float
[4] http://www.w3.org/TR/2007/REC-xpath-functions-20070123/#casting-to-string
[5] http://www.w3.org/TR/2007/REC-xpath-functions-20070123/#casting-to-decimal

In fact, unless there's something wrong in my analysis above, all float and double values between 1.0E-6 and 1.0E6 have exactly one possible representation when converted to an xs:string.

* * *

However, all the above—and especially the “unlimited precision” part—would imply that, e.g., xs:float('0.1') can only be represented as "0.100000001490116119384765625". 

Which doesn't seem what the specs want to say (for double, the number of digits can be 55). Also, a lot of the tests would be wrong.*

What am I missing?

(Until now I've been doing the float-to-string conversion via Java's functions until now, which seems to make most tests happy. However, using the strict procedure above I get different results on most tests.)

Comment 3 Bogdan Butnaru 2009-06-06 01:16:14 UTC

(In reply to comment #2)
> (Until now I've been doing the float-to-string conversion via Java's functions
> until now, which seems to make most tests happy. However, using the strict
> procedure above I get different results on most tests.)

Actually, for things like orderBy35 I've had to change to a method that returns the shortest string that gives the original value when converted to float. Which is not exactly what Java's Float.toString(float) does.

But the analysis of the specs I give above left me a bit confused.

Comment 4 Michael Kay 2009-06-06 15:18:49 UTC

In my implementation I gave up on the Java code for float->string a while ago and wrote my own, based on published algorithms.

You seem to be correct that the rules in the spec are stricter for values in the range (1e-6, 1e6) than for values outside that range. My memory was that we had relaxed the rules at a fairly late stage of development, but it seems this was only for the values outside this range. Compare the CR spec (http://www.w3.org/TR/2005/CR-xpath-functions-20051103/#casting-to-string) with the final version.

I seem to remember experimenting with a "shortest string" approach at some stage. It's not quite the same as the "nearest value" rule required for the 1e-6-to-1e6 case, or by the old CR rules.

Comment 5 Bogdan Butnaru 2009-06-06 22:51:54 UTC

(In reply to comment #4)
> You seem to be correct that the rules in the spec are stricter for values in
> the range (1e-6, 1e6) than for values outside that range.

Should I compile a list of tests that give different results than what the spec suggests?

Comment 6 Andrew Eisenberg 2009-06-22 20:35:14 UTC

I support the response Mike provided in comment #1 with respect to pragmatism:

> We've generally been pragmatic about this in the test suite.
> Sometimes we've added alternative allowed results, sometimes
> we've changed the test to do something loke round-half-to-even()
> to reduce the number of digits, and the fallback is to say
> simply that you can claim a pass if your result are correct
> even if they differ from the published results.


If you'd like us to add additional expected results to the test suite, then for each test case please supply us with a) test case name, b) current expected result, and c) new expected result.

Comment 7 Josh Spiegel 2009-07-07 20:35:44 UTC

I am having the same issues as Bogdan.

It seems for a given value, there is one closest float value that approximates it. For example, take 1.0E17 mentioned by Bodan above. The closest float value to this is the one with exponent 183 and mantissa 11641532. (i.e. 11641532 * 2 ^ (183-150) == 99,999,998,430,674,944). Given that XQuery implementations must conform to IEEE 754, all implementations should have this same internal representation for 1.0E17.

As mentioned above, the F&O spec does mention that when converting a float value to a string that implementations may have different representations. I assume this is referring to the fact that when you round a value to a certain precision, you must choose one of the rounding methods (i.e. nearest value) from the IEEE 754 specification. The F&O spec also throws in the phrase "inter alia" ("Among other things" - sec 17.1.2) in the context of reasons for different string representations. I am not sure what this is referring to... Possibly differences that arise during floating point arithmetic?

Anyway, the spec clearly is implying that some kind of limited precision is expected when converting a float value to a string. However, I can not find any specification of what the precision should be. Did I miss it?

The XQTS test suite results seem inconsistent to me. Many tests (about 72) like casthcds17 seem to assume that the precision should be at least 7. However, in the orderBy tests (e.g. orderBy35) it seems the precision is less than 7. For example, 1.0E17 (100000000000000000) is represented as the string "1.0E17" in the reference this result. In order to get this string representation for the float value, a precision of 6 or less would have had to have been used. With a precision of 7 it would have been "9.9999998E16" (see the second paragraph of this comment). Possibly the XQTS assumes that the precision is lowered as much as possible without destroying the string->float mapping back to the original value.

Java's Float.toString works by picking a precision that is small as possible without making adjacent float values in the value space appear the same. Continuing the running example, java would pick a precision of "7" in this case because with a precision of "6", 99,999,998,430,674,944 and its next biggest value would both appear as 1.0E17 (try adding 1 from the mantissa to verify this). see http://java.sun.com/javase/6/docs/api/java/lang/Float.html#toString(float)

Our implementation starts with Java's Float.toString() in the case that the value is not in the range (1e-6, 1e6). We then use a precision of 7 and round to the nearest value. This lets us pass the vast majority of XQTS test cases where this could be an issue. However, the following orderBy tests fail due to the seemingly varying precision issue I mentioned above:

orderBy25,35,45,55
orderbylocal-25,35,45,55
orderbywithout-14,21,30,37

Simply adding more reference results may scale unless the float values outside of the range (1e-6, 1e6) that are output are carefully selected. For the test that outputs 1.0E17 (99,999,998,430,674,944) you would need more than 9 reference results to handle all cases (more than 9 due to the valid rounding alternatives in IEEE 754).

We will report these failures as successes anyway so we don't feel it is urgent that this be addressed. However, it seems it should be fixed somehow eventually to save future implementers from unnecessary grief.

Full disclosure: I have only recently began to study floating points in order to understand this issue. I apologize in advance if I have overlooked or misunderstood something.

Comment 8 Michael Kay 2009-07-08 09:06:37 UTC

Firstly, the spec. After a great deal of deliberation, we decided that for float->string conversion, it should be legal to output any value that round-trips back to the original float. (Subject to the other rules e.g. that whole numbers are output without a decimal point.) Initially we used the XPath 1.0 rule that you should output the nearest number that is representable in decimal, but for various reasons this appeared unworkable. IIRC we also tried to mandate the shortest string that round-tripped back to the original float, but that had problems too.

So as far as testing is concerned, any test whose result is floating point (or that contains the serialization of a value that includes floating point) has an infinite number of correct answers. This is clearly unsatisfactory, but it's hard to find a good way forward. Pragmatically, it can be useful when regular comparison of results fails, to try again parsing both results as float and outputting a comment if this succeeds, to aid manual inspection.

At one stage I remember proposing that test results should be in a "wrapped" (self-describing) format for example <result type="xs:double">1.0e7</result>. Introducing this now would be a lot of work - and it still wouldn't solve the problem entirely, because comparison by converting back to double would give false positives, e.g. it would accept 10000000.0 as a valid result.

Pragmatically, the best way forward is probably to refine individual tests that cause problems for example by using round-half-to-even to reduce the precision.

Comment 9 Josh Spiegel 2009-07-08 15:33:51 UTC

I suggest creating documentation for this in XQTS and also having the relevant tests reference this documentation in the comment.  This would have helped me a lot initially.  Thanks.

Comment 10 Bogdan Butnaru 2009-07-09 00:09:05 UTC

(In reply to comment #8)
> So as far as testing is concerned, any test whose result is floating point (or
> that contains the serialization of a value that includes floating point) has an
> infinite number of correct answers. This is clearly unsatisfactory, but it's
> hard to find a good way forward. Pragmatically, it can be useful when regular
> comparison of results fails, to try again parsing both results as float and
> outputting a comment if this succeeds, to aid manual inspection.

I haven't thought this through very thoroughly, but I believe that for most float numbers there should be a (relatively simple) way to determine that a string value is a correct representation of that float number, using only string operations:

AFAIK, for most if not all float numbers, and with the specific restrictions in the spec regarding the position of the decimal separator, all valid mantissas that can represent it form a set of strings that is infinite but is bounded lexicographically at both ends by a well-defined pair of strings. I _think_ that the exponent can't vary, so in principle it should be possible to check the string values using only string operations.

(This excludes values in the 1E-6 – 1E+6 range, where I believe the spec allows a single representation, and of course NaN and infinities.)

I intend to investigate this further and propose concrete changes. This will probably mean using the exact value for the restricted millionth–million range, writing a tool that generates the extremes for any value outside that range, and changes to affected tests. I'm in the middle of something else right now, so that is not likely to come before the end of summer. However, if there is a specific reason to hurry (e.g., an imminent new release of the test suite) and nobody else offers, let me know, I'll probably be able to take a couple of days to work it through.

Comment 11 Andrew Eisenberg 2010-02-24 20:58:10 UTC

(In reply to comment #9)
> I suggest creating documentation for this in XQTS and also having the relevant
> tests reference this documentation in the comment.  This would have helped me a
> lot initially.  Thanks.
> 

The Guidelines for Running the XML Query Test Suite do address this issue, saying:

"Many tests involve operations on floats/doubles and converting those results to strings. Even as one explicit value is given, the task force realizes that other values may also be acceptable. In such cases submitters are encouraged to submit values that may differ. The task force will eventually determine if such values are within the acceptable range."

It didn't occur to us to reference this comment in the relevant test cases.

Comment 12 Andrew Eisenberg 2010-02-24 21:04:01 UTC

Bogdan, in comment #6 I asked if you'd like additional expected results added for some of our test cases. I haven't heard back from you, and so I'm marking this bug report with INVALID.

Please close this bug report if you agree with this resolution, or reopen it if you'd like further action or discussion. I understand that you may wish to reopen this in the future, when your schedule allows you to work on this topic.