Implications of integer/float non-disjointness

Jos reminded me that I had raised some concerns about RIF adopting the 
OWL2 approach to decimal/float disjointness but had not yet come back 
with a concrete expression of what the issues might be.  As input for 
the F2F (at which I'll only be virtually, and partially, present) here's 
an attempt to frame the issue.

** Background

XML Schema [1] states:

   "the value spaces of all primitive datatypes are disjoint (they do 
not share any values)"

which is how RIF DTB is currently defined. OWL 2 have chosen to 
reinterpret the datatypes so that the values spaces for xsd:float, 
xsd:double and xsd:decimal (and thus integer etc) are not disjoint. They 
did check with the XML Schema working group who indicated that was 
acceptable.

Should RIF adopt this for compatibility with OWL 2? This is part (but 
just part) of Issue-81.

** What's the problem with this?

This issue, as I see it, is that we (unlike OWL) have to deal with builtins.

To take an example consider func:numeric-add. At present this
defers to XPath's op:numeric-add and so the addition of two floats is a
float and the value is thus different (due to rounding) from what it 
would be if you treated the arguments as doubles or as decimals. This is 
OK because in DTB we know what domain the arguments are in since the 
domains are disjoint so even though the builtins only see the value not 
the lexical form it can still tell the difference between floats, 
doubles and decimals and presumably the type promotion and type return 
rules from XPath F&O [2] apply as normal.

If the domains are not disjoint then how can we specify the appropriate
builtin behaviour for what value and type each builtin should return in 
each circumstance?

** Specific example to illustrate this

At the moment:

RS1: [3]
   ex:a = "1"^^xsd:double
   ex:p(?x) :- ?x = External(func:numeric-add(ex:a,
                                  "0.0000000001"^^xsd:double))
entails:
   ex:p("1.0000000001"^^xsd:double)

but:

RS2:
   ex:a = "10000000"^^xsd:double
   ex:p(?x) :- ?x = External(func:numeric-add(ex:a,
                                  "0.0000000001"^^xsd:double))
entails:
   ex:p("10000000"^^xsd:double)

what's more the XPath type promotion rules mean that:

RS3:
   ex:a = "10000000"^^xsd:double
   ex:p(?x) :- ?x = External(func:numeric-add(ex:a,
                                  "0.0000000001"^^xsd:decimal))
also entails:
   ex:p("10000000"^^xsd:double)


However, if we change the value spaces to not be disjoint then the
op:numeric-add sees a real value in each case and so presumably RS2 and 
RS3 would both entail:

   ex:p("10000000.0000000001"^^xsd:decimal) [4]

** So what?

I see several problems with this:

(1) It's an implementation burden. Implementers would not be able use 
normally floating point operations when working over doubles. They would 
either be forced to use decimals all the time, or would have to use 
doubles only in cases where the result of the builtin would be safely 
representable in a double. At each invocation they would need to check 
the range of the arguments to decide if an overflow or rounding would 
result and promote to decimals if necessary.

(2) The behaviour differs from XPath F&O as well as common programming
expectations and on the face of it seems to eliminate all the point of
supporting xsd:double in the first place.

(3) There would be a lot of rewriting of DTB needed.



Does this define the issue appropriately? Any flaws in the above?

Dave

[1] http://www.w3.org/TR/xmlschema-2/ section 4.2.1. The corresponding 
statement in XMLSchema 1.1 is more "For purposes of this specification, 
the value spaces of primitive datatypes are disjoint, even in cases 
where the abstractions they represent might be thought of as having 
values in common" but we adopted the 1.0 spec.

[2] http://www.w3.org/TR/xpath20/#promotion and 
http://www.w3.org/TR/xpath-functions/#numeric-types

[3] The "ex:a = " part of these examples is not necessary but it helps
to underline the fact that the builtin only sees the value not the
lexical form.

[4] Note I've choosen the numbers so that the mantissa
100000000000000001 fits within the 64 bits which minimal xsd:decimal
implementations have to support, while not fitting in the 53 bits 
available within xsd:double. Thus each of the arguments is a precise 
value (the fraction is a precise binary fraction and not itself subject 
to rounding) and the result as a decimal is correct but the result as a 
double is subject to (well-defined) rounding.

Received on Thursday, 8 January 2009 17:03:24 UTC