3248 – Numeric precision

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3248 - Numeric precision

Summary: Numeric precision

Status:	CLOSED FIXED

Alias:	None

Product:	XML Schema
Classification:	Unclassified
Component:	Datatypes: XSD Part 2 (show other bugs)
Version:	1.1 only
Hardware:	PC Windows XP

Importance:	P2 normal
Target Milestone:	---
Assignee:	C. M. Sperberg-McQueen
QA Contact:	XML Schema comments list

URL:
Whiteboard:	cluster: numbers
Keywords:	editorial, resolved

Depends on:
Blocks:

Reported:	2006-05-09 10:50 UTC by Michael Kay
Modified:	2009-04-21 23:13 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Michael Kay 2006-05-09 10:50:26 UTC

QT approved comment

In 3.3.4, it's not clear what the sentence "Precision is sometimes given in absolute, sometimes in relative terms" means. Is it referring to the way precision is defined in this specification, or is this background information about the big bad world? I'm also confused by the definition of "arithmetic precision". It has just been stated that precisionDecimal "closely corresponds" to a floating point decimal datatype. So what is meant by "[digits] to the right of the decimal point"? Is this decimal point a fixed one or a floating one? To clarify, perhaps you could state what is the arithmetic precision of 1E6 and 1E-6. The definition of the minScale and maxScale facets seem to suggest that it is a floating decimal point that is intended; but in that case, do 1E-1 and .1E0 have the same scale or different scale?

Comment 1 Dave Peterson 2006-05-20 02:10:26 UTC

(In reply to comment #0)

> In 3.3.4, it's not clear what the sentence "Precision is sometimes given in
> absolute, sometimes in relative terms" means. Is it referring to the way
> precision is defined in this specification, or is this background information
> about the big bad world? I'm also confused by the definition of "arithmetic
> precision". It has just been stated that precisionDecimal "closely corresponds"
> to a floating point decimal datatype. So what is meant by "[digits] to the
> right of the decimal point"? Is this decimal point a fixed one or a floating
> one?

There are (at least) two ways of expressing "precision":  absolute or arithmetic
precision (exemplified by "+/- n") and relative or geometric precision (exemplified
by "+/-n%").  Certain selected values and precisions can have both the numerical
value and the precision specified with a single numeral (avoiding "+/-").  The
simple technique is to identify the arithmetic precision by the count of digits
to the right of the decimal point, modifying as appropriate when scientific
notation is used.  A so-called "fixed-point" datatype has a value space of
precision-bearing numbers all having the same arithmetic precision.  It is not
feasible to use a single numeral format to describe numbers having the same
geometric precision, but a related concept which might be called "floating-
point" precision can; it is exemplified by limits (either upper bound or fixed)
on the total number of "significant" digits, regardless of the location of the
decimal point in the numeral.  This "floating-point" precision is arithmetic in
the small and geometric in the large.  So-called "floating-point" datatypes fix or
limit the floating-point precision for all or most of their values with numerical
"numerical values" (thus excluding NaN and infinities).  The precisionDecimal
datatype does not limit either the arithmetic or floating-point precision of
the precision-bearing values therein, so the question of floating-point or
fixed-point has no bearing.

The minScale and maxScale facets can be used to limit the arithmetic precision;
making the have the same value would yield a "fixed-point" datatype.  The
totalDigits facet can be used to limit the floating-point precision, and if used
in conjunction with minScale and maxScale can approximate a "floating-point"
datatype analagous to float or double with base-10 rather than binary bases.
(Although it would not allow for rounding lexical representations whose exact
value is not in the value space, which is allowed for float and double.  This
appears less important for base-10-based floating-point datatypes than it
is for binary ones.)

Comment 2 Michael Kay 2006-05-22 12:50:04 UTC

Thanks for the tutorial: but the question is, are you proposing to change the existing text, and if so how?

(If I ask a question like "So what is meant by "[digits] to the
right of the decimal point"?" it's not so much because I want to know the answer, as that I want an assurance that the spec will provide the answer when I read the next draft.)

Comment 3 David Ezell 2007-06-26 15:33:24 UTC

WG suggestion from the June 2007 f2f meeting:

We accept this issue, and expect that we will make the spec crisper
with regard to what's required to implement, and possibly move supporting material to either an appendix or a note.

Comment 4 David Ezell 2007-06-26 15:34:08 UTC

WG suggestion from the June 2007 f2f meeting:

We accept this issue, and expect that we will make the spec crisper
with regard to what's required to implement, and possibly move supporting material to either an appendix or a note.

Comment 5 Dave Peterson 2007-08-27 01:27:43 UTC

This bug hereby subsumes that part of 3250 which relates to precision requiring better description.

Comment 6 C. M. Sperberg-McQueen 2007-09-14 16:19:08 UTC

A wording proposal from Dave Peterson is at 
http://lists.w3.org/Archives/Member/w3c-xml-schema-wg/2007Sep/att-0006/precisionDecimal__.htm

Comment 7 Mary Holstege 2007-09-14 16:33:39 UTC

I think all of appendix D is a bit much: all the discussion of what we aren't
doing, while interesting, is confusing to me.  I believe the only clarification
we need is to strike the "sometimes relative and sometimes absolute" sentence,
and add the clarification offered in the final two paragraphs of the proposed
D2.2 to the existing definition.

Comment 8 Michael Kay 2008-09-05 20:47:14 UTC

I was asked to review where we are on this.

I think I originally had two main criticisms of the spec as it was. One was that it was hard to distinguish general remarks about the world in which we live from specific statements about the semantics of the data type being specified. The other was that some of the detailed information about the data type was missing. (In retrospect, I don't now know whether it really was missing, or whether I had simply failed to find it among all the noise.)

Like Mary, I don't think the essay in Appendix D of the proposal referenced in comment #6 belongs in the spec (however interesting reading it is). We don't have similar essays about the Gregorian calendar or about floating point arithmetic. We shouldn't try to justify the choices we have made or tell readers what other alternatives we could have chosen. What we should try to do is to explain as clearly as we can and as unambiguously as we can how this datatype actually behaves.

I would like to suggest the following changes relative to the Last Call draft.

(1) In 3.3.4, replace the second paragraph "Precision is sometimes given..." by:

"Informally, the precision of the value is the number of decimal digits after the decimal point. The numbers 2 and 2.00, although numerically equal, are considered to have different precision (0 and 2 respectively). The precision of a value depends on the way it is written, and is derived from the lexical representation using rules described in [3.3.4.2 Lexical Mapping]. The precision of a value is retained as part of the value space, but plays no part in any operations defined in this specification other than the identity relationship: specifically, it does not affect equality or ordering comparisons. It might play a role in arithmetic operations, but that is outside the scope of this specification."

(2) In 3.3.4.2 Lexical Mapping, at the end of the current text, add:

"For example, given the lexical representations in the first column of the following table, the corresponding values and canonical representations are as shown in the adjacent columns:


Lexical          |  Value                               | Canonical
Representation   |  Numerical value | Precision | Sign  | Representation
                 |                  |           |       |
3                |  3               | 0         | +     | 3
3.00             |  3               | 2         | +     | 3.00
300              |  300             | 0         | +     | 3e2
3.0e2            |  300             | 1         | +     | 3.0e2
...."

(please correct if necessary, and add entries that illustrate the range of possibilities...)

Comment 9 Dave Peterson 2008-09-06 02:27:46 UTC

(In reply to comment #8)

> (2) In 3.3.4.2 Lexical Mapping, at the end of the current text, add:
> 
> "For example, given the lexical representations in the first column of the
> following table, the corresponding values and canonical representations are as
> shown in the adjacent columns:
> 
> 
> Lexical          |  Value                               | Canonical
> Representation   |  Numerical value | Precision | Sign  | Representation
>                  |                  |           |       |
> 3                |  3               | 0         | +     | 3
> 3.00             |  3               | 2         | +     | 3.00
> 300              |  300             | 0         | +     | 3e2
> 3.0e2            |  300             | 1         | +     | 3.0e2
> ...."
> 
> (please correct if necessary, and add entries that illustrate the range of
> possibilities...)

1.  The lexical and canonical representations are strings, so will need to be (single) quoted.

2.  Re '300':  Looking at the algorithm in the canonical representation function,
      1.   Let nV be the ·numericalValue· of pD.  [nV=300]
          Let aP be the ·arithmeticPrecision· of pD.  [aP=0]
      2.   pD is one of NaN, INF, or -INF, then return ·specialRepCanonicalMap·(nV).  [N/A]
      3.  Otherwise, if nV is an integer and aP is zero and 1E-6 &#8804; nV &#8804; 1E6, then return ·noDecimalPtCanonicalMap·(nV).  [Return '300']  (if garbled, '&#8804;' is less-than-or-equal)
      [remainder of the algorithm does not apply]

Therefore, the table above should have '300', not '3e2', for the canonical representation.

3.  Re '3.0e2':  Looking at the algorithm in the lexical representation function,
      Return
          -1 x ·noDecimalMap·(E)   when C is a noDecimalPtNumeral, [N/A] and
         ·decimalPtPrecision·(C) - ·noDecimalMap·(E)   otherwise. [1 - 2, i.e., -1]

Therefore the table above should have -1 as the arithmetic precision of '3.0e2'.

4.  I think you need at least one pair of examples with the same value and the same precision (and hence the same canonical representation), but different lexical representations.  E.g. '3.0e2' and '30e1' (and '0.30e3', for that matter).  That example is also special because there is no lexical representation not involving scientific notation (true of all values with negativve arithmetic precision).

Comment 10 Dave Peterson 2009-03-21 02:01:10 UTC

The WG approved the fix in http://www.w3.org/XML/Group/2004/06/xmlschema-2/datatypes.dp090225.html .  The fix is awaiting republishing of a status quo document.

Comment 11 C. M. Sperberg-McQueen 2009-04-21 17:51:29 UTC

The wording proposal mentioned in comment 10 has been integrated
into the status quo, so I'm marking this issue RESOLVED / FIXED.

Michael, if you would do the honors, please?  Thank you.