This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 8503 - [FO 1.1] grouping separators in format-integer
Summary: [FO 1.1] grouping separators in format-integer
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Functions and Operators 3.0 (show other bugs)
Version: Working drafts
Hardware: PC Windows NT
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-12-16 00:16 UTC by Michael Kay
Modified: 2010-02-05 00:55 UTC (History)
1 user (show)

See Also:


Attachments

Description Michael Kay 2009-12-16 00:16:32 UTC
The new function format-integer attempts to extract part of the functionality of xsl:number, for use within format-dateTime. At the same time it tries to align where possible with format-number.

The way grouping separators are handled is inspired by format-number, but there is an omission: to get output in the form 1,234 it is necessary to use a picture such as 0,000 or 9,999; but this will result in the number 42 being output as 0,042. We should therefore allow a "#" in the picture to denote a character position that is omitted if it would otherwise be an insignificant leading zero.
Comment 1 Michael Kay 2010-01-13 11:07:14 UTC
To achieve this effect, I propose to replace the two paragraphs following "The primary format token is one of the following" by:

<new>
A decimal-digit-pattern made up of optional-digit-signs, mandatory-digit-signs, and grouping-separator-signs.

* an optional-digit-sign is the character "#".

* a mandatory-digit-sign is a Unicode character in category Nd. All mandatory-digit-signs within the format token must be from the same digit family, where a digit family is a sequence of ten consecutive Unicode characters in category Nd, having digit values 0 through 9. Within the format token, these digits are interchangeable: a three-digit number may thus be indicated equivalently by 000, 001, or 999.

* a grouping-separator-sign is a non-alphanumeric character, that is a character whose Unicode category is other than Nd, Nl, No, Lu, Ll, Lt, Lm or Lo.

There must be at least one mandatory-digit-sign. There may be zero or more optional-digit-signs, and (if present) these must precede all mandatory-digit-signs. There may be zero or more grouping-separator-signs. A grouping-separator-sign must not appear at the start or end of the decimal-digit-pattern, nor adjacent to another grouping-separator-sign.

The corresponding output format is a decimal number, using this digit family, with at least as many digits as there are mandatory-digit-signs in the format token. Thus, a format token 1 generates the sequence 0 1 2 ... 10 11 12 ..., and a format token 01 (or equivalently, 00 or 99) generates the sequence 00 01 02 ... 09 10 11 12 ... 99 100 101. A format token of &#x661; (Arabic-Indic digit one) generates the sequence &#1633; then &#1634; then &#1635; ...

The grouping-separator-signs are handled as follows. The position of grouping separators within the format token, counting backwards from the last digit, indicates the position of grouping separators to appear within the formatted number, and the character used as the grouping-separator-sign within the format token indicates the character to be used as the corresponding grouping separator in the formatted number. If grouping-separator-signs appear at regular intervals within the format token, that is if the same grouping separator appears at positions forming a sequence N, 2N, 3N, ... for some integer value N (including the case where there is only one number in the list), then the sequence is extrapolated to the left, so grouping separators will be used in the formatted number at every multiple of N. For example, if the format token is 0'000 then the number one million will be formatted as 1'000'000, while the number fifteen will be formatted as 0'015.

The only purpose of optional-digit-signs is to mark the position of grouping-separator-signs. For example, if the format token is #'##0 then the number one million will be formatted as 1'000'000, while the number fifteen will be formatted as 15. A grouping separator is included in the formatted number only if there is a digit to its left, which will only be the case if either (a) the number is large enough to require that digit, or (b) the number of mandatory-digit-signs in the format token requires insignificant leading zeros to be present.

NOTE: numbers will never be truncated. Given the decimal-digit-pattern 01, the number three hundred will be output as 300, despite the absence of any optional-digit-sign.
</new>

NOTE ALSO (SEPARATE BUG): in the definition of format-number(), optional-digit-sign is described thus: "A single character, which must be defined in Unicode as a digit with the value zero". Delete "with the value zero", as we have removed this constraint.
Comment 2 Jim Melton 2010-02-05 00:55:17 UTC
In the Joint teleconference of the XSL WG and the XML Query WG on 2010-01-19,
minuted at http://lists.w3.org/Archives/Member/w3c-xsl-query/2010Jan/0081.html
(member-only link), the proposal in comment 1 was accepted.  As a result, I am
marking this bug RESOLVED/FIXED. 

Because you were present when the solution was adopted (and are the author of the solution), I am also marking the bug CLOSED.