21599 – format-number() integer-part grouping overly prescriptive

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21599 - format-number() integer-part grouping overly prescriptive

Summary: format-number() integer-part grouping overly prescriptive

Status:	RESOLVED WONTFIX

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Functions and Operators 3.0 (show other bugs)
Version:	Candidate Recommendation
Hardware:	All All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Michael Kay
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2013-04-06 00:39 UTC by Paul J. Lucas
Modified:	2013-04-23 17:06 UTC (History)
CC List:	1 user (show)

See Also:

Attachments

Description Paul J. Lucas 2013-04-06 00:39:38 UTC

XQuery Functions and Operators 3.0 [F&O], section 4.7.4, says in part:

> The integer-part-grouping-positions is a sequence of integers representing the positions of grouping separators within the integer part of the sub-picture. For each grouping-separator-sign that appears within the integer part of the sub-picture, this sequence contains an integer that is equal to the total number of optional-digit-sign and decimal-digit-family characters that appear within the integer part of the sub-picture and to the right of the grouping-separator-sign.

Compare with section 4.6.1 for format-integer():

> The position of grouping separators within the format token, counting backwards from the last digit, indicates the position of grouping separators to appear within the formatted number, and the character used as the grouping-separator-sign within the format token indicates the character to be used as the corresponding grouping separator in the formatted number.

4.7.4 seems overly prescriptive, in particular for an implementation to keep track of a sequence of integers whose value is "the total number of optional-digit-sign and decimal-digit-family characters that appear within the integer part of the sub-picture and to the right of the grouping-separator-sign."

Such a sequence also seems unnecessary since I have an implementation of format-integer() that determines no such sequence yet still manages to format integers correctly.

One would think and expect the specification for format-integer() and the integer part of format-number() to be equal.  If, however, the specification for format-number() really needs to be different (and really needs to determine the aforementioned grouping sequence), it would be nice if the specification said WHY it has to be different (and determine the sequence) in a Note.

Comment 1 Michael Kay 2013-04-06 09:24:19 UTC

The two specifications have a different history and we've done a fair bit to converge them. I think the main differences remaining are that format-number controls the choice of grouping-separator externally, which means the same character is always used at different positions, whereas format-integer allows different grouping separators in different positions, e.g. "(999)999-9999". (Though I see that there's an opportunity to make this more clear!)

When you say it's "overly prescriptive" am I right in thinking you feel the rules could be described in simpler (or less procedural) language, or are you really asking us to leave things more to the discretion of the implementation?

Clearly an implementation that can achieve the same effect with a simpler algorithm is perfectly at liberty to do so.

If you think the language can be simplified without introducing ambiguities, I'm happy to hear your concrete suggestions.

As for converging the language of the two specs, it would be nice if it can be done, but I'd say it's risky at this stage - experience shows that it's very hard to get these things right, and if commonality of text is the only benefit, I don't think it's worth the risk.

Comment 2 Paul J. Lucas 2013-04-06 14:57:31 UTC

I'm asking that things are left more to the discretion of the implementation.  There's simply no need for the kind of implementation details that are currently there in the specification.  As long as the implementation yields the same result, who cares how it's implemented?

I think the language can be simplified (made less prescriptive) by using very similar language to that of format integer, to wit:

The position of grouping-separator-signs within the integer part of the sub-picture, counting backwards from the last digit, indicates the position of grouping-separator-signs to appear within the integer part of the formatted number. If grouping-separator-signs appear at regular intervals within the integer part of the sub-picture, that is if grouping-separator-signs appears at positions forming a sequence N, 2N, 3N, ... for some integer value N (including the case where there is only one number in the list), then the sequence is extrapolated to the left, so grouping-separator-signs will be used in the integer part of the formatted number at every multiple of N.

BTW: I probably also should have included in this bug that the fractional part is likewise overly prescriptive.  I would suggest the wording for the fractional part be changed to:

The position of grouping-separator-signs within the fractional part of the sub-picture, counting forwards from the first digit, indicates the position of grouping-separator-signs to appear within the fractional part of the formatted number.

BTW: since the spec does NOT mention anything about "regular intervals" N, 2N, 3N, ..., for the fractional part, I assume that means that the sequence is NOT extrapolated as far as necessary to accommodate the largest possible fractional number -- or is is supposed to?  If is is supposed to, then it MUST say so.

If there is no extrapolation to be done for the fractional part, then I'd highly recommend that a Note be added to the spec saying so (in the same spirit as the the existing note saying that there is no maximum integer part size).  I can file a separate bug for this if you like.

Comment 3 Michael Kay 2013-04-06 19:32:08 UTC

When the specification describes things by means of an algorithm, there is no requirement for an implementation to use the same algorithm; it can use any algorithm it likes that delivers the same result.

The format-number text is the result of a great deal of review and feedback during the course of many drafts of the XSLT 2.0 specification, resulting from experience with XSLT 1.0 where different implementors interpreted the specification differently. If it seems to spell things out in too much detail, that's a reaction against the 1.0 spec which left things far too vague. I'm very reluctant to make any changes motivated only by making it simpler or less algorithmic, because there's too much risk of introducing an accidental incompatibility or ambiguity.

The grouping positions on the fractional side do not need to be extrapolated because there will never be more digits in the result than there are in the picture.

Comment 4 Paul J. Lucas 2013-04-07 15:36:13 UTC

> I'm very reluctant to make any changes ...

Do you really mean intransigent?

> there's too much risk of introducing an accidental incompatibility or ambiguity.

My concrete suggestion (which you requested) is pretty much a copy-paste of the format-integer wording with only the necessary substitutions, e.g., "format token" became "integer part of the sub-picture."  If the wording for format-integer() is presumed to be unambiguous, then my suggestion should also presumably be unambiguous.

As for incompatibility, well, I don't see any since the result is the same.

Comment 5 Michael Kay 2013-04-07 20:17:45 UTC

Yes, I'm being intransigent. We're at Candidate Recommendation stage, and at that stage I think one needs to be very conservative about making changes.

For example, I'm not convinced your proposed text works when there is a percent or per-mille sign. But that's not really the point; I want to reject the change because the existing text isn't broken and any change is risky, not because I can see actual problems with it.

Comment 6 Liam R E Quin 2013-04-07 21:19:23 UTC

We do have to be very careful in making changes to Functions and Operators 3 while it's a Candidate Rec. There's also room to consider such changes for 3.1.  

(it's part of my job to be the voice saying "no changes in 3.0" sort of like "we're in release freeze" :-) but I don't want to say that without saying how improvements could be made in the next cycle; 3.1 work has already started)

Comment 7 Paul J. Lucas 2013-04-09 03:00:59 UTC

If you're being intransigent, why did you even bother to ask for my concrete suggestion if you knew in advance you'd ignore it?

Comment 8 Michael Kay 2013-04-23 14:12:02 UTC

My recommendation to the Working Group is to close this as WONTFIX. I think the changes being proposed are (a) editorial, and (b) risky, in the sense that it's difficult to convince oneself that the change is purely editorial.

Comment 9 Liam R E Quin 2013-04-23 17:06:48 UTC

The joint Working Groups (XSLT and XQuery) agreed not to make a change here for 3.0 - the existing text has been worked out carefully over a long time.

We did agree to add a comment about extrapolating fractional digits - thank you for the suggestion.

Closing as WONTFIX for F&O 3.0.