This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 10387 - [FO11]: format-integer format token to disallow digits besides 0 or 1
Summary: [FO11]: format-integer format token to disallow digits besides 0 or 1
Status: RESOLVED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Functions and Operators 3.0 (show other bugs)
Version: Working drafts
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-08-17 17:39 UTC by Brett Zamir
Modified: 2010-08-31 21:59 UTC (History)
0 users

See Also:


Attachments

Description Brett Zamir 2010-08-17 17:39:30 UTC
Hi,

In the docs under section 4.5.1, it says, "The primary format token is one of the following:...Any sequence of Unicode digits drawn from the same digit family, where a digit family is a sequence of ten consecutive Unicode characters in category Nd, having digit values 0 through 9."

While the example makes clear how the different language equivalents might interpret 0 and 1, it is not clear to me from the text how other digits are to be interpreted, though I might guess they would be treated like 1 (though in that case I see no reason to allow other numbers, as it could be confusing if they all behave like 1). Thank you...
Comment 1 Brett Zamir 2010-08-17 18:07:05 UTC
I am aware of the statement which follows, "Any other format token indicates a numbering sequence in which that token represents the number 1 (one)" (a statement which I think is itself a bit confusing in referring to "number 1" instead of "format token 1", though that is cleared up in the next sentence), but since the first clause explicitly deals with "[a]ny sequence of Unicode digits drawn from the same digit family", I think that first clause should either limit itself to equivalent digits for 0 and 1 (so the "Any other format token..." statement can take effect), or else explicitly indicate how the tokens 2-9 are to be handled.
Comment 2 Michael Kay 2010-08-17 18:41:17 UTC
Thanks for the comment. format-integer combines elements of format-number and xsl:number; format-number uses "000" for a three-digit output field, while xsl:number uses "001". We wanted to eliminate this difference. xsl:number also allows formats such as "999" (remember COBOL?) but their meaning is essentially implementor-defined. So we thought it made sense to allow a three digit output to be indicated by any sequence of digits, that is, in the format picture all digits are interchangeable. I'll see if this can be explained more clearly.
Comment 3 Brett Zamir 2010-08-18 04:32:50 UTC
Ok, thanks. If the picture needs to be confined to 3 digits at most, I think that also should be specified. If not, though, it could be useful as an indefinitely sized fixed width filler: e.g., 0000123.
Comment 4 Michael Kay 2010-08-18 07:54:25 UTC
I was only using three digits as an example. You can use "000" if you want at least three digits, "000000" if you want at least 6, and so on.

I'll have another read of the text to see if it can be clarified. It was extracted almost verbatim from the specification of xsl:number in the XSLT specification - perhaps moving the text has lost some of the context and more introductory explanation is needed.
Comment 5 Michael Kay 2010-08-20 18:39:45 UTC
The text for this function has in fact been significantly improved since the Dec 2009 working draft.  

The paragraph cited in comment #0 now reads: "a mandatory-digit-sign is a Unicode character in category Nd. All mandatory-digit-signs within the format token must be from the same digit family, where a digit family is a sequence of ten consecutive Unicode characters in category Nd, having digit values 0 through 9. Within the format token, these digits are interchangeable: a three-digit number may thus be indicated equivalently by 000, 001, or 999." - which seems to address the point you are making.

I'm not sure I understand the problem you are describing in comment #2. This text has been part of the xsl:number specification for many years. By saying "Any other format token", it means an alphanumeric not listed above, for example greek letter alpha, and indicates that if the implementation supports it, this format token may be used to denote a numbering sequence such as α, β, γ, ... The first rule (or two rules, in the Dec 2009 version) covers format tokens containing Unicode digits; the subsequent rules therefore are only concerned with tokens made up of non-digits.
Comment 6 Michael Kay 2010-08-31 21:59:10 UTC
The WG decided to close this as resolved/fixed on the basis that we have made changes to the specification that address some of your concerns, and have responded to the other concerns to explain why the spec is as it is. We would be grateful if you take a look at the revised text when it is next published, and please feel free to reopen the bug if you feel there are outstanding concerns.