3326 – Ambiguity regarding non-alphanumeric format tokens, xsl:number

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3326 - Ambiguity regarding non-alphanumeric format tokens, xsl:number

Summary: Ambiguity regarding non-alphanumeric format tokens, xsl:number

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	XSLT 2.0 (show other bugs)
Version:	Candidate Recommendation
Hardware:	PC Windows 2000

Importance:	P2 normal
Target Milestone:	---
Assignee:	Michael Kay
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2006-06-12 21:41 UTC by David Marston
Modified:	2006-06-21 11:49 UTC (History)
CC List:	0 users

See Also:

Attachments

Description David Marston 2006-06-12 21:41:23 UTC

The 4th paragraph of 12.3 indicates that xsl:number could make assumptions about format tokens and separator tokens. This gives rise to some format possibilities lacking both:
format=""
format="()" and similar.

The 3rd paragraph must make an explicit statement the latter. (The 4th tells us enough to know that the former defaults as "1.1.1.1.1.1" of sufficient length to handle the longest sequence.) When the format attribute consists of only non-alphanumeric characters, is that sequence:
just the prefix?
just the suffix?
an error?
Being an error is plausible, since the string is NOT any of:
before a format token
after a format token
between two format tokens
Being only a prefix is mildly useful for symbols like #, but not so remarkably convenient (compared to "#1") that we should encourage users to think that way. Being only a suffix seems less useful. Being both prefix and suffix is useless and also violates the string-is-only-used-once verbiage.

Therefore, I suggest that adding a new error message, something like "format string has no alphanumeric fields", is the best resolution. (I recognize that any alphanumeric suffices, even if it is one not supported by the implementation. Unsupported alphanumerics revert to being placeholders for 1, not being prefix/separator/suffix.)

As an aside: I am unhappy that the 3rd paragraph does not come right out and say that the empty sequence is formatted as the prefix and suffix adjacent. In 1.0, this was an area of ambiguity.

Comment 1 Michael Kay 2006-06-12 22:16:53 UTC

I agree, there's a gap here.

XSLT 1.0 said: "If the first token is a non-alphanumeric token, then the constructed string will start with that token; if the last token is non-alphanumeric token, then the constructed string will end with that token." Saxon took that literally, so that <xsl:number value="99" format="()"/> gives you "()99()"; and I've retained that interpretation in the 2.0 processor, though there's nothing in the 2.0 spec to justify it (as you say, it's not exactly useful to anyone, which is probably why no-one has complained).

I don't favour making this an error. It's a bit arbitrary, but I think my preferred option would be to say that if there's only one token and it's alphanumeric then we treat it as the prefix.

Regarding the "aside", I think that the only possible interpretation of the spec is that an empty sequence is formatted as the concatenation of the prefix and the suffix. I don't think we need to treat this as a special case, since it falls out from the general rules.

Comment 2 David Marston 2006-06-14 19:12:59 UTC

>...if there's only one token and it's NOT alphanumeric then we treat it as the prefix.

That would be reasonable.

Continuing the aside: I agree that the verbiage "the prefix always appears exactly once" and its parallel for the suffix greatly reduces the ambiguity from 1.0. The core of the ambiguity is whether an empty sequence (or "empty list" as the 1.0 Erratum E23 said it) suppresses all rendition; i.e., that prefix and suffix only occur as a byproduct of having a number sequence/list to render. So it could be called an order-of-operations issue. For 2.0, it would be helpful to have a Note like this:
NOTE: The only way that xsl:number will not produce a text node is when the sequence of numbers is empty and there is no prefix and there is no suffix.

Comment 3 Michael Kay 2006-06-15 20:05:15 UTC

Given the absence of use cases for preferring one interpretation of this construct over any other, the WG's preference was to retain the interpretation given by a strict reading of the XSLT 1.0 specification, namely that the string should be used as both the prefix and the suffix.

To implement this decision, after the sentence "Each non-alphanumeric token is either a prefix, a separator, or a suffix.", the following will be added: "If there is a non-alphanumeric token but no format token, then the single non-alphanumeric token is used as both the prefix and the suffix."

David, thanks for raising this, and could you please indicate whether this is an acceptable resolution.

Michael Kay
for the XSL WG

Comment 4 David Marston 2006-06-15 21:03:28 UTC

In general, attempting a strict reading of anything about xsl:number in the 1.0 spec is begging to find gaps, but I admit that there is no gap here. I don't think xsl:number deserves to have preservation of 1.0 behavior as the top priority, but that's the WG's call. (If preservation of 1.0 were not the top priority, I would have the error in this situation, allow generated 0 values for level=multiple at least, etc., etc.)

So I say this resolution is "grudgingly acceptable" based on the premise that it's acceptable for inadequate format strings to produce non-useful results. Take the strange-looking result as a gentle error message.