1523 – [FO] fn:subsequence and negative start position

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 1523 - [FO] fn:subsequence and negative start position

Summary: [FO] fn:subsequence and negative start position

Status:	CLOSED FIXED

Alias:	None

Product:	XPath / XQuery / XSLT
Classification:	Unclassified
Component:	Functions and Operators 1.0 (show other bugs)
Version:	Last Call drafts
Hardware:	PC Windows 2000

Importance:	P2 normal
Target Milestone:	---
Assignee:	Ashok Malhotra
QA Contact:	Mailing list for public feedback on specs from XSL and XML Query WGs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2005-07-07 20:28 UTC by Andreas Behm
Modified:	2005-09-29 11:31 UTC (History)
CC List:	0 users

See Also:

Attachments

Description Andreas Behm 2005-07-07 20:28:33 UTC

The XTTF does not agree on the result of the following query:

fn:subsequence((1,2,3,4), -2, 4)

The result is determined by the following two statements:

"returns the items in $sourceString whose position $p obeys:
fn:round($startingLoc) <= $p < fn:round($startingLoc) + fn:round($length)"

which means -2 <= $p < 2

"If $startingLoc is zero or negative, the subsequence includes items from
the beginning of the $sourceSeq."

but it is not clear in which order the rules are applied to identify
valid positions. We found two possible interpretations:

1. apply the second rule first and define the valid positions as
1 <= $p < 5 such that the result is "1 2 3 4", or
2. define the valid positions first as -2 <= $p < 2 and start with position 1,
such that the result is "1".

We have not reached agreement in the XQuery Test Task Force and ask the
Working Group for clarification.

On behalf of XTTF
Andreas Behm

Comment 1 Michael Rys 2005-07-07 20:39:21 UTC

Normally the rules are interpreted in order as they appear in the
document. So this should return "1".

Best regards
Michael

Comment 2 Ashok Malhotra 2005-07-08 12:51:31 UTC

Section 1.4 says "For most functions there is an initial paragraph describing
what the function does followed by semantic rules. These rules are meant to be
followed in the order that they appear in this document."

Thus, as Michael Rys has pointed out the result should be "1".

I'm closing this issue.

Comment 3 Michael Dyck 2005-07-08 19:28:51 UTC

That seems to imply that the "$startingLoc is zero or negative" rule would never
be applied.

Comment 4 Michael Kay 2005-07-10 22:10:30 UTC

I think this is a case (there are many others in F+O) where the spec is trying
to be too helpful, and where it doesn't distinguish clearly enough between rules
and helpful remarks about the consequences of the rules. I think the sentence
"If $startingLoc is zero or negative, the subsequence includes items from the
beginning of the $sourceSeq." falls into the "helpful remark" category. It's an
incomplete statement. Both the suggested answers (1,2,3,4) and (1) in fact have
the property that the result includes items from the start of the input
sequence, so this sentence doesn't help to decide which of these answers is
right. There are other cases (for example subsequence((1,2), -10, 0) where the
sentence is false - in this case the result doesn't include any items from the
start of the input sequence.

Personally, I would structure the specification of this function as:

<quote>
SUMMARY: Returns the contiguous sequence of items in the value of $sourceSeq
beginning at the position indicated by the value of $startingLoc and continuing
for the number of items indicated by the value of $length.

RULES:

In the two-argument case, returns the items in $sourceSeq whose position $p obeys:

fn:round($startingLoc) le $p 

In the three-argument case, returns the items in $sourceSeq whose position $p obeys:

fn:round($startingLoc) le $p and $p lt fn:round($startingLoc) + fn:round($length) 


NOTES

If $sourceSeq is the empty sequence, the empty sequence is returned.

If $startingLoc is zero or negative, the subsequence includes items from the
beginning of the $sourceSeq.

If $length is not specified, the subsequence includes items to the end of
$sourceSeq.

If $length is greater than the number of items in the value of $sourceSeq
following $startingLoc, the subsequence includes items to the end of $sourceSeq.

The first item of a sequence is located at position 1, not position 0.

For detailed type semantics, see Section 7.2.11 The fn:subsequence functionFS
</quote>

Specifying it this way deals unambiguously and economically with awkward corner
cases such as a position or length of infinity or NaN.

(Note also, the current text has an incorrect reference to $sourceString)

Comment 5 Michael Kay 2005-07-10 22:23:51 UTC

On rereading this text, it occurs to me that there's nothing in the either the
old or the new version that states unequivocally that the order of items in the
result is the same as the order in the input (which I believe is true even in
unordered mode). This can be fixed by changing the rules to:

RULES

In the two-argument case, returns 

$sourceSeq[fn:round($startingLoc) le position()] 

In the three-argument case, returns 

$sourceSeq[fn:round($startingLoc) le position() and position() lt
fn:round($startingLoc) + fn:round($length)]

Since we get many questions about it, it might also be worth adding a

NOTE

The reason the function accepts arguments of type xs:double is that many
computations on untyped data return an xs:double result; and the reason for the
rounding rules is to compensate for any imprecision in these floating-point
computations.

Comment 6 Ashok Malhotra 2005-07-26 22:05:17 UTC

The WGs decided on July 22, 2005 to improve the description of this function
according to comments #4 and #5 by Michael Kay below.