This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 20573 - Random generator in XPath
Summary: Random generator in XPath
Status: CLOSED FIXED
Alias: None
Product: XPath / XQuery / XSLT
Classification: Unclassified
Component: Functions and Operators 3.1 (show other bugs)
Version: Working drafts
Hardware: PC Windows NT
: P2 enhancement
Target Milestone: ---
Assignee: Michael Kay
QA Contact: Mailing list for public feedback on specs from XSL and XML Query WGs
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-01-05 21:34 UTC by Jakub Maly
Modified: 2014-09-15 09:26 UTC (History)
0 users

See Also:


Attachments

Description Jakub Maly 2013-01-05 21:34:11 UTC
A random generator is present in almost every programming language. I was surprised that XPath does not have one. 

e.g.
fn:random() as xs:double
returns a random number from the interval [0,1]
fn:random($l as xs:double, $u as xs:double) as xs:double
returns a random number from the interval [$l,$u]
Comment 1 Michael Kay 2013-01-05 22:34:57 UTC
The main reason for the omission, I think, is the difficulty of doing it with deterministic functional semantics. 

See also bug #13494 and bug #13747.

The EXSLT library attempts to tackle the requirement with a deterministic function that generates a number that is a pseudo-random function of a supplied seed:

http://www.exslt.org/random/functions/random-sequence/index.html
Comment 2 Michael Kay 2013-01-08 16:22:39 UTC
Moving this to the 3.1 category as it's too late to be considered for 3.0.
Comment 3 Michael Kay 2014-04-29 17:42:22 UTC
The WG reviewed this on 2014-04-29 and there was sentiment in favour of finding a solution, provided the resulting function was purely deterministic and did not rely on hidden state. The EXSLT proposal was examined; it was recognized that having a fixed-length sequence of random numbers to play with created usability problems. On the other hand a mechanism that only generates a single random number from a seed has the difficulty (or at least danger) of going into a closed loop. Michael Sperberg-McQueen suggested a function

random-number-generator(seed) which returns a composite value (array or map) containing (a) the next random number in the sequence, and (b) a function to step this along. We would need to construct some examples to see how usable this is.
Comment 4 Michael Kay 2014-09-09 20:51:16 UTC
After considerable discussion in the WG, a proposal has been drafted and included in the F+O 3.1 specification, and was today accepted as status-quo text (with an invitation to WG members to review and comment).

Jakub, protocol dictates that it's your privilege to mark the bug as closed when you are satisfied that your comment has been addressed. So here is the spec of the proposed function for your review:

4.9.1 fn:random-number-generator

Summary
Returns a random number generator, which can be used to generate sequences of random numbers.

Signatures
fn:random-number-generator() as map(xs:string, item())
fn:random-number-generator(	$seed	 as xs:anyAtomicType) as map(xs:string, item())

Rules
The function returns a random number generator. A random number generator is represented as a map containing three entries. The keys of each entry are strings:

The entry with key "number" holds a random number; it is an xs:double greater than or equal to zero (0.0e0), and less than one (1.0e0).

The entry with key "next" is a zero-arity function that can be called to return another random number generator.

The entry with key "permute" is a function with arity 1 (one), which takes an arbitrary sequence as its argument, and returns a random permutation of that sequence.

Calling the fn:random-number-generator function with no arguments is equivalent to calling the single-argument form of the function with an implementation-dependent seed.

If a $seed is supplied, it may be an atomic value of any type.

Both forms of the function are ·deterministic·: calling the function twice with the same arguments, within a single ·execution scope·, produces the same results.

The value of the number entry should be such that all eligible xs:double values are equally likely to be chosen.

The function returned in the permute entry should be such that all permutations of the supplied sequence are equally likely to be chosen.

The map returned by the random-number-generator function may contain additional entries beyond those specified here, but it must match the type map(xs:string, item()). The meaning of any additional entries is ·implementation-defined·. To avoid conflict with any future version of this specification, the keys of any such entries should start with an underscore character.

Notes
It is not meaningful to ask whether the functions returned in the next and permute functions resulting from two separate calls with the same seed are "the same function", but the functions must be equivalent in the sense that calling them produces the same sequence of random numbers.

The repeatability of the results of function calls in different execution scopes is outside the scope of this specification. It is recommended that when the same seed is provided explicitly, the same random number sequence should be delivered even in different execution scopes; while if no seed is provided, the processor should choose a seed that is likely to be different from one execution scope to another. (The same effect can be achieved explicitly by using fn:current-dateTime() as a seed.)

The specification does not place strong conformance requirements on the actual randomness of the result; this is left to the implementation. It is desirable, for example, when generating a sequence of random numbers that the sequence should not get into a repeating loop; but the specification does not attempt to dictate this.

Examples
The following example returns a random permutation of the integers in the range 1 to 100: fn:random-number-generator()?permute(1 to 100)

The following example returns a 10% sample of the items in an input sequence $seq, chosen at random: fn:random-number-generator()?permute($seq)[position() = 1 to (count($seq) idiv 10)]

The following code defines a function that can be called to produce a random sequence of xs:double values in the range zero to one, of specified length:

declare %public function r:random-sequence($length as xs:integer) as xs:double* {
  r:random-sequence($length, fn:random-number-generator())
};
declare %private function r:random-sequence($length as xs:integer, $G as map(xs:string, item())) {
  if ($length eq 0)
  then ()
  else ($G?number, r:random-sequence($length - 1, $G?next())
};
r:random-sequence(200);