G05 Chapter Contents
NAG Library Manual

# NAG Library Chapter IntroductionG05 – Random Number Generators

## 1  Scope of the Chapter

This chapter is concerned with the generation of sequences of independent pseudorandom and quasi-random numbers from various distributions, and models.

## 2  Background to the Problems

### 2.1  Pseudorandom Numbers

PRNGs can be split into base generators, and distributional generators. Within the context of this document a base generator is defined as a PRNG that produces a sequence (or stream) of variates (or values) uniformly distributed over the interval 0,1. Depending on the algorithm being considered, this interval may be open, closed or half-closed. A distribution generator is a routine that takes variates generated from a base generator and transforms them into variates from a specified distribution, for example a uniform, Gaussian (Normal) or gamma distribution.
The period (or cycle length) of a base generator is defined as the maximum number of values that can be generated before the sequence starts to repeat. The initial state of the base generator is often called the seed.
There are six base generators currently available in the NAG Library, these are; a basic linear congruential generator (LCG) (referred to as the NAG basic generator) (see Knuth (1981)), two sets of Wichmann–Hill generators (see Maclaren (1989) and Wichmann and Hill (2006)), the Mersenne Twister (see Matsumoto and Nishimura (1998)), the ACORN generator (see Wikramaratna (1989)) and L'Ecuyer generator (see L'Ecuyer and Simard (2002)).

#### 2.1.1  NAG Basic Generator

The NAG basic generator is a linear congruential generator (LCG) and, like all linear congruential generators, has the form:
 xi = a1 xi-1  mod  m1 , ui = xi m1 ,
where the ui, for i=1,2,, form the required sequence.
The NAG basic generator uses a1=1313 and m1=259, which gives a period of approximately 257.
This generator has been part of the NAG Library since Mark 6 and as such has been widely used. It suffers from no known problems, other than those due to the lattice structure inherent in all linear congruential generators, and, even though the period is relatively short compared to many of the newer generators, it is sufficiently large for many practical problems.
The performance of the NAG basic generator has been analysed by the Spectral Test, see Section 3.3.4 of Knuth (1981), yielding the following results in the notation of Knuth (1981).
 n νn Upper bound for νn 2 3.44×108 4.08×108 3 4.29×105 5.88×105 4 1.72×104 2.32×104 5 1.92×103 3.33×103 6 593 939 7 198 380 8 108 197 9 67 120
The right-hand column gives an upper bound for the values of νn attainable by any multiplicative congruential generator working modulo 259.
An informal interpretation of the quantities νn is that consecutive n-tuples are statistically uncorrelated to an accuracy of 1/νn. This is a theoretical result; in practice the degree of randomness is usually much greater than the above figures might support. More details are given in Knuth (1981), and in the references cited therein.
Note that the achievable accuracy drops rapidly as the number of dimensions increases. This is a property of all multiplicative congruential generators and is the reason why very long periods are needed even for samples of only a few random numbers.

#### 2.1.2  Wichmann–Hill I Generator

The constants ai are in the range 112 to 127 and the constants mj are prime numbers in the range 16718909 to 16776971, which are close to 224=16777216. These constants have been chosen so that each of the resulting 273 generators are essentially independent, all calculations can be carried out in 32-bit integer arithmetic and the generators give good results with the spectral test, see Knuth (1981) and Maclaren (1989). The period of each of these generators would be at least 292 if it were not for common factors between m1-1, m2-1, m3-1 and m4-1. However, each generator should still have a period of at least 280. Further discussion of the properties of these generators is given in Maclaren (1989).

#### 2.1.3  Wichmann–Hill II Generator

This Wichmann–Hill base generator (see Wichmann and Hill (2006)) is of the same form as that described in Section 2.1.2, i.e., a combination of four linear congruential generators. In this case a1=11600, m1=2147483579, a2=47003, m2=2147483543, a3=23000, m3=2147483423, a4=33000, m4=2147483123.
Unlike in the original Wichmann–Hill generator, these values are too large to carry out the calculations detailed in (1) using 32-bit integer arithmetic, however, if
 wi = 11600 wi-1  mod  2147483579
then setting
 Wi = 11600 wi-1  mod  185127 - 10379 wi-1 / 185127
gives
 wi = Wi ​ if ​ Wi≥0 2147483579+Wi ​ otherwise
and Wi can be calculated in 32-bit integer arithmetic. Similar expressions exist for xi, yi and zi. The period of this generator is approximately 2121.
Further details of implementing this algorithm and its properties are given in Wichmann and Hill (2006). This paper also gives some useful guidelines on testing PRNGs.

#### 2.1.4  Mersenne Twister Generator

The Mersenne Twister (see Matsumoto and Nishimura (1998)) is a twisted generalized feedback shift register generator. The algorithm underlying the Mersenne Twister is as follows:
(i) Set some arbitrary initial values x1,x2,,xr, each consisting of w bits.
(ii) Letting
 A= 0 Iw-1 aw aw-1⋯a1 ,
where Iw-1 is the w-1×w-1 identity matrix and each of the ai,i=1 to w take a value of either 0 or 1 (i.e., they can be represented as bits). Define
 x i+r = x i+s ⊕ x i ω : l+1 | x i+1 l:1 A ,
where x i ω : l+1 | x i+1 l:1  indicates the concatenation of the most significant (upper) w-l bits of xi and the least significant (lower) l bits of xi+1.
(iii) Perform the following operations sequentially:
 z = xi+r ⊕ xi+r ≫ t1 z = z ⊕ z ≪ t2 ​ AND ​ m1 z = z ⊕ z ≪ t3 ​ AND ​ m2 z = z ⊕ z ≫ t4 u i+r = z/ 2w - 1 ,
where t1, t2, t3 and t4 are integers and m1 and m2 are bit-masks and ‘t’ and ‘t’ represent a t bit shift right and left respectively,  is bit-wise exclusively or (xor) operation and ‘AND’ is a bit-wise and operation.
The ui+r, for i=1,2,, form the required sequence. The supplied implementation of the Mersenne Twister uses the following values for the algorithmic constants:
 w = 32 a = 0x9908b0 df l = 31 r = 624 s = 397 t1 = 11 t2 = 7 t3 = 15 t4 = 18 m1 = 0x9d2c5680 m2 = 0xefc60000
where the notation 0xDD  indicates the bit pattern of the integer whose hexadecimal representation is DD .
This algorithm has a period length of approximately 219,937-1 and has been shown to be uniformly distributed in 623 dimensions (see Matsumoto and Nishimura (1998)).

#### 2.1.5  ACORN Generator

The ACORN generator is a special case of a multiple recursive generator (see Wikramaratna (1989) and Wikramaratna (2007)). The algorithm underlying ACORN is as follows:
(i) Choose an integer value k1.
(ii) Choose an integer value M, and an integer seed Y00, such that 0<Y00<M and Y00 and M are relatively prime.
(iii) Choose an arbitrary set of k initial integer values, Y01,Y02,,Y0k, such that 0 Y0m<M, for all m=1,2,,k.
(iv) Perform the following sequentially:
 Y i m = Y i m-1 + Y i-1 m  mod  M
for m=1,2,,k.
(v) Set ui=Yik/M.
The ui, for i=1,2,, then form a pseudorandom sequence, with ui 0,1, for all i.
Although you can choose any value for k, M, Y00 and the Y0m, within the constraints mentioned in (i) to (iii) above, it is recommended that k10, M is chosen to be a large power of two with M260 and Y00 is chosen to be odd.
The period of the ACORN generator, with the modulus M equal to a power of two, and an odd value for Y00 has been shown to be an integer multiple of M (see Wikramaratna (1992)). Therefore, increasing M will give a series with a longer period.

#### 2.1.6  L'Ecuyer MRG32k3a Combined Recursive Generator

The base generator L'Ecuyer MRG32k3a (see L'Ecuyer and Simard (2002)) combines two multiple recursive generators:
 xi = a11 xi-1 + a12 xi-2 + a13 xi-3  mod m1 yi = a21 yi-1 + a22 yi-2 + a23 yi-3  mod m2 zi = xi - yi  mod m1 ui = zi + 1 / d
where a11 = 0 , a12 = 1403580 , a13 = -810728 , m1 = 232-209 , a21 = 527612 , a22 = 0 , a23 = -1370589 , m2 = 232-22853 , and ui , i = 1 , 2 ,  form the required sequence. If d=m1 then ui0,1 else if d=m1+1 then ui0,1. Combining the two multiple recursive generators (MRG) results in sequences with better statistical properties in high dimensions and longer periods compared with those generated from a single MRG. The combined generator described above has a period length of approximately 2191.

### 2.2  Quasi-random Numbers

Low discrepancy (quasi-random) sequences are used in numerical integration, simulation and optimization. Like pseudorandom numbers they are uniformly distributed but they are not statistically independent, rather they are designed to give more even distribution in multidimensional space (uniformity). Therefore they are often more efficient than pseudorandom numbers in multidimensional Monte–Carlo methods.
The quasi-random number generators implemented in this chapter generate a set of points x1,x2,,xN with high uniformity in the S-dimensional unit cube IS=0,1S. One measure of the uniformity is the discrepancy which is defined as follows:
• Given a set of points x1,x2,,xNIS and a subset GIS, define the counting function SNG as the number of points xiG. For each x=x1,x2,,xSIS, let Gx be the rectangular S-dimensional region
 G x = 0, x 1 × 0, x 2 ×⋯× 0, x S
with volume x1,x2,,xS. Then the discrepancy of the points x1,x2,,xN is
 DN* x1,x2,…,xN = sup x∈IS SN Gx - N ∑ k=1 S xk .
The discrepancy of the first N terms of such a sequence has the form
 DN* x1,x2,…,xN ≤ CS log⁡NS + O log⁡N S-1   for all  N≥2.
The principal aim in the construction of low-discrepancy sequences is to find sequences of points in IS with a bound of this form where the constant CS is as small as possible.
Three types of low-discrepancy sequences are supplied in this library, these are due to Sobol, Faure and Niederreiter. Two sets of Sobol sequences are supplied, the first is based on work of Joe and Kuo (2008) and the second on the work of Bratley and Fox (1988). More information on quasi-random number generation and the Sobol, Faure and Niederreiter sequences in particular can be found in Bratley and Fox (1988) and Fox (1986).
The efficiency of a simulation exercise may often be increased by the use of variance reduction methods (see Morgan (1984)). It is also worth considering whether a simulation is the best approach to solving the problem. For example, low-dimensional integrals are usually more efficiently calculated by routines in Chapter D01 rather than by Monte–Carlo integration.

### 2.3  Scrambled Quasi-random Numbers

Scrambled quasi-random sequences are an extension of standard quasi-random sequences that attempt to eliminate the bias inherent in a quasi-random sequence whilst retaining the low-discrepancy properties. The use of a scrambled sequence allows error estimation of Monte–Carlo results by performing a number of iterates and computing the variance of the results.
This implementation of scrambled quasi-random sequences is based on TOMS algorithm 823 and details can be found in the accompanying paper, Hong and Hickernell (2003). Three methods of scrambling are supplied; the first a restricted form of Owen's scrambling (Owen (1995)), the second based on the method of Faure and Tezuka (2000) and the last method combines the first two.
Scrambled versions of both Sobol sequences and the Niederreiter sequence can be obtained.

### 2.4  Non-uniform Random Numbers

Random numbers from other distributions may be obtained from the uniform random numbers by the use of transformations and rejection techniques, and for discrete distributions, by table based methods.
 (a) Transformation Methods For a continuous random variable, if the cumulative distribution function (CDF) is Fx then for a uniform 0,1 random variate u, y=F-1u will have CDF Fx. This method is only efficient in a few simple cases such as the exponential distribution with mean μ, in which case F-1u=-μlog⁡u. Other transformations are based on the joint distribution of several random variables. In the bivariate case, if v and w are random variates there may be a function g such that y=gv,w has the required distribution; for example, the Student's t-distribution with n degrees of freedom in which v has a Normal distribution, w has a gamma distribution and gv,w=v⁢n/w. (b) Rejection Methods Rejection techniques are based on the ability to easily generate random numbers from a distribution (called the envelope) similar to the distribution required. The value from the envelope distribution is then accepted as a random number from the required distribution with a certain probability; otherwise, it is rejected and a new number is generated from the envelope distribution. (c) Table Search Methods For discrete distributions, if the cumulative probabilities, Pi=Probx≤i, are stored in a table then, given u from a uniform 0,1 distribution, the table is searched for i such that Pi-1

### 2.5  Copulas

A copula is a function that links the univariate marginal distributions with their multivariate distribution. Sklar's theorem (see Sklar (1973)) states that if f is an m-dimensional distribution function with continuous margins f1 , f2 ,, fm , then f has a unique copula representation, c, such that
 f x1 , x2 ,…, xm = c f1 x1 , f2 x2 ,…, fm xm
The copula, c, is a multivariate uniform distribution whose dependence structure is defined by the dependence structure of the multivariate distribution f, with
 c u1 , u2 ,…, um = f f1-1 u1 , f2-1 u2 ,… , fm-1 um
where ui 0,1 . This relationship can be used to simulate variates from distributions defined by the dependence structure of one distribution and each of the marginal distributions given by another. For additional information see Nelsen (1998) or Boye (Unpublished manuscript) and the references therein.

### 2.6  Brownian Bridge

#### 2.6.1  Brownian Bridge Process

Fix two times t0<T and let W = Wt 0tT-t0  be a standard d-dimensional Wiener process on the interval 0,T-t0. Recall that the terms Wiener process and Brownian motion are often used interchangeably.
A standard d-dimensional Brownian bridge B = Bt t0tT  on t0,T is defined (see Revuz and Yor (1999)) as
 Bt = W t-t0 - t-t0 T-t0 WT-t0 .
The process is continuous, starts at zero at time t0 and ends at zero at time T. It is Gaussian, has zero mean and has a covariance structure given by
 𝔼 Bs BtT = s-t0 T-t T-t0 Id
for any st in t0,T where Id is the d-dimensional identity matrix. The Brownian bridge is often called a non-free or ‘pinned’ Wiener process since it is forced to be 0 at time T, but is otherwise very similar to a standard Wiener process.
We can generalize this construction as follows. Fix points x,wd, let Σ be a d×d covariance matrix and choose any d×d matrix C such that CCT=Σ. The generalized d-dimensional Brownian bridge X = Xt t0tT  is defined by setting
 Xt = t-t0 w+ T-t x T-t0 + CBt = t-t0 w+ T-t x T-t0 + CWt - t0 - t-t0 T-t0 C W T-t0
for all tt0,T. The process X is continuous, starts at x at time t0 and ends at w at time T. It has mean t-t0 w+ T-t x / T-t0  and covariance structure
 𝔼 Xs - 𝔼 Xs Xt - 𝔼 Xt T = 𝔼 C Bs BtT CT = s-t0 T-t T-t0 Σ
for all st in t0,T. This is a non-free Wiener process since it is forced to be equal to w at time T. However if we set w=x+CWT-t0, then X simplifies to
 Xt = x+C W t-t0
for all tt0,T which is nothing other than a d-dimensional Wiener process with covariance given by Σ. Figure 1: Two sample paths for a two-dimensional free Wiener process
Figure 1 shows two sample paths for a two-dimensional free Wiener process X = Xt1 , Xt2 0t2 . The correlation coefficient between the one-dimensional processes X1 and X2 at any time is ρ=0.80. Note that the red and green paths in each figure are uncorrelated, however it is fairly evident that the two red paths are correlated, and that the two green paths are correlated (when one path increases so does the other, and vice versa). Figure 2: Two sample paths for a two-dimensional non-free Wiener process. The process starts at 0,0 and ends at 1,-1
Figure 2 shows two sample paths for a two-dimensional non-free Wiener process. The process starts at 0,0  and ends at 1,-1 . The correlation coefficient between the one-dimensional processes is again ρ=0.80. The red and green paths in each figure are uncorrelated, while the two red paths tend to increase and decrease together, as do the two green paths. Both Figure 1 and Figure 2 were constructed using G05XBF.

#### 2.6.2  Brownian Bridge Algorithm

The order in which the successive interpolation times tj are chosen is called the bridge construction order. Since all construction orders will produce a correct process, the question arises whether one construction order should be preferred over another. When the Z values are drawn from a pseudorandom generator, the answer is typically no. However the bridge algorithm is frequently used with quasi-random numbers, and in this case the bridge construction order can be important.

#### 2.6.3  Bridge Construction Order and Quasi-random Sequences

Consider the one-dimensional case of a free Wiener process where d=C=1. The Brownian bridge is frequently combined with low-discrepancy (quasi-random) sequences to perform quasi-Monte–Carlo integration. Quasi-random points Z1, Z2, Z3,  are generated from the standard Normal distribution, where each quasi-random point Zi = Z1i,Z2i,,ZDi  consists of D one-dimensional values. The process X starts at Xt0=x which is known. There remain N+1 time points at which the bridge is to be computed, namely X ti 1iN  and XT (recall we are considering a free Wiener process). In this case D is set equal to N+1, so that N+1 dimensional quasi-random points are generated. A single quasi-random point is used to construct one Wiener sample path.
The question is how to use the dimension values of each N+1 dimensional quasi-random point. Often the ‘lower’ dimension values (Z1i,Z2i, etc.) display better uniformity properties than the ‘higher’ dimension values (ZN+1i,ZNi, etc.) so that the ‘lower’ dimension values should be used to construct the most important sections of the sample path. For example, consider a model which is particularly sensitive to the behaviour of the underlying process at time 3. When constructing the sample paths, one would therefore ensure that time 3 was one of the interpolation points of the bridge, and that a ‘lower’ dimension value was used in (2) to construct the corresponding bridge point X3. Indeed, one would most likely also ensure that time X3 was one of the first bridge points that was constructed: ‘lower’ dimension values would be used to construct both the left and right bridge points used in (2) to interpolate X3, so that the distribution of X3 benefits as much as possible from the uniformity properties of the quasi-random sequence. For further discussions in this regard we refer to Glasserman (2004). These remarks extend readily to the case of a non-free Wiener process.

### 2.7  Random Fields

A random field is a stochastic process, taking values in a Euclidean space, and defined over a parameter space of dimensionality at least one. They are often used to simulate some physical space-dependent parameter, such as the permeability of rock, which cannot be measured at every point in the space. The simulated values can then be used to model other dependent quantities, for example, underground flow of water, often through the use of partial differential equations (PDEs).
A d-dimensional random field Zx is a function which is random at every point xD for some domain Dd, so Zx is a random variable for each x. The random field has a mean function μx=𝔼Zx and a symmetric positive semidefinite covariance function Cx,y=𝔼Zx-μxZy-μy.
A random field, Zx, is a Gaussian random field if, for any choice of n and x1,,xnd, the random vector Zx1,,ZxnT follows a multivariate Gaussian distribution.
A Gaussian random field Zx is stationary if μx is constant for all x and Cx,y=Cx+a,y+a for all x,y,ad and hence we can express the covariance function Cx,y as a function γ of one variable: Cx,y=γx-y. γ is known as a variogram (or more correctly, a semivariogram) and includes the multiplicative factor σ2 representing the variance such that γ0=σ2. There are a number of commonly used variograms, including:
1. Symmetric stable variogram
 γx = σ2 exp - x′ ν
2. Cauchy variogram
 γx = σ2 1+ x′ 2 -ν .
3. Differential variogram with compact support
 γx = σ21+8x′+25x′2+32x′31-x′8, x′<1, 0, x′≥1.
4. Exponential variogram
 γx=σ2exp-x′.
5. Gaussian variogram
 γx=σ2exp-x′2.
6. Nugget variogram
 γx= σ2, x=0, 0, x≠0.
7. Spherical variogram
 γx= σ21-1.5x′+0.5x′3, x′<1, 0, x′≥1.
8. Bessel variogram
 γx=σ22νΓν+1Jνx′x′ν,
9. Hole effect variogram
 γx=σ2sinx′x′.
10. Whittle–Matérn variogram
 γx=σ221-νx′νKνx′Γν,
11. Continuously parameterised variogram with compact support
 γx= σ221-νx′νKνx′Γν1+8x′′+25x′′2+32x′′31-x′′8, x′′<1, 0, x′′≥1,
12. Generalized hyperbolic distribution variogram
 γx=σ2δ2+x′2λ2δλKλκδKλκδ2+x′212,
13. Cosine variogram
 γx=σ2cosx′,
where x is a scaled norm of x.

### 2.8  Other Random Structures

In addition to random numbers from various distributions, random compound structures can be generated. These include random time series, random matrices and random samples.

### 2.9  Multiple Streams of Pseudorandom Numbers

It is often advantageous to be able to generate variates from multiple, independent, streams (or sequences) of random variates. For example when running a simulation in parallel on several processors. There are four ways of generating multiple streams using the routines available in this chapter:
 (i) using different initial values (seeds); (ii) using different generators; (iii) skip ahead (also called block-splitting); (iv) leap-frogging.

#### 2.9.1  Multiple Streams via Different Initial Values (Seeds)

A different sequence of variates can be generated from the same base generator by initializing the generator using a different set of seeds. The statistical properties of the base generators are only guaranteed within, not between sequences. For example, two sequences generated from two different starting points may overlap if these initial values are not far enough apart. The potential for overlapping sequences is reduced if the period of the generator being used is large. In general, of the four methods for creating multiple streams described here, this is the least satisfactory.
The one exception to this is the Wichmann–Hill II generator. The Wichmann and Hill (2006) paper describes a method of generating blocks of variates, with lengths up to 290, by fixing the first three seed values of the generator (w0, x0 and y0), and setting z0 to a different value for each stream required. This is similar to the skip-ahead method described in Section 2.9.3, in that the full sequence of the Wichmann–Hill II generator is split into a number of different blocks, in this case with a fixed length of 290. But without the computationally intensive initialization usually required for the skip-ahead method.

#### 2.9.2  Multiple Streams via Different Generators

Independent sequences of variates can be generated using a different base generator for each sequence. For example, sequence 1 can be generated using the NAG basic generator, sequence 2 using Mersenne Twister, sequence 3 the ACORN generator and sequence 4 using L'Ecuyer generator. The Wichmann–Hill I generator implemented in this chapter is, in fact, a series of 273 independent generators. The particular sub-generator to use is selected using the SUBID variable. Therefore, in total, 278 independent streams can be generated with each using a different generator (273 Wichmann–Hill I generators, and 5 additional base generators).

#### 2.9.3  Multiple Streams via Skip-ahead

Independent sequences of variates can be generated from a single base generator through the use of block-splitting, or skipping-ahead. This method consists of splitting the sequence into k non-overlapping blocks, each of length n, where n is no smaller than the maximum number of variates required from any of the sequences. For example,
 x1 , x2 , … , xn block 1 , xn+1 , xn+2 , … , x2n block 2 , x2n+1 , x2n+2 , … , x3n block 3 , etc.
where x1,x2, is the sequence produced by the generator of interest. Each of the k blocks provide an independent sequence.
The skip-ahead algorithm therefore requires the sequence to be advanced a large number of places, as to generate values from say, block b, you must skip over the b-1n values in the first b-1 blocks. Due to their form this can be done efficiently for linear congruential generators and multiple congruential generators. A skip-ahead algorithm is also provided for the Mersenne Twister generator.
Although skip-ahead requires some additional computation at the initialization stage (to ‘fast forward’ the sequence) no additional computation is required at the generation stage.

#### 2.9.4  Multiple Streams via Leap-frog

Independent sequences of variates can also be generated from a single base generator through the use of leap-frogging. This method involves splitting the sequence from a single generator into k disjoint subsequences. For example:
 Subsequence 1: x1 , xk+1 , x 2k+1 ,… Subsequence 2: x2 , xk+2 , x 2k+2 ,… ⋮ Subsequence ​k: xk , x2k , x3k ,… ,
where x1,x2, is the sequence produced by the generator of interest. Each of the k subsequences then provides an independent stream of variates.
The leap-frog algorithm therefore requires the generation of every kth variate from the base generator. Due to their form this can be done efficiently for linear congruential generators and multiple congruential generators. A leap-frog algorithm is provided for the NAG Basic generator, both the Wichmann–Hill I and Wichmann–Hill II generators and L'Ecuyer generator.
It is known that, dependent on the number of streams required, leap-frogging can lead to sequences with poor statistical properties, especially when applied to linear congruential generators. In addition, leap-frogging can increase the time required to generate each variate. Therefore leap-frogging should be avoided unless absolutely necessary.

#### 2.9.5  Skip-ahead and Leap-frog for a Linear Congruential Generator (LCG): An Example

As an illustrative example, a brief description of the algebra behind the implementation of the leap-frog and skip-ahead algorithms for a linear congruential generator is given. A linear congruential generator has the form xi+1=a1 xi  mod  m1. The recursive nature of a linear congruential generator means that
 xi+v = a1 x i+v-1  mod  m1 = a1 a1 x i+v-2  mod  m1  mod  m1 = a 1 2 x i+v-2  mod  m1 = a1v xi  mod  m1 .
The sequence can therefore be quickly advanced v places by multiplying the current state (xi) by a1v  mod  m1, hence skipping the sequence ahead. Leap-frogging can be implemented by using a1k, where k is the number of streams required, in place of a1 in the standard linear congruential generator recursive formula, in order to advance k places, rather than one, at each iteration.
In a linear congruential generator the multiplier a1 is constructed so that the generator has good statistical properties in, for example, the spectral test. When using leap-frogging to construct multiple streams this multiplier is replaced with a1k, and there is no guarantee that this new multiplier will have suitable properties especially as the value of k depends on the number of streams required and so is likely to change depending on the application. This problem can be emphasized by the lattice structure of linear congruential generators. Similiarly, the value of a1 is often chosen such that the computation a1 xi  mod  m1 can be performed efficiently. When a1 is replaced by a1k, this is often no longer the case.
Note that, due to rounding, when using a distributional generator, a sequence generated using leap-frogging and a sequence constructed by taking every k value from a set of variates generated without leap-frogging may differ slightly. These differences should only affect the least significant digit.

#### 2.9.6  Skip-ahead and Leap-frog for the Mersenne Twister: An Example

Skipping ahead with the Mersenne Twister generator is based on the definition of a k×k (where k=19937) transition matrix, A, over the finite field 𝔽2 (with elements 0 and 1). Multiplying A by the current state xn, represented as a vector of bits, produces the next state vector xn+1:
 x n + 1 = A ⁢ x n .
Thus, skipping ahead v places in a sequence is equivalent to multiplying by Av:
 x n + v = A v x n .
Since calculating Av by a standard square and multiply algorithm is Ok3 logv and requires over 47MB of memory (see Haramoto et al. (2008)), an indirect calculation is performed which relies on a property of the characteristic polynomial pz of A, namely that pA=0. We then define
 gz = z v  mod  pz = a k - 1 ⁢ z k - 1 + … + a 1 ⁢ z + a 0 ,
and observe that
 gz = z v + qz ⁢ p z
for a polynomial qz. Since pA=0, we have that g A = A v  and
 A v ⁢ x n = a k - 1 ⁢ A k - 1 + … + a 1 A + a 0 I ⁢ x n .
This polynomial evaluation can be performed using Horner's method:
 A v ⁢ x n = A ⁢ … A ⁢ A ⁢ A ⁢ a k - 1 ⁢ x n + a k - 2 ⁢ x n + a k - 3 ⁢ x n + ⋯ + a 1 ⁢ x n + a 0 ⁢ x n ,
which reduces the problem to advancing the generator k-1 places from state xn and adding (where addition is as defined over 𝔽2) the intermediate states for which ai is nonzero.
There are therefore two stages to skipping the Mersenne Twister ahead v places:
 (i) Calculate the coefficients of the polynomial g z = z v  mod  p z ; (ii) advance the sequence k-1 places from the starting state and add the intermediate states that correspond to nonzero coefficients in the polynomial calculated in the first step.
The resulting state is that for position v in the sequence.
The cost of calculating the polynomial is O k 2 logv  and the cost of applying it to state is constant. Skip ahead functionality is typically used in order to generate n independent pseudorandom number streams (e.g., for separate threads of computation). There are two options for generating the n states:
 (i) On the master thread calculate the polynomial for a skip ahead distance of v and apply this polynomial to state n times, after each iteration j saving the current state for later usage by thread j. (ii) Have each thread j independently and in parallel with other threads calculate the polynomial for a distance of j+1⁢v and apply to the original state.
Since lim v logv = log n v , then for large v the cost of generating the polynomial for a skip ahead distance of nv (i.e., the calculation performed by thread n-1 in option (ii) above) is approximately the same as generating that for a distance of v (i.e., the calculation performed by thread 0). However, only one application to state need be made per thread, and if n is sufficiently large the cost of applying the polynomial to state becomes the dominant cost in option (i), in which case it is desirable to use option (ii). Tests have shown that as a guideline it becomes worthwhile to switch from option (i) to option (ii) for approximately n>30.
Leap frog calculations with the Mersenne Twister are performed by computing the sequence fully up to the required size and discarding the redundant numbers for a given stream.

## 3  Recommendations on Choice and Use of Available Routines

### 3.1  Pseudorandom Numbers

#### 3.1.1  Initialization

Prior to generating any variates the base generator must be initialized. Two utility routines are provided for this, G05KFF and G05KGF, both of which allow any of the base generators to be chosen.
G05KFF selects and initializes a base generator to a repeatable (when executed serially) state: two calls of G05KFF with the same argument-values will result in the same subsequent sequences of random numbers (when both generated serially).
G05KGF selects and initializes a base generator to a non-repeatable state in such a way that different calls of G05KGF, either in the same run or different runs of the program, will almost certainly result in different subsequent sequences of random numbers.
No utilities for saving, retrieving or copying the current state of a generator have been provided. All of the information on the current state of a generator (or stream, if multiple streams are being used) is stored in the integer array STATE and as such this array can be treated as any other integer array, allowing for easy copying, restoring, etc.

#### 3.1.2  Repeated initialization

As mentioned in Section 2.9.1, it is important to note that the statistical properties of pseudorandom numbers are only guaranteed within sequences and not between sequences produced by the same generator. Repeated initialization will thus render the numbers obtained less rather than more independent. In a simple case there should be only one call to G05KFF or G05KGF and this call should be before any call to an actual generation routine.

#### 3.1.3  Choice of Base Generator

When choosing a base generator, the period of the chosen generator should be borne in mind. A good rule of thumb is never to use more numbers than the square root of the period in any one experiment as the statistical properties are impaired. For closely related reasons, breaking numbers down into their bit patterns and using individual bits may also cause trouble.

#### 3.1.4  Choice of Method for Generating Multiple Streams

If the Wichmann–Hill II base generator is being used, and a period of 290 is sufficient, then the method described in Section 2.9.1 can be used. If a different generator is used, or a longer period length is required then generating multiple streams by altering the initial values should be avoided.
Using a different generator works well if less than 277 streams are required.
Of the remaining two methods, both skip-ahead and leap-frogging use the sequence from a single generator, both guarantee that the different sequences will not overlap and both can be scaled to an arbitrary number of streams. Leap-frogging requires no a-priori knowledge about the number of variates being generated, whereas skip-ahead requires you to know (approximately) the maximum number of variates required from each stream. Skip-ahead requires no a-priori information on the number of streams required. In contrast leap-frogging requires you to know the maximum number of streams required, prior to generating the first value. Of these two, if possible, skip-ahead should be used in preference to leap-frogging. Both methods required additional computation compared with generating a single sequence, but for skip-ahead this computation occurs only at initialization. For leap-frogging additional computation is required both at initialization and during the generation of the variates. In addition, as mentioned in Section 2.9.4, using leap-frogging can, in some instances, change the statistical properties of the sequences being generated.
Leap-frogging is performed by calling G05KHF after the initialization routine (G05KFF or G05KGF). For skip-ahead, either G05KJF or G05KKF can be called. Of these, G05KKF restricts the amount being skipped to a power of 2, but allows for a large ‘skip’ to be performed.

#### 3.1.5  Copulas

After calling one of the copula routines the inverse cumulative distribution function (CDF) can be applied to convert the uniform marginal distribution into the required form. Scalar and vector routines for evaluating the CDF, for a range of distributions, are supplied in Chapter G01. If should be noted that these routines are often described as computing the ‘deviates’ of the distribution.
When using the inverse CDF routines from Chapter G01 it should be noted that some are limited in the number of significant figures they return. This may affect the statistical properties of the resulting sequence of variates. Section 7 of the individual routine documentation will give a discussion of the accuracy of the particular algorithm being used and any available alternatives.

### 3.2  Quasi-random Numbers

Prior to generating any quasi-random variates the generator being used must be initialized via G05YLF or G05YNF. Of these, G05YLF can be used to initialize a standard Sobol, Faure or Niederreiter sequence and G05YNF can be used to initialize a scrambled Sobol or Niederreiter sequence.
Due to the random nature of the scrambling, prior to calling the initialization routine G05YNF one of the pseudorandom initialization routines, G05KFF or G05KGF, must be called.
Once a quasi-random generator has been initialized, using either G05YLF or G05YNF, one of three generation routines can be called to generate uniformly distributed sequences (G05YMF), Normally distributed sequences (G05YJF) or sequences with a log-normal distribution (G05YKF). For example, for a repeatable sequence of scrambled quasi-random variates from the Normal distribution, G05KFF must be called first (to initialize a pseudorandom generator), followed by G05YNF (to initialize a scrambled quasi-random generator) and then G05YJF can be called to generate the sequence from the required distribution.
See the last paragraph of Section 3.1.5 on how sequences from other distributions can be obtained using the inverse CDF.

### 3.3  Brownian Bridge

G05XBF may be used to generate sample paths from a (free or non-free) Wiener process using the Brownian bridge algorithm. Prior to calling G05XBF, the generator must be initialized by a call to G05XAF. G05XAF requires you to specify a bridge construction order. The routine G05XEF can be used to convert a set of input times into one of several common bridge construction orders, which can then be used in the initialization call to G05XAF.
G05XDF may be used to generate the scaled increments of the sample paths of a (free or non-free) Wiener process. Prior to calling G05XDF, the generator must be initialized by a call to G05XCF. Note that G05XDF generates these scaled increments directly; it is not necessary to call G05XBF before calling G05XDF. As before, G05XEF can be used to convert a set of input times into a bridge construction order which can be passed to G05XCF.

### 3.4  Random Fields

Routines for simulating from either a one-dimensional or a two-dimensional stationary Gaussian random field are provided. These routines use the circulant embedding method of Dietrich and Newsam (1997) to efficiently generate from the required field. In both cases a setup routine is called, which defines the domain and variogram to use, followed by the generation routine. A number of preset variograms are supplied or a user-defined subroutine can be used.
• One-dimensional random field:
• G05ZNF setup routine, using a preset variogram.
• G05ZMF setup routine, using a user-defined variogram.
• G05ZPF generation routine.
• Two-dimension random field:
• G05ZQF setup routine, using a preset variogram.
• G05ZRF setup routine, using a user-defined variogram.
• G05ZSF generation routine.
In addition to generating a random field, it is possible to use the circulant embedding method to generate realisations of fractional Brownian motion, this functionality is provided in G05ZTF.
Prior to calling G05ZPF, G05ZRF or G05ZTF one of the initialization routines, G05KFF or G05KGF must be called.

## 4  Functionality Index

 Brownian bridge,
 circulant embedding generator,
 generate fractional Brownian motion G05ZTF
 increments generator,
 generate Wiener increments G05XDF
 initialize generator G05XCF
 path generator,
 create bridge construction order G05XEF
 generate a free or non-free (pinned) Wiener process for a given set of time steps G05XBF
 initialize generator G05XAF
 Generating samples, matrices and tables,
 random correlation matrix G05PYF
 random orthogonal matrix G05PXF
 random permutation of an integer vector G05NCF
 random sample from an integer vector,
 unequal weights, without replacement G05NEF
 unweighted, without replacement G05NDF
 random table G05PZF
 Generation of time series,
 asymmetric GARCH Type II G05PEF
 asymmetric GJR GARCH G05PFF
 EGARCH G05PGF
 exponential smoothing G05PMF
 type I AGARCH G05PDF
 univariate ARMA G05PHF
 vector ARMA G05PJF
 Pseudorandom numbers,
 array of variates from multivariate distributions,
 Dirichlet distribution G05SEF
 multinomial distribution G05TGF
 Normal distribution G05RZF
 Student's t distribution G05RYF
 copulas,
 Clayton/Cook–Johnson copula (bivariate) G05REF
 Clayton/Cook–Johnson copula (multivariate) G05RHF
 Frank copula (bivariate) G05RFF
 Frank copula (multivariate) G05RJF
 Gaussian copula G05RDF
 Gumbel–Hougaard copula G05RKF
 Plackett copula G05RGF
 Student's t copula G05RCF
 initialize generator,
 multiple streams,
 leap-frog G05KHF
 nonrepeatable sequence G05KGF
 repeatable sequence G05KFF
 vector of variates from discrete univariate distributions,
 binomial distribution G05TAF
 geometric distribution G05TCF
 hypergeometric distribution G05TEF
 logarithmic distribution G05TFF
 logical value .TRUE. or .FALSE. G05TBF
 negative binomial distribution G05THF
 Poisson distribution G05TJF
 uniform distribution G05TLF
 user-supplied distribution G05TDF
 variate array from discrete distributions with array of parameters,
 Poisson distribution with varying mean G05TKF
 vectors of variates from continuous univariate distributions,
 beta distribution G05SBF
 Cauchy distribution G05SCF
 exponential mix distribution G05SGF
 F-distribution G05SHF
 gamma distribution G05SJF
 logistic distribution G05SLF
 log-normal distribution G05SMF
 negative exponential distribution G05SFF
 Normal distribution G05SKF
 real number from the continuous uniform distribution G05SAF
 Student's t-distribution G05SNF
 triangular distribution G05SPF
 uniform distribution G05SQF
 von Mises distribution G05SRF
 Weibull distribution G05SSF
 χ2 square distribution G05SDF
 Quasi-random numbers,
 array of variates from univariate distributions,
 log-normal distribution G05YKF
 Normal distribution G05YJF
 uniform distribution G05YMF
 initialize generator,
 scrambled Sobol or Niederreiter G05YNF
 Sobol, Niederreiter or Faure G05YLF
 Random fields,
 one-dimensional,
 generation G05ZPF
 initialize generator,
 preset variogram G05ZNF
 user-defined variogram G05ZMF
 two-dimensional,
 generation G05ZSF
 initialize generator,
 preset variogram G05ZRF
 user-defined variogram G05ZQF

None.

## 6  Routines Withdrawn or Scheduled for Withdrawal

The following lists all those routines that have been withdrawn since Mark 17 of the Library or are scheduled for withdrawal at one of the next two marks.
 WithdrawnRoutine Mark ofWithdrawal Replacement Routine(s) G05CAF 22 G05SAF G05CBF 22 G05KFF G05CCF 22 G05KGF G05CFF 22 F06DFF G05CGF 22 F06DFF G05DAF 22 G05SQF G05DBF 22 G05SFF G05DCF 22 G05SLF G05DDF 22 G05SKF G05DEF 22 G05SMF G05DFF 22 G05SCF G05DHF 22 G05SDF G05DJF 22 G05SNF G05DKF 22 G05SHF G05DPF 22 G05SSF G05DRF 22 G05TKF G05DYF 22 G05TLF G05DZF 22 G05TBF G05EAF 22 G05RZF G05EBF 22 G05TLF G05ECF 22 G05TJF G05EDF 22 G05TAF G05EEF 22 G05THF G05EFF 22 G05TEF G05EGF 22 G05PHF G05EHF 22 G05NCF G05EJF 22 G05NDF G05EWF 22 G05PHF G05EXF 22 G05TDF G05EYF 22 G05TDF G05EZF 22 G05RZF G05FAF 22 G05SQF G05FBF 22 G05SFF G05FDF 22 G05SKF G05FEF 22 G05SBF G05FFF 22 G05SJF G05FSF 22 G05SRF G05GAF 22 G05PXF G05GBF 22 G05PYF G05HDF 22 G05PJF G05HKF 24 G05PDF G05HLF 24 G05PEF G05HMF 24 G05PFF G05HNF 24 G05PGF G05KAF 24 G05SAF G05KBF 24 G05KFF G05KCF 24 G05KGF G05KEF 24 G05TBF G05LAF 24 G05SKF G05LBF 24 G05SNF G05LCF 24 G05SDF G05LDF 24 G05SHF G05LEF 24 G05SBF G05LFF 24 G05SJF G05LGF 24 G05SQF G05LHF 24 G05SPF G05LJF 24 G05SFF G05LKF 24 G05SMF G05LLF 24 G05SJF G05LMF 24 G05SSF G05LNF 24 G05SLF G05LPF 24 G05SRF G05LQF 24 G05SGF G05LXF 24 G05RYF G05LYF 24 G05RZF G05LZF 24 G05RZF G05MAF 24 G05TLF G05MBF 24 G05TCF G05MCF 24 G05THF G05MDF 24 G05TFF G05MEF 24 G05TKF G05MJF 24 G05TAF G05MKF 24 G05TJF G05MLF 24 G05TEF G05MRF 24 G05TGF G05MZF 24 G05TDF G05NAF 24 G05NCF G05NBF 24 G05NDF G05PAF 24 G05PHF G05PCF 24 G05PJF G05QAF 24 G05PXF G05QBF 24 G05PYF G05QDF 24 G05PZF G05RAF 24 G05RDF G05RBF 24 G05RCF G05YAF 23 G05YLF and G05YMF G05YBF 23 G05YLF and either G05YJF or G05YKF G05YCF 24 G05YLF G05YDF 24 G05YMF G05YEF 24 G05YLF G05YFF 24 G05YMF G05YGF 24 G05YLF G05YHF 24 G05YMF G05ZAF 22 No replacement routine required