﻿ g01ez Method
g01ez returns the probability associated with the upper tail of the Kolmogorov–Smirnov two sample distribution.

# Syntax

C#
```public static double g01ez(
int n1,
int n2,
double d,
out int ifail
)```
Visual Basic
```Public Shared Function g01ez ( _
n1 As Integer, _
n2 As Integer, _
d As Double, _
<OutAttribute> ByRef ifail As Integer _
) As Double```
Visual C++
```public:
static double g01ez(
int n1,
int n2,
double d,
[OutAttribute] int% ifail
)```
F#
```static member g01ez :
n1 : int *
n2 : int *
d : float *
ifail : int byref -> float
```

#### Parameters

n1
Type: System..::..Int32
On entry: the number of observations in the first sample, ${n}_{1}$.
Constraint: ${\mathbf{n1}}\ge 1$.
n2
Type: System..::..Int32
On entry: the number of observations in the second sample, ${n}_{2}$.
Constraint: ${\mathbf{n2}}\ge 1$.
d
Type: System..::..Double
On entry: the test statistic ${D}_{{n}_{1},{n}_{2}}$, for the two sample Kolmogorov–Smirnov goodness-of-fit test, that is the maximum difference between the empirical cumulative distribution functions (CDFs) of the two samples.
Constraint: $0.0\le {\mathbf{d}}\le 1.0$.
ifail
Type: System..::..Int32%
On exit: ${\mathbf{ifail}}={0}$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

#### Return Value

g01ez returns the probability associated with the upper tail of the Kolmogorov–Smirnov two sample distribution.

# Description

Let ${F}_{{n}_{1}}\left(x\right)$ and ${G}_{{n}_{2}}\left(x\right)$ denote the empirical cumulative distribution functions for the two samples, where ${n}_{1}$ and ${n}_{2}$ are the sizes of the first and second samples respectively.
The function g01ez computes the upper tail probability for the Kolmogorov–Smirnov two sample two-sided test statistic ${D}_{{n}_{1},{n}_{2}}$, where
 $Dn1,n2=supxFn1x-Gn2x.$
The probability is computed exactly if ${n}_{1},{n}_{2}\le 10000$ and $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 2500$ using a method given by Kim and Jenrich (1973). For the case where $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 10%$ of the $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)$ and $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 80$ the Smirnov approximation is used. For all other cases the Kolmogorov approximation is used. These two approximations are discussed in Kim and Jenrich (1973).

# References

Conover W J (1980) Practical Nonparametric Statistics Wiley
Feller W (1948) On the Kolmogorov–Smirnov limit theorems for empirical distributions Ann. Math. Statist. 19 179–181
Kendall M G and Stuart A (1973) The Advanced Theory of Statistics (Volume 2) (3rd Edition) Griffin
Kim P J and Jenrich R I (1973) Tables of exact sampling distribution of the two sample Kolmogorov–Smirnov criterion ${D}_{mn}\left(m Selected Tables in Mathematical Statistics 1 80–129 American Mathematical Society
Siegel S (1956) Non-parametric Statistics for the Behavioral Sciences McGraw–Hill
Smirnov N (1948) Table for estimating the goodness of fit of empirical distributions Ann. Math. Statist. 19 279–281

# Error Indicators and Warnings

Errors or warnings detected by the method:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{n1}}<1$, or ${\mathbf{n2}}<1$.
${\mathbf{ifail}}=2$
 On entry, ${\mathbf{d}}<0.0$, or ${\mathbf{d}}>1.0$.
${\mathbf{ifail}}=3$
The approximation solution did not converge in $500$ iterations. A tail probability of $1.0$ is returned by g01ez.
${\mathbf{ifail}}=-9000$
An error occured, see message report.

# Accuracy

The large sample distributions used as approximations to the exact distribution should have a relative error of less than 5% for most cases.

# Parallelism and Performance

None.

The upper tail probability for the one-sided statistics, ${D}_{{n}_{1},{n}_{2}}^{+}$ or ${D}_{{n}_{1},{n}_{2}}^{-}$, can be approximated by halving the two-sided upper tail probability returned by g01ez, that is $p/2$. This approximation to the upper tail probability for either ${D}_{{n}_{1},{n}_{2}}^{+}$ or ${D}_{{n}_{1},{n}_{2}}^{-}$ is good for small probabilities, (e.g., $p\le 0.10$) but becomes poor for larger probabilities.
The time taken by the method increases with ${n}_{1}$ and ${n}_{2}$, until ${n}_{1}{n}_{2}>10000$ or $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\ge 2500$. At this point one of the approximations is used and the time decreases significantly. The time then increases again modestly with ${n}_{1}$ and ${n}_{2}$.

# Example

The following example reads in $10$ different sample sizes and values for the test statistic ${D}_{{n}_{1},{n}_{2}}$. The upper tail probability is computed and printed for each case.

Example program (C#): g01eze.cs

Example program data: g01eze.d

Example program results: g01eze.r