g01ez returns the probability associated with the upper tail of the Kolmogorov–Smirnov two sample distribution.

# Syntax

C# |
---|

public static double g01ez( int n1, int n2, double d, out int ifail ) |

Visual Basic |
---|

Public Shared Function g01ez ( _ n1 As Integer, _ n2 As Integer, _ d As Double, _ <OutAttribute> ByRef ifail As Integer _ ) As Double |

Visual C++ |
---|

public: static double g01ez( int n1, int n2, double d, [OutAttribute] int% ifail ) |

F# |
---|

static member g01ez : n1 : int * n2 : int * d : float * ifail : int byref -> float |

#### Parameters

- n1
- Type: System..::..Int32
*On entry*: the number of observations in the first sample, ${n}_{1}$.*Constraint*: ${\mathbf{n1}}\ge 1$.

- n2
- Type: System..::..Int32
*On entry*: the number of observations in the second sample, ${n}_{2}$.*Constraint*: ${\mathbf{n2}}\ge 1$.

- d
- Type: System..::..Double
*On entry*: the test statistic ${D}_{{n}_{1},{n}_{2}}$, for the two sample Kolmogorov–Smirnov goodness-of-fit test, that is the maximum difference between the empirical cumulative distribution functions (CDFs) of the two samples.*Constraint*: $0.0\le {\mathbf{d}}\le 1.0$.

- ifail
- Type: System..::..Int32%
*On exit*: ${\mathbf{ifail}}={0}$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

#### Return Value

g01ez returns the probability associated with the upper tail of the Kolmogorov–Smirnov two sample distribution.

# Description

Let ${F}_{{n}_{1}}\left(x\right)$ and ${G}_{{n}_{2}}\left(x\right)$ denote the empirical cumulative distribution functions for the two samples, where ${n}_{1}$ and ${n}_{2}$ are the sizes of the first and second samples respectively.

The function g01ez computes the upper tail probability for the Kolmogorov–Smirnov two sample two-sided test statistic ${D}_{{n}_{1},{n}_{2}}$, where

The probability is computed exactly if ${n}_{1},{n}_{2}\le 10000$ and $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 2500$ using a method given by Kim and Jenrich (1973). For the case where $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 10\%$ of the $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)$ and $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\le 80$ the Smirnov approximation is used. For all other cases the Kolmogorov approximation is used. These two approximations are discussed in Kim and Jenrich (1973).

$${D}_{{n}_{1},{n}_{2}}={\mathrm{sup}}_{x}\left|{F}_{{n}_{1}}\left(x\right)-{G}_{{n}_{2}}\left(x\right)\right|\text{.}$$ |

# References

Conover W J (1980)

*Practical Nonparametric Statistics*WileyFeller W (1948) On the Kolmogorov–Smirnov limit theorems for empirical distributions

*Ann. Math. Statist.***19**179–181Kendall M G and Stuart A (1973)

*The Advanced Theory of Statistics (Volume 2)*(3rd Edition) GriffinKim P J and Jenrich R I (1973) Tables of exact sampling distribution of the two sample Kolmogorov–Smirnov criterion ${D}_{mn}\left(m<n\right)$

*Selected Tables in Mathematical Statistics***1**80–129 American Mathematical SocietySiegel S (1956)

*Non-parametric Statistics for the Behavioral Sciences*McGraw–HillSmirnov N (1948) Table for estimating the goodness of fit of empirical distributions

*Ann. Math. Statist.***19**279–281# Error Indicators and Warnings

Errors or warnings detected by the method:

- ${\mathbf{ifail}}=1$
On entry, ${\mathbf{n1}}<1$, or ${\mathbf{n2}}<1$.

- ${\mathbf{ifail}}=2$
On entry, ${\mathbf{d}}<0.0$, or ${\mathbf{d}}>1.0$.

- ${\mathbf{ifail}}=3$
- The approximation solution did not converge in $500$ iterations. A tail probability of $1.0$ is returned by g01ez.

# Accuracy

The large sample distributions used as approximations to the exact distribution should have a relative error of less than 5% for most cases.

# Parallelism and Performance

None.

# Further Comments

The upper tail probability for the one-sided statistics, ${D}_{{n}_{1},{n}_{2}}^{+}$ or ${D}_{{n}_{1},{n}_{2}}^{-}$, can be approximated by halving the two-sided upper tail probability returned by g01ez, that is $p/2$. This approximation to the upper tail probability for either ${D}_{{n}_{1},{n}_{2}}^{+}$ or ${D}_{{n}_{1},{n}_{2}}^{-}$ is good for small probabilities, (e.g., $p\le 0.10$) but becomes poor for larger probabilities.

The time taken by the method increases with ${n}_{1}$ and ${n}_{2}$, until ${n}_{1}{n}_{2}>10000$ or $\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left({n}_{1},{n}_{2}\right)\ge 2500$. At this point one of the approximations is used and the time decreases significantly. The time then increases again modestly with ${n}_{1}$ and ${n}_{2}$.

# Example

The following example reads in $10$ different sample sizes and values for the test statistic ${D}_{{n}_{1},{n}_{2}}$. The upper tail probability is computed and printed for each case.

Example program (C#): g01eze.cs