g01dd calculates Shapiro and Wilk's $W$ statistic and its significance level for testing Normality.

# Syntax

C# |
---|

public static void g01dd( double[] x, int n, bool calwts, double[] a, out double w, out double pw, out int ifail ) |

Visual Basic |
---|

Public Shared Sub g01dd ( _ x As Double(), _ n As Integer, _ calwts As Boolean, _ a As Double(), _ <OutAttribute> ByRef w As Double, _ <OutAttribute> ByRef pw As Double, _ <OutAttribute> ByRef ifail As Integer _ ) |

Visual C++ |
---|

public: static void g01dd( array<double>^ x, int n, bool calwts, array<double>^ a, [OutAttribute] double% w, [OutAttribute] double% pw, [OutAttribute] int% ifail ) |

F# |
---|

static member g01dd : x : float[] * n : int * calwts : bool * a : float[] * w : float byref * pw : float byref * ifail : int byref -> unit |

#### Parameters

- x
- Type: array<System..::..Double>[]()[][]An array of size [n]
*On entry*: the ordered sample values, ${x}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$.

- n
- Type: System..::..Int32
*On entry*: $n$, the sample size.*Constraint*: $3\le {\mathbf{n}}\le 5000$.

- calwts
- Type: System..::..Boolean

- a
- Type: array<System..::..Double>[]()[][]An array of size [n]
*On entry*: if calwts has been set to false then before entry a must contain the $n$ weights as calculated in a previous call to g01dd, otherwise a need not be set.*On exit*: the $n$ weights required to calculate ${\mathbf{w}}$.

- w
- Type: System..::..Double%
*On exit*: the value of the statistic, ${\mathbf{w}}$.

- pw
- Type: System..::..Double%
*On exit*: the significance level of ${\mathbf{w}}$.

- ifail
- Type: System..::..Int32%
*On exit*: ${\mathbf{ifail}}={0}$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

# Description

g01dd calculates Shapiro and Wilk's $W$ statistic and its significance level for any sample size between $3$ and $5000$. It is an adaptation of the Applied Statistics Algorithm AS R94, see Royston (1995). The full description of the theory behind this algorithm is given in Royston (1992).

Given a set of observations ${x}_{1},{x}_{2},\dots ,{x}_{n}$ sorted into either ascending or descending order ( (M01CAF not in this release) may be used to sort the data) this method calculates the value of Shapiro and Wilk's $W$ statistic defined as:

where $\stackrel{-}{x}=\frac{1}{n}{\displaystyle \sum _{1}^{n}}{x}_{i}$ is the sample mean and ${a}_{i}$, for $i=1,2,\dots ,n$, are a set of ‘weights’ whose values depend only on the sample size $n$.

$$W=\frac{{\left(\sum _{i=1}^{n}{a}_{i}{x}_{i}\right)}^{2}}{\sum _{i=1}^{n}{\left({x}_{i}-\stackrel{-}{x}\right)}^{2}}\text{,}$$ |

On exit, the values of ${a}_{i}$, for $\mathit{i}=1,2,\dots ,n$, are only of interest should you wish to call the method again to calculate ${\mathbf{w}}$ and its significance level for a different sample of the same size.

# References

Royston J P (1982) Algorithm AS 181: the $W$ test for normality

*Appl. Statist.***31**176–180Royston J P (1986) A remark on AS 181: the $W$ test for normality

*Appl. Statist.***35**232–234Royston J P (1992) Approximating the Shapiro–Wilk's $W$ test for non-normality

*Statistics & Computing***2**117–119Royston J P (1995) A remark on AS R94: A remark on Algorithm AS 181: the $W$ test for normality

*Appl. Statist.***44(4)**547–551# Error Indicators and Warnings

Errors or warnings detected by the method:

- ${\mathbf{ifail}}=1$
On entry, ${\mathbf{n}}<3$.

- ${\mathbf{ifail}}=2$
On entry, ${\mathbf{n}}>5000$.

- ${\mathbf{ifail}}=3$
On entry, the elements in x are not in ascending or descending order or are all equal.

# Accuracy

There may be a loss of significant figures for large $n$.

# Parallelism and Performance

None.

# Further Comments

The time taken by g01dd depends roughly linearly on the value of $n$.

For very small samples the power of the test may not be very high.

The contents of the array a should not be modified between calls to g01dd for a given sample size, unless calwts is reset to true before each call of g01dd.

The Shapiro and Wilk's $W$ test is very sensitive to ties. If the data has been rounded the test can be improved by using Sheppard's correction to adjust the sum of squares about the mean. This produces an adjusted value of ${\mathbf{w}}$,

where $\omega $ is the rounding width. $WA$ can be compared with a standard Normal distribution, but a further approximation is given by Royston (1986).

$$WA=W\frac{\sum {{x}_{\left(i\right)}-\stackrel{-}{x}}^{2}}{\left\{\sum _{i=1}^{n}{{x}_{\left(i\right)}=\stackrel{-}{x}}^{2}-\frac{n-1}{12}{\omega}^{2}\right\}}\text{,}$$ |

If ${\mathbf{n}}>5000$, a value for w and pw is returned, but its accuracy may not be acceptable. See [References] for more details.

# Example

This example tests the following two samples (each of size $20$) for Normality.

SampleNumber | Data |

1 | $0.11$, $7.87$, $4.61$, $10.14$, $7.95$, $3.14$, $0.46$, $4.43$, $0.21$, $4.75$, $0.71$, $1.52$, $3.24$, $0.93$, $0.42$, $4.97$, $9.53$, $4.55$, $0.47$, $6.66$ |

2 | $1.36$, $1.14$, $2.92$, $2.55$, $1.46$, $1.06$, $5.27$, $-1.11$, $3.48$, $1.10$, $0.88$, $-0.51$, $1.46$, $0.52$, $6.20$, $1.69$, $0.08$, $3.67$, $2.81$, $3.49$ |

Example program (C#): g01dde.cs