﻿ g01dd Method
g01dd calculates Shapiro and Wilk's $W$ statistic and its significance level for testing Normality.

# Syntax

C#
```public static void g01dd(
double[] x,
int n,
bool calwts,
double[] a,
out double w,
out double pw,
out int ifail
)```
Visual Basic
```Public Shared Sub g01dd ( _
x As Double(), _
n As Integer, _
calwts As Boolean, _
a As Double(), _
<OutAttribute> ByRef w As Double, _
<OutAttribute> ByRef pw As Double, _
<OutAttribute> ByRef ifail As Integer _
)```
Visual C++
```public:
static void g01dd(
array<double>^ x,
int n,
bool calwts,
array<double>^ a,
[OutAttribute] double% w,
[OutAttribute] double% pw,
[OutAttribute] int% ifail
)```
F#
```static member g01dd :
x : float[] *
n : int *
calwts : bool *
a : float[] *
w : float byref *
pw : float byref *
ifail : int byref -> unit
```

#### Parameters

x
Type: array<System..::..Double>[]()[][]
An array of size [n]
On entry: the ordered sample values, ${x}_{\mathit{i}}$, for $\mathit{i}=1,2,\dots ,n$.
n
Type: System..::..Int32
On entry: $n$, the sample size.
Constraint: $3\le {\mathbf{n}}\le 5000$.
calwts
Type: System..::..Boolean
On entry: must be set to true if you wish g01dd to calculate the elements of a.
calwts should be set to false if you have saved the values in a from a previous call to g01dd.
If in doubt, set calwts equal to true.
a
Type: array<System..::..Double>[]()[][]
An array of size [n]
On entry: if calwts has been set to false then before entry a must contain the $n$ weights as calculated in a previous call to g01dd, otherwise a need not be set.
On exit: the $n$ weights required to calculate ${\mathbf{w}}$.
w
Type: System..::..Double%
On exit: the value of the statistic, ${\mathbf{w}}$.
pw
Type: System..::..Double%
On exit: the significance level of ${\mathbf{w}}$.
ifail
Type: System..::..Int32%
On exit: ${\mathbf{ifail}}={0}$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

# Description

g01dd calculates Shapiro and Wilk's $W$ statistic and its significance level for any sample size between $3$ and $5000$. It is an adaptation of the Applied Statistics Algorithm AS R94, see Royston (1995). The full description of the theory behind this algorithm is given in Royston (1992).
Given a set of observations ${x}_{1},{x}_{2},\dots ,{x}_{n}$ sorted into either ascending or descending order ( (M01CAF not in this release) may be used to sort the data) this method calculates the value of Shapiro and Wilk's $W$ statistic defined as:
 $W=∑i=1naixi2∑i=1nxi-x-2,$
where $\stackrel{-}{x}=\frac{1}{n}\sum _{1}^{n}{x}_{i}$ is the sample mean and ${a}_{i}$, for $i=1,2,\dots ,n$, are a set of ‘weights’ whose values depend only on the sample size $n$.
On exit, the values of ${a}_{i}$, for $\mathit{i}=1,2,\dots ,n$, are only of interest should you wish to call the method again to calculate ${\mathbf{w}}$ and its significance level for a different sample of the same size.
It is recommended that the method is used in conjunction with a Normal $\left(Q-Q\right)$ plot of the data. Methods g01da and g01db can be used to obtain the required Normal scores.

# References

Royston J P (1982) Algorithm AS 181: the $W$ test for normality Appl. Statist. 31 176–180
Royston J P (1986) A remark on AS 181: the $W$ test for normality Appl. Statist. 35 232–234
Royston J P (1992) Approximating the Shapiro–Wilk's $W$ test for non-normality Statistics & Computing 2 117–119
Royston J P (1995) A remark on AS R94: A remark on Algorithm AS 181: the $W$ test for normality Appl. Statist. 44(4) 547–551

# Error Indicators and Warnings

Errors or warnings detected by the method:
${\mathbf{ifail}}=1$
 On entry, ${\mathbf{n}}<3$.
${\mathbf{ifail}}=2$
 On entry, ${\mathbf{n}}>5000$.
${\mathbf{ifail}}=3$
 On entry, the elements in x are not in ascending or descending order or are all equal.
${\mathbf{ifail}}=-9000$
An error occured, see message report.
${\mathbf{ifail}}=-8000$
Negative dimension for array $〈\mathit{\text{value}}〉$
${\mathbf{ifail}}=-6000$
Invalid Parameters $〈\mathit{\text{value}}〉$

# Accuracy

There may be a loss of significant figures for large $n$.

# Parallelism and Performance

None.

The time taken by g01dd depends roughly linearly on the value of $n$.
For very small samples the power of the test may not be very high.
The contents of the array a should not be modified between calls to g01dd for a given sample size, unless calwts is reset to true before each call of g01dd.
The Shapiro and Wilk's $W$ test is very sensitive to ties. If the data has been rounded the test can be improved by using Sheppard's correction to adjust the sum of squares about the mean. This produces an adjusted value of ${\mathbf{w}}$,
 $WA=W∑xi-x-2∑i=1nxi=x-2-n-112ω2,$
where $\omega$ is the rounding width. $WA$ can be compared with a standard Normal distribution, but a further approximation is given by Royston (1986).
If ${\mathbf{n}}>5000$, a value for w and pw is returned, but its accuracy may not be acceptable. See [References] for more details.

# Example

This example tests the following two samples (each of size $20$) for Normality.
 Sample Number Data 1 $0.11$, $7.87$, $4.61$, $10.14$, $7.95$, $3.14$, $0.46$, $4.43$, $0.21$, $4.75$, $0.71$, $1.52$, $3.24$, $0.93$, $0.42$, $4.97$, $9.53$, $4.55$, $0.47$, $6.66$ 2 $1.36$, $1.14$, $2.92$, $2.55$, $1.46$, $1.06$, $5.27$, $-1.11$, $3.48$, $1.10$, $0.88$, $-0.51$, $1.46$, $0.52$, $6.20$, $1.69$, $0.08$, $3.67$, $2.81$, $3.49$
The elements of a are calculated only in the first call of g01dd, and are re-used in the second call.

Example program (C#): g01dde.cs

Example program data: g01dde.d

Example program results: g01dde.r