g02bu calculates the sample means and sums of squares and cross-products, or sums of squares and cross-products of deviations from the mean, in a single pass for a set of data. The data may be weighted.

Syntax

C#
public static void g02bu(
	string mean,
	string weight,
	int n,
	int m,
	double[,] x,
	double[] wt,
	out double sw,
	double[] wmean,
	double[] c,
	out int ifail
)
Visual Basic
Public Shared Sub g02bu ( _
	mean As String, _
	weight As String, _
	n As Integer, _
	m As Integer, _
	x As Double(,), _
	wt As Double(), _
	<OutAttribute> ByRef sw As Double, _
	wmean As Double(), _
	c As Double(), _
	<OutAttribute> ByRef ifail As Integer _
)
Visual C++
public:
static void g02bu(
	String^ mean, 
	String^ weight, 
	int n, 
	int m, 
	array<double,2>^ x, 
	array<double>^ wt, 
	[OutAttribute] double% sw, 
	array<double>^ wmean, 
	array<double>^ c, 
	[OutAttribute] int% ifail
)
F#
static member g02bu : 
        mean : string * 
        weight : string * 
        n : int * 
        m : int * 
        x : float[,] * 
        wt : float[] * 
        sw : float byref * 
        wmean : float[] * 
        c : float[] * 
        ifail : int byref -> unit 

Parameters

mean
Type: System..::..String
On entry: indicates whether g02bu is to calculate sums of squares and cross-products, or sums of squares and cross-products of deviations about the mean.
mean="M"
The sums of squares and cross-products of deviations about the mean are calculated.
mean="Z"
The sums of squares and cross-products are calculated.
Constraint: mean="M" or "Z".
weight
Type: System..::..String
On entry: indicates whether the data is weighted or not.
weight="U"
The calculations are performed on unweighted data.
weight="W"
The calculations are performed on weighted data.
Constraint: weight="W" or "U".
n
Type: System..::..Int32
On entry: n, the number of observations in the dataset.
Constraint: n1.
m
Type: System..::..Int32
On entry: m, the number of variables.
Constraint: m1.
x
Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, m]
Note: dim1 must satisfy the constraint: dim1n
On entry: x[i-1,j-1] must contain the ith observation on the jth variable, for i=1,2,,n and j=1,2,,m.
wt
Type: array<System..::..Double>[]()[][]
An array of size [dim1]
Note: the dimension of the array wt must be at least n if weight="W", and at least 1 otherwise.
On entry: the optional weights of each observation.
If weight="U", wt is not referenced.
If weight="W", wt[i-1] must contain the weight for the ith observation.
Constraint: if weight="W", wt[i]0.0, for i=0,1,,n-1.
sw
Type: System..::..Double%
On exit: the sum of weights.
If weight="U", sw contains the number of observations, n.
wmean
Type: array<System..::..Double>[]()[][]
An array of size [m]
On exit: the sample means. wmean[j-1] contains the mean for the jth variable.
c
Type: array<System..::..Double>[]()[][]
An array of size [m×m+m/2]
On exit: the cross-products.
If mean="M", c contains the upper triangular part of the matrix of (weighted) sums of squares and cross-products of deviations about the mean.
If mean="Z", c contains the upper triangular part of the matrix of (weighted) sums of squares and cross-products.
These are stored packed by columns, i.e., the cross-product between the jth and kth variable, kj, is stored in c[k×k-1/2+j-1].
ifail
Type: System..::..Int32%
On exit: ifail=0 unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

Description

g02bu is an adaptation of West's WV2 algorithm; see West (1979). This method calculates the (optionally weighted) sample means and (optionally weighted) sums of squares and cross-products or sums of squares and cross-products of deviations from the (weighted) mean for a sample of n observations on m variables Xj, for j=1,2,,m. The algorithm makes a single pass through the data.
For the first i-1 observations let the mean of the jth variable be x-ji-1, the cross-product about the mean for the jth and kth variables be cjki-1 and the sum of weights be Wi-1. These are updated by the ith observation, xij, for j=1,2,,m, with weight wi as follows:
Wi=Wi-1+wix-ji=x-ji-1+wiWixj-x-ji-1,  j=1,2,,m
and
cjki=cjki-1+wiWixj-x-ji-1xk-x-ki-1Wi-1,  j=1,2,,m​ and ​k=j,j+1,,m.
The algorithm is initialized by taking x-j1=x1j, the first observation, and cij1=0.0.
For the unweighted case wi=1 and Wi=i for all i.
Note that only the upper triangle of the matrix is calculated and returned packed by column.

References

Chan T F, Golub G H and Leveque R J (1982) Updating Formulae and a Pairwise Algorithm for Computing Sample Variances Compstat, Physica-Verlag
West D H D (1979) Updating mean and variance estimates: An improved method Comm. ACM 22 532–555

Error Indicators and Warnings

Errors or warnings detected by the method:
Some error messages may refer to parameters that are dropped from this interface (LDX) In these cases, an error in another parameter has usually caused an incorrect value to be inferred.
ifail=1
On entry,m<1,
orn<1,
ifail=2
On entry,mean"M" or "Z".
ifail=3
On entry,weight"W" or "U".
ifail=-9000
An error occured, see message report.
ifail=-6000
Invalid Parameters value
ifail=-4000
Invalid dimension for array value
ifail=-8000
Negative dimension for array value
ifail=-6000
Invalid Parameters value

Accuracy

For a detailed discussion of the accuracy of this algorithm see Chan et al. (1982) or West (1979).

Parallelism and Performance

None.

Further Comments

g02bw may be used to calculate the correlation coefficients from the cross-products of deviations about the mean. The cross-products of deviations about the mean may be scaled using (F06EDF not in this release) f06fd to give a variance-covariance matrix.
The means and cross-products produced by g02bu may be updated by adding or removing observations using g02bt.
Two sets of means and cross-products, as produced by g02bu, can be combined using (G02BZF not in this release).

Example

A program to calculate the means, the required sums of squares and cross-products matrix, and the variance matrix for a set of 3 observations of 3 variables.

Example program (C#): g02bue.cs

Example program data: g02bue.d

Example program results: g02bue.r

See Also