g02la fits an orthogonal scores partial least squares (PLS) regression by using singular value decomposition.

Syntax

C#
public static void g02la( int n, int mx, double[,] x, int[] isx, int ip, int my, double[,] y, double[] xbar, double[] ybar, int iscale, double[] xstd, double[] ystd, int maxfac, double[,] xres, double[,] yres, double[,] w, double[,] p, double[,] t, double[,] c, double[,] u, double[] xcv, double[,] ycv, out int ifail )

public static void g02la(
	int n,
	int mx,
	double[,] x,
	int[] isx,
	int ip,
	int my,
	double[,] y,
	double[] xbar,
	double[] ybar,
	int iscale,
	double[] xstd,
	double[] ystd,
	int maxfac,
	double[,] xres,
	double[,] yres,
	double[,] w,
	double[,] p,
	double[,] t,
	double[,] c,
	double[,] u,
	double[] xcv,
	double[,] ycv,
	out int ifail
)

Visual Basic
Public Shared Sub g02la ( _ n As Integer, _ mx As Integer, _ x As Double(,), _ isx As Integer(), _ ip As Integer, _ my As Integer, _ y As Double(,), _ xbar As Double(), _ ybar As Double(), _ iscale As Integer, _ xstd As Double(), _ ystd As Double(), _ maxfac As Integer, _ xres As Double(,), _ yres As Double(,), _ w As Double(,), _ p As Double(,), _ t As Double(,), _ c As Double(,), _ u As Double(,), _ xcv As Double(), _ ycv As Double(,), _ <OutAttribute> ByRef ifail As Integer _ )

Visual Basic

Public Shared Sub g02la ( _
	n As Integer, _
	mx As Integer, _
	x As Double(,), _
	isx As Integer(), _
	ip As Integer, _
	my As Integer, _
	y As Double(,), _
	xbar As Double(), _
	ybar As Double(), _
	iscale As Integer, _
	xstd As Double(), _
	ystd As Double(), _
	maxfac As Integer, _
	xres As Double(,), _
	yres As Double(,), _
	w As Double(,), _
	p As Double(,), _
	t As Double(,), _
	c As Double(,), _
	u As Double(,), _
	xcv As Double(), _
	ycv As Double(,), _
	<OutAttribute> ByRef ifail As Integer _
)

Visual C++
public: static void g02la( int n, int mx, array<double,2>^ x, array<int>^ isx, int ip, int my, array<double,2>^ y, array<double>^ xbar, array<double>^ ybar, int iscale, array<double>^ xstd, array<double>^ ystd, int maxfac, array<double,2>^ xres, array<double,2>^ yres, array<double,2>^ w, array<double,2>^ p, array<double,2>^ t, array<double,2>^ c, array<double,2>^ u, array<double>^ xcv, array<double,2>^ ycv, [OutAttribute] int% ifail )

Visual C++

public:
static void g02la(
	int n, 
	int mx, 
	array<double,2>^ x, 
	array<int>^ isx, 
	int ip, 
	int my, 
	array<double,2>^ y, 
	array<double>^ xbar, 
	array<double>^ ybar, 
	int iscale, 
	array<double>^ xstd, 
	array<double>^ ystd, 
	int maxfac, 
	array<double,2>^ xres, 
	array<double,2>^ yres, 
	array<double,2>^ w, 
	array<double,2>^ p, 
	array<double,2>^ t, 
	array<double,2>^ c, 
	array<double,2>^ u, 
	array<double>^ xcv, 
	array<double,2>^ ycv, 
	[OutAttribute] int% ifail
)

F#
static member g02la : n : int * mx : int * x : float[,] * isx : int[] * ip : int * my : int * y : float[,] * xbar : float[] * ybar : float[] * iscale : int * xstd : float[] * ystd : float[] * maxfac : int * xres : float[,] * yres : float[,] * w : float[,] * p : float[,] * t : float[,] * c : float[,] * u : float[,] * xcv : float[] * ycv : float[,] * ifail : int byref -> unit

static member g02la : 
        n : int * 
        mx : int * 
        x : float[,] * 
        isx : int[] * 
        ip : int * 
        my : int * 
        y : float[,] * 
        xbar : float[] * 
        ybar : float[] * 
        iscale : int * 
        xstd : float[] * 
        ystd : float[] * 
        maxfac : int * 
        xres : float[,] * 
        yres : float[,] * 
        w : float[,] * 
        p : float[,] * 
        t : float[,] * 
        c : float[,] * 
        u : float[,] * 
        xcv : float[] * 
        ycv : float[,] * 
        ifail : int byref -> unit

Parameters

n: Type: System..::..Int32
On entry: $n$ , the number of observations.

Constraint: $n > 1$ .

mx: Type: System..::..Int32
On entry: the number of predictor variables.

Constraint: $mx > 1$ .

x: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, mx]
Note: dim1 must satisfy the constraint: $dim1 \geq n$
On entry: $x [i - 1, j - 1]$ must contain the $i$ th observation on the $j$ th predictor variable, for $i = 1, 2, \dots, n$ and $j = 1, 2, \dots, mx$ .

isx

Type: array<System..::..Int32>[]()[][]

An array of size [mx]

On entry: indicates which predictor variables are to be included in the model.

$isx [j - 1] = 1$: The $j$ th predictor variable (with variates in the $j$ th column of $X$ ) is included in the model.
$isx [j - 1] = 0$: Otherwise.

Constraint: the sum of elements in isx must equal ip.

ip: Type: System..::..Int32
On entry: $m$ , the number of predictor variables in the model.

Constraint: $1 < ip \leq mx$ .

my: Type: System..::..Int32
On entry: $r$ , the number of response variables.

Constraint: $my \geq 1$ .

y: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, my]
Note: dim1 must satisfy the constraint: $dim1 \geq n$
On entry: $y [i - 1, j - 1]$ must contain the $i$ th observation for the $j$ th response variable, for $i = 1, 2, \dots, n$ and $j = 1, 2, \dots, my$ .

xbar: Type: array<System..::..Double>[]()[][]
An array of size [ip]
On exit: mean values of predictor variables in the model.

ybar: Type: array<System..::..Double>[]()[][]
An array of size [my]
On exit: the mean value of each response variable.

iscale

Type: System..::..Int32

On entry: indicates how predictor variables are scaled.

$iscale = 1$: Data are scaled by the standard deviation of variables.
$iscale = 2$: Data are scaled by user-supplied scalings.
$iscale = -1$: No scaling.

Constraint:

iscale = -1

1

2

xstd: Type: array<System..::..Double>[]()[][]
An array of size [ip]
On entry: if $iscale = 2$ , $xstd [j - 1]$ must contain the user-supplied scaling for the $j$ th predictor variable in the model, for $j = 1, 2, \dots, ip$ . Otherwise xstd need not be set.
On exit: if $iscale = 1$ , standard deviations of predictor variables in the model. Otherwise xstd is not changed.

ystd: Type: array<System..::..Double>[]()[][]
An array of size [my]
On entry: if $iscale = 2$ , $ystd [j - 1]$ must contain the user-supplied scaling for the $j$ th response variable in the model, for $j = 1, 2, \dots, my$ . Otherwise ystd need not be set.
On exit: if $iscale = 1$ , the standard deviation of each response variable. Otherwise ystd is not changed.

maxfac: Type: System..::..Int32
On entry: $k$ , the number of latent variables to calculate.

Constraint: $1 \leq maxfac \leq ip$ .

xres: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, ip]
Note: dim1 must satisfy the constraint: $dim1 \geq n$
On exit: the predictor variables' residual matrix $X_{k}$ .

yres: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, my]
Note: dim1 must satisfy the constraint: $dim1 \geq n$
On exit: the residuals for each response variable, $Y_{k}$ .

w: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, maxfac]
Note: dim1 must satisfy the constraint: $dim1 \geq ip$
On exit: the $j$ th column of $W$ contains the $x$ -weights $w_{j}$ , for $j = 1, 2, \dots, maxfac$ .

p: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, maxfac]
Note: dim1 must satisfy the constraint: $dim1 \geq ip$
On exit: the $j$ th column of $P$ contains the $x$ -loadings $p_{j}$ , for $j = 1, 2, \dots, maxfac$ .

t: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, maxfac]
Note: dim1 must satisfy the constraint: $dim1 \geq n$
On exit: the $j$ th column of $T$ contains the $x$ -scores $t_{j}$ , for $j = 1, 2, \dots, maxfac$ .

c: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, maxfac]
Note: dim1 must satisfy the constraint: $dim1 \geq my$
On exit: the $j$ th column of $C$ contains the $y$ -loadings $c_{j}$ , for $j = 1, 2, \dots, maxfac$ .

u: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, maxfac]
Note: dim1 must satisfy the constraint: $dim1 \geq n$
On exit: the $j$ th column of $U$ contains the $y$ -scores $u_{j}$ , for $j = 1, 2, \dots, maxfac$ .

xcv: Type: array<System..::..Double>[]()[][]
An array of size [maxfac]
On exit: $xcv [j - 1]$ contains the cumulative percentage of variance in the predictor variables explained by the first $j$ factors, for $j = 1, 2, \dots, maxfac$ .

ycv: Type: array<System..::..Double,2>[,](,)[,][,]
An array of size [dim1, my]
Note: dim1 must satisfy the constraint: $dim1 \geq maxfac$
On exit: $ycv [i - 1, j - 1]$ is the cumulative percentage of variance of the $j$ th response variable explained by the first $i$ factors, for $i = 1, 2, \dots, maxfac$ and $j = 1, 2, \dots, my$ .

ifail: Type: System..::..Int32%
On exit: $ifail = 0$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

Description

Let

X_{1}

be the mean-centred

n

m

data matrix

X

n

observations on

m

predictor variables. Let

Y_{1}

be the mean-centred

n

r

data matrix

Y

n

observations on

r

response variables.

The first of the

k

factors PLS methods extract from the data predicts both

X_{1}

and

Y_{1}

by regressing on

t_{1}

a column vector of

n

scores:

\begin{matrix} {\hat{X}}_{1} = t_{1} p_{1}^{T} \\ {\hat{Y}}_{1} = t_{1} c_{1}^{T}, & with ​ t_{1}^{T} t_{1} = 1, \end{matrix}

where the column vectors of

m

x

-loadings

p_{1}

and

r

y

-loadings

c_{1}

are calculated in the least squares sense:

\begin{matrix} p_{1}^{T} = t_{1}^{T} X_{1} \\ c_{1}^{T} = t_{1}^{T} Y_{1} . \end{matrix}

The

x

-score vector

t_{1} = X_{1} w_{1}

is the linear combination of predictor data

X_{1}

that has maximum covariance with the

y

-scores

u_{1} = Y_{1} c_{1}

, where the

x

-weights vector

w_{1}

is the normalised first left singular vector of

X_{1}^{T} Y_{1}

The method extracts subsequent PLS factors by repeating the above process with the residual matrices:

\begin{matrix} X_{i} = X_{i - 1} - {\hat{X}}_{i - 1} \\ Y_{i} = Y_{i - 1} - {\hat{Y}}_{i - 1}, i = 2, 3, \dots, k, \end{matrix}

and with orthogonal scores:

t_{i}^{T} t_{j} = 0, j = 1, 2, \dots, i - 1 .

Optionally, in addition to being mean-centred, the data matrices

X_{1}

and

Y_{1}

may be scaled by standard deviations of the variables. If data are supplied mean-centred, the calculations are not affected within numerical accuracy.

References

None.

Error Indicators and Warnings

Errors or warnings detected by the method:

Some error messages may refer to parameters that are dropped from this interface (LDX, LDY, LDXRES, LDYRES, LDW, LDP, LDT, LDC, LDU, LDYCV) In these cases, an error in another parameter has usually caused an incorrect value to be inferred.

$ifail = 1$

On entry,	$n < 2$ ,
or	$mx < 2$ ,
or	an element of $isx \neq 0$ or $1$ ,
or	$my < 1$ ,
or	$iscale \neq -1$ , $1$ or $2$ .

$ifail = 2$

On entry,	$ip < 2$ or $ip > mx$ ,
or	$maxfac < 1$ or $maxfac > ip$ ,

$ifail = 3$: ip does not equal the sum of elements in isx.

$ifail = -9000$: An error occured, see message report.
$ifail = -6000$: Invalid Parameters $〈value〉$
$ifail = -4000$: Invalid dimension for array $〈value〉$
$ifail = -8000$: Negative dimension for array $〈value〉$
$ifail = -6000$: Invalid Parameters $〈value〉$

Accuracy

The computed singular value decomposition is nearly the exact singular value decomposition for a nearby matrix

(A + E)

, where

{‖E‖}_{2} = O (ε) {‖A‖}_{2},

and

ε

is the machine precision.

Parallelism and Performance

None.

Further Comments

g02la allocates internally

2 m r + A + \max (3 (A + B), 5 A) + r

elements of real storage, where

A = \min (m, r)

and

B = \max (m, r)

Example

This example reads in data from an experiment to measure the biological activity in a chemical compound, and a PLS model is estimated.

Example program (C#): g02lae.cs

Example program data: g02lae.d

Example program results: g02lae.r