g02 Chapter Contents
g02 Chapter Introduction
NAG Library Manual

1  Purpose

nag_regsn_mult_linear_addrem_obs (g02dcc) adds or deletes an observation from a general regression model fitted by nag_regsn_mult_linear (g02dac).

2  Specification

 #include #include
 void nag_regsn_mult_linear_addrem_obs (Nag_UpdateObserv update, Nag_IncludeMean mean, Integer m, const Integer sx[], double q[], Integer tdq, Integer ip, const double x[], Integer nr, Integer tdx, Integer ix, double y, const double wt[], double *rss, NagError *fail)

3  Description

nag_regsn_mult_linear (g02dac) fits a general linear regression model to a dataset. You may wish to change the model by either adding or deleting an observation from the dataset. nag_regsn_mult_linear_addrem_obs (g02dcc) takes the results from nag_regsn_mult_linear (g02dac) and makes the required changes to the vector $c$ and the upper triangular matrix $R$ produced by nag_regsn_mult_linear (g02dac). The regression coefficients, standard errors and the variance-covariance matrix of the regression coefficients can be obtained from nag_regsn_mult_linear_upd_model (g02ddc) after all required changes to the dataset have been made.
nag_regsn_mult_linear (g02dac) performs a $QR$ decomposition on the (weighted) $X$ matrix of independent variables. To add a new observation to a model with $p$ arguments the upper triangular matrix $R$ and vector ${c}_{1}$, the first $p$ elements of $c$, are augmented by the new observation on independent variables in ${x}^{\mathrm{T}}$ and dependent variable $y$. Givens rotations are then used to restore the upper triangular form.
 $R : c 1 x y ⟶ R * c 1 * y * 0$
To delete an observation Givens rotations are applied to give:
 $R c 1 ⟶ R * c 1 * x y$
Note: only the $R$ and upper part of the $c$ are updated, the remainder of the $Q$ matrix is unchanged.

4  References

Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Hammarling S (1985) The singular value decomposition in multivariate statistics SIGNUM Newsl. 20(3) 2–25

5  Arguments

1:    $\mathbf{update}$Nag_UpdateObservInput
On entry: indicates if an observation is to be added or deleted.
${\mathbf{update}}=\mathrm{Nag_ObservAdd}$
${\mathbf{update}}=\mathrm{Nag_ObservDel}$
The observation is deleted.
Constraint: ${\mathbf{update}}=\mathrm{Nag_ObservAdd}$ or $\mathrm{Nag_ObservDel}$.
2:    $\mathbf{mean}$Nag_IncludeMeanInput
On entry: indicates if a mean has been used in the model.
${\mathbf{mean}}=\mathrm{Nag_MeanInclude}$
A mean term or intercept will have been included in the model by nag_regsn_mult_linear (g02dac).
${\mathbf{mean}}=\mathrm{Nag_MeanZero}$
A model with no mean term or intercept will have been fitted by nag_regsn_mult_linear (g02dac).
Constraint: ${\mathbf{mean}}=\mathrm{Nag_MeanInclude}$ or $\mathrm{Nag_MeanZero}$.
3:    $\mathbf{m}$IntegerInput
On entry: the total number of independent variables in the dataset.
Constraint: ${\mathbf{m}}\ge 1$.
4:    $\mathbf{sx}\left[{\mathbf{m}}\right]$const IntegerInput
On entry: if ${\mathbf{sx}}\left[\mathit{j}\right]$ is greater than 0, then the value contained in ${\mathbf{x}}\left[{\mathbf{tdx}}×\left({\mathbf{ix}}-1\right)+\mathit{j}\right]$ is to be included as a value of ${x}^{\mathrm{T}}$, an observation on an independent variable, for $\mathit{j}=0,1,\dots ,m-1$.
Constraint: if ${\mathbf{mean}}=\mathrm{Nag_MeanInclude}$, then exactly ${\mathbf{ip}}-1$ elements of sx must be $>0$ and if ${\mathbf{mean}}=\mathrm{Nag_MeanZero}$, then exactly ip elements of sx must be $>0$.
5:    $\mathbf{q}\left[{\mathbf{ip}}×{\mathbf{tdq}}\right]$doubleInput/Output
Note: the $\left(i,j\right)$th element of the matrix $Q$ is stored in ${\mathbf{q}}\left[\left(i-1\right)×{\mathbf{tdq}}+j-1\right]$.
On entry: q must be array q as output by nag_regsn_mult_linear (g02dac), nag_regsn_mult_linear_add_var (g02dec), nag_regsn_mult_linear_delete_var (g02dfc), or a previous call to nag_regsn_mult_linear_addrem_obs (g02dcc).
On exit: the first ip elements of the first column of q will contain ${c}_{1}^{*}$, the upper triangular part of columns 2 to ${\mathbf{ip}}+1$ will contain ${R}^{*}$, the remainder is unchanged.
6:    $\mathbf{tdq}$IntegerInput
On entry: the stride separating matrix column elements in the array q.
Constraint: ${\mathbf{tdq}}\ge {\mathbf{ip}}+1$.
7:    $\mathbf{ip}$IntegerInput
On entry: the number of linear terms in general linear regression model (including mean if there is one).
Constraint: ${\mathbf{ip}}\ge 1$.
8:    $\mathbf{x}\left[{\mathbf{nr}}×{\mathbf{tdx}}\right]$const doubleInput
On entry: the ip values for the dependent variables of the observation to be added or deleted, ${x}^{\mathrm{T}}$. The positions of the values x extracted depends on ix and tdx.
9:    $\mathbf{nr}$IntegerInput
On entry: the number of rows of the notional two-dimensional array x.
Constraint: ${\mathbf{nr}}\ge 1$.
10:  $\mathbf{tdx}$IntegerInput
On entry: the stride separating matrix column elements in the array x.
Constraint: ${\mathbf{tdx}}\ge {\mathbf{m}}$.
11:  $\mathbf{ix}$IntegerInput
On entry: the row of the notional two-dimensional array x that contains the values for the dependent variables of the observation to be added or deleted.
Constraint: $1\le {\mathbf{ix}}\le nr$.
12:  $\mathbf{y}$doubleInput
On entry: the value of the dependent variable for the observation to be added or deleted, $y$.
13:  $\mathbf{wt}\left[1\right]$const doubleInput
On entry: if the new observation is to be weighted, then wt must contain the weight to be used with the new observation. If ${\mathbf{wt}}\left[0\right]=0.0$, then the observation is not included in the model. If the new observation is to be unweighted, then wt must be supplied as NULL.
Constraint: if the new observation is to be weighted ${\mathbf{wt}}\left[0\right]\ge 0.0$.
14:  $\mathbf{rss}$double *Input/Output
On entry: the value of the residual sums of squares for the original set of observations.
Constraint: ${\mathbf{rss}}\ge 0.0$.
On exit: the updated values of the residual sums of squares.
Note: this will only be valid if the model is of full rank.
15:  $\mathbf{fail}$NagError *Input/Output
The NAG error argument (see Section 3.6 in the Essential Introduction).

6  Error Indicators and Warnings

NE_2_INT_ARG_GT
On entry, ${\mathbf{ix}}=〈\mathit{\text{value}}〉$ while ${\mathbf{nr}}=〈\mathit{\text{value}}〉$. These arguments must satisfy ${\mathbf{ix}}\le {\mathbf{nr}}$.
NE_2_INT_ARG_LT
On entry, ${\mathbf{tdq}}=〈\mathit{\text{value}}〉$ while ${\mathbf{ip}}+1=〈\mathit{\text{value}}〉$. These arguments must satisfy ${\mathbf{tdq}}\ge {\mathbf{ip}}+1$.
On entry, ${\mathbf{tdx}}=〈\mathit{\text{value}}〉$ while ${\mathbf{m}}=〈\mathit{\text{value}}〉$. These arguments must satisfy ${\mathbf{tdx}}\ge {\mathbf{m}}$.
NE_ALLOC_FAIL
Dynamic memory allocation failed.
On entry, mean had an illegal value.
On entry, update had an illegal value.
NE_INT_ARG_LT
On entry, ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ip}}\ge 1$.
On entry, ${\mathbf{ix}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{ix}}\ge 1$.
On entry, ${\mathbf{m}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{m}}\ge 1$.
On entry, ${\mathbf{nr}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{nr}}\ge 1$.
NE_IP_INCOMP_WITH_SX
On entry, for ${\mathbf{mean}}=\mathrm{Nag_MeanInclude}$, number of nonzero values of sx must be equal to ${\mathbf{ip}}-1$: number of nonzero values of ${\mathbf{sx}}=〈\mathit{\text{value}}〉$, ${\mathbf{ip}}-1=〈\mathit{\text{value}}〉$.
On entry, for ${\mathbf{mean}}=\mathrm{Nag_MeanZero}$, number of nonzero values of sx must be equal to ip: number of nonzero values of ${\mathbf{sx}}=〈\mathit{\text{value}}〉$, ${\mathbf{ip}}=〈\mathit{\text{value}}〉$.
NE_MAT_NOT_UPD
The $R$ matrix could not be updated: to, either, delete nonexistent observation, or, add an observation to $R$ matrix with zero diagonal element.
NE_REAL_ARG_LT
On entry, ${\mathbf{rss}}=〈\mathit{\text{value}}〉$.
Constraint: ${\mathbf{rss}}\ge 0.0$.
On entry, ${\mathbf{wt}}\left[0\right]=〈\mathit{\text{value}}〉$
Constraint: ${\mathbf{wt}}\left[0\right]\ge 0.0$.
The rss could not be updated because the input rss was less than the calculated decrease in rss when the new observation was deleted.

7  Accuracy

Higher accuracy is achieved by updating the $R$ matrix rather than the traditional methods of updating X'X.

8  Parallelism and Performance

Not applicable.

Care should be taken with the use of this function.
 (a) It is possible to delete observations which were not included in the original model. (b) If several additions/deletions have been performed you are advised to recompute the regression using nag_regsn_mult_linear (g02dac). (c) Adding or deleting observations can alter the rank of the model. Such changes will only be detected when a call to nag_regsn_mult_linear_upd_model (g02ddc) has been made. nag_regsn_mult_linear_upd_model (g02ddc) should also be used to compute the new residual sum of squares when the model is not of full rank.
nag_regsn_mult_linear_addrem_obs (g02dcc) may also be used after nag_regsn_mult_linear_add_var (g02dec) and nag_regsn_mult_linear_delete_var (g02dfc).

10  Example

A dataset consisting of 12 observations with four independent variables is read in and a general linear regression model fitted by nag_regsn_mult_linear (g02dac) and parameter estimates printed. The last observation is then dropped and the parameter estimates recalculated, using nag_regsn_mult_linear_upd_model (g02ddc), and printed.

10.1  Program Text

Program Text (g02dcce.c)

10.2  Program Data

Program Data (g02dcce.d)

10.3  Program Results

Program Results (g02dcce.r)