g02mc:: Correlation and Regression Analysis (NAG Toolbox)

Description

nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb) fit either a LARS, forward stagewise linear regression, LASSO or positive LASSO model to a vector of

n

observed values,

y = \{y_{i} : i = 1, 2, \dots, n\}

and an

n \times p

design matrix

X

, where the

j

th column of

X

is given by the

j

th independent variable

x_{j}

. The models are fit using the LARS algorithm of Efron et al. (2004).

Figure 1

The full solution path for all four of these models follow a similar pattern where the parameter estimate for a given variable is piecewise linear. One such path, for a LARS model with six variables

(p = 6)

can be seen in Figure 1. Both nag_correg_lars (g02ma) and nag_correg_lars_xtx (g02mb) return the vector of

p

parameter estimates,

β_{k}

, at

K

points along this path (so

k = 1, 2, \dots, K

). Each point corresponds to a step of the LARS algorithm. The number of steps taken depends on the model being fitted. In the case of a LARS model,

K = p

and each step corresponds to a new variable being included in the model. In the case of the LASSO models, each step corresponds to either a new variable being included in the model or an existing variable being removed from the model; the value of

K

is therefore no longer bound by the number of parameters. For forward stagewise linear regression, each step no longer corresponds to the addition or removal of a variable; therefore the number of possible steps is often markedly greater than for a corresponding LASSO model.

nag_correg_lars_param (g02mc) uses the piecewise linear nature of the solution path to predict the parameter estimates,

\tilde{β}

, at a different point on this path. The location of the solution can either be defined in terms of a (fractional) step number or a function of the

L_{1}

norm of the parameter estimates.

References

Parameters

Compulsory Input Parameters

Optional Input Parameters

Output Parameters

Error Indicators and Warnings

Accuracy

Further Comments

Example

function g02mc_example


fprintf('g02mc example results\n\n');

% Going to be fitting a LAR model via g02ma and getting g02ma
% to mean center y and normalise X around the mean
mtype = int64(1);
pred = int64(3);
prey = int64(1);

% Independent variables
d = [10.28  1.77  9.69 15.58  8.23 10.44;
      9.08  8.99 11.53  6.57 15.89 12.58;
     17.98 13.10  1.04 10.45 10.12 16.68;
     14.82 13.79 12.23  7.00  8.14  7.79;
     17.53  9.41  6.24  3.75 13.12 17.08;
      7.78 10.38  9.83  2.58 10.13  4.25;
     11.95 21.71  8.83 11.00 12.59 10.52;
     14.60 10.09 -2.70  9.89 14.67  6.49;
      3.63  9.07 12.59 14.09  9.06  8.19;
      6.35  9.79  9.40 12.79  8.38 16.79;
      4.66  3.55 16.82 13.83 21.39 13.88;
      8.32 14.04 17.17  7.93  7.39 -1.09;
     10.86 13.68  5.75 10.44 10.36 10.06;
      4.76  4.92 17.83  2.90  7.58 11.97;
      5.05 10.41  9.89  9.04  7.90 13.12;
      5.41  9.32  5.27 15.53  5.06 19.84;
      9.77  2.37  9.54 20.23  9.33  8.82;
     14.28  4.34 14.23 14.95 18.16 11.03;
     10.17  6.80  3.17  8.57 16.07 15.93;
      5.39  2.67  6.37 13.56 10.68  7.35];

% Dependent variable
y = [-46.47; -35.80; -129.22;  -42.44; -73.51;
     -26.61; -63.90;  -76.73;  -32.64; -83.29;
     -16.31;  -5.82;  -47.75;   18.38; -54.71;
     -55.62; -45.28;  -22.76; -104.32; -55.94];

% g02ma can issue warnings, but return sensible results,
% so save current warning state and turn warnings on
warn_state = nag_issue_warnings();
nag_issue_warnings(true);

% Call the model fitting routine
[b,fitsum,ifail] = g02ma(mtype,d,y);

% Reset the warning state to its initial value
nag_issue_warnings(warn_state);

% Set how the additional estimates will be specified

% Location of additional parameter estimates (as defined by the
% LARS step number)
ktype = int64(1);
nk = [0.2; 1.2; 3.2; 4.5; 5.2];

% Calculate the additional parameter estimates
[nb,ifail] = g02mc(b,fitsum,ktype,nk);

% Print the results
ip = size(b,1);
K = size(b,2) - 2;
lnk = size(nk,1);

fprintf(' Parameter Estimates from g02ma\n');
fprintf('  Step %s Parameter Estimate\n ',repmat(' ',1,max(ip-2,0)*5));
fprintf(repmat('-',1,5+ip*10));
fprintf('\n');
for k = 1:K
  fprintf('  %3d',k);
  for j = 1:ip
    fprintf(' %9.3f',b(j,k));
  end
  fprintf('\n');
end
fprintf('\n');

fprintf(' Additional Parameter Estimates from g02mc\n');
fprintf('   nk  %s Parameter Estimate\n ',repmat(' ',1,max(ip-2,0)*5));
fprintf(repmat('-',1,5+ip*10));
fprintf('\n');
for k = 1:lnk
  fprintf('  %4.1f',nk(k));
  for j = 1:ip
    fprintf(' %9.3f',nb(j,k));
  end
  fprintf('\n');
end

g02mc example results

 Parameter Estimates from g02ma
  Step                      Parameter Estimate
 -----------------------------------------------------------------
    1     0.000     0.000     3.125     0.000     0.000     0.000
    2     0.000     0.000     3.792     0.000     0.000    -0.713
    3    -0.446     0.000     3.998     0.000     0.000    -1.151
    4    -0.628    -0.295     4.098     0.000     0.000    -1.466
    5    -1.060    -1.056     4.110    -0.864     0.000    -1.948
    6    -1.073    -1.132     4.118    -0.935    -0.059    -1.981

 Additional Parameter Estimates from g02mc
   nk                       Parameter Estimate
 -----------------------------------------------------------------
   0.2     0.000     0.000     0.625     0.000     0.000     0.000
   1.2     0.000     0.000     3.258     0.000     0.000    -0.143
   3.2    -0.483    -0.059     4.018     0.000     0.000    -1.214
   4.5    -0.844    -0.676     4.104    -0.432     0.000    -1.707
   5.2    -1.062    -1.071     4.112    -0.878    -0.012    -1.955

NAG Toolbox: nag_correg_lars_param (g02mc)

▸▿ Contents

Purpose

Syntax