NAG Library Routine Document
G13AEF
1 Purpose
G13AEF fits a seasonal autoregressive integrated moving average (ARIMA) model to an observed time series, using a nonlinear least squares procedure incorporating backforecasting. Parameter estimates are obtained, together with appropriate standard errors. The residual series is returned, and information for use in forecasting the time series is produced for use by the routines
G13AGF and
G13AHF.
The estimation procedure is iterative, starting with initial parameter values such as may be obtained using
G13ADF. It continues until a specified convergence criterion is satisfied, or until a specified number of iterations has been carried out. The progress of the procedure can be monitored by means of a user-supplied routine.
2 Specification
SUBROUTINE G13AEF ( |
MR, PAR, NPAR, C, KFC, X, NX, ICOUNT, EX, EXR, AL, IEX, S, G, IGH, SD, H, LDH, ST, IST, NST, PIV, KPIV, NIT, ITC, ZSP, KZSP, ISF, WA, IWA, HC, IFAIL) |
INTEGER |
MR(7), NPAR, KFC, NX, ICOUNT(6), IEX, IGH, LDH, IST, NST, KPIV, NIT, ITC, KZSP, ISF(4), IWA, IFAIL |
REAL (KIND=nag_wp) |
PAR(NPAR), C, X(NX), EX(IEX), EXR(IEX), AL(IEX), S, G(IGH), SD(IGH), H(LDH,IGH), ST(IST), ZSP(4), WA(IWA), HC(LDH,IGH) |
EXTERNAL |
PIV |
|
3 Description
The time series
x1,x2,…,xn supplied to G13AEF is assumed to follow a seasonal autoregressive integrated moving average (ARIMA) model defined as follows:
where
∇d∇sDxt is the result of applying non-seasonal differencing of order
d and seasonal differencing of seasonality
s and order
D to the series
xt, as outlined in the description of
G13AAF. The differenced series is then of length
N=n-d′, where
d′=d+D×s is the generalized order of differencing. The scalar
c is the expected value of the differenced series, and the series
w1,w2,…,wN follows a zero-mean stationary autoregressive moving average (ARMA) model defined by a pair of recurrence equations. These express
wt in terms of an uncorrelated series
at, via an intermediate series
et. The first equation describes the seasonal structure:
The second equation describes the non-seasonal structure. If the model is purely non-seasonal the first equation is redundant and
et above is equated with
wt:
Estimates of the model parameters defined by
and (optionally)
c are obtained by minimizing a quadratic form in the vector
w=w1,w2,…,wN′.
This is
QF=w′V-1w, where
V is the covariance matrix of
w, and is a function of the model parameters. This matrix is not explicitly evaluated, since
QF may be expressed as a ‘sum of squares’ function. When moving average parameters
θi or
Θi are present, so that the generalized moving average order
q′=q+s×Q is positive, backforecasts
w1-q′,w2-q′,…,w0 are introduced as nuisance parameters. The ‘sum of squares’ function may then be written as
where
pm is a combined vector of parameters, consisting of the backforecasts followed by the ARMA model parameters.
The terms at correspond to the ARMA model residual series at, and p′=p+s×P is the generalized autoregressive order. The terms bt are only present if autoregressive parameters are in the model, and serve to correct for transient errors introduced at the start of the autoregression.
The equations defining
at and
bt are precisely:
- et=wt-Φ1wt-s-Φ2wt-2×s-⋯-ΦPwt-P×s+Θ1et-s+Θ2et-2×s+⋯+ΘQet-Q×s,
for t=1-q′,2-q′,…,n. - at=et-ϕ1et-1-ϕ2et-2-⋯-ϕpet-p+θ1at-1+θ2at-2+⋯+θqat-q,
for t=1-q′,2-q′,…,n. - ft=wt-Φ1wt+s-Φ2wt+2×s-⋯-ΦPwt+P×s+Θ1ft-s+Θ2ft-2×s+⋯+ΘQft-Q×s,
for t=1-q′-s×P,2-q′-s×P,…,-q′+P - bt=ft-ϕ1ft+1-ϕ2ft+2-⋯-ϕpft+p+θ1bt-1+θ2bt-2+⋯+θqbt-q,
for t=1-q′-p′,2-q′-p′,…,-q′.
For all four of these equations, the following conditions hold:
- wi=0 if i<1-q′
- ei=0 if i<1-q′
- ai=0 if i<1-q′
- fi=0 if i<1-q′-s×P
- bi=0 if i<1-q′-p′
Minimization of
S with respect to
pm uses an extension of the algorithm of
Marquardt (1963).
The first derivatives of
S with respect to the parameters are calculated as
where
at,i and
bt,i are derivatives of
at and
bt with respect to the
ith parameter.
The second derivative of
S is approximated by
Successive parameter iterates are obtained by calculating a vector of corrections
dpm by solving the equations
where
G is a vector with elements
Gi,
H is a matrix with elements
Hij,
α is a scalar used to control the search and
D is the diagonal matrix of
H.
The new parameter values are then pm+dpm.
The scalar α controls the step size, to which it is inversely related.
If a step results in new parameter values which give a reduced value of S, then α is reduced by a factor β. If a step results in new parameter values which give an increased value of S, or in ARMA model parameters which in any way contravene the stationarity and invertibility conditions, then the new parameters are rejected, α is increased by the factor β, and the revised equations are solved for a new parameter correction.
This action is repeated until either a reduced value of S is obtained, or α reaches the limit of 109, which is used to indicate a failure of the search procedure.
This failure may be due to a badly conditioned sum of squares function or to too strict a convergence criterion. Convergence is deemed to have occurred if the fractional reduction in the residual sum of squares in successive iterations is less than a value γ, while α<1.0.
The stationarity and invertibility conditions are tested to within a specified tolerance multiple δ of machine accuracy. Upon convergence, or completion of the specified maximum number of iterations without convergence, statistical properties of the estimates are derived. In the latter case the sequence of iterates should be checked to ensure that convergence is adequate for practical purposes, otherwise these properties are not reliable.
The estimated residual variance is
where
Smin is the final value of
S, and the residual number of degrees of freedom is given by
The covariance matrix of the vector of estimates
pm is given by
where
H is evaluated at the final parameter values.
From this expression are derived the vector of standard deviations, and the correlation matrix for the whole parameter set. These are asymptotic approximations.
The differenced series
wt (now uncorrected for the constant), intermediate series
et and residual series
at are all available upon completion of the iterations over the range (extended by backforecasts)
The values
at can only properly be interpreted as residuals for
t≥1+p′-q′, as the earlier values are corrupted by transients if
p′>0.
In consequence of the manner in which differencing is implemented, the residual at is the one step ahead forecast error for xt+d′.
For convenient application in forecasting, the following quantities constitute the ‘state set’, which contains the minimum amount of time series information needed to construct forecasts:
(i) |
the differenced series wt, for N-s×P<t≤N, |
(ii) |
the d′ values required to reconstitute the original series xt from the differenced series wt, |
(iii) |
the intermediate series et, for
N
-
maxp,
Q
×
s
<
t
≤
N
, |
(iv) |
the residual series at, for N-q<t≤N. |
This state set is available upon completion of the iterations. The routine may be used purely for the construction of this state set, given a previously estimated model and time series
xt, by requesting zero iterations. Backforecasts are estimated, but the model parameter values are unchanged. If later observations become available and it is desired to update the state set,
G13AGF can be used.
4 References
Box G E P and Jenkins G M (1976)
Time Series Analysis: Forecasting and Control (Revised Edition) Holden–Day
Marquardt D W (1963) An algorithm for least-squares estimation of nonlinear parameters
J. Soc. Indust. Appl. Math. 11 431
5 Parameters
- 1: MR(7) – INTEGER arrayInput
On entry: the orders vector p,d,q,P,D,Q,s of the ARIMA model whose parameters are to be estimated. p, q, P and Q refer respectively to the number of autoregressive (ϕ), moving average θ, seasonal autoregressive (Φ) and seasonal moving average (Θ) parameters. d, D and s refer respectively to the order of non-seasonal differencing, the order of seasonal differencing and the seasonal period.
Constraints:
- p, d, q, P, D, Q, s≥0;
- p+q+P+Q>0;
- s≠1;
- if s=0, P+D+Q=0;
- if s>1, P+D+Q>0;
- d+s×P+D≤n;
- p+d-q+s×P+D-Q≤n.
- 2: PAR(NPAR) – REAL (KIND=nag_wp) arrayInput/Output
On entry: the initial estimates of the p values of the ϕ parameters, the q values of the θ parameters, the P values of the Φ parameters and the Q values of the Θ parameters, in that order.
On exit: the latest values of the estimates of these parameters.
- 3: NPAR – INTEGERInput
On entry: the total number of ϕ, θ, Φ and Θ parameters to be estimated.
Constraint:
NPAR=p+q+P+Q.
- 4: C – REAL (KIND=nag_wp)Input/Output
On entry: if
KFC=0,
C must contain the expected value,
c, of the differenced series.
If
KFC=1,
C must contain an initial estimate of
c.
On exit: if
KFC=0,
C is unchanged.
If
KFC=1,
C contains the latest estimate of
c.
Therefore, if
C and
KFC are both zero on entry, there is no constant correction.
- 5: KFC – INTEGERInput
On entry: must be set to 1 if the constant, c, is to be estimated and 0 if it is to be held fixed at its initial value.
Constraint:
KFC=0 or 1.
- 6: X(NX) – REAL (KIND=nag_wp) arrayInput
On entry: the n values of the original undifferenced time series.
- 7: NX – INTEGERInput
On entry: n, the length of the original undifferenced time series.
- 8: ICOUNT(6) – INTEGER arrayOutput
On exit: size of various output arrays.
- ICOUNT1
- Contains q+Q×s, the number of backforecasts.
- ICOUNT2
- Contains n-d-D×s, the number of differenced values.
- ICOUNT3
- Contains d+D×s, the number of values of reconstitution information.
- ICOUNT4
- Contains n+q+Q×s, the number of values held in each of the series EX, EXR and AL.
- ICOUNT5
- Contains n-d-D×s-p-q-P-Q-KFC, the number of degrees of freedom associated with S.
- ICOUNT6
- Contains ICOUNT1+NPAR+KFC, the number of parameters being estimated.
These values are always computed regardless of the exit value of
IFAIL.
- 9: EX(IEX) – REAL (KIND=nag_wp) arrayOutput
On exit: the extended differenced series which is made up of:
ICOUNT1 backforecast values of the differenced series.
ICOUNT2 actual values of the differenced series.
ICOUNT3 values of reconstitution information.
The total number of these values held in
EX is
ICOUNT4.
If the routine exits because of a faulty input parameter, the contents of
EX will be indeterminate.
- 10: EXR(IEX) – REAL (KIND=nag_wp) arrayOutput
On exit: the values of the model residuals which is made up of:
ICOUNT1 residuals corresponding to the backforecasts in the differenced series.
ICOUNT2 residuals corresponding to the actual values in the differenced series.
The remaining ICOUNT3 values contain zeros.
If the routine exits with
IFAIL holding a value other than
0 or
9, the contents of
EXR will be indeterminate.
- 11: AL(IEX) – REAL (KIND=nag_wp) arrayOutput
On exit: the intermediate series which is made up of:
ICOUNT1 intermediate series values corresponding to the backforecasts in the differenced series.
ICOUNT2 intermediate series values corresponding to the actual values in the differenced series.
The remaining ICOUNT3 values contain zeros.
If the routine exits with
IFAIL≠0, the contents of
AL will be indeterminate.
- 12: IEX – INTEGERInput
On entry: the dimension of the arrays
EX,
EXR and
AL as declared in the (sub)program from which G13AEF is called.
Constraint:
IEX≥q+Q×s+n, which is equivalent to the exit value of ICOUNT4.
- 13: S – REAL (KIND=nag_wp)Output
On exit: the residual sum of squares after the latest series of parameter estimates has been incorporated into the model. If the routine exits with a faulty input parameter,
S contains zero.
- 14: G(IGH) – REAL (KIND=nag_wp) arrayOutput
On exit: the latest value of the derivatives of
S with respect to each of the parameters being estimated (backforecasts,
PAR parameters, and where relevant the constant – in that order). The contents of
G will be indeterminate if the routine exits with a faulty input parameter.
- 15: IGH – INTEGERInput
On entry: the dimension of the arrays
G and
SD and the second dimension of the arrays
H and
HC as declared in the (sub)program from which G13AEF is called.
Constraint:
IGH≥q+Q×s+NPAR+KFC which is equivalent to the exit value of ICOUNT6.
- 16: SD(IGH) – REAL (KIND=nag_wp) arrayOutput
On exit: the standard deviations corresponding to each of the parameters being estimated (backforecasts,
PAR parameters, and where relevant the constant, in that order).
If the routine exits with
IFAIL containing a value other than
0 or
9, or if the required number of iterations is zero, the contents of
SD will be indeterminate.
- 17: H(LDH,IGH) – REAL (KIND=nag_wp) arrayOutput
On exit: the second derivative of
S and correlation coefficients.
(a) |
the latest values of an approximation to the second derivative of S with respect to each of the q+Q×s+NPAR+KFC parameters being estimated (backforecasts, PAR parameters, and where relevant the constant – in that order), and |
(b) |
the correlation coefficients relating to each pair of these parameters. |
These are held in a matrix defined by the first
q+Q×s+NPAR+KFC rows and the first
q+Q×s+NPAR+KFC columns of
H. (Note that
ICOUNT6 contains the value of this expression.) The values of
(a) are contained in the upper triangle, and the values of
(b) in the strictly lower triangle.
These correlation coefficients are zero during intermediate printout using
PIV, and indeterminate if
IFAIL contains on exit a value other than
0 or
9.
All the contents of
H are indeterminate if the required number of iterations are zero. The
q+Q×s+NPAR+KFC+1th row of
H is used internally as workspace.
- 18: LDH – INTEGERInput
On entry: the first dimension of the arrays
H and
HC as declared in the (sub)program from which G13AEF is called.
Constraint:
LDH≥1+q+Q×s+NPAR+KFC, which is equivalent to the exit value of ICOUNT6.
- 19: ST(IST) – REAL (KIND=nag_wp) arrayOutput
On exit: the
NST values of the state set array. If the routine exits with
IFAIL containing a value other than
0 or
9, the contents of
ST will be indeterminate.
- 20: IST – INTEGERInput
On entry: the dimension of the array
ST as declared in the (sub)program from which G13AEF is called.
Constraint:
IST≥P×s+d+D×s+q+maxp,Q×s.
- 21: NST – INTEGEROutput
On exit: the number of values in the state set array
ST.
- 22: PIV – SUBROUTINE, supplied by the NAG Library or the user.External Procedure
PIV is used to monitor the progress of the optimization.
The specification of
PIV is:
SUBROUTINE PIV ( |
MR, PAR, NPAR, C, KFC, ICOUNT, S, G, H, LDH, IGH, ITC, ZSP) |
INTEGER |
MR(7), NPAR, KFC, ICOUNT(6), LDH, IGH, ITC |
REAL (KIND=nag_wp) |
PAR(NPAR), C, S, G(IGH), H(LDH,IGH), ZSP(4) |
|
PIV is called on each iteration by G13AEF when the input value of
KPIV is nonzero and is bypassed when it is
0.
The routine G13AFZ may be used as
PIV. It prints the heading
G13AFZ MONITORING OUTPUT - ITERATION n
followed by the parameter values and the residual sum of squares. Output is directed to the advisory channel defined by
X04ABF.
- 1: MR(7) – INTEGER arrayInput
- 2: PAR(NPAR) – REAL (KIND=nag_wp) arrayInput
- 3: NPAR – INTEGERInput
- 4: C – REAL (KIND=nag_wp)Input
- 5: KFC – INTEGERInput
- 6: ICOUNT(6) – INTEGER arrayInput
- 7: S – REAL (KIND=nag_wp)Input
- 8: G(IGH) – REAL (KIND=nag_wp) arrayInput
- 9: H(LDH,IGH) – REAL (KIND=nag_wp) arrayInput
- 10: LDH – INTEGERInput
- 11: IGH – INTEGERInput
- 12: ITC – INTEGERInput
- 13: ZSP(4) – REAL (KIND=nag_wp) arrayInput
On entry: all the parameters are defined as for G13AEF itself.
PIV must either be a module subprogram USEd by, or declared as EXTERNAL in, the (sub)program from which G13AEF is called. Parameters denoted as
Input must
not be changed by this procedure.
If
KPIV=0 a dummy
PIV must be supplied.
- 23: KPIV – INTEGERInput
On entry: must be nonzero if the progress of the optimization is to be monitored using
PIV. Otherwise
KPIV must contain
0.
- 24: NIT – INTEGERInput
On entry: the maximum number of iterations to be performed.
Constraint:
NIT≥0.
- 25: ITC – INTEGEROutput
On exit: the number of iterations performed.
- 26: ZSP(4) – REAL (KIND=nag_wp) arrayInput/Output
On entry: when
KZSP=1, the first four elements of
ZSP must contain the four values used to guide the search procedure. These are as follows.
ZSP1 contains α, the value used to constrain the magnitude of the search procedure steps.
ZSP2 contains β, the multiplier which regulates the value α.
ZSP3 contains δ, the value of the stationarity and invertibility test tolerance factor.
ZSP4 contains γ, the value of the convergence criterion.
If
KZSP≠1 on entry, default values for
ZSP are supplied by the routine.
These are 0.001, 10.0, 1000.0 and max100×machine precision, 0.0000001 respectively.
On exit:
ZSP contains the values, default or otherwise, used by the routine.
Constraint:
if KZSP=1, ZSP1>0.0, ZSP2>1.0, ZSP3≥1.0, 0≤ZSP4<1.0.
- 27: KZSP – INTEGERInput
On entry: the value
1 if the routine is to use the input values of
ZSP, and any other value if the default values of
ZSP are to be used.
- 28: ISF(4) – INTEGER arrayOutput
On exit: contains success/failure indicators, one for each of the four types of parameter in the model (autoregressive, moving average, seasonal autoregressive, seasonal moving average), in that order.
Each indicator has the interpretation:
-2 |
On entry parameters of this type have initial estimates which do not satisfy the stationarity or invertibility test conditions. |
-1 |
The search procedure has failed to converge because the latest set of parameter estimates of this type is invalid. |
-0 |
No parameter of this type is in the model. |
-1 |
Valid final estimates for parameters of this type have been obtained. |
- 29: WA(IWA) – REAL (KIND=nag_wp) arrayWorkspace
- 30: IWA – INTEGERInput
On entry: the dimension of the array
WA as declared in the (sub)program from which G13AEF is called.
Constraint:
IWA≥F1×F2+9×NPAR.
Where |
F1=NX+1+p+P×s+q+Q×s; |
and |
F2=8 if KFC=1; |
|
F2=7 if KFC=0, Q>0; |
|
F2=6 if KFC=0, Q=0, P>0; |
|
F2=5 if KFC=0, Q=0, P=0, q>0; |
|
F2=4 otherwise. |
- 31: HC(LDH,IGH) – REAL (KIND=nag_wp) arrayWorkspace
- 32: IFAIL – INTEGERInput/Output
-
On entry:
IFAIL must be set to
0,
-1 or 1. If you are unfamiliar with this parameter you should refer to
Section 3.3 in the Essential Introduction for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
-1 or 1 is recommended. If the output of error messages is undesirable, then the value
1 is recommended. Otherwise, because for this routine the values of the output parameters may be useful even if
IFAIL≠0 on exit, the recommended value is
-1.
When the value -1 or 1 is used it is essential to test the value of IFAIL on exit.
On exit:
IFAIL=0 unless the routine detects an error or a warning has been flagged (see
Section 6).
6 Error Indicators and Warnings
If on entry
IFAIL=0 or
-1, explanatory error messages are output on the current error message unit (as defined by
X04AAF).
Note: G13AEF may return useful information for one or more of the following detected errors or warnings.
Errors or warnings detected by the routine:
- IFAIL=1
On entry, | NPAR≠p+q+P+Q, |
or | the orders vector MR is invalid (check it against the constraints in Section 5), |
or | KFC≠0 or 1. |
- IFAIL=2
On entry, NX-d-D×s≤NPAR+KFC, i.e., the number of terms in the differenced series is not greater than the number of parameters in the model. The model is over-parameterised.
- IFAIL=3
On entry, | one or more of the user-supplied criteria for controlling the iterative process are invalid, |
or | NIT<0, |
or | if KZSP=1, ZSP1≤0.0; |
or | if KZSP=1, ZSP2≤1.0; |
or | if KZSP=1, ZSP3<1.0; |
or | if KZSP=1, ZSP4<0.0; |
or | if KZSP=1, ZSP4≥1.0. |
- IFAIL=4
On entry, the state set array
ST is too small. The output value of
NST contains the required value (see the description of
IST in
Section 5 for the formula).
- IFAIL=5
On entry, the workspace array
WA is too small. Check the value of
IWA against the constraints in
Section 5.
- IFAIL=6
On entry, | IEX<q+Q×s+NX, |
or | IGH<q+Q×s+NPAR+KFC, |
or | LDH≤q+Q×s+NPAR+KFC. |
- IFAIL=7
This indicates a failure in the search procedure, with ZSP1≥1.0E09.
Some output parameters may contain meaningful values; see
Section 5 for details.
- IFAIL=8
This indicates a failure to invert H.
Some output parameters may contain meaningful values; see
Section 5 for details.
- IFAIL=9
This indicates a failure in
F04ASF which is used to solve the equations giving the latest estimates of the backforecasts.
Some output parameters may contain meaningful values; see
Section 5 for details.
- IFAIL=10
-
Satisfactory parameter estimates could not be obtained for all parameter types in the model. Inspect array
ISF for further information on the parameter type(s) in error.
7 Accuracy
The computations are believed to be stable.
8 Further Comments
The time taken by G13AEF is approximately proportional to NX×ITC×
q+Q×s+NPAR+KFC
2.
9 Example
The following program reads 30 observations from a time series relating to the rate of the earth's rotation about its polar axis. Differencing of order 1 is applied, and the number of non-seasonal parameters is 3, one autoregressive ϕ, and two moving average θ. No seasonal effects are taken into account.
The constant is estimated. Up to 25 iterations are allowed.
The initial estimates of ϕ1, θ1, θ2 and c are zero.
9.1 Program Text
Program Text (g13aefe.f90)
9.2 Program Data
Program Data (g13aefe.d)
9.3 Program Results
Program Results (g13aefe.r)