e04dg minimizes an unconstrained nonlinear function of several variables using a pre-conditioned, limited memory quasi-Newton conjugate gradient method. First derivatives (or an ‘acceptable’ finite difference approximation to them) are required. It is intended for use on large scale problems.

# Syntax

C# |
---|

public static void e04dg( int n, E04..::..E04DG_OBJFUN objfun, out int iter, out double objf, double[] objgrd, double[] x, E04..::..e04dgOptions options, out int ifail ) |

Visual Basic |
---|

Public Shared Sub e04dg ( _ n As Integer, _ objfun As E04..::..E04DG_OBJFUN, _ <OutAttribute> ByRef iter As Integer, _ <OutAttribute> ByRef objf As Double, _ objgrd As Double(), _ x As Double(), _ options As E04..::..e04dgOptions, _ <OutAttribute> ByRef ifail As Integer _ ) |

Visual C++ |
---|

public: static void e04dg( int n, E04..::..E04DG_OBJFUN^ objfun, [OutAttribute] int% iter, [OutAttribute] double% objf, array<double>^ objgrd, array<double>^ x, E04..::..e04dgOptions^ options, [OutAttribute] int% ifail ) |

F# |
---|

static member e04dg : n : int * objfun : E04..::..E04DG_OBJFUN * iter : int byref * objf : float byref * objgrd : float[] * x : float[] * options : E04..::..e04dgOptions * ifail : int byref -> unit |

#### Parameters

- n
- Type: System..::..Int32
*On entry*: $n$, the number of variables.*Constraint*: ${\mathbf{n}}>0$.

- objfun
- Type: NagLibrary..::..E04..::..E04DG_OBJFUNobjfun must calculate the objective function $F\left(x\right)$ and possibly its gradient as well for a specified $n$-element vector $x$.
A delegate of type E04DG_OBJFUN.

- iter
- Type: System..::..Int32%
*On exit*: the total number of iterations performed.

- objf
- Type: System..::..Double%
*On exit*: the value of the objective function at the final iterate.

- objgrd
- Type: array<System..::..Double>[]()[][]An array of size [n]
*On exit*: the gradient of the objective function at the final iterate (or its finite difference approximation).

- x
- Type: array<System..::..Double>[]()[][]An array of size [n]
*On entry*: an initial estimate of the solution.*On exit*: the final estimate of the solution.

- options
- Type: NagLibrary..::..E04..::..e04dgOptionsAn Object of type E04.e04dgOptions. Used to configure optional parameters to this method.

- ifail
- Type: System..::..Int32%
*On exit*: ${\mathbf{ifail}}={0}$ unless the method detects an error or a warning has been flagged (see [Error Indicators and Warnings]).

# Description

e04dg is designed to solve unconstrained minimization problems of the form

where $x$ is an $n$-element vector.

$$\underset{x\in {R}^{n}}{\mathrm{minimize}}\phantom{\rule{0.25em}{0ex}}F\left(x\right)\text{\hspace{1em} subject to \hspace{1em}}-\infty \le x\le \infty \text{,}$$ |

You must supply an initial estimate of the solution.

The method used by e04dg is described in [Algorithmic Details].

# References

Gill P E and Murray W (1979) Conjugate-gradient methods for large-scale nonlinear optimization

*Technical Report SOL 79-15*Department of Operations Research, Stanford UniversityGill P E, Murray W and Wright M H (1981)

*Practical Optimization*Academic Press# Error Indicators and Warnings

**Note:**e04dg may return useful information for one or more of the following detected errors or warnings.

Errors or warnings detected by the method:

- ${\mathbf{ifail}}<0$

- ${\mathbf{ifail}}=1$
- Not used by this method.

- ${\mathbf{ifail}}=2$
- Not used by this method.

- ${\mathbf{ifail}}=3$
- The limiting number of iterations (as determined by the optional parameter Iteration Limit ($\text{default value}=\mathrm{max}\phantom{\rule{0.125em}{0ex}}\left(50,5n\right)$) has been reached.If the algorithm appears to be making satisfactory progress, then optional parameter Iteration Limit may be too small. If so, increase its value and rerun e04dg. If the algorithm seems to be making little or no progress, then you should check for incorrect gradients as described under ${\mathbf{ifail}}={7}$.

- ${\mathbf{ifail}}=4$
- The computed upper bound on the step length taken during the linesearch was too small. A rerun with an increased value of the optional parameter Maximum Step Length ($\rho $ say) may be successful unless $\rho \ge {10}^{20}$ (the default value), in which case the current point cannot be improved upon.

- ${\mathbf{ifail}}=5$
- Not used by this method.

- ${\mathbf{ifail}}=6$
- The conditions for an acceptable solution (see parameter ifail in [Parameters]) have not all been met, but a lower point could not be found.If objfun computes the objective function and its gradient correctly, then this may occur because an overly stringent accuracy has been requested, i.e., the value of the optional parameter Optimality Tolerance ($\text{default value}={\epsilon}^{0.8}$) is too small or if ${\alpha}_{k}\simeq 0$. In this case you should apply the three tests described under ${\mathbf{ifail}}={0}$ to determine whether or not the final solution is acceptable. For a discussion of attainable accuracy see Gill et al. (1981).If many iterations have occurred in which essentially no progress has been made or e04dg has failed to move from the initial point, objfun may be incorrect. You should refer to the comments below under ${\mathbf{ifail}}={7}$ and check the gradients using the optional parameter Verify ($\text{default value}=0$). Unfortunately, there may be small errors in the objective gradients that cannot be detected by the verification process. Finite difference approximations to first derivatives are catastrophically affected by even small inaccuracies.

- ${\mathbf{ifail}}=7$
- The user-supplied derivatives of the objective function appear to be incorrect.Large errors were found in the derivatives of the objective function. This value of ifail will occur if the verification process indicated that at least one gradient element had no correct figures. You should refer to the printed output to determine which elements are suspected to be in error.As a first step, you should check that the code for the objective values is correct – for example, by computing the function at a point where the correct value is known. However, care should be taken that the chosen point fully tests the evaluation of the function. It is remarkable how often the values $x=0$ or $x=1$ are used to test function evaluation procedures, and how often the special properties of these numbers make the test meaningless.Special care should be used in this test if computation of the objective function involves subsidiary data communicated in storage. Although the first evaluation of the function may be correct, subsequent calculations may be in error because some of the subsidiary data has accidentally been overwritten.Errors in programming the function may be quite subtle in that the function value is almost correct. For example, the function may not be accurate to full precision because of the inaccurate calculation of a subsidiary quantity, or the limited accuracy of data upon which the function depends. A common error on machines where numerical calculations are usually performed in double precision is to include even one single precision constant in the calculation of the function; since some compilers do not convert such constants to double precision, half the correct figures may be lost by such a seemingly trivial error.

- ${\mathbf{ifail}}=8$
- The gradient $\left(g=\frac{\partial F}{\partial x}\right)$ at the starting point ${x}_{0}$ is ‘too small’. More precisely, the value of $g{\left({x}_{0}\right)}^{\mathrm{T}}g\left({x}_{0}\right)$ is less than ${\epsilon}_{r}\left|1+F\left({x}_{0}\right)\right|$, where ${\epsilon}_{r}$ is the value of the optional parameter Function Precision ($\text{default value}={\epsilon}^{0.9}$).The problem should be rerun from a different starting point.

- ${\mathbf{ifail}}=9$
- An input parameter is invalid.

# Accuracy

On successful exit (${\mathbf{ifail}}={0}$) the accuracy of the solution will be as defined by the optional parameter Optimality Tolerance ($\text{default value}={\epsilon}^{0.8}$).

# Parallelism and Performance

None.

# Further Comments

To evaluate an ‘acceptable’ set of finite difference intervals using e04xa requires $2$ function evaluations per variable for a well-scaled problem and up to $6$ function evaluations per variable for a badly scaled problem.

# Description of Printed Output

This section describes the intermediate printout and final printout produced by e04dg. You can control the level of printed output (see the description of the optional parameter Print Level). Note that the intermediate printout and final printout are produced only if ${\mathbf{Print\; Level}}\ge 10$ (the default
).

The following line of summary output ($\text{}<80$ characters) is produced at every iteration. In all cases, the values of the quantities are those in effect on completion of the given iteration.

Itn | is the iteration count. |

Step | is the step ${\alpha}_{k}$ taken along the computed search direction. On reasonably well-behaved problems, the unit step (i.e., ${\alpha}_{k}=1$) will be taken as the solution is approached. |

Nfun | is the cumulated number of evaluations of the objective function needed for the linesearch. Evaluations needed for the verification of the gradients by finite differences are not included. Nfun is printed as a guide to the amount of work required for the linesearch. e04dg will perform at most $11$ function evaluations per iteration. |

Objective | is the value of the objective function at ${x}_{k}$. |

Norm G | is the Euclidean norm of the gradient of the objective function at ${x}_{k}$. |

Norm X | is the Euclidean norm of ${x}_{k}$. |

Norm (X(k-1)-X(k)) | is the Euclidean norm of ${x}_{k-1}-{x}_{k}$. |

The following describes the printout for each variable.

Variable | gives the name (Varbl) and index $\mathit{j}$, for $\mathit{j}=1,2,\dots ,n$ of the variable. |

Value | is the value of the variable at the final iteration. |

Gradient Value | is the value of the gradient of the objective function with respect to the $j$th variable at the final iteration. |

Numerical values are output with a fixed number of digits; they are not guaranteed to be accurate to this precision.

# Example

This example finds a minimum of the function

The initial point is

and $F\left({x}_{0}\right)=1.8394$ (to five figures).

$$F={e}^{{x}_{1}}\left(4{x}_{1}^{2}+2{x}_{2}^{2}+4{x}_{1}{x}_{2}+2{x}_{2}+1\right)\text{.}$$ |

$${x}_{0}={\left(-1.0,1.0\right)}^{\mathrm{T}}\text{,}$$ |

The optimal solution is

and $F\left({x}^{*}\right)=0$.

$${x}^{*}={\left(0.5,-1.0\right)}^{\mathrm{T}}\text{,}$$ |

Example program (C#): e04dge.cs

# Algorithmic Details

This section contains a description of the method used by e04dg.

e04dg uses a pre-conditioned conjugate gradient method and is based upon algorithm PLMA as described in Section 4.8.3 of Gill and Murray (1979) and Gill et al. (1981).

The algorithm proceeds as follows:

Let ${x}_{0}$ be a given starting point and let $k$ denote the current iteration, starting with $k=0$. The iteration requires ${g}_{k}$, the gradient vector evaluated at ${x}_{k}$, the $k$th estimate of the minimum. At each iteration a vector ${p}_{k}$ (known as the direction of search) is computed and the new estimate ${x}_{k+1}$ is given by ${x}_{k}+{\alpha}_{k}{p}_{k}$ where ${\alpha}_{k}$ (the step length) minimizes the function $F\left({x}_{k}+{\alpha}_{k}{p}_{k}\right)$ with respect to the scalar ${\alpha}_{k}$. A choice of initial step ${\alpha}_{0}$ is taken as

where ${F}_{\mathrm{est}}$ is a user-supplied estimate of the function value at the solution. If ${F}_{\mathrm{est}}$ is not specified, the software always chooses the unit step length for ${\alpha}_{0}$. Subsequent step length estimates are computed using cubic interpolation with safeguards.

$${\alpha}_{0}=\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left\{1,2\times \left|{F}_{k}-{F}_{\mathrm{est}}\right|/{g}_{k}^{\mathrm{T}}{g}_{k}\right\}$$ |

A quasi-Newton method can be used to compute the search direction ${p}_{k}$ by updating the inverse of the approximate Hessian $\left({H}_{k}\right)$ and computing

The updating formula for the approximate inverse is given by

where ${y}_{k}={g}_{k-1}-{g}_{k}$ and ${s}_{k}={x}_{k+1}-{x}_{k}={\alpha}_{k}{p}_{k}$.

$${p}_{k+1}=-{H}_{k+1}{g}_{k+1}\text{.}$$ | (1) |

$${H}_{k+1}={H}_{k}-\frac{1}{{y}_{k}^{\mathrm{T}}{s}_{k}}\left({H}_{k}{y}_{k}{s}_{k}^{\mathrm{T}}+{s}_{k}{y}_{k}^{\mathrm{T}}{H}_{k}\right)+\frac{1}{{y}_{k}^{\mathrm{T}}{s}_{k}}\left(1+\frac{{y}_{k}^{\mathrm{T}}{H}_{k}{y}_{k}}{{y}_{k}^{\mathrm{T}}{s}_{k}}\right){s}_{k}{s}_{k}^{\mathrm{T}}\text{,}$$ | (2) |

The method used to obtain the search direction is based upon computing ${p}_{k+1}$ as $-{H}_{k+1}{g}_{k+1}$ where ${H}_{k+1}$ is a matrix obtained by updating the identity matrix with a limited number of quasi-Newton corrections. The storage of an $n$ by $n$ matrix is avoided by storing only the vectors that define the rank two corrections – hence the term ‘limited-memory’ quasi-Newton method. The precise method depends upon the number of updating vectors stored. For example, the direction obtained with the ‘one-step’ limited memory update is given by (1) using (2) with ${H}_{k}$ equal to the identity matrix, viz.

Using a limited-memory quasi-Newton formula, such as the one above, guarantees ${p}_{k+1}$ to be a descent direction if all the inner products ${y}_{k}^{\mathrm{T}}{s}_{k}$ are positive for all vectors ${y}_{k}$ and ${s}_{k}$ used in the updating formula.

$${p}_{k+1}=-{g}_{k+1}+\frac{1}{{y}_{k}^{\mathrm{T}}{s}_{k}}\left({s}_{k}^{\mathrm{T}}{g}_{k+1}{y}_{k}+{y}_{k}^{\mathrm{T}}{g}_{k+1}{s}_{k}\right)-\frac{{s}_{k}^{\mathrm{T}}{g}_{k+1}}{{y}_{k}^{\mathrm{T}}{s}_{k}}\left(1+\frac{{y}_{k}^{\mathrm{T}}{y}_{k}}{{y}_{k}^{\mathrm{T}}{s}_{k}}\right){s}_{k}\text{.}$$ |