NAG Library Routine Document
d03edf (dim2_ellip_mgrid)
1
Purpose
d03edf solves sevendiagonal systems of linear equations which arise from the discretization of an elliptic partial differential equation on a rectangular region. This routine uses a multigrid technique.
2
Specification
Fortran Interface
Subroutine d03edf ( 
ngx, ngy, lda, a, rhs, ub, maxit, acc, us, u, iout, numit, ifail) 
Integer, Intent (In)  ::  ngx, ngy, lda, maxit, iout  Integer, Intent (Inout)  ::  ifail  Integer, Intent (Out)  ::  numit  Real (Kind=nag_wp), Intent (In)  ::  acc  Real (Kind=nag_wp), Intent (Inout)  ::  a(lda,7), rhs(lda), ub(ngx*ngy)  Real (Kind=nag_wp), Intent (Out)  ::  us(lda), u(lda) 

C Header Interface
#include nagmk26.h
void 
d03edf_ (const Integer *ngx, const Integer *ngy, const Integer *lda, double a[], double rhs[], double ub[], const Integer *maxit, const double *acc, double us[], double u[], const Integer *iout, Integer *numit, Integer *ifail) 

3
Description
d03edf solves, by multigrid iteration, the sevenpoint scheme
which arises from the discretization of an elliptic partial differential equation of the form
and its boundary conditions, defined on a rectangular region. This we write in matrix form as
The algorithm is described in separate reports by
Wesseling (1982a),
Wesseling (1982b) and
McCarthy (1983).
Systems of linear equations, matching the sevenpoint stencil defined above, are solved by a multigrid iteration. An initial estimate of the solution must be provided by you. A zero guess may be supplied if no better approximation is available.
A ‘smoother’ based on incomplete Crout decomposition is used to eliminate the high frequency components of the error. A restriction operator is then used to map the system on to a sequence of coarser grids. The errors are then smoothed and prolongated (mapped onto successively finer grids). When the finest cycle is reached, the approximation to the solution is corrected. The cycle is repeated for
maxit iterations or until the required accuracy,
acc, is reached.
d03edf will automatically determine the number
$l$ of possible coarse grids, ‘levels’ of the multigrid scheme, for a particular problem. In other words,
d03edf determines the maximum integer
$l$ so that
${n}_{x}$ and
${n}_{y}$ can be expressed in the form
It should be noted that the rate of convergence improves significantly with the number of levels used (see
McCarthy (1983)), so that
${n}_{x}$ and
${n}_{y}$ should be carefully chosen so that
${n}_{x}1$ and
${n}_{y}1$ have factors of the form
${2}^{l}$, with
$l$ as large as possible. For good convergence the integer
$l$ should be at least
$2$.
d03edf has been found to be robust in application, but being an iterative method the problem of divergence can arise. For a strictly diagonally dominant matrix
$A$
no such problem is foreseen. The diagonal dominance of
$A$ is not a necessary condition, but should this condition be strongly violated then divergence may occur. The quickest test is to try the routine.
4
References
McCarthy G J (1983) Investigation into the multigrid code MGD1 Report AERER 10889 Harwell
Wesseling P (1982a) MGD1 – a robust and efficient multigrid method Multigrid Methods. Lecture Notes in Mathematics 960 614–630 Springer–Verlag
Wesseling P (1982b) Theoretical aspects of a multigrid method SIAM J. Sci. Statist. Comput. 3 387–407
5
Arguments
 1: $\mathbf{ngx}$ – IntegerInput

On entry: the number of interior grid points in the $x$direction, ${n}_{x}$. ${\mathbf{ngx}}1$ should preferably be divisible by as high a power of $2$ as possible.
Constraint:
${\mathbf{ngx}}\ge 3$.
 2: $\mathbf{ngy}$ – IntegerInput

On entry: the number of interior grid points in the $y$direction, ${n}_{y}$. ${\mathbf{ngy}}1$ should preferably be divisible by as high a power of $2$ as possible.
Constraint:
${\mathbf{ngy}}\ge 3$.
 3: $\mathbf{lda}$ – IntegerInput

On entry: the first dimension of the array
a, which must also be a lower bound for the dimension of the arrays
rhs,
us and
u as declared in the (sub)program from which
d03edf is called. It is always sufficient to set
${\mathbf{lda}}\ge \left(4\times \left({\mathbf{ngx}}+1\right)\times \left({\mathbf{ngy}}+1\right)\right)/3$, but slightly smaller values may be permitted, depending on the values of
ngx and
ngy. If on entry,
lda is too small, an error message gives the minimum permitted value. (
lda must be large enough to allow space for the coarsegrid approximations.)
 4: $\mathbf{a}\left({\mathbf{lda}},7\right)$ – Real (Kind=nag_wp) arrayInput/Output

On entry: ${\mathbf{a}}\left(\mathit{i}+\left(\mathit{j}1\right)\times {\mathbf{ngx}},\mathit{k}\right)$ must be set to ${{\mathbf{a}}}_{\mathit{i}\mathit{j}}^{\mathit{k}}$, for $\mathit{i}=1,2,\dots ,{\mathbf{ngx}}$, $\mathit{j}=1,2,\dots ,{\mathbf{ngy}}$ and $\mathit{k}=1,2,\dots ,7$.
On exit: is overwritten.
 5: $\mathbf{rhs}\left({\mathbf{lda}}\right)$ – Real (Kind=nag_wp) arrayInput/Output

On entry: ${\mathbf{rhs}}\left(\mathit{i}+\left(\mathit{j}1\right)\times {\mathbf{ngx}}\right)$ must be set to ${f}_{\mathit{i}\mathit{j}}$, for $\mathit{i}=1,2,\dots ,{\mathbf{ngx}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{ngy}}$.
On exit: the first ${\mathbf{ngx}}\times {\mathbf{ngy}}$ elements are unchanged and the rest of the array is used as workspace.
 6: $\mathbf{ub}\left({\mathbf{ngx}}\times {\mathbf{ngy}}\right)$ – Real (Kind=nag_wp) arrayInput/Output

On entry: ${\mathbf{ub}}\left(i+\left(j1\right)\times {\mathbf{ngx}}\right)$ must be set to the initial estimate for the solution ${u}_{ij}$.
On exit: the corresponding component of the residual $r=f{\mathbf{a}}u$.
 7: $\mathbf{maxit}$ – IntegerInput

On entry: the maximum permitted number of multigrid iterations. If
${\mathbf{maxit}}=0$, no multigrid iterations are performed, but the coarsegrid approximations and incomplete Crout decompositions are computed, and may be output if
iout is set accordingly.
Constraint:
${\mathbf{maxit}}\ge 0$.
 8: $\mathbf{acc}$ – Real (Kind=nag_wp)Input

On entry: the required tolerance for convergence of the residual
$2$norm:
where
$r=fAu$ and
$u$ is the computed solution. Note that the norm is not scaled by the number of equations. The routine will stop after fewer than
maxit iterations if the residual
$2$norm is less than the specified tolerance. (If
${\mathbf{maxit}}>0$, at least one iteration is always performed.)
If on entry
${\mathbf{acc}}=0.0$, the
machine precision is used as a default value for the tolerance; if
${\mathbf{acc}}>0.0$, but
acc is less than the
machine precision, the routine will stop when the residual
$2$norm is less than the
machine precision and
ifail will be set to
$4$.
Constraint:
${\mathbf{acc}}\ge 0.0$.
 9: $\mathbf{us}\left({\mathbf{lda}}\right)$ – Real (Kind=nag_wp) arrayOutput

On exit: the residual $2$norm, stored in element ${\mathbf{us}}\left(1\right)$.
 10: $\mathbf{u}\left({\mathbf{lda}}\right)$ – Real (Kind=nag_wp) arrayOutput

On exit: the computed solution ${u}_{\mathit{i}\mathit{j}}$ is returned in ${\mathbf{u}}\left(\mathit{i}+\left(\mathit{j}1\right)\times {\mathbf{ngx}}\right)$, for $\mathit{i}=1,2,\dots ,{\mathbf{ngx}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{ngy}}$.
 11: $\mathbf{iout}$ – IntegerInput

On entry: controls the output of printed information to the advisory message unit as returned by
x04abf:
 ${\mathbf{iout}}=0$
 No output.
 ${\mathbf{iout}}=1$
 The solution
${u}_{\mathit{i}\mathit{j}}$, for $\mathit{i}=1,2,\dots ,{\mathbf{ngx}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{ngy}}$.
 ${\mathbf{iout}}=2$
 The residual $2$norm after each iteration, with the reduction factor over the previous iteration.
 ${\mathbf{iout}}=3$
 As for ${\mathbf{iout}}=1$ and ${\mathbf{iout}}=2$.
 ${\mathbf{iout}}=4$
 As for ${\mathbf{iout}}=3$, plus the final residual (as returned in ub).
 ${\mathbf{iout}}=5$
 As for ${\mathbf{iout}}=4$, plus the initial elements of a and rhs.
 ${\mathbf{iout}}=6$
 As for ${\mathbf{iout}}=5$, plus the Galerkin coarse grid approximations.
 ${\mathbf{iout}}=7$
 As for ${\mathbf{iout}}=6$, plus the incomplete Crout decompositions.
 ${\mathbf{iout}}=8$
 As for ${\mathbf{iout}}=7$, plus the residual after each iteration.
The elements
${\mathbf{a}}\left(p,k\right)$, the Galerkin coarse grid approximations and the incomplete Crout decompositions are output in the format:
 Yindex $\text{}=j$
 Xindex $\text{}=i{\mathbf{a}}\left(p,1\right){\mathbf{a}}\left(p,2\right){\mathbf{a}}\left(p,3\right){\mathbf{a}}\left(p,4\right){\mathbf{a}}\left(p,5\right){\mathbf{a}}\left(p,6\right){\mathbf{a}}\left(p,7\right)$
 where
$p=\mathit{i}+\left(\mathit{j}1\right)\times {\mathbf{ngx}}$, for $\mathit{i}=1,2,\dots ,{\mathbf{ngx}}$ and $\mathit{j}=1,2,\dots ,{\mathbf{ngy}}$.
The vectors
${\mathbf{u}}\left(p\right)$,
${\mathbf{ub}}\left(p\right)$,
${\mathbf{rhs}}\left(p\right)$ are output in matrix form with
ngy rows and
ngx columns. Where
${\mathbf{ngx}}>10$, the
ngx values for a given
$j$ value are produced in rows of
$10$. Values of
${\mathbf{iout}}>4$ may yield considerable amounts of output.
Constraint:
$0\le {\mathbf{iout}}\le 8$.
 12: $\mathbf{numit}$ – IntegerOutput

On exit: the number of iterations performed.
 13: $\mathbf{ifail}$ – IntegerInput/Output

On entry:
ifail must be set to
$0$,
$1\text{ or}1$. If you are unfamiliar with this argument you should refer to
Section 3.4 in How to Use the NAG Library and its Documentation for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{ or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
$0$.
When the value $\mathbf{1}\text{ or}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

On entry,  ${\mathbf{ngx}}<3$, 
or  ${\mathbf{ngy}}<3$, 
or  lda is too small, 
or  ${\mathbf{acc}}<0.0$, 
or  ${\mathbf{maxit}}<0$, 
or  ${\mathbf{iout}}<0$, 
or  ${\mathbf{iout}}>8$. 
 ${\mathbf{ifail}}=2$

maxit iterations have been performed with the residual
$2$norm decreasing at each iteration but the residual
$2$norm has not been reduced to less than the specified tolerance (see
acc). Examine the progress of the iteration by setting
${\mathbf{iout}}\ge 2$.
 ${\mathbf{ifail}}=3$

As for ${\mathbf{ifail}}={\mathbf{2}}$, except that at one or more iterations the residual $2$norm did not decrease. It is likely that the method fails to converge for the given matrix $A$.
 ${\mathbf{ifail}}=4$

On entry,
acc is less than the
machine precision. The routine terminated because the residual norm is less than the
machine precision.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 3.9 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 3.8 in How to Use the NAG Library and its Documentation for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 3.7 in How to Use the NAG Library and its Documentation for further information.
7
Accuracy
8
Parallelism and Performance
d03edf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the
X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the
Users' Note for your implementation for any additional implementationspecific information.
The rate of convergence of this routine is strongly dependent upon the number of levels,
$l$, in the multigrid scheme, and thus the choice of
ngx and
ngy is very important. You are advised to experiment with different values of
ngx and
ngy to see the effect they have on the rate of convergence; for example, using a value such as
${\mathbf{ngx}}=65$ (
$\text{}={2}^{6}+1$) followed by
${\mathbf{ngx}}=64$ (for which
$l=1$).
10
Example
The program solves the elliptic partial differential equation
on the unit square
$0\le x,y\le 1$, with boundary conditions
For the equation to be elliptic,
$\alpha $ must be less than
$2$.
The equation is discretized on a square grid with mesh spacing
$h$ in both directions using the following approximations:
Figure 1
Thus the following equations are solved:
10.1
Program Text
Program Text (d03edfe.f90)
10.2
Program Data
Program Data (d03edfe.d)
10.3
Program Results
Program Results (d03edfe.r)