NAG Library Routine Document

f01bsf factorizes a real sparse matrix using the pivotal sequence previously obtained by f01brf when a matrix of the same sparsity pattern was factorized.

2

Specification

Fortran Interface

Subroutine f01bsf (

n, nz, a, licn, ivect, jvect, icn, ikeep, iw, w, grow, eta, rpmin, abort, idisp, ifail)

Integer, Intent (In)	::	n, nz, licn, ivect(nz), jvect(nz), ikeep(5*n), idisp(2)
Integer, Intent (Inout)	::	icn(licn), ifail
Integer, Intent (Out)	::	iw(5*n)
Real (Kind=nag_wp), Intent (In)	::	eta
Real (Kind=nag_wp), Intent (Inout)	::	a(licn)
Real (Kind=nag_wp), Intent (Out)	::	w(n), rpmin
Logical, Intent (In)	::	grow, abort

C Header Interface

#include <nagmk26.h>

void

f01bsf_ (const Integer *n, const Integer *nz, double a[], const Integer *licn, const Integer ivect[], const Integer jvect[], Integer icn[], const Integer ikeep[], Integer iw[], double w[], const logical *grow, const double *eta, double *rpmin, const logical *abort, const Integer idisp[], Integer *ifail)

3

Description

f01bsf accepts as input a real sparse matrix of the same sparsity pattern as a matrix previously factorized by a call of f01brf. It first applies to the matrix the same permutations as were used by f01brf, both for permutation to block triangular form and for pivoting, and then performs Gaussian elimination to obtain the

L U

factorization of the diagonal blocks.

Extensive data checks are made; duplicated nonzeros can be accumulated.

The factorization is intended to be used by f04axf to solve sparse systems of linear equations

A x = b

A^{T} x = b

f01bsf is much faster than f01brf and in some applications it is expected that there will be many calls of f01bsf for each call of f01brf.

The method is fully described in Duff (1977).

A more recent algorithm for the same calculation is provided by f11mef.

4

References

Duff I S (1977) MA28 – a set of Fortran subroutines for sparse unsymmetric linear equations AERE Report R8730 HMSO

5

Arguments

1: $n$ – IntegerInput

On entry:

n

, the order of the matrix

A

Constraint:

n > 0

2: $nz$ – IntegerInput

On entry: the number of nonzero elements in the matrix

A

Constraint:

nz > 0

3: $a (licn)$ – Real (Kind=nag_wp) arrayInput/Output

On entry:

a (i)

, for

i = 1, 2, \dots, nz

, must contain the nonzero elements of the sparse matrix

A

. They can be in any order since f01bsf will reorder them.

On exit: the nonzero elements in the

L U

factorization. The array must not be changed by you between a call of f01bsf and a call of f04axf.

4: $licn$ – IntegerInput

On entry: the dimension of the arrays a and icn as declared in the (sub)program from which f01bsf is called. It should have the same value as it had for f01brf.

Constraint:

licn \geq nz

5: $ivect (nz)$ – Integer arrayInput

6: $jvect (nz)$ – Integer arrayInput

On entry:

ivect (i)

and

jvect (i)

, for

i = 1, 2, \dots, nz

, must contain the row index and the column index respectively of the nonzero element stored in

a (i)

7: $icn (licn)$ – Integer arrayInput

icn contains, on entry, the same information as output by f01brf. It must not be changed by you between a call of f01bsf and a call of f04axf.

icn is used as internal workspace prior to being restored on exit and hence is unchanged.

8: $ikeep (5 \times n)$ – Integer arrayCommunication Array

On entry: the same indexing information about the factorization as output in ikeep by f01brf.

You must not change ikeep between a call of f01bsf and subsequent calls to f04axf.

9: $iw (5 \times n)$ – Integer arrayWorkspace

10: $w (n)$ – Real (Kind=nag_wp) arrayOutput

On exit: if

grow = .TRUE.

w (1)

contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization (see grow); the rest of the array is used as workspace.

grow = .FALSE.

, the array is not used.

11: $grow$ – LogicalInput

On entry: if

grow = .TRUE.

, then on exit

w (1)

contains an estimate (an upper bound) of the increase in size of elements encountered during the factorization. If the matrix is well-scaled (see Section 9), then a high value for

w (1)

indicates that the

L U

factorization may be inaccurate and you should be wary of the results and perhaps increase the argument pivot for subsequent runs (see Section 7).

12: $eta$ – Real (Kind=nag_wp)Input

On entry: the relative pivot threshold below which an error diagnostic is provoked and ifail is set to

ifail = 7

. If eta is greater than

1.0

, then no check on pivot size is made.

Suggested value:

eta = 10^{- 4}

13: $rpmin$ – Real (Kind=nag_wp)Output

On exit: if eta is less than

1.0

, then rpmin gives the smallest ratio of the pivot to the largest element in the row of the corresponding upper triangular factor thus monitoring the stability of the factorization. If rpmin is very small it may be advisable to perform a new factorization using f01brf.

14: $abort$ – LogicalInput

On entry: if

abort = .TRUE.

, f01bsf exits immediately (with

ifail = 8

) if it finds duplicate elements in the input matrix.

abort = .FALSE.

, f01bsf proceeds using a value equal to the sum of the duplicate elements.

In either case details of each duplicate element are output on the current advisory message unit (see x04abf), unless suppressed by the value of ifail on entry.

Suggested value:

abort = .TRUE.

15: $idisp (2)$ – Integer arrayCommunication Array

On entry:

idisp (1)

and

idisp (2)

must be as output in idisp by the previous call of f01brf.

16: $ifail$ – IntegerInput/Output

For this routine, the normal use of ifail is extended to control the printing of error and warning messages as well as specifying hard or soft failure (see Section 3.4 in How to Use the NAG Library and its Documentation).

On entry: ifail must be set to a value with the decimal expansion

cba

, where each of the decimal digits

c

b

and

a

must have a value of

0

1

$a = 0$	specifies hard failure, otherwise soft failure;
$b = 0$	suppresses error messages, otherwise error messages will be printed (see Section 6);
$c = 0$	suppresses warning messages, otherwise warning messages will be printed (see Section 6).

The recommended value for inexperienced users is

110

(i.e., hard failure with all messages printed).

On exit:

ifail = 0

unless the routine detects an error or a warning has been flagged (see Section 6).

6

Error Indicators and Warnings

If on entry

ifail = 0

- 1

, explanatory error messages are output on the current error message unit (as defined by x04aaf).

Errors or warnings detected by the routine:

$ifail = 1$: On entry, $n = 〈value〉$ .
Constraint: $n > 0$ .

$ifail = 2$: On entry, $nz = 〈value〉$ .
Constraint: $nz > 0$ .

$ifail = 3$: On entry, $licn = 〈value〉$ and $nz = 〈value〉$ .
Constraint: $licn \geq nz$ .

$ifail = 4$: On entry, $irn (I)$ or $icn (I)$ is out of range: $I = 〈value〉$ , $a (I) = 〈value〉$ $irn (I) = 〈value〉$ , $icn (I) = 〈value〉$ .

$ifail = 5$: Nonzero element ( $〈value〉$ , $〈value〉$ ) in zero off-diagonal block.

Nonzero element ( $〈value〉$ , $〈value〉$ ) was not in L/U pattern.

$ifail = 6$: Numerical singularity in row $〈value〉$ - decomposition aborted.

$ifail = 7$: Subthreshold pivot in row $〈value〉$ - decomposition completed.

$ifail = 8$: On entry, duplicate elements found – see advisory messages.

$ifail = - 99$: An unexpected error has been triggered by this routine. Please contact NAG.
See Section 3.9 in How to Use the NAG Library and its Documentation for further information.

$ifail = - 399$: Your licence key may have expired or may not have been installed correctly.
See Section 3.8 in How to Use the NAG Library and its Documentation for further information.

$ifail = - 999$: Dynamic memory allocation failed.
See Section 3.7 in How to Use the NAG Library and its Documentation for further information.

7

Accuracy

The factorization obtained is exact for a perturbed matrix whose

(i, j)

th element differs from

a_{i j}

by less than

3 ε ρ m_{i j}

where

ε

is the machine precision,

ρ

is the growth value returned in

w (1)

grow = .TRUE.

, and

m_{i j}

the number of Gaussian elimination operations applied to element

(i, j)

ρ = w (1)

is very large or rpmin is very small, then a fresh call of f01brf is recommended.

8

Parallelism and Performance

f01bsf is not threaded in any implementation.

9

Further Comments

If you have a sequence of problems with the same sparsity pattern then f01bsf is recommended after f01brf has been called for one such problem. It is typically

4

7

times faster but is potentially unstable since the previous pivotal sequence is used. Further details on timing are given in the document for f01brf.

If growth estimation is performed (

grow = .TRUE.

), then the time increases by between

5 %

and

10 %

. Pivot size monitoring (

eta \leq 1.0

) involves a similar overhead.

We normally expect this routine to be entered with a matrix having the same pattern of nonzeros as was earlier presented to f01brf. However there is no record of this pattern, but rather a record of the pattern including all fill-ins. Therefore we permit additional nonzeros in positions corresponding to fill-ins.

If singular matrices are being treated then it is also required that the present matrix be sufficiently like the previous one for the same permutations to be suitable for factorization with the same set of zero pivots.

10

Example

This example factorizes the real sparse matrices

(\begin{array}{r} 5 & 0 & 0 & 0 & 0 & 0 \\ 0 & 2 & - 1 & 2 & 0 & 0 \\ 0 & 0 & 3 & 0 & 0 & 0 \\ - 2 & 0 & 0 & 1 & 1 & 0 \\ - 1 & 0 & 0 & - 1 & 2 & - 3 \\ - 1 & - 1 & 0 & 0 & 0 & 6 \end{array})

and

(\begin{array}{r} 10 & 0 & 0 & 0 & 0 & 0 \\ 0 & 12 & - 3 & - 1 & 0 & 0 \\ 0 & 0 & 15 & 0 & 0 & 0 \\ - 2 & 0 & 0 & 10 & - 1 & 0 \\ - 1 & 0 & 0 & - 5 & 1 & - 1 \\ - 1 & - 2 & 0 & 0 & 0 & 6 \end{array}) .

This example program simply prints the values of

w (1)

and rpmin returned by f01bsf. Normally the calls of f01brf and f01bsf would be followed by calls of f04axf.

NAG Library Routine Document

f01bsf (real_gen_sparse_lu_reuse)

▸▿ Contents

1 Purpose

2 Specification

3 Description

4 References

5 Arguments

6 Error Indicators and Warnings

7 Accuracy

8 Parallelism and Performance

9 Further Comments

10 Example

10.1 Program Text

10.2 Program Data

10.3 Program Results

1

Purpose

2

Specification

3

Description

4

References

5

Arguments

6

Error Indicators and Warnings

7

Accuracy

8

Parallelism and Performance

9

Further Comments

10

Example

10.1

Program Text

10.2

Program Data

10.3

Program Results