Testing Matrix Function Algorithms Using Identities
Edvin Deadman and Nick Higham (University of Manchester) write:
In a previous blog post we explained how testing new algorithms is difficult. We discussed the forward error (how far from the actual solution are we?) and the backward error (what problem have we actually solved?) and how we'd like the backward error to be close to the unit roundoff, u.
For matrix functions, we also mentioned the idea of using identities such as sin2A + cos2A = I to test algorithms. In practice, rather than I, we might find that we obtain a matrix R close to I, perhaps with ||R-I|| 10-13. What does this tell us about how the algorithms for sin A and cos A are performing? In particular, does it tell us anything about the backward errors? We've just written a paper which aims to answer these questions. This work is an output of NAG's Knowledge Transfer Partnership with the University of Manchester, so we thought we'd blog about it here.
Let's consider the identity exp(log A) - A = 0. Suppose that when we evaluate the left-hand side in floating point arithmetic we get a nonzero residual R rather than 0. We'll assume that this residual is caused by some backward errors E1 and E2 so that exp(log(A + E1) + E2) = R. We'd like to investigate how big R can be when E1 and E2 are small, so we expand the left-hand side in a Taylor series to linear order. After a bit of algebra, the result is a linear operator relating R to E1 and E2: R = L(E1, E2). The operator is different for each identity considered, but it always involves the Fréchet derivatives of the matrix functions in the identity (the full gory details, including formulae for the linear operators associated with various identities, are in our paper).
We now have two options. The first option is to estimate the norm of the linear operator. This enables us to determine the maximum value of ||R|| consistent with backward errors of size u. The second option is to use the linear operator to explicitly estimate E1 and E2. This works just as well, but it is more expensive.
Our paper contains several examples demonstrating our approach in practice, so we'll just pick one here, involving the matrix exponential. The 10 x 10 Forsythe matrix is