﻿ F08 Class
This chapter provides methods for the solution of linear least squares problems, eigenvalue problems and singular value problems, as well as associated computations. It provides methods for:
• – solution of linear least squares problems
• – solution of symmetric eigenvalue problems
• – solution of nonsymmetric eigenvalue problems
• – solution of singular value problems
• – solution of generalized linear least squares problems
• – solution of generalized symmetric-definite eigenvalue problems
• – solution of generalized nonsymmetric eigenvalue problems
• – solution of generalized singular value problems
• – matrix factorizations associated with the above problems
• – estimating condition numbers of eigenvalue and eigenvector problems
• – estimating the numerical rank of a matrix
• – solution of the Sylvester matrix equation
Methods are provided for both real and complex data.
For a general introduction to the solution of linear least squares problems, you should turn first to (F04 not in this release). The decision trees, at the end of (F04 not in this release), direct you to the most appropriate methods in (F04 not in this release) F08 class. (F04 not in this release) F08 class contain Black Box (or driver) methods which enable standard linear least squares problems to be solved by a call to a single method.
For a general introduction to eigenvalue and singular value problems, you should turn first to (F02 not in this release). The decision trees, at the end of (F02 not in this release), direct you to the most appropriate methods in (F02 not in this release) F08 class. (F02 not in this release) F08 class contain Black Box (or driver) methods which enable standard types of problem to be solved by a call to a single method. Often methods in (F02 not in this release) call F08 class methods to perform the necessary computational tasks.
The methods in this chapter ( F08 class) handle only dense, band, tridiagonal and Hessenberg matrices (not matrices with more specialised structures, or general sparse matrices). The tables in [Recommendations on Choice and Use of Available Methods] and the decision trees in [Decision Trees] direct you to the most appropriate methods in F08 class.
The methods in this chapter have all been derived from the LAPACK project (see Anderson et al. (1999)). They have been designed to be efficient on a wide range of high-performance computers, without compromising efficiency on conventional serial machines.
It is not expected that you will need to read all of the following sections, but rather you will pick out those sections relevant to your particular problem.

Syntax

C#
`public static class F08`
Visual Basic
`Public NotInheritable Class F08`
Visual C++
`public ref class F08 abstract sealed`
F#
```[<AbstractClassAttribute>]
[<SealedAttribute>]
type F08 =  class end```

Background to the Problems

This section is only a brief introduction to the numerical solution of linear least squares problems, eigenvalue and singular value problems. Consult a standard textbook for a more thorough discussion, for example Golub and Van Loan (2012).

Linear Least Squares Problems

The linear least squares problem is
 $minimizexb-Ax2,$ (1)
where $A$ is an $m$ by $n$ matrix, $b$ is a given $m$ element vector and $x$ is an $n$-element solution vector.
In the most usual case $m\ge n$ and $\mathrm{rank}\left(A\right)=n$, so that $A$ has full rank and in this case the solution to problem (1) is unique; the problem is also referred to as finding a least squares solution to an overdetermined system of linear equations.
When $m and $\mathrm{rank}\left(A\right)=m$, there are an infinite number of solutions $x$ which exactly satisfy $b-Ax=0$. In this case it is often useful to find the unique solution $x$ which minimizes ${‖x‖}_{2}$, and the problem is referred to as finding a minimum norm solution to an underdetermined system of linear equations.
In the general case when we may have $\mathrm{rank}\left(A\right)<\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$ – in other words, $A$ may be rank-deficient – we seek the minimum norm least squares solution $x$ which minimizes both ${‖x‖}_{2}$ and ${‖b-Ax‖}_{2}$.
This chapter ( F08 class) contains driver methods to solve these problems with a single call, as well as computational methods that can be combined with methods in F07 class to solve these linear least squares problems. (F04 not in this release) also contains Black Box methods to solve these linear least squares problems in standard cases. The next two sections discuss the factorizations that can be used in the solution of linear least squares problems.

Orthogonal Factorizations and Least Squares Problems

A number of methods are provided for factorizing a general rectangular $m$ by $n$ matrix $A$, as the product of an orthogonal matrix (unitary if complex) and a triangular (or possibly trapezoidal) matrix.
A real matrix $Q$ is orthogonal if ${Q}^{\mathrm{T}}Q=I$; a complex matrix $Q$ is unitary if ${Q}^{\mathrm{H}}Q=I$. Orthogonal or unitary matrices have the important property that they leave the $2$-norm of a vector invariant, so that
 $x2=Qx2,$
if $Q$ is orthogonal or unitary. They usually help to maintain numerical stability because they do not amplify rounding errors.
Orthogonal factorizations are used in the solution of linear least squares problems. They may also be used to perform preliminary steps in the solution of eigenvalue or singular value problems, and are useful tools in the solution of a number of other problems.

$QR$ factorization

The most common, and best known, of the factorizations is the $QR$ factorization given by
 $A=QR0, if ​m≥n,$
where $R$ is an $n$ by $n$ upper triangular matrix and $Q$ is an $m$ by $m$ orthogonal (or unitary) matrix. If $A$ is of full rank $n$, then $R$ is nonsingular. It is sometimes convenient to write the factorization as
 $A=Q1Q2R0$
which reduces to
 $A=Q1R,$
where ${Q}_{1}$ consists of the first $n$ columns of $Q$, and ${Q}_{2}$ the remaining $m-n$ columns.
If $m, $R$ is trapezoidal, and the factorization can be written
 $A=QR1R2, if ​m
where ${R}_{1}$ is upper triangular and ${R}_{2}$ is rectangular.
The $QR$ factorization can be used to solve the linear least squares problem (1) when $m\ge n$ and $A$ is of full rank, since
 $b-Ax2=QTb-QTAx2=c1-Rxc22,$
where
 $c≡c1c2=Q1TbQ2Tb=QTb;$
and ${c}_{1}$ is an $n$-element vector. Then $x$ is the solution of the upper triangular system
 $Rx=c1.$
The residual vector $r$ is given by
 $r=b-Ax=Q0c2.$
The residual sum of squares ${{‖r‖}_{2}}^{2}$ may be computed without forming $r$ explicitly, since
 $r2=b-Ax2=c22.$

$LQ$ factorization

The $LQ$ factorization is given by
 $A=L 0Q=L 0Q1Q2=LQ1, if ​m≤n,$
where $L$ is $m$ by $m$ lower triangular, $Q$ is $n$ by $n$ orthogonal (or unitary), ${Q}_{1}$ consists of the first $m$ rows of $Q$, and ${Q}_{2}$ the remaining $n-m$ rows.
The $LQ$ factorization of $A$ is essentially the same as the $QR$ factorization of ${A}^{\mathrm{T}}$ (${A}^{\mathrm{H}}$ if $A$ is complex), since
 $A=L 0Q⇔AT=QTLT0.$
The $LQ$ factorization may be used to find a minimum norm solution of an underdetermined system of linear equations $Ax=b$ where $A$ is $m$ by $n$ with $m and has rank $m$. The solution is given by
 $x=QTL-1b0.$

$QR$ factorization with column pivoting

To solve a linear least squares problem (1) when $A$ is not of full rank, or the rank of $A$ is in doubt, we can perform either a $QR$ factorization with column pivoting or a singular value decomposition.
The $QR$ factorization with column pivoting is given by
 $A=QR0PT, m≥n,$
where $Q$ and $R$ are as before and $P$ is a (real) permutation matrix, chosen (in general) so that
 $r11≥r22≥⋯≥rnn$
and moreover, for each $k$,
 $rkk≥Rk:j,j2, j=k+1,…,n.$
If we put
 $R=R11R120R22$
where ${R}_{11}$ is the leading $k$ by $k$ upper triangular sub-matrix of $R$ then, in exact arithmetic, if $\mathrm{rank}\left(A\right)=k$, the whole of the sub-matrix ${R}_{22}$ in rows and columns $k+1$ to $n$ would be zero. In numerical computation, the aim must be to determine an index $k$, such that the leading sub-matrix ${R}_{11}$ is well-conditioned, and ${R}_{22}$ is negligible, so that
 $R=R11R120R22≃R11R1200.$
Then $k$ is the effective rank of $A$. See Golub and Van Loan (2012) for a further discussion of numerical rank determination.
The so-called basic solution to the linear least squares problem (1) can be obtained from this factorization as
 $x=PR11-1c^10,$
where ${\stackrel{^}{c}}_{1}$ consists of just the first $k$ elements of $c={Q}^{\mathrm{T}}b$.

Complete orthogonal factorization

The $QR$ factorization with column pivoting does not enable us to compute a minimum norm solution to a rank-deficient linear least squares problem, unless ${R}_{12}=0$. However, by applying for further orthogonal (or unitary) transformations from the right to the upper trapezoidal matrix $\left(\begin{array}{cc}{R}_{11}& {R}_{12}\end{array}\right)$, ${R}_{12}$ can be eliminated:
 $R11R12Z=T110.$
This gives the complete orthogonal factorization
 $AP=QT11000ZT$
from which the minimum norm solution can be obtained as
 $x=PZT11-1c^10.$

Updating a $QR$ factorization

[$QR$ factorization] gave the forms of the $QR$ factorization of an $m$ by $n$ matrix $A$ for the two cases $m\ge n$ and $m. Taking first the case $m\ge n$, the least squares solution of
 $Ax=b=n(b1)m-nb2$
is the solution of
 $Rx=Q1Tb.$
If the original system is now augmented by the addition of $p$ rows so that we require the solution of
 $ABx=m(b)pb3$
where $B$ is $p$ by $n$, then this is equivalent to finding the least squares solution of
 $A^x=nn(R)pBx=Q1Tbb3=b^.$
This now requires the $QR$ factorization of the $n+p$ by $n$ triangular-rectangular matrix $\stackrel{^}{A}$.
For the case $m, the least squares solution of the augmented system reduces to
 $A^x=BR1R2x=b3QTb=b^,$
where $\stackrel{^}{A}$ is pentagonal.
In both cases $\stackrel{^}{A}$ can be written as a special case of a triangular-pentagonal matrix consisting of an upper triangular part on top of a rectangular part which is itself on top of a trapezoidal part. In the first case there is no trapezoidal part, in the second case a zero upper triangular part can be added, and more generally the two cases can be combined.

Other factorizations

The $QL$ and $RQ$ factorizations are given by
 $A=Q0L, if ​m≥n,$
and
 $A=0RQ, if ​m≤n.$
The factorizations are less commonly used than either the $QR$ or $LQ$ factorizations described above, but have applications in, for example, the computation of generalized $QR$ factorizations.

The Singular Value Decomposition

The singular value decomposition (SVD) of an $m$ by $n$ matrix $A$ is given by
 $A=UΣVT, A=UΣVHin the complex case$
where $U$ and $V$ are orthogonal (unitary) and $\Sigma$ is an $m$ by $n$ diagonal matrix with real diagonal elements, ${\sigma }_{i}$, such that
 $σ1≥σ2≥⋯≥σminm,n≥0.$
The ${\sigma }_{i}$ are the singular values of $A$ and the first $\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$ columns of $U$ and $V$ are the left and right singular vectors of $A$. The singular values and singular vectors satisfy
 $Avi=σiui and ATui=σivi or ​AHui=σivi$
where ${u}_{i}$ and ${v}_{i}$ are the $i$th columns of $U$ and $V$ respectively.
The computation proceeds in the following stages.
 1 The matrix $A$ is reduced to bidiagonal form $A={U}_{1}B{V}_{1}^{\mathrm{T}}$ if $A$ is real ($A={U}_{1}B{V}_{1}^{\mathrm{H}}$ if $A$ is complex), where ${U}_{1}$ and ${V}_{1}$ are orthogonal (unitary if $A$ is complex), and $B$ is real and upper bidiagonal when $m\ge n$ and lower bidiagonal when $m, so that $B$ is nonzero only on the main diagonal and either on the first superdiagonal (if $m\ge n$) or the first subdiagonal (if $m). 2 The SVD of the bidiagonal matrix $B$ is computed as $B={U}_{2}\Sigma {V}_{2}^{\mathrm{T}}$, where ${U}_{2}$ and ${V}_{2}$ are orthogonal and $\Sigma$ is diagonal as described above. The singular vectors of $A$ are then $U={U}_{1}{U}_{2}$ and $V={V}_{1}{V}_{2}$.
If $m\gg n$, it may be more efficient to first perform a $QR$ factorization of $A$, and then compute the SVD of the $n$ by $n$ matrix $R$, since if $A=QR$ and $R=U\Sigma {V}^{\mathrm{T}}$, then the SVD of $A$ is given by $A=\left(QU\right)\Sigma {V}^{\mathrm{T}}$.
Similarly, if $m\ll n$, it may be more efficient to first perform an $LQ$ factorization of $A$.
This chapter supports two primary algorithms for computing the SVD of a bidiagonal matrix. They are:
 (i) the divide and conquer algorithm; (ii) the $QR$ algorithm.
The divide and conquer algorithm is much faster than the $QR$ algorithm if singular vectors of large matrices are required.

The Singular Value Decomposition and Least Squares Problems

The SVD may be used to find a minimum norm solution to a (possibly) rank-deficient linear least squares problem (1). The effective rank, $k$, of $A$ can be determined as the number of singular values which exceed a suitable threshold. Let $\stackrel{^}{\Sigma }$ be the leading $k$ by $k$ sub-matrix of $\Sigma$, and $\stackrel{^}{V}$ be the matrix consisting of the first $k$ columns of $V$. Then the solution is given by
 $x=V^Σ^-1c^1,$
where ${\stackrel{^}{c}}_{1}$ consists of the first $k$ elements of $c={U}^{\mathrm{T}}b={U}_{2}^{\mathrm{T}}{U}_{1}^{\mathrm{T}}b$.

Generalized Linear Least Squares Problems

The simple type of linear least squares problem described in [Linear Least Squares Problems] can be generalized in various ways.
1. Linear least squares problems with equality constraints:
 $find ​x​ to minimize ​S=c-Ax22 subject to Bx=d,$
where $A$ is $m$ by $n$ and $B$ is $p$ by $n$, with $p\le n\le m+p$. The equations $Bx=d$ may be regarded as a set of equality constraints on the problem of minimizing $S$. Alternatively the problem may be regarded as solving an overdetermined system of equations
 $ABx=cd,$
where some of the equations (those involving $B$) are to be solved exactly, and the others (those involving $A$) are to be solved in a least squares sense. The problem has a unique solution on the assumptions that $B$ has full row rank $p$ and the matrix $\left(\begin{array}{c}A\\ B\end{array}\right)$ has full column rank $n$. (For linear least squares problems with inequality constraints, refer to E04 class.)
2. General Gauss–Markov linear model problems:
 $minimize ​y2 subject to d=Ax+By,$
where $A$ is $m$ by $n$ and $B$ is $m$ by $p$, with $n\le m\le n+p$. When $B=I$, the problem reduces to an ordinary linear least squares problem. When $B$ is square and nonsingular, it is equivalent to a weighted linear least squares problem:
 $find ​x​ to minimize ​B-1d-Ax2.$
The problem has a unique solution on the assumptions that $A$ has full column rank $n$, and the matrix $\left(A,B\right)$ has full row rank $m$. Unless $B$ is diagonal, for numerical stability it is generally preferable to solve a weighted linear least squares problem as a general Gauss–Markov linear model problem.

Generalized $QR$ Factorization

The generalized $QR$ (GQR) factorization of an $n$ by $m$ matrix $A$ and an $n$ by $p$ matrix $B$ is given by the pair of factorizations
 $A=QR and B=QTZ,$
where $Q$ and $Z$ are respectively $n$ by $n$ and $p$ by $p$ orthogonal matrices (or unitary matrices if $A$ and $B$ are complex). $R$ has the form
 $R=mm(R11)n-m0, if ​n≥m,$
or
 $R=nm-nn(R11R12), if ​n
where ${R}_{11}$ is upper triangular. $T$ has the form
 $T=p-nnn(0T12), if ​n≤p,$
or
 $T=pn-p(T11)pT21, if ​n>p,$
where ${T}_{12}$ or ${T}_{21}$ is upper triangular.
Note that if $B$ is square and nonsingular, the GQR factorization of $A$ and $B$ implicitly gives the $QR$ factorization of the matrix ${B}^{-1}A$:
 $B-1A=ZTT-1R$
without explicitly computing the matrix inverse ${B}^{-1}$ or the product ${B}^{-1}A$ (remembering that the inverse of an invertible upper triangular matrix and the product of two upper triangular matrices is an upper triangular matrix).
The GQR factorization can be used to solve the general (Gauss–Markov) linear model problem (GLM) (see [Generalized Linear Least Squares Problems], but note that $A$ and $B$ are dimensioned differently there as $m$ by $n$ and $p$ by $n$ respectively). Using the GQR factorization of $A$ and $B$, we rewrite the equation $d=Ax+By$ as
 $QTd=QTAx+QTBy=Rx+TZy.$
We partition this as
 $d1d2=mm(R11)n-m0x+p-n+mn-mm(T11T12)n-m0T22y1y2$
where
 $d1d2≡QTd, and y1y2≡Zy.$
The GLM problem is solved by setting
 $y1=0 and y2=T22-1d2$
from which we obtain the desired solutions
 $x=R11-1d1-T12y2 and y=ZT0y2.$

Generalized $RQ$ Factorization

The generalized $RQ$ (GRQ) factorization of an $m$ by $n$ matrix $A$ and a $p$ by $n$ matrix $B$ is given by the pair of factorizations
 $A=RQ, B=ZTQ$
where $Q$ and $Z$ are respectively $n$ by $n$ and $p$ by $p$ orthogonal matrices (or unitary matrices if $A$ and $B$ are complex). $R$ has the form
 $R=n-mmm(0R12), if ​m≤n,$
or
 $R=nm-n(R11)nR21, if ​m>n,$
where ${R}_{12}$ or ${R}_{21}$ is upper triangular. $T$ has the form
 $T=nn(T11)p-n0, if ​p≥n,$
or
 $T=pn-pp(T11T12), if ​p
where ${T}_{11}$ is upper triangular.
Note that if $B$ is square and nonsingular, the GRQ factorization of $A$ and $B$ implicitly gives the $RQ$ factorization of the matrix $A{B}^{-1}$:
 $AB-1=RT-1ZT$
without explicitly computing the matrix ${B}^{-1}$ or the product $A{B}^{-1}$ (remembering that the inverse of an invertible upper triangular matrix and the product of two upper triangular matrices is an upper triangular matrix).
The GRQ factorization can be used to solve the linear equality-constrained least squares problem (LSE) (see [Generalized Linear Least Squares Problems]). We use the GRQ factorization of $B$ and $A$ (note that $B$ and $A$ have swapped roles), written as
 $B=TQ and A=ZRQ.$
We write the linear equality constraints $Bx=d$ as
 $TQx=d,$
which we partition as:
 $n-ppp(0T12)x1x2=d where x1x2≡Qx.$
Therefore ${x}_{2}$ is the solution of the upper triangular system
 $T12x2=d.$
Furthermore,
 $Ax-c2=ZTAx-ZTc2=RQx-ZTc2.$
We partition this expression as:
 $n-ppn-p(R11R12)p+m-n0R22x1x2-c1c2,$
where $\left(\begin{array}{c}{c}_{1}\\ {c}_{2}\end{array}\right)\equiv {Z}^{\mathrm{T}}c$.
To solve the LSE problem, we set
 $R11x1+R12x2-c1=0$
which gives ${x}_{1}$ as the solution of the upper triangular system
 $R11x1=c1-R12x2.$
Finally, the desired solution is given by
 $x=QTx1x2.$

Generalized Singular Value Decomposition (GSVD)

The generalized (or quotient) singular value decomposition of an $m$ by $n$ matrix $A$ and a $p$ by $n$ matrix $B$ is given by the pair of factorizations
 $A=UΣ10,RQT and B=VΣ20,RQT.$
The matrices in these factorizations have the following properties:
 – $U$ is $m$ by $m$, $V$ is $p$ by $p$, $Q$ is $n$ by $n$, and all three matrices are orthogonal. If $A$ and $B$ are complex, these matrices are unitary instead of orthogonal, and ${Q}^{\mathrm{T}}$ should be replaced by ${Q}^{\mathrm{H}}$ in the pair of factorizations. – $R$ is $r$ by $r$, upper triangular and nonsingular. $\left[0,R\right]$ is $r$ by $n$ (in other words, the $0$ is an $r$ by $n-r$ zero matrix). The integer $r$ is the rank of $\left(\begin{array}{c}A\\ B\end{array}\right)$, and satisfies $r\le n$. – ${\Sigma }_{1}$ is $m$ by $r$, ${\Sigma }_{2}$ is $p$ by $r$, both are real, non-negative and diagonal, and ${\Sigma }_{1}^{\mathrm{T}}{\Sigma }_{1}+{\Sigma }_{2}^{\mathrm{T}}{\Sigma }_{2}=I$. Write ${\Sigma }_{1}^{\mathrm{T}}{\Sigma }_{1}=\mathrm{diag}\left({\alpha }_{1}^{2},\dots ,{\alpha }_{r}^{2}\right)$ and ${\Sigma }_{2}^{\mathrm{T}}{\Sigma }_{2}=\mathrm{diag}\left({\beta }_{1}^{2},\dots ,{\beta }_{r}^{2}\right)$, where ${\alpha }_{i}$ and ${\beta }_{i}$ lie in the interval from $0$ to $1$. The ratios ${\alpha }_{1}/{\beta }_{1},\dots ,{\alpha }_{r}/{\beta }_{r}$ are called the generalized singular values of the pair $A$, $B$. If ${\beta }_{i}=0$, then the generalized singular value ${\alpha }_{i}/{\beta }_{i}$ is infinite.
${\Sigma }_{1}$ and ${\Sigma }_{2}$ have the following detailed structures, depending on whether $m\ge r$ or $m. In the first case, $m\ge r$, then
 $Σ1=klk(I0)l0Cm-k-l00 and Σ2=kll(0S)p-l00.$
Here $l$ is the rank of $B$, $k=r-l$, $C$ and $S$ are diagonal matrices satisfying ${C}^{2}+{S}^{2}=I$, and $S$ is nonsingular. We may also identify ${\alpha }_{1}=\cdots ={\alpha }_{k}=1$, ${\alpha }_{k+i}={c}_{ii}$, for $i=1,2,\dots ,l$, ${\beta }_{1}=\cdots ={\beta }_{k}=0$, and ${\beta }_{k+i}={s}_{ii}$, for $i=1,2,\dots ,l$. Thus, the first $k$ generalized singular values ${\alpha }_{1}/{\beta }_{1},\dots ,{\alpha }_{k}/{\beta }_{k}$ are infinite, and the remaining $l$ generalized singular values are finite.
In the second case, when $m,
 $Σ1=km-kk+l-mk(I00)m-k0C0$
and
 $Σ2=km-kk+l-mm-k(0S0)k+l-m00Ip-l000.$
Again, $l$ is the rank of $B$, $k=r-l$, $C$ and $S$ are diagonal matrices satisfying ${C}^{2}+{S}^{2}=I$, and $S$ is nonsingular, and we may identify ${\alpha }_{1}=\cdots ={\alpha }_{k}=1$, ${\alpha }_{k+i}={c}_{ii}$, for $i=1,2,\dots ,m-k$, ${\alpha }_{m+1}=\cdots ={\alpha }_{r}=0$, ${\beta }_{1}=\cdots ={\beta }_{k}=0$, ${\beta }_{k+i}={s}_{ii}$, for $i=1,2,\dots ,m-k$ and ${\beta }_{m+1}=\cdots ={\beta }_{r}=1$. Thus, the first $k$ generalized singular values ${\alpha }_{1}/{\beta }_{1},\dots ,{\alpha }_{k}/{\beta }_{k}$ are infinite, and the remaining $l$ generalized singular values are finite.
Here are some important special cases of the generalized singular value decomposition. First, if $B$ is square and nonsingular, then $r=n$ and the generalized singular value decomposition of $A$ and $B$ is equivalent to the singular value decomposition of $A{B}^{-1}$, where the singular values of $A{B}^{-1}$ are equal to the generalized singular values of the pair $A$, $B$:
 $AB-1=UΣ1RQTVΣ2RQT-1=UΣ1Σ2-1VT.$
Second, for the matrix $C$, where
 $C≡AB$
if the columns of $C$ are orthonormal, then $r=n$, $R=I$ and the generalized singular value decomposition of $A$ and $B$ is equivalent to the CS (Cosine–Sine) decomposition of $C$:
 $AB=U00VΣ1Σ2QT.$
Third, the generalized eigenvalues and eigenvectors of ${A}^{\mathrm{T}}A-\lambda {B}^{\mathrm{T}}B$ can be expressed in terms of the generalized singular value decomposition: Let
 $X=QI00R-1.$
Then
 $XTATAX=000Σ1TΣ1 and XTBTBX=000Σ2TΣ2.$
Therefore, the columns of $X$ are the eigenvectors of ${A}^{\mathrm{T}}A-\lambda {B}^{\mathrm{T}}B$, and ‘nontrivial’ eigenvalues are the squares of the generalized singular values (see also [Generalized Symmetric-definite Eigenvalue Problems]). ‘Trivial’ eigenvalues are those corresponding to the leading $n-r$ columns of $X$, which span the common null space of ${A}^{\mathrm{T}}A$ and ${B}^{\mathrm{T}}B$. The ‘trivial eigenvalues’ are not well defined.

The Full CS Decomposition of Orthogonal Matrices

In [Generalized Singular Value Decomposition (GSVD)] the CS (Cosine-Sine) decomposition of an orthogonal matrix partitioned into two submatrices $A$ and $B$ was given by
 $AB=U00VΣ1Σ2QT.$
The full CS decomposition of an $m$ by $m$ orthogonal matrix $X$ partitions $X$ into four submatrices and factorizes as
 $X11X12X21X22=U100U2Σ11-Σ12Σ21Σ22V100V2T$
where, ${X}_{11}$ is a $p$ by $q$ submatrix (which implies the dimensions of ${X}_{12}$, ${X}_{21}$ and ${X}_{22}$); ${U}_{1}$, ${U}_{2}$, ${V}_{1}$ and ${V}_{2}$ are orthogonal matrices of dimensions $p$, $m-p$, $q$ and $m-q$ respectively; ${\Sigma }_{11}$ is the $p$ by $q$ single-diagonal matrix
 $Σ11=k11-rrq-k11k11-r(I00)r0C0p-k1100, k11=minp,q$
${\Sigma }_{12}$ is the $p$ by $m-q$ single-diagonal matrix
 $Σ12=m-q-k12rk12-rp-k12(00)r0S0k12-r00I, k12=minp,m-q,$
${\Sigma }_{21}$ is the $m-p$ by $q$ single-diagonal matrix
 $Σ21=q-k21rk21-rm-p-k21(00)r0S0k21-r00I, k21=minm-p,q,$
and, ${\Sigma }_{21}$ is the $m-p$ by $q$ single-diagonal matrix
 $Σ22=k22-rrm-q-k22k22-r(I00)r0C0m-p-k2200, k22=minm-p,m-q$
where $r=\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(p,m-p,q,m-q\right)$ and the missing zeros remind us that either the column or the row is missing. The $r$ by $r$ diagonal matrices $C$ and $S$ are such that ${C}^{2}+{S}^{2}=I$.
This is equivalent to the simultaneous singular value decomposition of the four submatrices ${X}_{11}$, ${X}_{12}$, ${X}_{21}$ and ${X}_{22}$.

Symmetric Eigenvalue Problems

The symmetric eigenvalue problem is to find the eigenvalues, $\lambda$, and corresponding eigenvectors, $z\ne 0$, such that
 $Az=λz, A=AT, where ​A​ is real.$
For the Hermitian eigenvalue problem we have
 $Az=λz, A=AH, where ​A​ is complex.$
For both problems the eigenvalues $\lambda$ are real.
When all eigenvalues and eigenvectors have been computed, we write
 $A=ZΛZT or ​A=ZΛZH​ if complex,$
where $\Lambda$ is a diagonal matrix whose diagonal elements are the eigenvalues, and $Z$ is an orthogonal (or unitary) matrix whose columns are the eigenvectors. This is the classical spectral factorization of $A$.
The basic task of the symmetric eigenproblem methods is to compute values of $\lambda$ and, optionally, corresponding vectors $z$ for a given matrix $A$. This computation proceeds in the following stages.
 1 The real symmetric or complex Hermitian matrix $A$ is reduced to real tridiagonal form $T$. If $A$ is real symmetric this decomposition is $A=QT{Q}^{\mathrm{T}}$ with $Q$ orthogonal and $T$ symmetric tridiagonal. If $A$ is complex Hermitian, the decomposition is $A=QT{Q}^{\mathrm{H}}$ with $Q$ unitary and $T$, as before, real symmetric tridiagonal. 2 Eigenvalues and eigenvectors of the real symmetric tridiagonal matrix $T$ are computed. If all eigenvalues and eigenvectors are computed, this is equivalent to factorizing $T$ as $T=S\Lambda {S}^{\mathrm{T}}$, where $S$ is orthogonal and $\Lambda$ is diagonal. The diagonal entries of $\Lambda$ are the eigenvalues of $T$, which are also the eigenvalues of $A$, and the columns of $S$ are the eigenvectors of $T$; the eigenvectors of $A$ are the columns of $Z=QS$, so that $A=Z\Lambda {Z}^{\mathrm{T}}$ ($Z\Lambda {Z}^{\mathrm{H}}$ when $A$ is complex Hermitian).
This chapter supports four primary algorithms for computing eigenvalues and eigenvectors of real symmetric matrices and complex Hermitian matrices. They are:
 (i) the divide-and-conquer algorithm; (ii) the $QR$ algorithm; (iii) bisection followed by inverse iteration; (iv) the Relatively Robust Representation (RRR).
The divide-and-conquer algorithm is generally more efficient than the traditional $QR$ algorithm for computing all eigenvalues and eigenvectors, but the RRR algorithm tends to be fastest of all. For further information and references see Anderson et al. (1999).

Generalized Symmetric-definite Eigenvalue Problems

This section is concerned with the solution of the generalized eigenvalue problems $Az=\lambda Bz$, $ABz=\lambda z$, and $BAz=\lambda z$, where $A$ and $B$ are real symmetric or complex Hermitian and $B$ is positive definite. Each of these problems can be reduced to a standard symmetric eigenvalue problem, using a Cholesky factorization of $B$ as either $B=L{L}^{\mathrm{T}}$ or $B={U}^{\mathrm{T}}U$ ($L{L}^{\mathrm{H}}$ or ${U}^{\mathrm{H}}U$ in the Hermitian case).
With $B=L{L}^{\mathrm{T}}$, we have
 $Az=λBz⇒L-1AL-TLTz=λLTz.$
Hence the eigenvalues of $Az=\lambda Bz$ are those of $Cy=\lambda y$, where $C$ is the symmetric matrix $C={L}^{-1}A{L}^{-\mathrm{T}}$ and $y={L}^{\mathrm{T}}z$. In the complex case $C$ is Hermitian with $C={L}^{-1}A{L}^{-\mathrm{H}}$ and $y={L}^{\mathrm{H}}z$.
Table 1 summarises how each of the three types of problem may be reduced to standard form $Cy=\lambda y$, and how the eigenvectors $z$ of the original problem may be recovered from the eigenvectors $y$ of the reduced problem. The table applies to real problems; for complex problems, transposed matrices must be replaced by conjugate-transposes.
 Type of problem Factorization of $B$ Reduction Recovery of eigenvectors 1. $Az=\lambda Bz$ $B=L{L}^{\mathrm{T}}$, $B={U}^{\mathrm{T}}U$ $C={L}^{-1}A{L}^{-\mathrm{T}}$, $C={U}^{-\mathrm{T}}A{U}^{-1}$ $z={L}^{-\mathrm{T}}y$, $z={U}^{-1}y$ 2. $ABz=\lambda z$ $B=L{L}^{\mathrm{T}}$, $B={U}^{\mathrm{T}}U$ $C={L}^{\mathrm{T}}AL$, $C=UA{U}^{\mathrm{T}}$ $z={L}^{-\mathrm{T}}y$, $z={U}^{-1}y$ 3. $BAz=\lambda z$ $B=L{L}^{\mathrm{T}}$, $B={U}^{\mathrm{T}}U$ $C={L}^{\mathrm{T}}AL$, $C=UA{U}^{\mathrm{T}}$ $z=Ly$, $z={U}^{\mathrm{T}}y$
Table 1
Reduction of generalized symmetric-definite eigenproblems to standard problems
When the generalized symmetric-definite problem has been reduced to the corresponding standard problem $Cy=\lambda y$, this may then be solved using the methods described in the previous section. No special methods are needed to recover the eigenvectors $z$ of the generalized problem from the eigenvectors $y$ of the standard problem, because these computations are simple applications of Level 2 or Level 3 BLAS (see F06 class).

Packed Storage for Symmetric Matrices

Methods which handle symmetric matrices are usually designed so that they use either the upper or lower triangle of the matrix; it is not necessary to store the whole matrix. If either the upper or lower triangle is stored conventionally in the upper or lower triangle of a two-dimensional array, the remaining elements of the array can be used to store other useful data. However, that is not always convenient, and if it is important to economize on storage, the upper or lower triangle can be stored in a one-dimensional array of length $n\left(n+1\right)/2$; that is, the storage is almost halved.
This storage format is referred to as packed storage; it is described in [Packed storage] in the F07 class Chapter Introduction.
Methods designed for packed storage are usually less efficient, especially on high-performance computers, so there is a trade-off between storage and efficiency.

Band Matrices

A band matrix is one whose elements are confined to a relatively small number of subdiagonals or superdiagonals on either side of the main diagonal. Algorithms can take advantage of bandedness to reduce the amount of work and storage required. The storage scheme for band matrices is described in [Band storage] in the F07 class Chapter Introduction.
If the problem is the generalized symmetric definite eigenvalue problem $Az=\lambda Bz$ and the matrices $A$ and $B$ are additionally banded, the matrix $C$ as defined in [Generalized Symmetric-definite Eigenvalue Problems] is, in general, full. We can reduce the problem to a banded standard problem by modifying the definition of $C$ thus:
 $C=XTAX, where X=U-1Q or ​L-TQ,$
where $Q$ is an orthogonal matrix chosen to ensure that $C$ has bandwidth no greater than that of $A$.
A further refinement is possible when $A$ and $B$ are banded, which halves the amount of work required to form $C$. Instead of the standard Cholesky factorization of $B$ as ${U}^{\mathrm{T}}U$ or $L{L}^{\mathrm{T}}$, we use a split Cholesky factorization $B={S}^{\mathrm{T}}S$, where
 $S=U11M21L22$
with ${U}_{11}$ upper triangular and ${L}_{22}$ lower triangular of order approximately $n/2$; $S$ has the same bandwidth as $B$.

Nonsymmetric Eigenvalue Problems

The nonsymmetric eigenvalue problem is to find the eigenvalues, $\lambda$, and corresponding eigenvectors, $v\ne 0$, such that
 $Av=λv.$
More precisely, a vector $v$ as just defined is called a right eigenvector of $A$, and a vector $u\ne 0$ satisfying
 $uTA=λuT uHA=λuH when ​u​ is complex$
is called a left eigenvector of $A$.
A real matrix $A$ may have complex eigenvalues, occurring as complex conjugate pairs.
This problem can be solved via the Schur factorization of $A$, defined in the real case as
 $A=ZTZT,$
where $Z$ is an orthogonal matrix and $T$ is an upper quasi-triangular matrix with $1$ by $1$ and $2$ by $2$ diagonal blocks, the $2$ by $2$ blocks corresponding to complex conjugate pairs of eigenvalues of $A$. In the complex case, the Schur factorization is
 $A=ZTZH,$
where $Z$ is unitary and $T$ is a complex upper triangular matrix.
The columns of $Z$ are called the Schur vectors. For each $k$ ($1\le k\le n$), the first $k$ columns of $Z$ form an orthonormal basis for the invariant subspace corresponding to the first $k$ eigenvalues on the diagonal of $T$. Because this basis is orthonormal, it is preferable in many applications to compute Schur vectors rather than eigenvectors. It is possible to order the Schur factorization so that any desired set of $k$ eigenvalues occupy the $k$ leading positions on the diagonal of $T$.
The two basic tasks of the nonsymmetric eigenvalue methods are to compute, for a given matrix $A$, all $n$ values of $\lambda$ and, if desired, their associated right eigenvectors $v$ and/or left eigenvectors $u$, and the Schur factorization.
These two basic tasks can be performed in the following stages.
 1 A general matrix $A$ is reduced to upper Hessenberg form $H$ which is zero below the first subdiagonal. The reduction may be written $A=QH{Q}^{\mathrm{T}}$ with $Q$ orthogonal if $A$ is real, or $A=QH{Q}^{\mathrm{H}}$ with $Q$ unitary if $A$ is complex. 2 The upper Hessenberg matrix $H$ is reduced to Schur form $T$, giving the Schur factorization $H=ST{S}^{\mathrm{T}}$ (for $H$ real) or $H=ST{S}^{\mathrm{H}}$ (for $H$ complex). The matrix $S$ (the Schur vectors of $H$) may optionally be computed as well. Alternatively $S$ may be postmultiplied into the matrix $Q$ determined in stage $1$, to give the matrix $Z=QS$, the Schur vectors of $A$. The eigenvalues are obtained from the diagonal elements or diagonal blocks of $T$. 3 Given the eigenvalues, the eigenvectors may be computed in two different ways. Inverse iteration can be performed on $H$ to compute the eigenvectors of $H$, and then the eigenvectors can be multiplied by the matrix $Q$ in order to transform them to eigenvectors of $A$. Alternatively the eigenvectors of $T$ can be computed, and optionally transformed to those of $H$ or $A$ if the matrix $S$ or $Z$ is supplied.
The accuracy with which eigenvalues can be obtained can often be improved by balancing a matrix. This is discussed further in [Balancing and condition for the nonsymmetric eigenproblem] below.

Generalized Nonsymmetric Eigenvalue Problem

The generalized nonsymmetric eigenvalue problem is to find the eigenvalues, $\lambda$, and corresponding eigenvectors, $v\ne 0$, such that
 $Av=λBv.$
More precisely, a vector $v$ as just defined is called a right eigenvector of the matrix pair $\left(A,B\right)$, and a vector $u\ne 0$ satisfying
 $uTA=λuTB uHA=λuHB​ when ​u​ is complex$
is called a left eigenvector of the matrix pair $\left(A,B\right)$.
If $B$ is singular then the problem has one or more infinite eigenvalues $\lambda =\infty$, corresponding to $Bv=0$. Note that if $A$ is nonsingular, then the equivalent problem $\mu Av=Bv$ is perfectly well defined and an infinite eigenvalue corresponds to $\mu =0$. To deal with both finite (including zero) and infinite eigenvalues, the methods in this chapter do not compute $\lambda$ explicitly, but rather return a pair of numbers $\left(\alpha ,\beta \right)$ such that if $\beta \ne 0$
 $λ=α/β$
and if $\alpha \ne 0$ and $\beta =0$ then $\lambda =\infty$. $\beta$ is always returned as real and non-negative. Of course, computationally an infinite eigenvalue may correspond to a small $\beta$ rather than an exact zero.
For a given pair $\left(A,B\right)$ the set of all the matrices of the form $\left(A-\lambda B\right)$ is called a matrix pencil and $\lambda$ and $v$ are said to be an eigenvalue and eigenvector of the pencil $\left(A-\lambda B\right)$. If $A$ and $B$ are both singular and share a common null space then
 $detA-λB≡0$
so that the pencil $\left(A-\lambda B\right)$ is singular for all $\lambda$. In other words any $\lambda$ can be regarded as an eigenvalue. In exact arithmetic a singular pencil will have $\alpha =\beta =0$ for some $\left(\alpha ,\beta \right)$. Computationally if some pair $\left(\alpha ,\beta \right)$ is small then the pencil is singular, or nearly singular, and no reliance can be placed on any of the computed eigenvalues. Singular pencils can also manifest themselves in other ways; see, in particular, Sections 2.3.5.2 and 4.11.1.4 of Anderson et al. (1999) for further details.
The generalized eigenvalue problem can be solved via the generalized Schur factorization of the pair $\left(A,B\right)$ defined in the real case as
 $A=QSZT, B=QTZT,$
where $Q$ and $Z$ are orthogonal, $T$ is upper triangular with non-negative diagonal elements and $S$ is upper quasi-triangular with $1$ by $1$ and $2$ by $2$ diagonal blocks, the $2$ by $2$ blocks corresponding to complex conjugate pairs of eigenvalues. In the complex case, the generalized Schur factorization is
 $A=QSZH, B=QTZH,$
where $Q$ and $Z$ are unitary and $S$ and $T$ are upper triangular, with $T$ having real non-negative diagonal elements. The columns of $Q$ and $Z$ are called respectively the left and right generalized Schur vectors and span pairs of deflating subspaces of $A$ and $B$, which are a generalization of invariant subspaces.
It is possible to order the generalized Schur factorization so that any desired set of $k$ eigenvalues correspond to the $k$ leading positions on the diagonals of the pair $\left(S,T\right)$.
The two basic tasks of the generalized nonsymmetric eigenvalue methods are to compute, for a given pair $\left(A,B\right)$, all $n$ values of $\lambda$ and, if desired, their associated right eigenvectors $v$ and/or left eigenvectors $u$, and the generalized Schur factorization.
These two basic tasks can be performed in the following stages.
 1 The matrix pair $\left(A,B\right)$ is reduced to generalized upper Hessenberg form $\left(H,R\right)$, where $H$ is upper Hessenberg (zero below the first subdiagonal) and $R$ is upper triangular. The reduction may be written as $A={Q}_{1}H{Z}_{1}^{\mathrm{T}},B={Q}_{1}R{Z}_{1}^{\mathrm{T}}$ in the real case with ${Q}_{1}$ and ${Z}_{1}$ orthogonal, and $A={Q}_{1}H{Z}_{1}^{\mathrm{H}},B={Q}_{1}R{Z}_{1}^{\mathrm{H}}$ in the complex case with ${Q}_{1}$ and ${Z}_{1}$ unitary. 2 The generalized upper Hessenberg form $\left(H,R\right)$ is reduced to the generalized Schur form $\left(S,T\right)$ using the generalized Schur factorization $H={Q}_{2}S{Z}_{2}^{\mathrm{T}}$, $R={Q}_{2}T{Z}_{2}^{\mathrm{T}}$ in the real case with ${Q}_{2}$ and ${Z}_{2}$ orthogonal, and $H={Q}_{2}S{Z}_{2}^{\mathrm{H}},R={Q}_{2}T{Z}_{2}^{\mathrm{H}}$ in the complex case. The generalized Schur vectors of $\left(A,B\right)$ are given by $Q={Q}_{1}{Q}_{2}$, $Z={Z}_{1}{Z}_{2}$. The eigenvalues are obtained from the diagonal elements (or blocks) of the pair $\left(S,T\right)$. 3 Given the eigenvalues, the eigenvectors of the pair $\left(S,T\right)$ can be computed, and optionally transformed to those of $\left(H,R\right)$ or $\left(A,B\right)$.
The accuracy with which eigenvalues can be obtained can often be improved by balancing a matrix pair. This is discussed further in [Balancing the generalized eigenvalue problem] below.

The Sylvester Equation and the Generalized Sylvester Equation

The Sylvester equation is a matrix equation of the form
 $AX+XB=C,$
where $A$, $B$, and $C$ are given matrices with $A$ being $m$ by $m$, $B$ an $n$ by $n$ matrix and $C$, and the solution matrix $X$, $m$ by $n$ matrices. The solution of a special case of this equation occurs in the computation of the condition number for an invariant subspace, but a combination of methods in this chapter allows the solution of the general Sylvester equation.
Methods are also provided for solving a special case of the generalized Sylvester equations
 $AR-LB=C, DR-LE=F,$
where $\left(A,D\right)$, $\left(B,E\right)$ and $\left(C,F\right)$ are given matrix pairs, and $R$ and $L$ are the solution matrices.

Error and Perturbation Bounds and Condition Numbers

In this section we discuss the effects of rounding errors in the solution process and the effects of uncertainties in the data, on the solution to the problem. A number of the methods in this chapter return information, such as condition numbers, that allow these effects to be assessed. First we discuss some notation used in the error bounds of later sections.
The bounds usually contain the factor $p\left(n\right)$ (or $p\left(m,n\right)$), which grows as a function of the matrix dimension $n$ (or matrix dimensions $m$ and $n$). It measures how errors can grow as a function of the matrix dimension, and represents a potentially different function for each problem. In practice, it usually grows just linearly; $p\left(n\right)\le 10n$ is often true, although generally only much weaker bounds can be actually proved. We normally describe $p\left(n\right)$ as a ‘modestly growing’ function of $n$. For detailed derivations of various $p\left(n\right)$, see Golub and Van Loan (2012) and Wilkinson (1965).
For linear equation (see F07 class) and least squares solvers, we consider bounds on the relative error $‖x-\stackrel{^}{x}‖/‖x‖$ in the computed solution $\stackrel{^}{x}$, where $x$ is the true solution. For eigenvalue problems we consider bounds on the error $\left|{\lambda }_{i}-{\stackrel{^}{\lambda }}_{i}\right|$ in the $i$th computed eigenvalue ${\stackrel{^}{\lambda }}_{i}$, where ${\lambda }_{i}$ is the true $i$th eigenvalue. For singular value problems we similarly consider bounds $\left|{\sigma }_{i}-{\stackrel{^}{\sigma }}_{i}\right|$.
Bounding the error in computed eigenvectors and singular vectors ${\stackrel{^}{v}}_{i}$ is more subtle because these vectors are not unique: even though we restrict ${‖{\stackrel{^}{v}}_{i}‖}_{2}=1$ and ${‖{v}_{i}‖}_{2}=1$, we may still multiply them by arbitrary constants of absolute value $1$. So to avoid ambiguity we bound the angular difference between ${\stackrel{^}{v}}_{i}$ and the true vector ${v}_{i}$, so that
 $θvi,v^i=acute angle between ​vi​ and ​v^i=arccosviHv^i.$ (2)
Here $\mathrm{arccos}\left(\theta \right)$ is in the standard range: $0\le \mathrm{arccos}\left(\theta \right)<\pi$. When $\theta \left({v}_{i},{\stackrel{^}{v}}_{i}\right)$ is small, we can choose a constant $\alpha$ with absolute value $1$ so that ${‖\alpha {v}_{i}-{\stackrel{^}{v}}_{i}‖}_{2}\approx \theta \left({v}_{i},{\stackrel{^}{v}}_{i}\right)$.
In addition to bounds for individual eigenvectors, bounds can be obtained for the spaces spanned by collections of eigenvectors. These may be much more accurately determined than the individual eigenvectors which span them. These spaces are called invariant subspaces in the case of eigenvectors, because if $v$ is any vector in the space, $Av$ is also in the space, where $A$ is the matrix. Again, we will use angle to measure the difference between a computed space $\stackrel{^}{S}$ and the true space $S$:
 $θS,S^=acute angle between ​S​ and ​S^=maxs∈Ss≠0mins^∈S^s^≠0θs,s^ or maxs^∈S^s^≠0mins∈Ss≠0θs,s^$ (3)
$\theta \left(S,\stackrel{^}{S}\right)$ may be computed as follows. Let $S$ be a matrix whose columns are orthonormal and $\mathrm{span} S$. Similarly let $\stackrel{^}{S}$ be an orthonormal matrix with columns spanning $\stackrel{^}{S}$. Then
 $θS,S^=arccos σminSHS^.$
Finally, we remark on the accuracy of the bounds when they are large. Relative errors like $‖\stackrel{^}{x}-x‖/‖x‖$ and angular errors like $\theta \left({\stackrel{^}{v}}_{i},{v}_{i}\right)$ are only of interest when they are much less than $1$. Some stated bounds are not strictly true when they are close to $1$, but rigorous bounds are much more complicated and supply little extra information in the interesting case of small errors. These bounds are indicated by using the symbol $\lesssim$, or ‘approximately less than’, instead of the usual $\le$. Thus, when these bounds are close to 1 or greater, they indicate that the computed answer may have no significant digits at all, but do not otherwise bound the error.
A number of methods in this chapter return error estimates and/or condition number estimates directly. In other cases Anderson et al. (1999) gives code fragments to illustrate the computation of these estimates, and a number of the F08 class example programs, for the driver methods, implement these code fragments.

Least squares problems

The conventional error analysis of linear least squares problems goes as follows. The problem is to find the $x$ minimizing ${‖Ax-b‖}_{2}$. Let $\stackrel{^}{x}$ be the solution computed using one of the methods described above. We discuss the most common case, where $A$ is overdetermined (i.e., has more rows than columns) and has full rank.
Then the computed solution $\stackrel{^}{x}$ has a small normwise backward error. In other words $\stackrel{^}{x}$ minimizes ${‖\left(A+E\right)\stackrel{^}{x}-\left(b+f\right)‖}_{2}$, where
 $maxE2A2,f2b2≤pnε$
and $p\left(n\right)$ is a modestly growing function of $n$ and $\epsilon$ is the machine precision. Let ${\kappa }_{2}\left(A\right)={\sigma }_{\mathrm{max}}\left(A\right)/{\sigma }_{\mathrm{min}}\left(A\right)$, $\rho ={‖Ax-b‖}_{2}$, and $\mathrm{sin}\left(\theta \right)=\rho /{‖b‖}_{2}$. Then if $p\left(n\right)\epsilon$ is small enough, the error $\stackrel{^}{x}-x$ is bounded by
 $x-x^2x2≲pnε2κ2Acosθ+tanθκ22A.$
If $A$ is rank-deficient, the problem can be regularized by treating all singular values less than a user-specified threshold as exactly zero. See Golub and Van Loan (2012) for error bounds in this case, as well as for the underdetermined case.
The solution of the overdetermined, full-rank problem may also be characterised as the solution of the linear system of equations
 $IAAT0rx=b0.$
By solving this linear system (see F07 class) component-wise error bounds can also be obtained (see Arioli et al. (1989)).

The singular value decomposition

The usual error analysis of the SVD algorithm is as follows (see Golub and Van Loan (2012)).
The computed SVD, $\stackrel{^}{U}\stackrel{^}{\Sigma }{\stackrel{^}{V}}^{\mathrm{T}}$, is nearly the exact SVD of $A+E$, i.e., $A+E=\left(\stackrel{^}{U}+\delta \stackrel{^}{U}\right)\stackrel{^}{\Sigma }\left(\stackrel{^}{V}+\delta \stackrel{^}{V}\right)$ is the true SVD, so that $\stackrel{^}{U}+\delta \stackrel{^}{U}$ and $\stackrel{^}{V}+\delta \stackrel{^}{V}$ are both orthogonal, where ${‖E‖}_{2}/{‖A‖}_{2}\le p\left(m,n\right)\epsilon$, $‖\delta \stackrel{^}{U}‖\le p\left(m,n\right)\epsilon$, and $‖\delta \stackrel{^}{V}‖\le p\left(m,n\right)\epsilon$. Here $p\left(m,n\right)$ is a modestly growing function of $m$ and $n$ and $\epsilon$ is the machine precision. Each computed singular value ${\stackrel{^}{\sigma }}_{i}$ differs from the true ${\sigma }_{i}$ by an amount satisfying the bound
 $σ^i-σi≤pm,nεσ1.$
Thus large singular values (those near ${\sigma }_{1}$) are computed to high relative accuracy and small ones may not be.
The angular difference between the computed left singular vector ${\stackrel{^}{u}}_{i}$ and the true ${u}_{i}$ satisfies the approximate bound
 $θu^i,ui≲pm,nεA2gapi$
where
 $gapi=minj≠iσi-σj$
is the absolute gap between ${\sigma }_{i}$ and the nearest other singular value. Thus, if ${\sigma }_{i}$ is close to other singular values, its corresponding singular vector ${u}_{i}$ may be inaccurate. The same bound applies to the computed right singular vector ${\stackrel{^}{v}}_{i}$ and the true vector ${v}_{i}$. The gaps may be easily obtained from the computed singular values.
Let $\stackrel{^}{\mathbit{S}}$ be the space spanned by a collection of computed left singular vectors $\left\{{\stackrel{^}{u}}_{i},i\in \mathbit{I}\right\}$, where $\mathbit{I}$ is a subset of the integers from $1$ to $n$. Let $\mathbit{S}$ be the corresponding true space. Then
 $θS^,S≲pm,nεA2gapI.$
where
 $gapI=minσi-σj for ​i∈I,j∉I$
is the absolute gap between the singular values in $\mathbit{I}$ and the nearest other singular value. Thus, a cluster of close singular values which is far away from any other singular value may have a well determined space $\stackrel{^}{\mathbit{S}}$ even if its individual singular vectors are ill-conditioned. The same bound applies to a set of right singular vectors $\left\{{\stackrel{^}{v}}_{i},i\in \mathbit{I}\right\}$.
In the special case of bidiagonal matrices, the singular values and singular vectors may be computed much more accurately (see Demmel and Kahan (1990)). A bidiagonal matrix $B$ has nonzero entries only on the main diagonal and the diagonal immediately above it (or immediately below it). Reduction of a dense matrix to bidiagonal form $B$ can introduce additional errors, so the following bounds for the bidiagonal case do not apply to the dense case.
Using the methods in this chapter, each computed singular value of a bidiagonal matrix is accurate to nearly full relative accuracy, no matter how tiny it is, so that
 $σ^i-σi≤pm,nεσi.$
The computed left singular vector ${\stackrel{^}{u}}_{i}$ has an angular error at most about
 $θu^i,ui≲pm,nεrelgapi$
where
 $relgapi=minj≠iσi-σj/σi+σj$
is the relative gap between ${\sigma }_{i}$ and the nearest other singular value. The same bound applies to the right singular vector ${\stackrel{^}{v}}_{i}$ and ${v}_{i}$. Since the relative gap may be much larger than the absolute gap, this error bound may be much smaller than the previous one. The relative gaps may be easily obtained from the computed singular values.

The symmetric eigenproblem

The usual error analysis of the symmetric eigenproblem is as follows (see Parlett (1998)).
The computed eigendecomposition $\stackrel{^}{Z}\stackrel{^}{\Lambda }{\stackrel{^}{Z}}^{\mathrm{T}}$ is nearly the exact eigendecomposition of $A+E$, i.e., $A+E=\left(\stackrel{^}{Z}+\delta \stackrel{^}{Z}\right)\stackrel{^}{\Lambda }{\left(\stackrel{^}{Z}+\delta \stackrel{^}{Z}\right)}^{\mathrm{T}}$ is the true eigendecomposition so that $\stackrel{^}{Z}+\delta \stackrel{^}{Z}$ is orthogonal, where ${‖E‖}_{2}/{‖A‖}_{2}\le p\left(n\right)\epsilon$ and ${‖\delta \stackrel{^}{Z}‖}_{2}\le p\left(n\right)\epsilon$ and $p\left(n\right)$ is a modestly growing function of $n$ and $\epsilon$ is the machine precision. Each computed eigenvalue ${\stackrel{^}{\lambda }}_{i}$ differs from the true ${\lambda }_{i}$ by an amount satisfying the bound
 $λ^i-λi≤pnεA2.$
Thus large eigenvalues (those near $\underset{i}{\mathrm{max}}\phantom{\rule{0.25em}{0ex}}\left|{\lambda }_{i}\right|={‖A‖}_{2}$) are computed to high relative accuracy and small ones may not be.
The angular difference between the computed unit eigenvector ${\stackrel{^}{z}}_{i}$ and the true ${z}_{i}$ satisfies the approximate bound
 $θz^i,zi≲pnεA2gapi$
if $p\left(n\right)\epsilon$ is small enough, where
 $gapi=minj≠iλi-λj$
is the absolute gap between ${\lambda }_{i}$ and the nearest other eigenvalue. Thus, if ${\lambda }_{i}$ is close to other eigenvalues, its corresponding eigenvector ${z}_{i}$ may be inaccurate. The gaps may be easily obtained from the computed eigenvalues.
Let $\stackrel{^}{\mathbit{S}}$ be the invariant subspace spanned by a collection of eigenvectors $\left\{{\stackrel{^}{z}}_{i},i\in \mathbit{I}\right\}$, where $\mathbit{I}$ is a subset of the integers from $1$ to $n$. Let $\mathbit{S}$ be the corresponding true subspace. Then
 $θS^,S≲pnεA2gapI$
where
 $gapI=minλi-λj for ​i∈I,j∉I$
is the absolute gap between the eigenvalues in $\mathbit{I}$ and the nearest other eigenvalue. Thus, a cluster of close eigenvalues which is far away from any other eigenvalue may have a well determined invariant subspace $\stackrel{^}{\mathbit{S}}$ even if its individual eigenvectors are ill-conditioned.
In the special case of a real symmetric tridiagonal matrix $T$, methods in this chapter can compute the eigenvalues and eigenvectors much more accurately. See Anderson et al. (1999) for further details.

The generalized symmetric-definite eigenproblem

The three types of problem to be considered are $A-\lambda B$, $AB-\lambda I$ and $BA-\lambda I$. In each case $A$ and $B$ are real symmetric (or complex Hermitian) and $B$ is positive definite. We consider each case in turn, assuming that methods in this chapter are used to transform the generalized problem to the standard symmetric problem, followed by the solution of the symmetric problem. In all cases
 $gapi=minj≠iλi-λj$
is the absolute gap between ${\lambda }_{i}$ and the nearest other eigenvalue.
1. $A-\lambda B$. The computed eigenvalues ${\stackrel{^}{\lambda }}_{i}$ can differ from the true eigenvalues ${\lambda }_{i}$ by an amount
 $λ^i-λi≲pnεB-12A2.$
The angular difference between the computed eigenvector ${\stackrel{^}{z}}_{i}$ and the true eigenvector ${z}_{i}$ is
 $θz^i,zi≲pnεB-12A2κ2B1/2gapi.$
2. $AB-\lambda I$ or $BA-\lambda I$. The computed eigenvalues ${\stackrel{^}{\lambda }}_{i}$ can differ from the true eigenvalues ${\lambda }_{i}$ by an amount
 $λ^i-λi≲pnεB2A2.$
The angular difference between the computed eigenvector ${\stackrel{^}{z}}_{i}$ and the true eigenvector ${z}_{i}$ is
 $θz^i,zi≲pnεB2A2κ2B1/2gapi.$
These error bounds are large when $B$ is ill-conditioned with respect to inversion (${\kappa }_{2}\left(B\right)$ is large). It is often the case that the eigenvalues and eigenvectors are much better conditioned than indicated here. One way to get tighter bounds is effective when the diagonal entries of $B$ differ widely in magnitude, as for example with a graded matrix.
 1 $A-\lambda B$. Let $D=\mathrm{diag}\left({b}_{11}^{-1/2},\dots ,{b}_{nn}^{-1/2}\right)$ be a diagonal matrix. Then replace $B$ by $DBD$ and $A$ by $DAD$ in the above bounds. 2 $AB-\lambda I$ or $BA-\lambda I$. Let $D=\mathrm{diag}\left({b}_{11}^{-1/2},\dots ,{b}_{nn}^{-1/2}\right)$ be a diagonal matrix. Then replace $B$ by $DBD$ and $A$ by ${D}^{-1}A{D}^{-1}$ in the above bounds.
Further details can be found in Anderson et al. (1999).

The nonsymmetric eigenproblem

The nonsymmetric eigenvalue problem is more complicated than the symmetric eigenvalue problem. In this section, we just summarise the bounds. Further details can be found in Anderson et al. (1999).
We let ${\stackrel{^}{\lambda }}_{i}$ be the $i$th computed eigenvalue and ${\lambda }_{i}$ the $i$th true eigenvalue. Let ${\stackrel{^}{v}}_{i}$ be the corresponding computed right eigenvector, and ${v}_{i}$ the true right eigenvector (so $A{v}_{i}={\lambda }_{i}{v}_{i}$). If $\mathbit{I}$ is a subset of the integers from $1$ to $n$, we let ${\lambda }_{\mathbit{I}}$ denote the average of the selected eigenvalues: ${\lambda }_{\mathbit{I}}=\left(\sum _{i\in \mathbit{I}}\phantom{\rule{0.25em}{0ex}}{\lambda }_{i}\right)/\left(\sum _{i\in \mathbit{I}}\phantom{\rule{0.25em}{0ex}}1\right)$, and similarly for ${\stackrel{^}{\lambda }}_{\mathbit{I}}$. We also let ${\mathbit{S}}_{\mathbit{I}}$ denote the subspace spanned by $\left\{{v}_{i},i\in \mathbit{I}\right\}$; it is called a right invariant subspace because if $v$ is any vector in ${\mathbit{S}}_{\mathbit{I}}$ then $Av$ is also in ${\mathbit{S}}_{\mathbit{I}}$. ${\stackrel{^}{\mathbit{S}}}_{\mathbit{I}}$ is the corresponding computed subspace.
The algorithms for the nonsymmetric eigenproblem are normwise backward stable: they compute the exact eigenvalues, eigenvectors and invariant subspaces of slightly perturbed matrices $\left(A+E\right)$, where $‖E‖\le p\left(n\right)\epsilon ‖A‖$. Some of the bounds are stated in terms of ${‖E‖}_{2}$ and others in terms of ${‖E‖}_{F}$; one may use $p\left(n\right)\epsilon$ for either quantity.
Methods are provided so that, for each (${\stackrel{^}{\lambda }}_{i},{\stackrel{^}{v}}_{i}$) pair the two values ${s}_{i}$ and ${\mathit{sep}}_{i}$, or for a selected subset $\mathbit{I}$ of eigenvalues the values ${s}_{\mathbit{I}}$ and ${\mathit{sep}}_{\mathbit{I}}$ can be obtained, for which the error bounds in Table 2 are true for sufficiently small $‖E‖$, (which is why they are called asymptotic):
 Simple eigenvalue $\left|{\stackrel{^}{\lambda }}_{i}-{\lambda }_{i}\right|\lesssim {‖E‖}_{2}/{s}_{i}$ Eigenvalue cluster $\left|{\stackrel{^}{\lambda }}_{\mathbit{I}}-{\lambda }_{\mathbit{I}}\right|\lesssim {‖E‖}_{2}/{s}_{\mathbit{I}}$ Eigenvector $\theta \left({\stackrel{^}{v}}_{i},{v}_{i}\right)\lesssim {‖E‖}_{F}/{\mathit{sep}}_{i}$ Invariant subspace $\theta \left({\stackrel{^}{S}}_{\mathbit{I}},{S}_{\mathbit{I}}\right)\lesssim {‖E‖}_{F}/{\mathit{sep}}_{\mathbit{I}}$
Table 2
Asymptotic error bounds for the nonsymmetric eigenproblem
If the problem is ill-conditioned, the asymptotic bounds may only hold for extremely small $‖E‖$. The global error bounds of Table 3 are guaranteed to hold for all ${‖E‖}_{F}:
 Simple eigenvalue $\left|{\stackrel{^}{\lambda }}_{i}-{\lambda }_{i}\right|\le n{‖E‖}_{2}/{s}_{i}$ Holds for all $E$ Eigenvalue cluster $\left|{\stackrel{^}{\lambda }}_{\mathbit{I}}-{\lambda }_{\mathbit{I}}\right|\le 2{‖E‖}_{2}/{s}_{\mathbit{I}}$ Requires ${‖E‖}_{F}<{s}_{\mathbit{I}}×{\mathit{sep}}_{\mathbit{I}}/4$ Eigenvector $\theta \left({\stackrel{^}{v}}_{i},{v}_{i}\right)\le \mathrm{arctan}\left(2{‖E‖}_{F}/\left({\mathit{sep}}_{i}-4{‖E‖}_{F}/{s}_{i}\right)\right)$ Requires ${‖E‖}_{F}<{s}_{i}×{\mathit{sep}}_{i}/4$ Invariant subspace $\theta \left({\stackrel{^}{S}}_{\mathbit{I}},{S}_{\mathbit{I}}\right)\le \mathrm{arctan}\left(2{‖E‖}_{F}/\left({\mathit{sep}}_{\mathbit{I}}-4{‖E‖}_{F}/{s}_{\mathbit{I}}\right)\right)$ Requires ${‖E‖}_{F}<{s}_{\mathbit{I}}×{\mathit{sep}}_{\mathbit{I}}/4$
Table 3
Global error bounds for the nonsymmetric eigenproblem

Balancing and condition for the nonsymmetric eigenproblem

There are two preprocessing steps one may perform on a matrix $A$ in order to make its eigenproblem easier. The first is permutation, or reordering the rows and columns to make $A$ more nearly upper triangular (closer to Schur form): ${A}^{\prime }=PA{P}^{\mathrm{T}}$, where $P$ is a permutation matrix. If ${A}^{\prime }$ is permutable to upper triangular form (or close to it), then no floating-point operations (or very few) are needed to reduce it to Schur form. The second is scaling by a diagonal matrix $D$ to make the rows and columns of ${A}^{\prime }$ more nearly equal in norm: ${A}^{\prime \prime }=D{A}^{\prime }{D}^{-1}$. Scaling can make the matrix norm smaller with respect to the eigenvalues, and so possibly reduce the inaccuracy contributed by roundoff (see Chapter 11 of Wilkinson and Reinsch (1971)). We refer to these two operations as balancing.
Permuting has no effect on the condition numbers or their interpretation as described previously. Scaling, however, does change their interpretation and further details can be found in Anderson et al. (1999).

The generalized nonsymmetric eigenvalue problem

The algorithms for the generalized nonsymmetric eigenvalue problem are normwise backward stable: they compute the exact eigenvalues (as the pairs $\left(\alpha ,\beta \right)$), eigenvectors and deflating subspaces of slightly perturbed pairs $\left(A+E,B+F\right)$, where
 $E,FF≤pnεA,BF.$
Asymptotic and global error bounds can be obtained, which are generalizations of those given in Tables 2 and 3. See Section 4.11 of Anderson et al. (1999) for details. Methods are provided to compute estimates of reciprocal conditions numbers for eigenvalues and eigenspaces.

Balancing the generalized eigenvalue problem

As with the standard nonsymmetric eigenvalue problem, there are two preprocessing steps one may perform on a matrix pair $\left(A,B\right)$ in order to make its eigenproblem easier; permutation and scaling, which together are referred to as balancing, as indicated in the following two steps.
1. The balancing method first attempts to permute $A$ and $B$ to block upper triangular form by a similarity transformation:
 $PAPT=F=F11F12F13F22F23F33,PBPT=G=G11G12G13G22G23G33,$
where $P$ is a permutation matrix, ${F}_{11}$, ${F}_{33}$, ${G}_{11}$ and ${G}_{33}$ are upper triangular. Then the diagonal elements of the matrix $\left({F}_{11},{G}_{11}\right)$ and $\left({G}_{33},{H}_{33}\right)$ are generalized eigenvalues of $\left(A,B\right)$. The rest of the generalized eigenvalues are given by the matrix pair $\left({F}_{22},{G}_{22}\right)$. Subsequent operations to compute the eigenvalues of $\left(A,B\right)$ need only be applied to the matrix $\left({F}_{22},{G}_{22}\right)$; this can save a significant amount of work if $\left({F}_{22},{G}_{22}\right)$ is smaller than the original matrix pair $\left(A,B\right)$. If no suitable permutation exists (as is often the case), then there is no gain in efficiency or accuracy.
2. The balancing method applies a diagonal similarity transformation to $\left(F,G\right)$, to make the rows and columns of $\left({F}_{22},{G}_{22}\right)$ as close as possible in the norm:
 $DFD-1=ID22IF11F12F13F22F23F33ID22-1I,DGD-1=ID22IG11G12G13G22G23G33ID22-1I.$
This transformation usually improves the accuracy of computed generalized eigenvalues and eigenvectors. However, there are exceptional occasions when this transformation increases the norm of the pencil; in this case accuracy could be lower with diagonal balancing.
See Anderson et al. (1999) for further details.

Other problems

Error bounds for other problems such as the generalized linear least squares problem and generalized singular value decomposition can be found in Anderson et al. (1999).

Block Partitioned Algorithms

A number of the methods in this chapter use what is termed a block partitioned algorithm. This means that at each major step of the algorithm a block of rows or columns is updated, and much of the computation is performed by matrix-matrix operations on these blocks. The matrix-matrix operations are performed by calls to the Level 3 BLAS (see F06 class), which are the key to achieving high performance on many modern computers. In the case of the $QR$ algorithm for reducing an upper Hessenberg matrix to Schur form, a multishift strategy is used in order to improve performance. See Golub and Van Loan (2012) or Anderson et al. (1999) for more about block partitioned algorithms and the multishift strategy.
The performance of a block partitioned algorithm varies to some extent with the block size – that is, the number of rows or columns per block. This is a machine-dependent parameter, which is set to a suitable value when the library is implemented on each range of machines. You do not normally need to be aware of what value is being used. Different block sizes may be used for different methods. Values in the range $16$ to $64$ are typical.
On more conventional machines there is often no advantage from using a block partitioned algorithm, and then the methods use an unblocked algorithm (effectively a block size of $1$), relying solely on calls to the Level 2 BLAS (see F06 class again).
The only situation in which you need some awareness of the block size is when it affects the amount of workspace to be supplied to a particular method. This is discussed in [Length of work arrays].

References

Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J J, Du Croz J J, Greenbaum A, Hammarling S, McKenney A and Sorensen D (1999) LAPACK Users' Guide (3rd Edition) SIAM, Philadelphia
Arioli M, Duff I S and de Rijk P P M (1989) On the augmented system approach to sparse least squares problems Numer. Math. 55 667–684
Demmel J W and Kahan W (1990) Accurate singular values of bidiagonal matrices SIAM J. Sci. Statist. Comput. 11 873–912
Golub G H and Van Loan C F (2012) Matrix Computations (4th Edition) Johns Hopkins University Press, Baltimore
Moler C B and Stewart G W (1973) An algorithm for generalized matrix eigenproblems SIAM J. Numer. Anal. 10 241–256
Parlett B N (1998) The Symmetric Eigenvalue Problem SIAM, Philadelphia
Stewart G W and Sun J-G (1990) Matrix Perturbation Theory Academic Press, London
Ward R C (1981) Balancing the generalized eigenvalue problem SIAM J. Sci. Stat. Comp. 2 141–152
Wilkinson J H (1965) The Algebraic Eigenvalue Problem Oxford University Press, Oxford
Wilkinson J H and Reinsch C (1971) Handbook for Automatic Computation II, Linear Algebra Springer–Verlag

Inheritance Hierarchy

System..::..Object
NagLibrary..::..F08