NAG FL Interface
f10caf (svd_rowext_real)
1
Purpose
f10caf computes the singular value decomposition (SVD) of a real $m$ by $n$ matrix $A$, optionally computing the left and/or right singular vectors by using a randomised numerical linear algebra (RNLA) method.
2
Specification
Fortran Interface
Subroutine f10caf ( 
jobu, jobvt, m, n, a, lda, k, rtol_abs, rtol_rel, state, s, u, ldu, vt, ldvt, r, ifail) 
Integer, Intent (In) 
:: 
m, n, lda, k, ldu, ldvt 
Integer, Intent (Inout) 
:: 
state(*), ifail 
Integer, Intent (Out) 
:: 
r 
Real (Kind=nag_wp), Intent (In) 
:: 
a(lda,*), rtol_abs, rtol_rel 
Real (Kind=nag_wp), Intent (Inout) 
:: 
s(k), u(ldu,*), vt(ldvt,*) 
Character (1), Intent (In) 
:: 
jobu, jobvt 

C Header Interface
#include <nag.h>
void 
f10caf_ (const char *jobu, const char *jobvt, const Integer *m, const Integer *n, const double a[], const Integer *lda, const Integer *k, const double *rtol_abs, const double *rtol_rel, Integer state[], double s[], double u[], const Integer *ldu, double vt[], const Integer *ldvt, Integer *r, Integer *ifail, const Charlen length_jobu, const Charlen length_jobvt) 

C++ Header Interface
#include <nag.h> extern "C" {
void 
f10caf_ (const char *jobu, const char *jobvt, const Integer &m, const Integer &n, const double a[], const Integer &lda, const Integer &k, const double &rtol_abs, const double &rtol_rel, Integer state[], double s[], double u[], const Integer &ldu, double vt[], const Integer &ldvt, Integer &r, Integer &ifail, const Charlen length_jobu, const Charlen length_jobvt) 
}

The routine may be called by the names f10caf or nagf_rnla_svd_rowext_real.
3
Description
The SVD is written as
where
$\Sigma $ is an
$m$ by
$n$ matrix which is zero except for its
$\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$ diagonal elements,
$U$ is an
$m$ by
$m$ orthogonal matrix, and
$V$ is an
$n$ by
$n$ orthogonal matrix. The diagonal elements of
$\Sigma $ are the singular values of
$A$; they are real and nonnegative, and are returned in descending order. The first
$\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$ columns of
$U$ and
$V$ are the left and right singular vectors of
$A$.
Note that the routine returns ${V}^{\mathrm{T}}$, not $V$.
If the rank of $A$ is $r<\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$, then $\sigma $ has $r$ nonzero elements, and only $r$ columns of $U$ and $V$ are welldefined. In this case we can reduce $\sigma $ to an $r$ by $r$ matrix, $U$ to an $m$ by $r$ matrix and ${V}^{\mathrm{T}}$ to an $r$ by $n$ matrix.
f10caf is designed for efficiently computing the SVD in the case $r<\mathrm{min}\phantom{\rule{0.125em}{0ex}}\left(m,n\right)$. The input argument $k$ should be greater than $r$ by a small oversampling parameter, $\delta $, such that $k=r+\delta $. A reasonable value for $\delta $, to compute the SVD to within machine precision, is $\delta =5$. The value of $\delta $ should not vary based on $m$ or $n$. If $r$ is not known then the routine can be used iteratively to refine the estimate and accuracy of the computed SVD; using a larger value of $\delta $ than necessary increases the computational cost of the routine.
As a byproduct of computing the SVD, the routine estimates $r$.
If the input argument
$k$ is less than
$r$ the accuracy depends on the
$\left(k+1\right)$th singular value,
${\sigma}_{k+1}$. See
Section 7 for more details.
A call to
f10caf consists of the following:

1.A random projection is applied, $Y=A\Omega $, where $\Omega $ is an $n$ by $k$ matrix. (Note that the product $A\Omega $ is computed using a Fast Fourier Transform, so can be computed in $\mathit{O}\left(mn\mathrm{log}\left(k\right)\right)$ time.) See f10daf for more details on the random projection.

2.A pivoted $QR$ decomposition of $Y$ is calculated (see f08bef for more details). The rank estimate is then $r$ such that, on the diagonal of $R$,
where ${\epsilon}_{a}$ and ${\epsilon}_{r}$ are the absolute and relative error tolerances, respectively, and $r$ is the largest diagonal index for which the above relation holds.

3.Obtain the SVD from the $QR$ decomposition of $Y$ (or, depending on the rank, an approximation to the SVD) of $A$. This is referred to as row extraction.
Further details of the randomized SVD procedure can be found in Sections 4 and 5 of
Halko et al. (2011).
4
References
Anderson E, Bai Z, Bischof C, Blackford S, Demmel J, Dongarra J J, Du Croz J J, Greenbaum A, Hammarling S, McKenney A and Sorensen D (1999)
LAPACK Users' Guide (3rd Edition) SIAM, Philadelphia
https://www.netlib.org/lapack/lug
Golub G H and Van Loan C F (1996) Matrix Computations (3rd Edition) Johns Hopkins University Press, Baltimore
Halko N (2012) Randomized methods for computing lowrank approximations of matrices PhD thesis
Halko N, Martinsson P G and Tropp J A (2011) Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions
SIAM Rev. 53(2) 217–288
https://epubs.siam.org/doi/abs/10.1137/090771806
5
Arguments

1:
$\mathbf{jobu}$ – Character(1)
Input

On entry: specifies options for computing part of or none of the matrix
$U$.
 ${\mathbf{jobu}}=\text{'S'}$
 The first $k$ columns of $U$ (the left singular vectors) are returned in the array u.
 ${\mathbf{jobu}}=\text{'N'}$
 No columns of $U$ (no left singular vectors) are computed.
Constraint:
${\mathbf{jobu}}=\text{'S'}$ or $\text{'N'}$.

2:
$\mathbf{jobvt}$ – Character(1)
Input

On entry: specifies options for computing part of or none of the matrix
${V}^{\mathrm{T}}$.
 ${\mathbf{jobvt}}=\text{'S'}$
 The first $k$ rows of ${V}^{\mathrm{T}}$ (the right singular vectors) are returned in the array vt.
 ${\mathbf{jobvt}}=\text{'N'}$
 No rows of ${V}^{\mathrm{T}}$ (no right singular vectors) are computed.
Constraint:
${\mathbf{jobvt}}=\text{'S'}$ or $\text{'N'}$.

3:
$\mathbf{m}$ – Integer
Input

On entry: $m$, the number of rows of the matrix $A$.
Constraint:
${\mathbf{m}}>0$.

4:
$\mathbf{n}$ – Integer
Input

On entry: $n$, the number of columns of the matrix $A$.
Constraint:
${\mathbf{n}}>0$.

5:
$\mathbf{a}\left({\mathbf{lda}},*\right)$ – Real (Kind=nag_wp) array
Input

Note: the second dimension of the array
a
must be at least
${\mathbf{n}}$.
On entry: the $m$ by $n$ matrix $A$.

6:
$\mathbf{lda}$ – Integer
Input

On entry: the first dimension of the array
a as declared in the (sub)program from which
f10caf is called.
Constraint:
${\mathbf{lda}}\ge {\mathbf{m}}$.

7:
$\mathbf{k}$ – Integer
Input

On entry: $k$, number of columns in random projection, $Y=A\Omega $.
Constraint:
$0<{\mathbf{k}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{m}},{\mathbf{n}}\right)$.

8:
$\mathbf{rtol\_abs}$ – Real (Kind=nag_wp)
Input

On entry: the absolute tolerance, ${\epsilon}_{a}$ used in defining the threshold on estimating the rank of $A$. If ${\mathbf{rtol\_abs}}<0.0$ then $0.0$ is used unless ${\mathbf{rtol\_rel}}\le 0.0$ in which case $\sqrt{\mathit{machineprecision}}$ is used.

9:
$\mathbf{rtol\_rel}$ – Real (Kind=nag_wp)
Input

On entry: the relative tolerance, ${\epsilon}_{r}$ used in defining the threshold on estimating the rank of $A$. If ${\mathbf{rtol\_rel}}<0.0$ then $0.0$ is used unless ${\mathbf{rtol\_abs}}\le 0.0$ in which case $\sqrt{\mathit{machineprecision}}$ is used.

10:
$\mathbf{state}\left(*\right)$ – Integer array
Communication Array
Note: the actual argument supplied
must be the array
state supplied to the initialization routines
g05kff or
g05kgf.
On entry: contains information on the selected base generator and its current state.
On exit: contains updated information on the state of the generator.

11:
$\mathbf{s}\left({\mathbf{k}}\right)$ – Real (Kind=nag_wp) array
Output

On exit: the first
r elements of
s contain the
r largest singular values of
$A$ in descending order. The remaining values are set to zero.

12:
$\mathbf{u}\left({\mathbf{ldu}},*\right)$ – Real (Kind=nag_wp) array
Output

Note: the second dimension of the array
u
must be at least
${\mathbf{k}}$ if
${\mathbf{jobu}}=\text{'S'}$.
On exit: if
${\mathbf{jobu}}=\text{'S'}$,
u contains the first
r columns of
$U$ (the left singular vectors, stored columnwise); the remaining elements of
u are set to zero.
If
${\mathbf{jobu}}=\text{'N'}$,
u is not referenced.

13:
$\mathbf{ldu}$ – Integer
Input

On entry: the first dimension of the array
u as declared in the (sub)program from which
f10caf is called.
Constraint:
if ${\mathbf{jobu}}=\text{'S'}$, ${\mathbf{ldu}}\ge {\mathbf{m}}$.

14:
$\mathbf{vt}\left({\mathbf{ldvt}},*\right)$ – Real (Kind=nag_wp) array
Output

Note: the second dimension of the array
vt
must be at least
${\mathbf{n}}$ if
${\mathbf{jobvt}}=\text{'S'}$, and at least
$1$ otherwise.
On exit: if
${\mathbf{jobvt}}=\text{'S'}$,
vt contains the first
r rows of
${V}^{\mathrm{T}}$ (the right singular vectors); the remaining elements of
vt are set to zero.
If
${\mathbf{jobvt}}=\text{'N'}$,
vt is not referenced.

15:
$\mathbf{ldvt}$ – Integer
Input

On entry: the first dimension of the array
vt as declared in the (sub)program from which
f10caf is called.
Constraint:
if ${\mathbf{jobvt}}=\text{'S'}$, ${\mathbf{ldvt}}\ge {\mathbf{k}}$.

16:
$\mathbf{r}$ – Integer
Output

On exit:
$r$, contains estimated rank of array
a.

17:
$\mathbf{ifail}$ – Integer
Input/Output

On entry:
ifail must be set to
$0$,
$1\text{or}1$. If you are unfamiliar with this argument you should refer to
Section 4 in the Introduction to the NAG Library FL Interface for details.
For environments where it might be inappropriate to halt program execution when an error is detected, the value
$1\text{or}1$ is recommended. If the output of error messages is undesirable, then the value
$1$ is recommended. Otherwise, if you are not familiar with this argument, the recommended value is
$0$.
When the value $\mathbf{1}\text{or}\mathbf{1}$ is used it is essential to test the value of ifail on exit.
On exit:
${\mathbf{ifail}}={\mathbf{0}}$ unless the routine detects an error or a warning has been flagged (see
Section 6).
6
Error Indicators and Warnings
If on entry
${\mathbf{ifail}}=0$ or
$1$, explanatory error messages are output on the current error message unit (as defined by
x04aaf).
Errors or warnings detected by the routine:
 ${\mathbf{ifail}}=1$

On entry, ${\mathbf{jobu}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{jobu}}=\text{'S'}$ or $\text{'N'}$.
 ${\mathbf{ifail}}=2$

On entry, ${\mathbf{jobvt}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{jobvt}}=\text{'S'}$ or $\text{'N'}$.
 ${\mathbf{ifail}}=3$

On entry, ${\mathbf{m}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{m}}>0$.
 ${\mathbf{ifail}}=4$

On entry, ${\mathbf{n}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{n}}>0$.
 ${\mathbf{ifail}}=6$

On entry, ${\mathbf{lda}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: ${\mathbf{lda}}\ge {\mathbf{m}}$.
 ${\mathbf{ifail}}=7$

On entry, ${\mathbf{k}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: $0<{\mathbf{k}}\le \mathrm{min}\phantom{\rule{0.125em}{0ex}}\left({\mathbf{m}},{\mathbf{n}}\right)$.
 ${\mathbf{ifail}}=8$

On entry,
state vector has been corrupted or not initialized.
 ${\mathbf{ifail}}=9$

On entry, ${\mathbf{jobu}}=\u2329\mathit{\text{value}}\u232a$ and ${\mathbf{ldu}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: if ${\mathbf{jobu}}=\text{'S'}$, ${\mathbf{ldu}}\ge {\mathbf{m}}$.
 ${\mathbf{ifail}}=10$

On entry, ${\mathbf{jobvt}}=\u2329\mathit{\text{value}}\u232a$ and ${\mathbf{ldvt}}=\u2329\mathit{\text{value}}\u232a$.
Constraint: if ${\mathbf{jobvt}}=\text{'S'}$, ${\mathbf{ldvt}}\ge {\mathbf{k}}$.
 ${\mathbf{ifail}}=21$

On exit,
${\mathbf{r}}={\mathbf{k}}$, the rank of
$A$ may be larger than
r.
Increase
k to obtain a more accurate rank estimate.
Smallest diagonal element of
$R$, from
$QR$ of
$Y$,
$=$$\u2329\mathit{\text{value}}\u232a$.
Tolerance used to determine rank
$=$$\u2329\mathit{\text{value}}\u232a$.
 ${\mathbf{ifail}}=22$

$A$ has effective rank of zero.
First diagonal element of $R$, from $QR$ of $Y$, $=$$\u2329\mathit{\text{value}}\u232a$.
Tolerance used to determine rank $=$$\u2329\mathit{\text{value}}\u232a$.
 ${\mathbf{ifail}}=99$
An unexpected error has been triggered by this routine. Please
contact
NAG.
See
Section 7 in the Introduction to the NAG Library FL Interface for further information.
 ${\mathbf{ifail}}=399$
Your licence key may have expired or may not have been installed correctly.
See
Section 8 in the Introduction to the NAG Library FL Interface for further information.
 ${\mathbf{ifail}}=999$
Dynamic memory allocation failed.
See
Section 9 in the Introduction to the NAG Library FL Interface for further information.
7
Accuracy
The error is approximately,
where,
The norm on the lefthand side of the first equation is the spectral norm, and
${\sigma}_{k+1}$ is the
$\left(k+1\right)$th singular value of
$A$. More details on the error bound can be found in Sections 5 and 11 of
Halko et al. (2011).
8
Parallelism and Performance
f10caf is threaded by NAG for parallel execution in multithreaded implementations of the NAG Library.
f10caf makes calls to BLAS and/or LAPACK routines, which may be threaded within the vendor library used by this implementation. Consult the documentation for the vendor library for further information.
Please consult the
X06 Chapter Introduction for information on how to control and interrogate the OpenMP environment used within this routine. Please also consult the
Users' Note for your implementation for any additional implementationspecific information.
The total number of floatingpoint operations is $\mathit{O}\left(\mathrm{log}\left(k\right)mn+{k}^{2}\left(m+n\right)\right)$. The first term corresponds to applying the random projection, i.e., computing $Y=A\Omega $. The second term corresponds to the $QR$ decomposition of $Y$ and the steps required to obtain the SVD of the original matrix $A$.
Deterministic SVD solvers, such as
f08kbf, require
$\mathit{O}\left(m{n}^{2}\right)$ operations when
$m\ge n$ and
$\mathit{O}\left(n{m}^{2}\right)$ operations when
$n\ge m$.
The default values for
rtol_abs and
rtol_rel assume that you need an accurate approximation to
$A$. If you only need to use a small number of singular values or singular vectors, larger values for these tolerances are appropriate. Increasing tolerances sufficiently will decrease
r, the estimated rank. Decreasing
r means that
k can then be decreased to reduce the runtime of the routine.
10
Example
This example finds the singular values, the left and right singular vectors, and the rank of the
$6$ by
$6$ matrix
using the randomised solver,
f10caf, and a deterministic solver,
f08kbf for comparison.
10.1
Program Text
10.2
Program Data
10.3
Program Results