Subsection 11.3.3 Jacobi's method for computing the Singular Value Decomposition
ΒΆJust like the QR algorithm for computing the Spectral Decomposition was modified to compute the SVD, so can the Jacobi Method for computing the Spectral Decomposition.
The insight is very simple. Let \(A \in \mathbb R^{m \times n} \) and partition it by columns:
One could form \(B = A^T A \) and then compute Jacobi rotations to diagonalize it:
We recall that if we order the columns of \(Q \) and diagonal elements of \(D \) appropriately, then choosing \(V = Q \) and \(\Sigma = D^{1/2} \) yields
or, equivalently,
This means that if we apply the Jacobi rotations \(J_{2,1}, J_{3,1}, \ldots \) from the right to \(A \text{,}\)
then, once \(B \) has become (approximately) diagonal, the columns of \(\widehat A = ( ( A J_{2,1} ) J_{3,1} ) \cdots \) are mutually orthogonal. By scaling them to have length one, setting \(\Sigma = \diag{ \| \widehat a_0 \|_2, \| \widehat a_1 \|_2, \ldots , \| \widehat a_{n-1} \|_2} \text{,}\) we find that
The only problem is that in forming \(B \text{,}\) we may introduce unnecessary error since it squares the condition number.
Here is a more practical algorithm. We notice that
We observe that we don't need to form all of \(B \text{.}\) When it is time to compute \(J_{i,j} \text{,}\) we need only compute
from which \(J_{i,j} \) can be computed. By instead applying this Jacobi rotation to \(B \text{,}\) we observe that
and hence the Jacobi rotation can instead be used to take linear combinations of the \(i \)th and \(j\)th columns of \(A \text{:}\)
We have thus outlined an algorithm:
Starting with matrix \(A \text{,}\) compute a sequence of Jacobi rotations (e.g., corresponding to a column-cyclic Jacobi method) until the off-diagonal elements of \(A^T A \) (parts of which are formed as Jacobi rotations are computed) become small. Every time a Jacobi rotation is computed, it updates the appropriate columns of \(A \text{.}\)
-
Accumulate the Jacobi rotations into matrix \(V \text{,}\) by applying them from the right to an identity matrix:
\begin{equation*} V = (( I \times J_{2,1}) J_{3,1} ) \cdots \end{equation*} -
Upon completion,
\begin{equation*} \Sigma = \diag{ \| a_0 \|_2, \| a_1 \|_2, \ldots , \| a_{n-1} \|_2} \end{equation*}and
\begin{equation*} U = A \Sigma^{-1}, \end{equation*}meaning that each column of the updated \(A \) is divided by its length.
If necessary, reorder the columns of \(U \) and \(V \) and the diagonal elements of \(\Sigma \text{.}\)
Obviously, there are variations on this theme. Such methods are known as one-sided Jacobi methods.