Subsection 11.2.4 Implicitly shifted bidiagonal QR algorithm
ΒΆConverting a (tridiagonal) implicitly shifted QR algorithm into a (bidiagonal) implicitly shifted QR algorithm now hinges on some key insights, which we will illustrate with a \(4 \times 4 \) example.
-
We start with a bidiagonal matrix \(B^{(k)}\)
\begin{equation*} B^{(k)} = \left( \begin{array}{c c c c} \beta_{0,0} \amp \beta_{0,1} \amp 0 \amp 0 \\ 0 \amp \beta_{1,1} \amp \beta_{1,2} \amp 0 \\ 0 \amp 0 \amp \beta_{2,2} \amp \beta_{2,3} \\ 0 \amp 0 \amp 0 \amp \beta_{3,3} \\ \end{array} \right) , \end{equation*}which is our "current iteration."
-
If we explicitly form \(T^{(k)} = B^{(k)\,T} B^{(k)} \text{,}\) then we would have to form
\begin{equation*} \begin{array}{rcl} T^{(k)} \amp=\amp \left( \begin{array}{c c c c} \tau_{0,0} \amp \tau_{1,0} \amp 0 \amp 0 \\ \tau_{1,0} \amp \tau_{1,1} \amp \tau_{2,1} \amp 0 \\ 0 \amp \tau_{2,1} \amp \tau_{2,2} \amp \tau_{3,2} \\ 0 \amp 0 \amp \tau_{3,2} \amp \tau_{3,3} \end{array} \right) \\ \amp=\amp \left( \begin{array}{c c c c } \beta_{0,0}^2 \amp \beta_{0,1} \beta_{0,0} \amp 0 \amp 0 \\ \beta_{0,1} \beta_{0,0}\amp \beta_{0,1}^2 + \beta_{1,1}^2 \amp \beta_{1,2} \beta_{1,1} \amp 0 \\ 0 \amp \beta_{1,2} \beta_{1,1}\amp \beta_{1,2}^2 + \beta_{2,2}^2 \amp \beta_{2,3} \beta_{2,2} \\ 0 \amp 0 \amp \beta_{2,3} \beta_{2,2}\amp \beta_{2,3}^2 + \beta_{3,3}^2 \end{array} \right) \end{array} \end{equation*} -
The Francis Implicit QR Step would then compute a first Givens' rotation so that
\begin{equation} \begin{array}[t]{c} \underbrace{ \left( \begin{array}{c c} \gamma_0 \amp -\sigma_0 \\ \sigma_0 \amp \gamma_0 \end{array} \right)^T } \\ G_0^T \end{array} \left( \begin{array}{c} \tau_{0,0} - \tau_{3,3} \\ \tau_{1,0} \end{array}\right) = \left( \begin{array}{c} \times \\ 0 \end{array}\right).\label{chapter11-G0}\tag{11.2.1} \end{equation} -
With this Givens' rotation, it would introduce a bulge
\begin{equation*} \begin{array}{l} \left( \begin{array}{| c c | c c} \times \amp \times \amp {\color{red} \times} \amp 0 \\ \hline \times \amp \times \amp \times \amp 0 \\ {\color{red} \times} \amp \times \amp \times \amp \times \\ \hline 0 \amp 0 \amp \times \amp \times \end{array} \right) \\ = \left( \begin{array}{c | c c} G_0^T \amp 0 \amp 0\\ \hline 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{array} \right) \left( \begin{array}{c c | c c} \tau_{0,0} \amp \tau_{1,0} \amp 0 \amp 0 \\ \tau_{1,0} \amp \tau_{1,1} \amp \tau_{2,1} \amp 0 \\ \hline 0 \amp \tau_{2,1} \amp \tau_{2,2} \amp \tau_{3,2} \\ 0 \amp 0 \amp \tau_{3,2} \amp \tau_{3,3} \end{array} \right) \left( \begin{array}{c | c c} G_0 \amp 0 \amp 0\\ \hline 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{array} \right) \end{array} \end{equation*} -
Finally, the bulge would be chased out
\begin{equation*} \begin{array}{l} T^{(k+1)} = \left( \begin{array}{c c c c} \times \amp \times \amp 0 \amp 0 \\ \times \amp \times \amp \times \amp 0 \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \end{array} \right) \\ = \left( \begin{array}{c c | c } 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ \hline 0 \amp 0\amp G_2^T \end{array} \right) \\ ~~~~~~ \begin{array}[t]{c} \underbrace{ \left[ \left( \begin{array}{c | c | c} 1 \amp 0 \amp 0 \\ \hline 0 \amp G_1^T \amp 0 \\ \hline 0 \amp 0 \amp 1 \end{array} \right) \left( \begin{array}{c | c c | c} \times \amp \times \amp {\color{red} \times} \amp 0 \\ \hline \times \amp \times \amp \times \amp 0 \\ {\color{red} \times} \amp \times \amp \times \amp \times \\ \hline 0 \amp 0 \amp \times \amp \times \end{array} \right) \left( \begin{array}{c | c | c} 1 \amp 0 \amp 0 \\ \hline 0 \amp G_1 \amp 0 \\ \hline 0 \amp 0 \amp 1 \end{array} \right) \right] } \\ \left( \begin{array}{c c | c c |} \times \amp \times \amp 0 \amp 0 \\ \times \amp \times \amp \times \amp {\color{red} \times} \\ \hline 0 \amp \times \amp \times \amp \times \\ 0 \amp {\color{red} \times} \amp \times \amp \times \\ \hline \end{array} \right) \end{array} \left( \begin{array}{c c | c } 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp 0 \\ \hline 0 \amp 0\amp G_2 \end{array} \right) . \end{array} \end{equation*}
Obviously, this extends.
Let us examine what would happen if we instead apply these Givens' rotations to \(B^{(k)\,T} B^{(k)} \text{.}\) Since \(T^{(k)} = B^{(k)\,T} B^{(k)} \text{,}\) we find that
The observation now is that if we can find two sequences of Givens' rotations such that
then, by the Implicit Q Theorem,
If we iterate in this way, we know that \(T^{(k)} \) converge to a diagonal matrix (under mild conditions). This means that the matrices \(B^{(k)} \) converge to a diagonal matrix, \(\Sigma_B \text{.}\) If we accumulate all Givens' rotations into matrices \(U_B \) and \(V_B \text{,}\) then we end up with the SVD of \(B \text{:}\)
modulo, most likely, a reordering of the diagonal elements of \(\Sigma_B \) and a corresponding reordering of the columns of \(U_B \) and \(V_B \text{.}\)
This leaves us with the question of how to find the two sequences of Givens' rotations mentioned in (11.2.2).
We know \(G_0 \text{,}\) which was computed from (11.2.1). Importantly, computing this first Givens' rotation requires only that the elements \(\tau_{0,0} \text{,}\) \(\tau_{1,0} \text{,}\) and \(\tau_{m-1,m-1} \) of \(T^{(k)} \) to be explicitly formed.
-
If we apply it to \(B^{(k)} \text{,}\) we introduce a bulge:
\begin{equation*} \left( \begin{array}{| c c | c c} \times \amp \times \amp 0 \amp 0 \\ {\color{red} \times} \amp \times \amp \times \amp 0 \\ 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp 0 \amp \times \end{array} \right) = \left( \begin{array}{| c c | c c} \beta_{0,0} \amp \beta_{0,1} \amp 0 \amp 0 \\ 0 \amp \beta_{1,1} \amp \beta_{1,2} \amp 0 \\ 0 \amp 0 \amp \beta_{2,2} \amp \beta_{2,3} \\ 0 \amp 0 \amp \amp \beta_{3,3} \\ \end{array} \right) \left( \begin{array}{| c | c c} G_0 \amp 0 \amp 0\\ \hline 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{array} \right). \end{equation*} -
We next compute a Givens' rotation, \(\widehat G_0 \text{,}\) that changes the nonzero that was introduced below the diagonal back into a zero.
\begin{equation*} \left( \begin{array}{c c c c} \hline \times \amp \times \amp {\color{red} \times} \amp 0 \\ 0 \amp \times \amp \times \amp 0 \\ \hline 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp 0 \amp \times \end{array} \right) = \left( \begin{array}{c | c c} \widehat G_0^T \amp 0 \amp 0\\ \hline 0 \amp 1 \amp 0 \\ 0 \amp 0 \amp 1 \end{array} \right) \left( \begin{array}{c c c c} \hline \times \amp \times \amp 0 \amp 0 \\ {\color{red} \times} \amp \times \amp \times \amp 0 \\ \hline 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp 0 \amp \times \end{array} \right) . \end{equation*} -
This means we now need to chase the bulge that has appeared above the superdiagonal:
\begin{equation*} \left( \begin{array}{c | c c | c} \times \amp \times \amp 0 \amp 0 \\ 0 \amp \times \amp \times \amp 0 \\ 0 \amp {\color{red} \times} \amp \times \amp \times \\ 0 \amp 0 \amp 0 \amp \times \end{array} \right) = \left( \begin{array}{c | c c | c} \times \amp \times \amp {\color{red} \times} \amp 0 \\ 0 \amp \times \amp \times \amp 0 \\ 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp 0 \amp \times \end{array} \right) \left( \begin{array}{c | c | c} 1 \amp 0 \amp 0 \\ \hline 0 \amp \widetilde G_1 \amp 0 \\ \hline 0 \amp 0 \amp 1 \end{array} \right) \end{equation*} We continue like this until the bulge is chased out the end of the matrix.
The net result is an implicitly shifted bidiagonal QR algorithm that is applied directly to the bidiagonal matrix, maintains the bidiagonal form from one iteration to the next, and converges to a diagonal matrix that has the singular values of \(B\) on its diagonal. Obviously, deflation can be added to this scheme to further reduce its cost.