Subsection 3.5.2 Summary
ΒΆClassical Gram-Schmidt orthogonalization: Given a set of linearly independent vectors \(\{ a_0, \ldots,
a_{n-1} \} \subset \Cm \text{,}\) the Gram-Schmidt process computes an orthonormal basis \(\{ q_0, \ldots, q_{n-1} \} \) that spans the same subspace as the original vectors, i.e.
\begin{equation*}
\Span ( \{ a_0, \ldots, a_{n-1} \} ) = \Span ( \{ q_0, \ldots,
q_{n-1} \} ) .
\end{equation*}
The process proceeds as follows:
-
Compute vector \(q_0 \) of unit length so that \(\Span( \{
a_0 \} ) = \Span( \{ q_0 \} ) \text{:}\)
-
\(\rho_{0,0} = \| a_0 \|_2 \)
Computes the length of vector \(a_0 \text{.}\)
-
\(q_0 = a_0 / \rho_{0,0} \)
Sets \(q_0 \) to a unit vector in the direction of \(a_0 \text{.}\)
Notice that \(a_0 = q_0 \rho_{0,0} \)
-
Compute vector \(q_1 \) of unit length so that \(\Span( \{
a_0, a_1 \} ) = \Span( \{ q_0, q_1 \} ) \text{:}\)
-
\(\rho_{0,1} = q_0^H a_1 \)
Computes \(\rho_{0,1} \) so that \(\rho_{0,1}
q_0 = q_0^H a_1 q_0 \) equals the component of \(a_1 \) in the direction of \(q_0 \text{.}\)
-
\(a_1^\perp = a_1 - \rho_{0,1} q_0 \)
Computes the component of \(a_1 \) that is orthogonal to \(q_0 \text{.}\)
-
\(\rho_{1,1} = \| a_1^\perp \|_2 \)
Computes the length of vector \(a_1^\perp \text{.}\)
-
\(q_1 = a_1^\perp / \rho_{1,1} \)
Sets \(q_1 \) to a unit vector in the direction of \(a_1^\perp \text{.}\)
Notice that
\begin{equation*}
\left( \begin{array}{c | c}
a_0 \amp a_1
\end{array} \right)
=
\left( \begin{array}{c | c}
q_0 \amp q_1
\end{array} \right)
\left( \begin{array}{c | c}
\rho_{0,0} \amp \rho_{0,1} \\ \hline
0 \amp \rho_{1,1}
\end{array} \right) .
\end{equation*}
-
Compute vector \(q_2 \) of unit length so that \(\Span( \{
a_0, a_1, a_2 \} ) = \Span( \{ q_0, q_1, q_2 \} ) \text{:}\)
-
\(\begin{array}{l}
\rho_{0,2} = q_0^H a_2 \\
\rho_{1,2} = q_1^H a_2
\end{array}
\mbox{ or, equivalently, }
\left( \begin{array}{c}
\rho_{0,2} \\
\rho_{1,2}
\end{array} \right) =
\left( \begin{array}{c c}
q_0 \amp q_1
\end{array} \right)^H a_2\)
Computes \(\rho_{0,2} \) so that \(\rho_{0,2}
q_0 = q_0^H a_2 q_0 \) and \(\rho_{1,2}
q_1 = q_1^H a_2 q_1 \) equal the components of \(a_2 \) in the directions of \(q_0 \) and \(q_1 \text{.}\)
Or, equivalently, \(\left( \begin{array}{c c}
q_0 \amp q_1
\end{array} \right) \left( \begin{array}{c}
\rho_{0,2} \\
\rho_{1,2}
\end{array} \right) \) is the component in \(\Span( \{ q_0, q_1 \} ) \text{.}\)
-
\(a_2^\perp = a_2 - \rho_{0,2} q_0 - \rho_{1,2}
q_1 =
a_2 - \left( \begin{array}{c c}
q_0 \amp q_1
\end{array} \right) \left( \begin{array}{c}
\rho_{0,2} \\
\rho_{1,2}
\end{array} \right) \)
Computes the component of \(a_2 \) that is orthogonal to \(q_0 \) and \(q_1 \text{.}\)
-
\(\rho_{2,2} = \| a_2^\perp \|_2 \)
Computes the length of vector \(a_2^\perp \text{.}\)
-
\(q_2 = a_2^\perp / \rho_{2,2} \)
Sets \(q_2 \) to a unit vector in the direction of \(a_2^\perp \text{.}\)
Notice that
\begin{equation*}
\left( \begin{array}{c c | c}
a_0 \amp a_1 \amp a_2
\end{array} \right)
=
\left( \begin{array}{c c | c}
q_0 \amp q_1 \amp q_2
\end{array} \right)
\left( \begin{array}{c c | c}
\rho_{0,0} \amp \rho_{0,1} \amp \rho_{0,2} \\
0 \amp \rho_{1,1} \amp \rho_{1,2} \\ \hline
0 \amp 0 \amp \rho_{2,2}
\end{array} \right) .
\end{equation*}
And so forth.
Theorem 3.5.2.1. QR Decomposition Theorem.
Let \(A \in \Cmxn \) have linearly independent columns. Then there exists an orthonormal matrix \(Q \) and upper triangular matrix \(R \) such that \(A = Q R \text{,}\) its QR decomposition. If the diagonal elements of \(R \) are taken to be real and positive, then the decomposition is unique.
Projection a vector \(y \) onto the orthonormal columns of \(Q \in \Cmxn \text{:}\)
\begin{equation*}
\begin{array}{|l | l|} \hline
[ y^\perp, r ] = {\rm Proj\!\!\perp\!\! Q}_{\rm CGS}( Q,
y ) ~~~~~~~~~~~~
\amp
[ y^\perp, r ] = {\rm Proj\!\!\perp\!\! Q}_{\rm MGS}( Q,
y ) ~~~~~~~~~~~~ \\
{\rm (used~by~CGS)}
\amp
{\rm (used~by~MGS)} \\ \hline
y^\perp = y
\amp
y^\perp = y \\
{\bf for~} i=0, \ldots, k-1
\amp
{\bf for~} i=0, \ldots, k-1 \\
~~~\rho_i := q_i^H y
\amp
~~~ \rho_i := q_i^H y^\perp \\
~~~ y^\perp := y^\perp - \rho_i q_i
\amp
~~~ y^\perp := y^\perp - \rho_i q_i \\
{\bf endfor} \amp {\bf endfor} \\ \hline
\end{array}
\end{equation*}
Gram-Schmidt orthogonalization algorithms:
\begin{equation*}
\newcommand{\operation}{ \left[Q, R \right] := \mbox{Gram-Schmidt}( A ) }
\newcommand{\routinename}{ \left[ A, R \right] :=
\mbox{GS}( A ) \quad \mbox{ (overwrites $ A $ with $ Q $)}}
% Step 3: Loop-guard
\newcommand{\guard}{
n( A_L ) \lt n( A )
}
% Step 4: Initialize
\newcommand{\partitionings}{
A \rightarrow
\FlaOneByTwo{A_L}{A_R}
,
R \rightarrow
\FlaTwoByTwo{R_{TL}}{R_{TR}}
{0}{R_{BR}}
}
\newcommand{\partitionsizes}{
A_L {\rm ~has~} 0 {\rm ~columns~and~}
R_{TL} {\rm ~is~} 0 \times 0
}
% Step 5a: Repartition the operands
\newcommand{\repartitionings}{
\FlaOneByTwo{A_L}{A_R}
\rightarrow \FlaOneByThreeR{A_0}{a_1}{A_2}
,
\FlaTwoByTwo{R_{TL}}{R_{TR}}
{0}{R_{BR}}
\rightarrow
\FlaThreeByThreeBR{R_{00}}{r_{01}}{R_{02}}
{0}{\rho_{11}}{r_{12}^T}
{0}{0}{R_{22}}
}
% \newcommand{\repartitionsizes}{
% a_1 {\rm ~and~} q_1 {\rm ~are~columns,~}
% \rho_{11} {\rm is~a~scalar}
% }
% Step 5b: Move the double lines
\newcommand{\moveboundaries}{
\FlaOneByTwo{A_L}{A_R}
\leftarrow \FlaOneByThreeL{A_0}{a_1}{A_2}
,
\FlaTwoByTwo{R_{TL}}{R_{TR}}
{0}{R_{BR}}
\leftarrow
\FlaThreeByThreeTL{R_{00}}{r_{01}}{R_{02}}
{0}{\rho_{11}}{r_{12}^T}
{0}{0}{R_{22}}
}
% Step 8: Insert the updates required to change the
% state from that given in Step 6 to that given in Step 7
% Note: The below needs editing!!!
\newcommand{\update}{
\begin{array}{l | l | l}
\hline
{\rm CGS} \amp {\rm MGS}
\amp {\rm MGS~(alternative)} \\
r_{01} := A_0^H a_1 \amp \amp \\
a_1 := a_1 - A_0 r_{01} \amp
[ a_1, r_{01} ] = {\rm Proj\!\!\perp\!\! toQ}_{\rm MGS}( A_0, a_1 ) \amp \\
\rho_{11} := \| a_1 \|_2 \amp \rho_{11} := \| a_1 \|_2 \amp
\rho_{11} := \| a_1 \|_2 \\
a_1 := a_1 / \rho_{11} \amp q_1 := a_1 / \rho_{11}
\amp a_1 := a_1 / \rho_{11} \\
\amp \amp r_{12}^T := a_1^H A_2 \\
\amp \amp A_2 := A_2 - a_1 r_{12}^T \\ \hline
\end{array}
}
\FlaAlgorithm % this command generates the worksheet using the
% commands that were defined between the \resetsteps
% and the worksheet command
\end{equation*}
Classic example that shows that the columns of \(Q \text{,}\) computed by MGS, are "more orthogonal" than those computed by CGS:
\begin{equation*}
A = \left(
\begin{array}{c | c | c } 1 \amp 1 \amp 1 \\ \epsilon \amp 0
\amp 0 \\ 0 \amp \epsilon \amp 0 \\ 0 \amp 0 \amp \epsilon
\end{array} \right) = \left( \begin{array}{c | c | c } a_0
\amp a_1 \amp a_2 \end{array} \right).
\end{equation*}
Cost of Gram-Schmidt algorithms: approximately \(2 m n^2\) flops.
Definition 3.5.2.2.
Let \(u \in \Cn \) be a vector of unit length (\(\| u \|_2 = 1 \)). Then \(H = I - 2 u u^H \) is said to be a Householder transformation or (Householder) reflector.
If \(H \) is a Householder transformation (reflector), then
Computing a Householder transformation \(I - 2 u u^H \text{:}\)
Practical computation of \(u \) and \(\tau \) so that \(I - u u^H / tau \) is a Householder transformation (reflector):
\begin{equation*}
\begin{array}{|l|} \hline
{\rm Algorithm:~}
\left[
\FlaTwoByOneSingleLine
{\rho}
{u_2} , \tau \right]
=
{\rm Housev}
\left( \FlaTwoByOneSingleLine
{\chi_1}
{x_2} \right) \\ \hline
\begin{array}{l | l}
\amp
\chi_2 := \| x_2 \|_2 \\
\amp
\alpha := \left\| \left( \begin{array}{c}
\chi_1 \\
\chi_2
\end{array}
\right)
\right\|_2 (= \| x \|_2) \\
\rho = - \sign( \chi_1 ) \| x \|_2
\amp
\rho := - \sign( \chi_1 ) \alpha \\
\nu_1 = \chi_1 + \sign( \chi_1 ) \| x \|_2
\amp
\nu_1 := \chi_1 - \rho \\
u_2 = x_2 / \nu_1
\amp
u_2 := x_2 / \nu_1 \\
\amp
\chi_2 = \chi_2 / \vert \nu_1 \vert ( = \| u_2 \|_2 ) \\
\tau = ( 1 + u_2^H u_2 ) / 2
\amp
\tau = ( 1 + \chi_2^2 ) / 2
\end{array}
\\ \hline
\end{array}
\end{equation*}
Householder QR factorization algorithm:
\begin{equation*}
\newcommand{\routinename}{[ A, t ] = {\rm HQR\_unb\_var1}( A )}
\newcommand{\guard}{n( A_{BR} ) > 0}
\newcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} }
{\rm ~and~}
t \rightarrow
\FlaTwoByOne{ t_{T} }
{ t_{B} }
}
\newcommand{\partitionsizes}{
A_{TL} {\rm ~is~} 0 \times 0 {\rm ~and~} t_T {\rm ~has~} 0 {\rm
~elements}
}
\newcommand{\repartitionings}{
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} } \rightarrow
\FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} }
{ a_{10}^T }{ \alpha_{11} }{ a_{12}^T }
{ A_{20} }{ a_{21} }{ A_{22} }
{\rm ~and~}
\FlaTwoByOne{ t_{T} }
{ t_{B} } \rightarrow
\FlaThreeByOneB{ t_{0} }
{ \tau_{1} }
{ t_{2} }
}
\newcommand{\moveboundaries}{
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} } \leftarrow
\FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} }
{ a_{10}^T }{ \alpha_{11} }{ a_{12}^T }
{ A_{20} }{ a_{21} }{ A_{22} }
{\rm ~and~}
\FlaTwoByOne{ t_{T} }
{ t_{B} } \leftarrow
\FlaThreeByOneT{ t_{0} }
{ \tau_{1} }
{ t_{2} }
}
\newcommand{\update}{
\begin{array}{l}
\left[
\FlaTwoByOneSingleLine
{ \alpha_{11} }
{ a_{21}}, \tau_1
\right]
:=
\left[
\FlaTwoByOneSingleLine
{ \rho_{11} }
{ u_{21}}, \tau_1
\right]
=
{\rm Housev}
\left( \begin{array}{c}
\alpha_{11} \\
a_{21}
\end{array}
\right) \\
{\rm Update~}
\left(
\begin{array}{c}
a_{12}^T \\
A_{22}
\end{array}
\right) :=
\left(
I -
\frac{1}{\tau_1}
\left(
\begin{array}{c}
1 \\
u_{21}
\end{array} \right)
\left(
\begin{array}{c c }
1 \amp u_{21}^H
\end{array}
\right)
\right)
\left(
\begin{array}{c}
a_{12}^T \\
A_{22}
\end{array}
\right) \\
~~~~~{\rm ~via~the~steps~} \\
~~~~~~~~~~~~~~~
w_{12}^T := ( a_{12}^T + a_{21}^H A_{22} )/\tau_1 \\
~~~~~~~~~~~~~~~\FlaTwoByOneSingleLine
{ a_{12}^T }
{ A_{22} }
:=
\FlaTwoByOneSingleLine
{ a_{12}^T - { w_{12}^T } }
{ A_{22} - { a_{21} } { w_{12}^T } }
\end{array}
}
\FlaAlgorithm
\end{equation*}
Cost: approximately \(2 m n^2 - \frac{2}{3} n^3 \) flops.
Algorithm for forming \(Q \) given output of Householder QR factorization algorithm:
\begin{equation*}
\newcommand{\routinename}{[ A ] = {\rm FormQ}( A, t )}
\newcommand{\guard}{n( A_{TL} ) > 0}
\newcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{ A_{TL} } { A_{TR}}
{ A_{BL} } { A_{BR} },
t \rightarrow
\FlaTwoByOne{ t_{T} }
{ t_{B} }
}
\newcommand{\partitionsizes}{
A_{TL} {\rm ~is~} n(A) \times n(A) {\rm ~and~} t_T {\rm ~has~} n(A)
{\rm ~elements}
}
\newcommand{\repartitionings}{
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} } \rightarrow
\FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} }
{ a_{10}^T }{ \alpha_{11} }{ a_{12}^T }
{ A_{20} }{ a_{21} }{ A_{22} },
\FlaTwoByOne{ t_{T} }
{ t_{B} } \rightarrow
\FlaThreeByOneT{ t_{0} }
{ \tau_{1} }
{ t_{2} }
}
\newcommand{\moveboundaries}{
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} } \leftarrow
\FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} }
{ a_{10}^T }{ \alpha_{11} }{ a_{12}^T }
{ A_{20} }{ a_{21} }{ A_{22} },
\FlaTwoByOne{ t_{T} }
{ t_{B} } \leftarrow
\FlaThreeByOneB{ t_{0} }
{ \tau_{1} }
{ t_{2} }
}
\newcommand{\update}{
\begin{array}{l}
{\rm Update~}
\left(
\begin{array}{c | c}
\alpha_{11} \amp a_{12}^T \\ \hline
a_{21} \amp A_{22}
\end{array}
\right) := \\
~~~~~~~~~~~~~~~ \left(
I -
\frac{1}{\tau_1}
\left(
\begin{array}{c}
1 \\ \hline
u_{21}
\end{array} \right)
\left(
\begin{array}{c | c }
1 \amp u_{21}^H
\end{array}
\right)
\right)
\left(
\begin{array}{c | c}
1 \amp 0 \\ \hline
0 \amp A_{22}
\end{array}
\right) \\
~~~~{\rm via~the~steps} \\
~~~~~~~~~~{ \alpha_{11} }
:=
{ 1 - 1/\tau_1 } \\
~~~~~~~~~~a_{12}^T := - ( a_{21}^H A_{22} )/\tau_1 \\
~~~~~~~~~~{ A_{22} }
:=
{ A_{22} + { a_{21} } { a_{12}^T } }\\
~~~~~~~~~~{ a_{21} }
:=
{ - a_{21}/\tau_1 }
\end{array}
}
\FlaAlgorithm
\end{equation*}
Cost: approximately \(2 m n^2 - \frac{2}{3} n^3 \) flops.
Algorithm for applying \(Q^H \) given output of Householder QR factorization algorithm:
\begin{equation*}
\newcommand{\routinename}{[ y ] = {\rm Apply\_QH}( A, t, y )}
\newcommand{\guard}{n( A_{BR} ) \lt 0}
\newcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} }
,
t \rightarrow
\FlaTwoByOne{ t_{T} }
{ t_{B} }
,
y \rightarrow
\FlaTwoByOne{ y_{T} }
{ y_{B} }
}
\newcommand{\partitionsizes}{
A_{TL} {\rm ~is~} 0 \times 0 {\rm ~and~} t_T, y_T {\rm ~have~} 0
{\rm ~elements}}
\newcommand{\repartitionings}{
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} } \rightarrow
\FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} }
{ a_{10}^T }{ \alpha_{11} }{ a_{12}^T }
{ A_{20} }{ a_{21} }{ A_{22} }
,\\
\FlaTwoByOne{ t_{T} }
{ t_{B} } \rightarrow
\FlaThreeByOneB{ t_{0} }
{ \tau_{1} }
{ t_{2} }
,
\FlaTwoByOne{ y_{T} }
{ y_{B} } \rightarrow
\FlaThreeByOneB{ y_{0} }
{ \psi_{1} }
{ y_{2} }
}
\newcommand{\moveboundaries}{
\FlaTwoByTwo{ A_{TL} } {A_{TR}}
{ A_{BL} } { A_{BR} } \leftarrow
\FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} }
{ a_{10}^T }{ \alpha_{11} }{ a_{12}^T }
{ A_{20} }{ a_{21} }{ A_{22} }
, \\
\FlaTwoByOne{ t_{T} }
{ t_{B} } \leftarrow
\FlaThreeByOneT{ t_{0} }
{ \tau_{1} }
{ t_{2} }
,
\FlaTwoByOne{ y_{T} }
{ y_{B} } \leftarrow
\FlaThreeByOneT{ y_{0} }
{ \psi_{1} }
{ y_{2} }
}
\newcommand{\update}{
\begin{array}{l}
{\rm Update~}
\left(
\begin{array}{c}
\psi_1 \\
y_{2}
\end{array}
\right) :=
\left(
I -
\frac{1}{\tau_1}
\left(
\begin{array}{c}
1 \\
u_{21}
\end{array} \right)
\left(
\begin{array}{c c }
1 \amp u_{21}^H
\end{array}
\right)
\right)
\left(
\begin{array}{c}
\psi_1 \\
y_{2}
\end{array}
\right) \\
~~~~~~~~~~{\rm via~the~steps} \\
~~~~~~~~~~~~~~~ \omega_{1} := ( \psi_{1} + a_{21}^H y_{2} )/\tau_1 \\
~~~~~~~~~~~~~
\FlaTwoByOneSingleLine
{ \psi_{1} }
{ y_{2} }
:=
\FlaTwoByOneSingleLine
{ \psi_{1} - { \omega_{1} } }
{ y_{2} - \omega_1 { u_{2} } }
\end{array}
}
\FlaAlgorithm
\end{equation*}
Cost: approximately \(4 m n - n^2 \) flops.