Subsection 5.2.4 Gaussian elimination via Gauss transforms
¶ VIDEO
Definition 5.2.4.1 . A matrix \(L_k \) of the form
\begin{equation*}
L_k =
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ l_{21} }{ I },
\end{equation*}
where \(I_k \) is the \(k \times k \) identity matrix and \(I \) is an identity matrix "of appropriate size" is called a Gauss transform.Gauss transforms, when applied to a matrix, take multiples of the row indexed with \(k \) and add these multiples to other rows. In our use of Gauss transforms to explain the LU factorization, we subtract instead:
Example 5.2.4.2 .
Evaluate
\begin{equation*}
\left( \begin{array}{c | c c c }
1 \amp 0 \amp 0 \amp 0 \\ \hline
0 \amp 1 \amp 0 \amp 0 \\
0 \amp - \lambda_{21} \amp 1 \amp 0 \\
0 \amp - \lambda_{31} \amp 0 \amp 1
\end{array}
\right)
\left(
\begin{array}{c}
\widetilde a_0^T \\ \hline
\widetilde a_1^T \\
\widetilde a_2^T \\
\widetilde a_3^T
\end{array}
\right)
=
\end{equation*}
Solution
\begin{equation*}
\left( \begin{array}{c | c c c }
1 \amp 0 \amp 0 \amp 0 \\ \hline
0 \amp 1 \amp 0 \amp 0 \\
0 \amp - \lambda_{21} \amp 1 \amp 0 \\
0 \amp - \lambda_{31} \amp 0 \amp 1
\end{array}
\right)
\left(
\begin{array}{c}
\widetilde a_0^T \\ \hline
\widetilde a_1^T \\
\widetilde a_2^T \\
\widetilde a_3^T
\end{array}
\right)
=
\left(
\begin{array}{c}
\widetilde a_0^T \\ \hline
\widetilde a_1^T \\
\left( \begin{array}{c}
\widetilde a_2^T\\
\widetilde a_3^T
\end{array}
\right)
-
\left( \begin{array}{c}
\lambda_{21} \\
\lambda_{31}
\end{array}
\right)
\widetilde a_1^T
\end{array}
\right)
=
\left(
\begin{array}{c}
\widetilde a_0^T \\ \hline
\widetilde a_1^T \\
\widetilde a_2^T - \lambda_{21} \widetilde a_1^T\\
\widetilde a_3^T - \lambda_{31} \widetilde a_1^T\\
\end{array}
\right).
\end{equation*}
Notice the similarity with what one does in Gaussian elimination: take multiples of one row and subtracting these from other rows.
Homework 5.2.4.1 .
Evaluate
\begin{equation*}
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ -l_{21} }{ I }
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}}
\end{equation*}
where \(I_k \) is the \(k \times k \) identity matrix and \(A_0 \) has \(k \) rows. If we compute
\begin{equation*}
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{\widehat a_{21}}{\widehat A_{22}}
:=
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ -l_{21} }{ I }
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}}
\end{equation*}
how should \(l_{21} \) be chosen if we want \(\widehat a_{21}\) to be a zero vector?
Solution
\begin{equation*}
\begin{array}{l}
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ - l_{21} }{ I }
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}} \\
~~~ =
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{-l_{21} \alpha_{11} + a_{21}}{
-l_{21} a_{12}^T + A_{22}} \\
~~~ =
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21} -\alpha_{11} l_{21}}
{A_{22} -l_{21} a_{12}^T}
\end{array}
\end{equation*}
If \(l_{21} = a_{21} / \alpha_{11} \) then \(\widetilde a_{21} = a_{21} - \alpha_{11} a_{21} /
\alpha_{11} = 0 \text{.}\)
Hopefully you notice the parallels between the computation in the last homework, and the algorithm in Figure 5.2.1.1 .
Now, assume that the right-looking LU factorization has proceeded to where \(A \) contains
\begin{equation*}
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}},
\end{equation*}
where \(A_{00} \) is upper triangular (recall: it is being overwritten by \(U \text{!}\)). What we would like to do is eliminate the elements in \(a_{21} \) by taking multiples of the "current row"' \(\left( \begin{array}{c| c}\alpha_{11} \amp a_{12}^T
\end{array}\right) \) and subtract these from the rest of the rows: \(\left( \begin{array}{c| c} a_{21} \amp A_{22}
\end{array}\right) \) in order to introduce zeroes below \(\alpha_{11} \text{.}\) The vehicle is an appropriately chosen Gauss transform, inspired by Homework 5.2.4.1 . We must determine \(l_{21} \) so that
\begin{equation*}
\FlaThreeByThreeBR
{I}{0}{0}
{0}{1}{0}
{0}{-l_{21}}{I}
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}}
=
\FlaThreeByThreeBR
{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{0}{A_{22} - l_{21} a_{12}^T}.
\end{equation*}
As we saw in Homework 5.2.4.1 , this means we must pick \(l_{21} = a_{21} / \alpha_{11} \text{.}\) The resulting algorithm is summarized in Figure 5.2.4.3 . Notice that this algorithm is, once again, identical to the algorithm in Figure 5.2.1.1 (except that it does not overwrite the lower triangular matrix).
\begin{equation*}
\newcommand{\routinename}{ A = \mbox{GE-via-Gauss-transforms}( A )}
\newcommand{\guard}{
n( A_{TL} ) \lt n( A )
}
\newcommand{\partitionings}{
A \rightarrow
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
% ,
% L \rightarrow
% \FlaTwoByTwo{L_{TL}}{L_{TR}}
% {L_{BL}}{L_{BR}}
}
\newcommand{\partitionsizes}{
A_{TL} {\rm ~is~} 0 \times 0
% , L_{TL} {\rm ~are~} 0 \times 0
}
\newcommand{\repartitionings}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\rightarrow
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
% ,
% \FlaTwoByTwo{L_{TL}}{L_{TR}}
% {L_{BL}}{L_{BR}}
% \rightarrow
% \FlaThreeByThreeBR{L_{00}}{l_{01}}{L_{02}}
% {l_{10}^T}{\lambda_{11}}{l_{12}^T}
% {L_{20}}{l_{21}}{L_{22}}
}
\newcommand{\moveboundaries}{
\FlaTwoByTwo{A_{TL}}{A_{TR}}
{A_{BL}}{A_{BR}}
\leftarrow
\FlaThreeByThreeTL{A_{00}}{a_{01}}{A_{02}}
{a_{10}^T}{\alpha_{11}}{a_{12}^T}
{A_{20}}{a_{21}}{A_{22}}
% ,
% \FlaTwoByTwo{L_{TL}}{L_{TR}}
% {L_{BL}}{L_{BR}}
% \leftarrow
% \FlaThreeByThreeTL{L_{00}}{l_{01}}{L_{02}}
% {l_{10}^T}{\lambda_{11}}{l_{12}^T}
% {L_{20}}{l_{21}}{L_{22}}
}
\newcommand{\update}{
\begin{array}{ll}
l_{21} \becomes a_{21} / \alpha_{11} \\
\\
\left.
\begin{array}{l}
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}} \\
~~~~~
\becomes
\FlaThreeByThreeBR{I}{0}{0}
{0}{1}{0}
{0}{-l_{21}}{0}
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{a_{21}}{A_{22}} \\
~~~~~=~
\FlaThreeByThreeBR{A_{00}}{a_{01}}{A_{02}}
{0}{\alpha_{11}}{a_{12}^T}
{0}{0}{A_{22}-l_{21} a_{12}^T}
\end{array}
\right\}
\begin{array}{l}
a_{21} := 0 \\
A_{22} := A_{22} - l_{21} a_{12}^T
\end{array}
\end{array}
}
\FlaAlgorithm
\end{equation*}
Figure 5.2.4.3. Gaussian elimination, formulated as a sequence of applications of Gauss transforms.
Homework 5.2.4.2 .
Show that
\begin{equation*}
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ -l_{21} }{ I }^{-1}
=
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ l_{21} }{ I}
\end{equation*}
where \(I_k \) denotes the \(k \times k \) identity matrix.
Hint To show that \(B = A^{-1} \text{,}\) it suffices to show that \(B A = I \) (if \(A \) and \(B \) are square).
Solution
\begin{equation*}
\begin{array}{l}
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ -l_{21} }{ I_{(n-k-1) \times (n-k-1)} }
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ l_{21} }{ I } \\
~~~ =
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ -l_{21} + l_{21} }{ I } \\
~~~ =
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ 0 }{ I }
\end{array}
\end{equation*}
Starting with an \(m \times m \) matrix \(A \text{,}\) the algorithm computes a sequence of \(m \) Gauss transforms \(L_0 , \ldots , L_{m-1} \text{,}\) each of the form
\begin{equation}
L_k =
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ -l_{21} }{ I }, \label{chapter05-gauss-transform-eqn1}\tag{5.2.2}
\end{equation}
such that \(L_{m-1} L_{m-2} \cdots L_1 L_0 A = U \text{.}\) Equivalently, \(A = L_0^{-1} L_1^{-1} \cdots L_{m-2}^{-1} L_{m-1}^{-1} U
\text{,}\) where
\begin{equation*}
L_k^{-1} =
\FlaThreeByThreeBR
{ I_k }{ 0 }{ 0 }
{ 0 }{ 1 }{ 0 }
{ 0 }{ l_{21} }{ I }.
\end{equation*}
It is easy to show that the product of unit lower triangular matrices is itself unit lower triangular. Hence
\begin{equation*}
L = L_0^{-1} L_1^{-1} \cdots L_{n-2}^{-1} L_{n-1}^{-1}
\end{equation*}
is unit lower triangular. However, it turns out that this \(L \) is particularly easy to compute, as the following homework suggests.
Homework 5.2.4.3 .
Let
\begin{equation*}
\tilde L_{k-1} = L_0^{-1} L_1^{-1} \dots L_{k-1}^{-1}
=
\FlaThreeByThreeBR
{ L_{00} }{0}{0}
{ l_{10}^T }{1}{0}
{ L_{20} }{0}{I}
\quad
\mbox{and}
\quad
L_k^{-1} =
\FlaThreeByThreeBR
{ I_k }{0}{0}
{ 0 }{1}{0}
{ 0 }{ l_{21} }{I},
\end{equation*}
where \(L_{00} \) is a \(k \times k \) unit lower triangular matrix. Show that
\begin{equation*}
\tilde L_{k} = \tilde L_{k-1}^{-1} L_k^{-1} =
\FlaThreeByThreeBR
{ L_{00} }{0}{0}
{ l_{10}^T }{1}{0}
{ L_{20} }{ l_{21}}{I}.
\end{equation*}
Solution
\begin{equation*}
\begin{array}{rcl}
\tilde L_{k} \amp = \amp
L_0^{-1} L_1^{-1} \cdots L_{k-1} L_k^{-1}
=
\tilde L_{k-1} L_k^{-1} \\
\amp = \amp
\FlaThreeByThreeBR
{ L_{00} }{0}{0}
{ l_{10}^T }{1}{0}
{ L_{20} }{ 0 }{I}
\FlaThreeByThreeBR
{ I_k }{0}{0}
{ 0 }{1}{0}
{ 0 }{ l_{21}}{I} \\
\amp = \amp
\FlaThreeByThreeBR
{ L_{00} }{0}{0}
{ l_{10}^T }{1}{0}
{ L_{20} }{ l_{21}}{I}.
\end{array}
\end{equation*}
What this exercise shows is that \(L = L_0^{-1} L_1^{-1} \cdots L_{n-2}^{-1} L_{n-1}^{-1} \) is the triangular matrix that is created by simply placing the computed vectors \(l_{21} \) below the diagonal of a unit lower triangular matrix. This insight explains the "magic" observed in Homework 5.2.1.3 . We conclude that the algorithm in Figure 5.2.1.1 overwrites \(n \times n \) matrix \(A \) with unit lower triangular matrix \(L \) and upper triangular matrix \(U \) such that \(A = L U \text{.}\) This is known as the LU factorization or LU decomposition of \(A \text{.}\)
Ponder This 5.2.4.4 .
Let
\begin{equation*}
L_k = \left( \begin{array}{c | c c}
I_{k \times k} \amp 0 \amp 0 \\ \hline
0 \amp 1 \amp 0 \\
0 \amp - l_{21} \amp I
\end{array} \right).
\end{equation*}
Show that
\begin{equation*}
\kappa_2( L_k ) \geq \| l_{21} \|_2^2.
\end{equation*}
What does this mean about how error in \(A \) may be amplified if the pivot (the \(\alpha_{11} \) by which entries in \(a_{21} \) are divided to compute \(l_{21} \)) encountered in the right-looking LU factorization algorithm is small in magnitude relative to the elements below it? How can we chose which row to swap so as to minimize \(\| l_{21} \|_2 \text{?}\)