Subsection 6.4.1 Numerical stability of triangular solve
¶We now use the error results for the dot product to derive a backward error result for solving \(L x = y \text{,}\) where \(L \) is an \(n \times n \) lower triangular matrix, via the algorithm in Figure 6.4.1.1, a variation on the algorithm in Figure 5.3.5.1 that stores the result in vector \(x \) and does not assume that \(L \) is unit lower triangular.
To establish the backward error result for this algorithm, we need to understand the error incurred in the key computation
The following theorem gives the required (forward error) result, abstracted away from the specifics of how it occurs in the lower triangular solve algorithm.
Lemma 6.4.1.2.
Let \(n \geq 1 \text{,}\) \(\lambda, \nu \in \R \) and \(x, y \in \Rn \text{.}\) Assume \(\lambda \neq 0\) and consider the computation
Then
Homework 6.4.1.1.
Prove Lemma 6.4.1.2
Use the Alternative Computations Model (Subsubsection 6.2.3.3) appropriately.
We know that
From Corollary 6.3.3.2 R-1B: if \(\beta = x^T y \) then \(\check \beta = (x + \delta\!x)^T y \) where \(\vert \delta\!x \vert \leq \gamma_n \vert x \vert \text{.}\)
-
From the ACM (Subsubsection 6.2.3.3): If \(\nu = ( \alpha - \beta ) / \lambda \) then
\begin{equation*} \check \nu = \frac{ \alpha - \beta }{\lambda} \frac{1}{( 1 + \epsilon_- ) (1 + \epsilon_/)}, \end{equation*}where \(\vert \epsilon_-\vert \leq \meps \) and \(\vert \epsilon_/\vert \leq \meps \text{.}\)
Hence
or, equivalently,
or,
where \(\vert \theta_2 \vert \leq \gamma_2 \text{,}\) which can also be written as
where \(\delta\!\lambda = \theta_2 \lambda \) and hence \(\vert \delta\!\lambda \vert \leq \gamma_2 \| \lambda \vert \text{.}\)
The errror result for the algorithm in Figure 6.4.1.1 is given by
Theorem 6.4.1.3.
Let \(L \in \Rnxn \) be a nonsingular lower triangular matrix and let \(\check x \) be the computed result when executing Figure 6.4.1.1 to solve \(L x = y \) under the computation model from Subsection 6.2.3. Then there exists a matrix \(\Delta\!L \) such that
The reasoning behind the result is that one expects the maximal error to be incurred during the final iteration when computing \(\chi_1 := ( \psi_1 - l_{10}^T x_0 ) / \lambda_{11} \text{.}\) This fits Lemma 6.4.1.2, except that this assignment involves a dot product with vectors of length \(n-1 \) rather than of length \(n \text{.}\)
You now prove Theorem 6.4.1.3 by first proving the special cases where \(n = 1 \) and \(n = 2 \text{,}\) and then the general case.
Homework 6.4.1.2.
Prove Theorem 6.4.1.3 for the case where \(n = 1 \text{.}\)
Case 1: \(n =1 \text{.}\)
The system looks like \(\lambda_{11} \chi_{1} = \psi_{1} \) so that
and
Rearranging gives us
or
where \(\delta\!\lambda_{11} = \epsilon_{/} \lambda_{11} \) and hence
Homework 6.4.1.3.
Prove Theorem 6.4.1.3 for the case where \(n = 2 \text{.}\)
Case 2: \(n =2 \text{.}\)
The system now looks like
From the proof of Case 1 we know that
Since \(\chi_1 = ( \psi_1 - \lambda_{10} \check \chi_0 )/ \lambda_{11} \text{,}\) Lemma 6.4.1.2 tells us that
where
(6.4.1) and (6.4.2) can be combined into
where
Since \(\gamma_1 \leq \gamma_2 \)
Homework 6.4.1.4.
Prove Theorem 6.4.1.3 for \(n \geq 1 \text{.}\)
We will utilize a proof by induction.
-
Case 1: \(n =1 \text{.}\)
See Homework 6.4.1.2.
-
Case 2: \(n =2 \text{.}\)
See Homework 6.4.1.3.
-
Case 3: \(n \gt 2 \text{.}\)
The system now looks like
\begin{equation} \left( \begin{array}{c | c} L_{00} \amp 0 \\ \hline l_{10}^T \amp \lambda_{11} \end{array} \right) \left( \begin{array}{c} x_0 \\ \chi_1 \end{array} \right) = \left( \begin{array}{c} y_0 \\ \psi_1 \end{array} \right) ,\label{trsv-eq-1c}\tag{6.4.3} \end{equation}where \(L_{00} \in \R^{(n-1) \times (n-1)} \text{,}\) and the inductive hypothesis states that
\begin{equation*} (L_{00} + \Delta\!L_{00}) \check x_0 = y_0 \mbox{ where } \vert \Delta\!L_{00} \vert \leq \max( \gamma_2, \gamma_{n-2} ) \vert L_{00} \vert. \end{equation*}Since \(\chi_1 = ( \psi_1 - l_{10}^T \check x_0 )/ \lambda_{11} \text{,}\) Lemma 6.4.1.2 tells us that
\begin{equation} ( l_{10} + \delta\!l_{10})^T\check x_0 + (\lambda_{11} + \delta\!\lambda_{11}) \check \chi_1 = \psi_1,\label{trsv-eq-2c}\tag{6.4.4} \end{equation}where \(\vert \delta\!l_{10} \vert \leq \gamma_{n-1} \vert l_{10} \vert \) and \(\vert \delta\!\lambda_{11} \vert \leq \gamma_2 \vert \lambda_{11} \vert \text{.}\)
(6.4.3) and (6.4.4) can be combined into
\begin{equation*} \left( \begin{array}{c | c} L_{00} + \delta\!L_{00} \amp 0 \\ \hline (l_{10} + \delta\!l_{10})^T \amp \lambda_{11} + \delta\!\lambda_{11} \end{array} \right) \left( \begin{array}{c} \check x_0 \\ \check \chi_1 \end{array} \right) = \left( \begin{array}{c} y_0 \\ \psi_1 \end{array} \right), \end{equation*}where
\begin{equation*} \left( \begin{array}{c | c} \vert \delta\!L_{00} \vert \amp 0 \\ \hline \vert \delta\!l_{10}^T \vert \amp \vert \delta\!\lambda_{11} \vert \end{array} \right) \leq \left( \begin{array}{c | c} \max( \gamma_2, \gamma_{n-2} ) \vert L_{00} \vert\amp 0 \\ \hline \gamma_{n-1} \vert l_{10}^T \vert \amp \gamma_2 \vert \lambda_{11} \vert \end{array} \right) \end{equation*}and hence
\begin{equation*} \left\vert \left( \begin{array}{c | c} \delta\!L_{00} \amp 0 \\ \hline \delta\!l_{10}^T \amp \delta\!\lambda_{11} \end{array} \right) \right\vert| \leq \max( \gamma_2, \gamma_{n-1} ) \left\vert \left( \begin{array}{c | c} L_{00} \amp 0 \\ \hline l_{10}^T \amp \lambda_{11} \end{array} \right) \right\vert. \end{equation*} By the Principle of Mathematical Induction, the result holds for all \(n \geq 1 \text{.}\)
A careful examination of the solution to Homework 6.4.1.2, together with the fact that \(\gamma_{n-1} \leq \gamma_n \) allows us to state a slightly looser, but cleaner, result of Theorem 6.4.1.3:
Corollary 6.4.1.4.
Let \(L \in \Rnxn \) be a nonsingular lower triangular matrix and let \(\check x \) be the computed result when executing Figure 6.4.1.1 to solve \(L x = y \) under the computation model from Subsection 6.2.3. Then there exists a matrix \(\Delta\!L \) such that