Subsection 1.5.1 Condition number estimation
¶It has been observed that high-quality numerical software should not only provide routines for solving a given problem, but, when possible, should also (optionally) provide the user with feedback on the conditioning (sensitivity to changes in the input) of the problem. In this enrichment, we relate this to what you have learned this week.
Given a vector norm \(\| \cdot \| \) and induced matrix norm \(\| \cdot \| \text{,}\) the condition number of matrix \(A \) using that norm is given by \(\kappa( A ) = \| A \| \| A^{-1} \| \text{.}\) When trying to practically compute the condition number, this leads to two issues:
Which norm should we use? A case has been made in this week that the 1-norm and \(\infty \)-norm are canditates since they are easy and cheap to compute.
It appears that \(A^{-1} \) needs to be computed. We will see in future weeks that this is costly: \(O( m^3 ) \) computation when \(A \) is \(m \times m \text{.}\) This is generally considered to be expensive.
This leads to the question "Can a reliable estimate of the condition number be cheaply computed?" In this unit, we give a glimpse of how this can be achieved and then point the interested learner to related papers.
Partition \(m \times m \) matrix \(A \text{:}\)
We recall that
-
The \(\infty \)-norm is defined by
\begin{equation*} \| A \|_\infty = \max_{\| x \|_\infty = 1}\| A x \|_\infty. \end{equation*} -
From Homework 1.3.6.2, we know that the \(\infty \)-norm can be practically computed as
\begin{equation*} \| A \|_\infty = \max_{0 \leq i \lt m} \| \widetilde a_i \|_1, \end{equation*}where \(\widetilde a_i = (\widetilde a_i^T)^T \text{.}\) This means that \(\| A \|_\infty \) can be computed in \(O( m^2 ) \) operations.
-
From the solution to Homework 1.3.6.2, we know that there is a vector \(x \) with \(\vert \chi_j \vert = 1 \) for \(0 \leq j \lt m \) such that \(\| A \|_\infty = \| A x \|_\infty \text{.}\) This \(x \) satisfies \(\| x \|_\infty = 1 \text{.}\)
More precisely: \(\| A \|_\infty = \| \widetilde a_k^T \|_1 \) for some \(k \text{.}\) For simplicity, assume \(A \) is real valued. Then
\begin{equation*} \begin{array}{rcl} \| A \|_\infty \amp = \amp \vert \alpha_{k,0} \vert + \cdots + \vert \alpha_{k,m-1} \vert \\ \amp = \amp \alpha_{k,0} \chi_0 + \cdots + \alpha_{k,m-1} \chi_{m-1}, \end{array} \end{equation*}where each \(\chi_j = \pm 1 \) is chosen so that \(\chi_j \alpha_{k,j} = \vert \alpha_{k,j} \vert \text{.}\) That vector \(x \) then has the property that \(\| A \|_\infty = \| \widetilde a_k \|_1 = \| A x \|_\infty \text{.}\)
From this we conclude that
where \({\cal S} \) is the set of all vectors \(x \) with \(\vert \chi_j \vert =1 \text{,}\) \(0 \leq j \lt n \text{.}\)
We will illustrate the techniques that underly efficient condition number estimation by looking at the simpler case where we wish to estimate the condition number of a real-valued nonsingular upper triangular \(m \times m \) matrix \(U \text{,}\) using the \(\infty\)-norm. Since \(U \) is real-valued, \(\vert \chi_i \vert =1 \) means \(\chi_i = \pm 1 \text{.}\) The problem is that it appears we must compute \(\| U^{-1} \|_\infty \text{.}\) Computing \(U^{-1} \) when \(U \) is dense requires \(O( m^3 ) \) operations (a topic we won't touch on until much later in the course).
Our observations tell us that
where \({\cal S} \) is the set of all vectors \(x \) with elements \(\chi_i \in \{ -1, 1 \} \text{.}\) This is equivalent to
where \({\cal T} \) is the set of all vectors \(z \) that satisfy \(U z = y \) for some \(y \) with elements \(\psi_i \in \{ -1, 1 \} \text{.}\) So, we could solve \(U z = y \) for all vectors \(y \in {\cal S} \text{,}\) compute the \(\infty \)-norm for all those vectors \(z \text{,}\) and pick the maximum of those values. But that is not practical.
One simple solution is to try to construct a vector \(y \) that results in a large amplification (in the \(\infty\)-norm) when solving \(U z = y \text{,}\) and to then use that amplification as an estimate for \(\| U^{-1} \|_\infty \text{.}\) So how do we do this? Consider
Here is a heuristic for picking \(y\in {\cal S} \text{:}\)
-
We want to pick \(\psi_{m-1} \in \{ -1,1 \} \) in order to construct a vector \(y \in {\cal S} \text{.}\) We can pick \(\psi_{m-1} = 1 \) since picking it equal to \(-1\) will simply carry through negation in the appropriate way in the scheme we are describing.
From this \(\psi_{m-1} \) we can compute \(\zeta_{m-1} \text{.}\)
-
Now,
\begin{equation*} \upsilon_{m-2,m-2} \zeta_{m-2} + \upsilon_{m-2,m-1} \zeta_{m-1}= \psi_{m-2} \end{equation*}where \(\zeta_{m-1} \) is known and \(\psi_{m-2}\) can be strategically chosen. We want \(z \) to have a large \(\infty \)-norm and hence a heuristic is to now pick \(\psi_{m-2} \in \{ -1,1 \} \) in such a way that \(\zeta_{m-2} \) is as large as possible in magnitude.
With this \(\psi_{m-2} \) we can compute \(\zeta_{m-2} \text{.}\)
And so forth!
When done, the magnification equals \(\| z \|_\infty = \vert \zeta_k \vert \text{,}\) where \(\zeta_k \) is the element of \(z\) with largest magnitude. This approach provides an estimate for \(\| U^{-1} \|_\infty \) with \(O( m^2 ) \) operations.
The described method underlies the condition number estimator for LINPACK, developed in the 1970s [16], as described in [11]:
A.K. Cline, C.B. Moler, G.W. Stewart, and J.H. Wilkinson, An estimate for the condition number of a matrix, SIAM J. Numer. Anal., 16 (1979).
The method discussed in that paper yields a lower bound on \(\| A^{-1} \|_\infty\) and with that on \(\kappa_\infty( A ) \text{.}\)
Remark 1.5.1.1.
Alan Cline has his office on our floor at UT-Austin. G.W. (Pete) Stewart was Robert's Ph.D. advisor. Cleve Moler is the inventor of Matlab. John Wilkinson received the Turing Award for his contributions to numerical linear algebra.
More sophisticated methods are discussed in [21]:
N. Higham, A Survey of Condition Number Estimates for Triangular Matrices, SIAM Review, 1987.
His methods underlie the LAPACK [1] condition number estimator and are remarkably accurate: most of the time they provides an almost exact estimate of the actual condition number.