Subsection 4.2.3 Solving the normal equations
¶Let us review a method you have likely seen before for solving the LLS problem when matrix \(A \) has linearly independent columns. We already used these results in Subsection 2.1.1
We wish to solve \(A^H A \hat x = A^H b \text{,}\) where \(A \) has linearly independent columns. If we form \(B = A^H A \) and \(y = A^H b \text{,}\) we can instead solve \(B \hat x = y \text{.}\) Some observations:
Since \(A \) has linearly independent columns, \(B\) is nonsingular. Hence, \(\hat x \) is unique.
\(B \) is Hermitian since \(B^H = (A^H A )^H = A^H (A^H)^H = A^H A = B \text{.}\)
-
\(B \) is Hermitian Positive Definite (HPD): \(x \neq 0 \) implies that \(x^H B x \gt 0 \text{.}\) This follows from the fact that
\begin{equation*} x^H B x = x^H A^H A x = ( A x )^H ( A x ) = \| A x \|_2^2. \end{equation*}Since \(A \) has linearly independent columns, \(x \neq 0 \) implies that \(A x \neq 0\) and hence \(\| A x \|_2^2 \gt 0 \text{.}\)
In Section 5.4, you will find out that since \(B \) is HPD, there exists a lower triangular matrix \(L \) such that \(B = L L^H \text{.}\) This is known as the Cholesky factorization of \(B \text{.}\) The steps for solving the normal equations then become
-
Compute \(B = A^H A \text{.}\)
Notice that since \(B \) is Hermitian symmetric, only the lower or upper triangular part needs to be computed. This is known as a Hermitian rank-k update (where in this case \(k = n \)). The cost is, approximately, \(m n^2 \) flops. (See Subsection C.1.)
-
Compute \(y = A^H b \text{.}\)
The cost of this matrix-vector multiplication is, approximately, \(2 m n \) flops. (See Subsection C.1.)
-
Compute the Cholesky factorization \(B \rightarrow L L^H \text{.}\)
Later we will see that this costs, approximately, \(\frac{1}{3} n^3 \) flops. (See Subsection 5.4.3.)
-
Solve
\begin{equation*} L z = y \end{equation*}(solve with a lower triangular matrix) followed by
\begin{equation*} L^H \hat x = z \end{equation*}(solve with an upper triangular matrix).
Together, these triangular solves cost, approximately, \(2 n^2 \) flops. (See Subsection C.1.)
We will revisit this in Section 5.4.