Subsection 4.4.1 \(A \) has linearly independent columns
¶Theorem 4.4.1.1.
Assume \(A \in \Cmxn \) has linearly independent columns and let \(A = Q R \) be its QR factorization with orthonormal matrix \(Q \in \Cmxn \) and upper triangular matrix \(R \in \Cnxn \text{.}\) Then the LLS problem
is solved by the unique solution of
Proof 1.
Since \(A = Q R \text{,}\) minimizing \(\| b - A x \|_2 \) means minimizing
Since \(R \) is nonsingular, we can first find \(z \) that minimizes
after which we can solve \(R x = z \) for \(x \text{.}\) But from the Method of Normal Equations we know that the minimizing \(z\) solves
Since \(Q \) has orthonormal columns, we thus deduce that
Hence, the desired \(\widehat x \) must satisfy
Proof 2.
Let \(A = Q_L R_{TL} \) be the QR factorization of \(A \text{.}\) We know that then there exists a matrix \(Q_R \) such that \(Q = \FlaOneByTwo{ Q_L }{ Q_R } \) is unitary: \(Q_R \) is an orthonormal basis for the space orthogonal to the space spanned by \(Q_L \text{.}\) Now,
Thus, the desired \(\widehat x \) that minimizes the linear least squares problem solves \(R_{TL} \widehat x = Q_L^H y \text{.}\) The solution is unique because \(R_{TL} \) is nonsingular (because \(A \) has linearly independent columns).
Homework 4.4.1.1.
Yet another alternative proof for Theorem 4.4.1.1 starts with the observation that the solution is given by \(\widehat x = (A^H A)^{-1} A^H b \) and then substitutes in \(A = Q R \text{.}\) Give a proof that builds on this insight.
Recall that we saw in Subsection 4.2.2 that, if \(A \) has linearly independent columns, the LLS solution is given by \(\widehat x = ( A^H A )^{-1} A^H b\) (the solution to the normal equations). Also, if \(A \) has linearly independent columns and \(A = Q R \) is its QR factorization, then the upper triangular matrix \(R \) is nonsingular (and hence has no zeroes on its diagonal).
Now,
Thus, the \(\widehat x \) that solves \(R \widehat x = Q^H b \) solves the LLS problem.
Ponder This 4.4.1.2.
Create a picture similar to Figure 4.3.2.1 that uses the QR factorization rather than the SVD.