Subsection 4.4.4 \(A \) has linearly dependent columns
¶Let us now consider the case where \(A \in \Cmxn \) has rank \(r \leq n \text{.}\) In other words, it has \(r \) linearly independent columns. Let \(p \in \Rn \) be a permutation vector, by which we mean a permutation of the vector
\begin{equation*}
\left( \begin{array}{c}
0 \\
1 \\
\vdots \\
n-1
\end{array}\right)
\end{equation*}
And \(P( p ) \) be the matrix that, when applied to a vector \(x \in \Cn \) permutes the entries of \(x \) according to the vector \(p \text{:}\)
\begin{equation*}
P( p ) x =
\begin{array}[t]{c}
\underbrace{
\left( \begin{array}{c}
e_{\pi_0}^T \\
e_{\pi_1}^T \\
\vdots \\
e_{\pi_{n-1}}^T
\end{array} \right)
} \\
P( p )
\end{array}
x
=
\left( \begin{array}{c}
e_{\pi_0}^T x\\
e_{\pi_1}^T x\\
\vdots \\
e_{\pi_{n-1}}^T x
\end{array} \right)
=
\left( \begin{array}{c}
\chi_{\pi_0} \\
\chi_{\pi_1} \\
\vdots \\
\chi_{\pi_{n-1}}
\end{array} \right) .
\end{equation*}
where \(e_j \) equals the columns of \(I \in \Rnxn \) indexed with \(j \) (and hence the standard basis vector indexed with \(j \)).
If we apply \(P( p )^T \) to \(A \in \Cmxn \) from the right, we get
\begin{equation*}
\begin{array}{l}
A P( p )^T \\
~~~= ~~~~ \lt \mbox{ definition of } P( p ) \gt \\
A
\left( \begin{array}{c}
e_{\pi_0}^T \\
\vdots \\
e_{\pi_{n-1}}^T
\end{array} \right)^T \\
~~~= ~~~~ \lt \mbox{ transpose } \gt \\
A
\left( \begin{array}{c | c | c }
e_{\pi_0} \amp \cdots \amp
e_{\pi_{n-1}}
\end{array} \right) \\
~~~= ~~~~ \lt \mbox{ matrix multiplication by columns } \gt \\
\left( \begin{array}{c | c | c }
A e_{\pi_0} \amp \cdots \amp
A e_{\pi_{n-1}}
\end{array} \right) \\
~~~= ~~~~ \lt B e_j = b_j \gt \\
\left( \begin{array}{c | c | c }
a_{\pi_0} \amp \cdots \amp
a_{\pi_{n-1}}
\end{array} \right) .
\end{array}
\end{equation*}
In other words, applying the transpose of the permutation matrix to \(A \) from the right permutes its columns as indicated by the permutation vector \(p \text{.}\)
The discussion about permutation matrices gives us the ability to rearrange the columns of \(A \) so that the first \(r = \rank( A ) \) columns are linearly independent.
Theorem 4.4.4.1.
Assume \(A \in \Cmxn \) and that \(r = \rank( A ) \text{.}\) Then there exists a permutation vector \(p \in \Rn \text{,}\) orthonormal matrix \(Q_L \in \C^{m \times
r} \text{,}\) upper triangular matrix \(R_{TL} \in \C^{r \times r} \text{,}\) and \(R_{TR} \in \C^{r \times (n-r)} \) such that
\begin{equation*}
A P( p )^T =
Q_L
\left( \begin{array}{c | c }
R_{TL} \amp R_{TR}
\end{array} \right).
\end{equation*}
Proof.
Let \(p \) be the permutation vector such that the first \(r \) columns of \(A^P = A P( p )^T \) are linearly independent. Partition
\begin{equation*}
A^P = A P( p )^T =
\left( \begin{array}{c | c }
A_L^P \amp A_R^P
\end{array} \right)
\end{equation*}
where \(A_L^P \in \C^{m \times r} \text{.}\) Since \(A_L^P \) has linearly independent columns, its QR factorization, \(A^P = Q_L R_{TL} \text{,}\) exists. Since all the linearly independent columns of matrix \(A\) were permuted to the left, the remaining columns, now part of \(A_R^P \text{,}\) are in the column space of \(A_L^P \) and hence in the column space of \(Q_L \text{.}\) Hence \(A_R^P = Q_L R_{TR} \) for some matrix \(R_{TR}
\text{,}\) which then must satisfy \(Q_L^H A_R^P = R_{TR} \) giving us a means by which to compute it. We conclude that
\begin{equation*}
A^P = A P( p )^T =
\left( \begin{array}{c | c }
A_L^P \amp A_R^P
\end{array} \right)
=
Q_L
\left( \begin{array}{c | c }
R_{TL} \amp R_{TR}
\end{array} \right).
\end{equation*}
Let us examine how this last theorem can help us solve the LLS
\begin{equation*}
\mbox{Find } \hat x \in \Cn
\mbox{ such that }
\| b - A \hat x \|_2 = \min_{x \in \Cn} \| b - A x \|_2
\end{equation*}
when \(\rank( A ) \leq n \text{:}\)
\begin{equation*}
\begin{array}{l}
\min_{x \in \Cn} \| b - A x \|_2 \\
~~~=~~~~ \lt P(p)^T P( p ) = I \gt \\
\min_{x \in \Cn} \| b - A P( p )^T P( p ) x \|_2 \\
~~~=~~~~ \lt A P( p )^T = Q_L \left( \begin{array}{c | c } R_{TL} \amp R_{TR}
\end{array} \right) \gt \\
\min_{x \in \Cn} \| b - Q_L
\begin{array}[t]{c}
\underbrace{
\left( \begin{array}{c | c } R_{TL} \amp R_{TR}
\end{array} \right)
P( p ) x
} \\
w
\end{array} \|_2 \\
~~~=~~~~ \lt \mbox{ substitute } w = \left( \begin{array}{c | c } R_{TL} \amp R_{TR}
\end{array} \right)
P( p ) x \gt \\
\min_{w \in \C^r} \| b - Q_L w \|_2 \\
\end{array}
\end{equation*}
which is minimized when \(w = Q_L^H b \text{.}\) Thus, we are looking for vector \(\hat x \) such that
\begin{equation*}
\left( \begin{array}{c | c } R_{TL} \amp R_{TR}
\end{array} \right)
P( p ) \hat x
= Q_L^H b .
\end{equation*}
Substituting
\begin{equation*}
z =
\left( \begin{array}{c}
z_T \\ \hline
z_B
\end{array}
\right)
\end{equation*}
for \(P(p) \hat x \) we find that
\begin{equation*}
\left( \begin{array}{c | c } R_{TL} \amp R_{TR}
\end{array} \right)
\left( \begin{array}{c}
z_T \\ \hline
z_B
\end{array}
\right)
= Q_L^H b .
\end{equation*}
Now, we can pick \(z_B \in \C^{n-r} \) to be an arbitrary vector, and determine a corresponding \(z_T \) by solving
\begin{equation*}
R_{TL} z_T = Q_L^H b - R_{TR} z_B.
\end{equation*}
A convenient choice is \(z_B = 0 \) so that \(z_T \) solves
\begin{equation*}
R_{TL} z_T = Q_L^H b.
\end{equation*}
Regardless of choice of \(z_B \text{,}\) the solution \(\hat x \) is given by
\begin{equation*}
\hat x = P( p )^T
\left( \begin{array}{c}
R_{TL}^{-1} ( Q_L^H b - R_{TR} z_B )\\ \hline
z_B
\end{array}
\right).
\end{equation*}
(a permutation of vector \(z \text{.}\)) This defines an infinite number of solutions if \(\rank( A ) \lt n \text{.}\)
The problem is that we don't know which columns are linearly independent in advance. In enrichments in Subsection 4.5.1 and Subsection 4.5.2, rank-revealing QR factorization algorithms are discussed that overcome this problem.