Skip to main content

Subsection 3.2.3 Classical Gram-Schmidt algorithm

Remark 3.2.3.1.

If the FLAME notation used in this unit is not intuitively obvious, you may want to review some of the materials in Weeks 3-5 of Linear Algebra: Foundations to Frontiers (http://www.ulaff.net).

An alternative for motivating that algorithm is as follows:

  • Consider \(A = Q R \text{.}\)

  • Partition \(A \text{,}\) \(Q \text{,}\) and \(R \) to yield

    \begin{equation*} \FlaOneByThreeR{ A_0 }{ a_1 }{ A_2 } = \FlaOneByThreeR{ Q_0 }{ q_1 }{ Q_2 } \FlaThreeByThreeBR { R_{00} }{ r_{01}}{ R_{02} } { 0 }{ \rho_{11}}{ r_{12}^T } { 0 }{ 0 }{ R_{22} }. \end{equation*}
  • Assume that \(Q_0 \) and \(R_{00} \) have already been computed.

  • Since corresponding columns of both sides must be equal, we find that

    \begin{equation} a_1 = Q_0 r_{01} + q_1 \rho_{11} .\label{eqn-GS1}\tag{3.2.1} \end{equation}

    Also, \(Q_0^H Q_0 = I \) and \(Q_0^H q_1 = 0 \text{,}\) since the columns of \(Q \) are mutually orthonormal.

  • Hence

    \begin{equation*} Q_0^H a_1 = Q_0^H Q_0 r_{01} + Q_0^H q_1 \rho_{11} = r_{01}. \end{equation*}
  • This shows how \(r_{01} \) can be computed from \(Q_0 \) and \(a_1 \text{,}\) which are already known:

    \begin{equation*} r_{01} := Q_0^H a_1. \end{equation*}
  • Next,

    \begin{equation*} a_1^\perp := a_1 - Q_0 r_{01} \end{equation*}

    is computed from (3.2.1). This is the component of \(a_1 \) that is perpendicular (orthogonal) to the columns of \(Q_0 \text{.}\) We know it is nonzero since the columns of \(A \) are linearly independent.

  • Since \(\rho_{11} q_1 = a_1^\perp \) and we know that \(q_1 \) has unit length, we now compute

    \begin{equation*} \rho_{11} := \| a_1^\perp \|_2 \end{equation*}

    and

    \begin{equation*} q_1 := a_1^\perp / \rho_{11}, \end{equation*}

These insights are summarized in the algorithm in Figure 3.2.3.2.

\begin{equation*} \newcommand{\routinename}{[ Q, R ] = \mbox{CGS-QR}( A )} \newcommand{\guard}{ n( A_L ) \lt n( A ) } \newcommand{\partitionings}{ A \rightarrow \FlaOneByTwo{A_L}{A_R} , Q \rightarrow \FlaOneByTwo{Q_L}{Q_R} , R \rightarrow \FlaTwoByTwo{R_{TL}}{R_{TR}} {0}{R_{BR}} } \newcommand{\partitionsizes}{ A_L {\rm ~and~} Q_L {\rm ~has~} 0 {\rm ~columns~and~} R_{TL} {\rm ~is~} 0 \times 0 } \newcommand{\repartitionings}{ \FlaOneByTwo{A_L}{A_R} \rightarrow \FlaOneByThreeR{A_0}{a_1}{A_2} , \FlaOneByTwo{Q_L}{Q_R} \rightarrow \FlaOneByThreeR{Q_0}{q_1}{Q_2} ,\\ \FlaTwoByTwo{R_{TL}}{R_{TR}} {0}{R_{BR}} \rightarrow \FlaThreeByThreeBR{R_{00}}{r_{01}}{R_{02}} {0}{\rho_{11}}{r_{12}^T} {0}{0}{R_{22}} } \newcommand{\repartitionsizes}{ a_1 {\rm ~and~} q_1 {\rm ~are~columns,~} \rho_{11} {\rm ~is~a~scalar} } \newcommand{\moveboundaries}{ \FlaOneByTwo{A_L}{A_R} \leftarrow \FlaOneByThreeL{A_0}{a_1}{A_2} , \FlaOneByTwo{Q_L}{Q_R} \leftarrow \FlaOneByThreeL{Q_0}{q_1}{Q_2} , \\ \FlaTwoByTwo{R_{TL}}{R_{TR}} {0}{R_{BR}} \leftarrow \FlaThreeByThreeTL{R_{00}}{r_{01}}{R_{02}} {0}{\rho_{11}}{r_{12}^T} {0}{0}{R_{22}} } \newcommand{\update}{ \begin{array}{l} r_{01} := Q_0^H a_1 \\ a_1^\perp := a_1 - Q_0 r_{01} \\ \rho_{11} := \| a_1^\perp \|_2 \\ q_1 := a_1^\perp / \rho_{11} \end{array} } \FlaAlgorithm \end{equation*}
Figure 3.2.3.2. (Classical) Gram-Schmidt (CGS) algorithm for computing the QR factorization of a matrix \(A \text{.}\)

Having presented the algorithm in FLAME notation, we can provide a formal proof of Theorem 3.2.2.1.

Informal proof: The process described earlier in this unit constructs the \(Q R \) decomposition. The computation of \(\rho_{j,j} \) is unique if it is restricted to be a real and positive number. This then prescribes all other results along the way.

Formal proof:

(By induction). Note that \(n \leq m \) since \(A \) has linearly independent columns.

  • Base case: \(n = 1 \text{.}\) In this case \(A = \left( \begin{array}{c| c} A_0 \amp a_1 \end{array} \right) \text{,}\) where \(A_0 \) has no columns. Since \(A \) has linearly independent columns, \(a_1 \neq 0 \text{.}\) Then

    \begin{equation*} A = \left( \begin{array}{c} a_1 \end{array} \right) = \left( q_1 \right) \left( \rho_{11} \right), \end{equation*}

    where \(\rho_{11} = \| a_1 \|_2 \) and \(q_1 = a_1 / \rho_{11} \text{,}\) so that \(Q = \left( q_1 \right) \) and \(R = \left( \rho_{11} \right) \text{.}\)

  • Inductive step: Assume that the result is true for all \(A_0 \) with \(k \) linearly independent columns. We will show it is true for \(A\) with \(k+1 \) linearly independent columns.

    Let \(A \in \C^{m \times (k+1)} \text{.}\) Partition \(A \rightarrow \left( \begin{array}{c | c} A_0 \amp a_1 \end{array} \right) \text{.}\)

    By the induction hypothesis, there exist \(Q_0 \) and \(R_{00} \) such that \(Q_0^H Q_0 = I \text{,}\) \(R_{00} \) is upper triangular with nonzero diagonal entries and \(A_0 = Q_0 R_{00} \text{.}\) Also, by induction hypothesis, if the elements on the diagonal of \(R_{00} \) are chosen to be positive, then the factorization \(A_0 = Q_0 R_{00} \) is unique.

    We are looking for

    \begin{equation*} \left( \begin{array}{c | c} \widetilde Q_0 \amp q_1 \end{array} \right) \mbox{ and } \left( \begin{array}{c | c} \widetilde R_{00} \amp r_{01} \\ \hline 0 \amp \rho_{11} \end{array} \right) \end{equation*}

    so that

    \begin{equation*} \left( \begin{array}{c | c} A_0 \amp a_1 \end{array} \right) = \left( \begin{array}{c | c} \widetilde Q_0 \amp q_1 \end{array} \right) \left( \begin{array}{c | c} \widetilde R_{00} \amp r_{01} \\ \hline 0 \amp \rho_{11} \end{array} \right) . \end{equation*}

    This means that

    • \(A_0 = \widetilde Q_0 \widetilde R_{00}, \)

      We choose \(\widetilde Q_0 = Q_0\) and \(\widetilde R_{00} = R_{00} \text{.}\) If we insist that the elements on the diagonal be positive, this choice is unique. Otherwise, it is a choice that allows us to prove existence.

    • \(a_1 = Q_0 r_{01} + \rho_{11} q_1 \) which is the unique choice if we insist on positive elements on the diagonal.

      \(a_1 = Q_0 r_{01} + \rho_{11} q_1 .\) Multiplying both sides by \(Q_0^H \) we find that \(r_{01} \) must equal \(Q_0^H a_1\) (and is uniquely determined by this if we insist on positive elements on the diagonal).

    • Letting \(a_1^\perp = a_1 - Q_0 r_{01} \) (which equals the component of \(a_1 \) orthogonal to \(\Col( Q_0 ) \)), we find that \(\rho_{11} q_1 = a_1^\perp .\) Since \(q_1 \) has unit length, we can choose \(\rho_{11} = \| a_1 ^\perp \|_2 \text{.}\) If we insist on positive elements on the diagonal, then this choice is unique.

    • Finally, we let \(q_1 = a_1^\perp / \rho_{11} \text{.}\)

  • By the Principle of Mathematical Induction the result holds for all matrices \(A \in \C^{m \times n}\) with \(m \geq n \text{.}\)