ALAFF Ran -Revealing QR (RRQR) via MGS

Subsection 4.5.1 Ran -Revealing QR (RRQR) via MGS

The discussion in Subsection 4.4.4 falls short of being a practical algorithm for at least two reasons:

One needs to be able to determine in advance what columns of \(A \) are linearly independent; and
Due to roundoff error or error in the data from which the matrix was created, a column may be linearly independent of other columns when for practical purposes it should be considered dependent.

We now discuss how the MGS algorithm can be modified so that appropriate linearly independent columns can be determined "on the fly" as well as the defacto rank of the matrix. The result is known as the Rank Revealing QR factorization (RRQR). It is also known as QR factorization with column pivoting. We are going to give a modification of the MGS algorithm for computing the RRQR.

For our discussion, we introduce an elementary pivot matrix, \(\widetilde P( j ) \in \Cnxn \text{,}\) that swaps the first element of the vector to which it is applied with the element indexed with \(j \text{:}\)

\begin{equation*} \widetilde P( j ) x = \left( \begin{array}{c} e_j^T \\ e_1^T \\ \vdots \\ e_{j-1}^T \\ e_0^T \\ e_{j+1}^T \\ \vdots \\ e_{n-1}^T \end{array} \right) x = \left( \begin{array}{c} e_j^T x \\ e_1^T x \\ \vdots \\ e_{j-1}^T x \\ e_0^T x \\ e_{j+1}^T x \\ \vdots \\ e_{n-1}^T x \end{array} \right) = \left( \begin{array}{c} \chi_j \\ \chi_1 \\ \vdots \\ \chi_{j-1} \\ \chi_0 \\ \chi_{j+1} \\ \vdots \\ \chi_{n-1} \end{array} \right) . \end{equation*}

Another way of stating this is that

\begin{equation*} \widetilde P( j ) = \left( \begin{array}{c c c c } 0 \amp 0 \amp 1 \amp 0 \\ 0 \amp I_{(j-1)\times(j-1)} \amp 0 \amp 0 \\ 1 \amp 0 \amp 0 \amp 0 \\ 0 \amp 0 \amp 0 \amp I_{(n-j-1)\times(n-j-1)} \end{array} \right), \end{equation*}

where \(I_{k \times k} \) equals the \(k \times k \) identity matrix. When applying \(\widetilde P(j) \) from the right to a matrix, it swaps the first column and the column indexed with \(j \text{.}\) Notice that \(\widetilde P ( j )^T = \widetilde P( j ) \) and \(\widetilde P( j ) = \widetilde P( j )^{-1} \text{.}\)

Remark 4.5.1.1.

For a more detailed discussion of permutation matrices, you may want to consult Week 7 of "Linear Algebra: Foundations to Frontiers" (LAFF) [26]. We also revisit this in Section 5.3 when discussing LU factorization with partial pivoting.

Here is an outline of the algorithm:

Determine the index \(\pi_1 \) such that the column of \(A \) indexed with \(\pi_1 \) has the largest 2-norm (is the longest).
Permute \(A := A \widetilde P( \pi_1 ) \text{,}\) swapping the first column with the column that is longest.
Partition

\begin{equation*} A \rightarrow \FlaOneByTwoSingleLine{ a_1 }{ A_2 } , Q \rightarrow \FlaOneByTwoSingleLine{ q_1 }{ Q_2 }, R \rightarrow \FlaTwoByTwoSingleLine { \rho_{11} }{ r_{12}^T } { 0 }{ R_{22} }, p \rightarrow \FlaTwoByOneSingleLine{ \pi_1 }{ p_2 } \end{equation*}
Compute \(\rho_{11} := \| a_1 \|_2 \text{.}\)
\(q_1 := a_1 /\rho_{11} \text{.}\)
Compute \(r_{12}^T := q_1^T A_2 \text{.}\)
Update \(A_2 := A_2 - q_1 r_{12}^T \text{.}\)

This substracts the component of each column that is in the direction of \(q_1 \text{.}\)
Continue the process with the updated matrix \(A_2 \text{.}\)

The complete algorithm, which overwrites \(A \) with \(Q \text{,}\) is given in Figure 4.5.1.2. Observe that the elements on the diagonal of \(R \) will be positive and in non-increasing order because updating \(A_2 := A_2 - q_1 r_{12}^T \) inherently does not increase the length of the columns of \(A_2 \text{.}\) After all, the component in the direction of \(q_1 \) is being subtracted from each column of \(A_2 \text{,}\) leaving the component orthogonal to \(q_1 \text{.}\)

\begin{equation*} \newcommand{\routinename}{ \left[ A, R, p \right] := {\rm RRQR\_MGS\_simple}( A, R, p ) } \newcommand{\guard}{ n( A_L ) \lt n( A ) } \newcommand{\partitionings}{ A \rightarrow \FlaOneByTwo{A_L}{A_R} , R \rightarrow \FlaTwoByTwo{R_{TL}}{R_{TR}} {R_{BL}}{R_{BR}} , p \rightarrow \FlaTwoByOne{p_{T}} {p_{B}} } \newcommand{\partitionsizes}{ A_L {\rm ~has~} 0 {\rm ~columns}, R_{TL} {\rm ~is~} 0 \times 0 , p_{T} {\rm ~has~} 0 {\rm ~rows} } \newcommand{\repartitionings}{ \FlaOneByTwo{A_L}{A_R} \rightarrow \FlaOneByThreeR{A_0}{a_1}{A_2} , \\ \FlaTwoByTwo{R_{TL}}{R_{TR}} {R_{BL}}{R_{BR}} \rightarrow \FlaThreeByThreeBR{R_{00}}{r_{01}}{R_{02}} {r_{10}^T}{\rho_{11}}{r_{12}^T} {R_{20}}{r_{21}}{R_{22}} , \FlaTwoByOne{ p_T } { p_B } \rightarrow \FlaThreeByOneB{p_0} {\pi_1} {p_2} } \newcommand{\moveboundaries}{ \FlaOneByTwo{A_L}{A_R} \leftarrow \FlaOneByThreeL{A_0}{a_1}{A_2} , \\ \FlaTwoByTwo{R_{TL}}{R_{TR}} {R_{BL}}{R_{BR}} \leftarrow \FlaThreeByThreeTL{R_{00}}{r_{01}}{R_{02}} {r_{10}^T}{\rho_{11}}{r_{12}^T} {R_{20}}{r_{21}}{R_{22}} , \FlaTwoByOne{ p_T } { p_B } \leftarrow \FlaThreeByOneT{p_0} {\pi_1} {p_2} } \newcommand{\update}{ \begin{array}{l} \pi_1 = \mbox{DetermineColumnIndex}( \FlaOneByTwoSingleLine{a_1}{A_2} ) \\ \FlaOneByTwoSingleLine{a_1}{A_2} := \FlaOneByTwoSingleLine{a_1}{A_2} \widetilde P( \pi_1 ) \\ \rho_{11} := \| a_1 \|_2 \\ a_1 := a_1 / \rho_{11} \\ r_{12}^T := a_1^T A_2 \\ A_2 := A_2 - a_1 r_{12}^T \end{array} } \FlaAlgorithm \end{equation*}

Figure 4.5.1.2. Simple implementation of RRQR via MGS. Incorporating a stopping critera that checks whether \(\rho_{11}\) is small would allow the algorithm to determine the effective rank of the input matrix.

The problem with the algorithm in Figure 4.5.1.2 is that determining the index \(\pi_1 \) requires the 2-norm of all columns in \(A_R \) to be computed, which costs \(O( m (n-j) ) \) flops when \(A_L\) has \(j \) columns (and hence \(A_R \) has \(n-j\) columns). The following insight reduces this cost: Let \(A = \left( \begin{array}{c | c | c | c} a_0 \amp a_1 \amp \cdots \amp a_{n-1} \end{array} \right) \text{,}\) \(v = \left( \begin{array}{c} \nu_0 \\ \nu_1 \\ \vdots \\ \nu_{n-1} \end{array} \right) = \left( \begin{array}{c} \| a_0 \|_2^2 \\ \| a_1 \|_2^2 \\ \vdots \\ \| a_{n-1} \|_2^2 \end{array} \right) \text{,}\) \(q^T q = 1 \) (here \(q \) is of the same size as the columns of \(A \)), and \(r = A^T q = \left( \begin{array}{c} \rho_0 \\ \rho_1 \\ \vdots \\ \rho_{n-1} \end{array} \right) \text{.}\) Compute \(B := A - q r^T \) with \(B = \left( \begin{array}{c | c | c | c} b_0 \amp b_1 \amp \cdots \amp b_{n-1} \end{array} \right) \text{.}\) Then

\begin{equation*} \left( \begin{array}{c} \| b_0 \|_2^2 \\ \| b_1 \|_2^2 \\ \vdots \\ \| b_{n-1} \|_2^2 \end{array} \right) = \left( \begin{array}{c} \nu_0 - \rho_0^2 \\ \nu_1 - \rho_1^2 \\ \vdots \\ \nu_{n-1} - \rho_{n-1}^2 \end{array} \right). \end{equation*}

To verify this, notice that

\begin{equation*} a_i = ( a_i - a_i^T q q ) + a_i^T q q \end{equation*}

and

\begin{equation*} (a_i - a_i^T q q )^T q = a_i^T q - a_i^T q q^T q = a_i^T q - a_i^T q = 0 . \end{equation*}

This means that

\begin{equation*} \| a_i \|_2^2 = \| ( a_i - a_i^T q q ) + a_i^T q q \|_2^2 = \| a_i - a_i^T q q \|_2^2 + \| a_i^T q q \|_2^2 = \| a_i - \rho_i q \|_2^2 + \| \rho_i q \|_2^2 = \| b_i \|_2^2 + \rho_i^2 \end{equation*}

so that

\begin{equation*} \| b_i \|_2^2 = \| a_i \|_2^2 - \rho_i^2 = \nu_i - \rho_i^2. \end{equation*}

Building on this insight, we make an important observation that greatly reduces the cost of determining the column that is longest. Let us start by computing \(v \) as the vector such that the \(i \)th entry in \(v \) equals the square of the length of the \(i \)th column of \(A \text{.}\) In other words, the \(i \)th entry of \(v \) equals the dot product of the \(i \) column of \(A \) with itself. In the above outline for the MGS with column pivoting, we can then also partition

\begin{equation*} v \rightarrow \FlaTwoByOneSingleLine {\nu_1}{v_2}. \end{equation*}

The question becomes how \(v_2 \) before the update \(A_2 := A_2 - q_1 r_{12}^T \) compares to \(v_2 \) after that update. The answer is that the \(i \)th entry of \(v_2 \) must be updated by subtracting off the square of the \(i \)th entry of \(r_{12}^T \text{.}\)

Let us introduce the functions v = ComputeWeights( A ) and v = UpdateWeights( v, r ) to compute the described weight vector \(v \) and to update a weight vector \(v \) by subtracting from its elements the squares of the corresponding entries of \(r \text{.}\) Also, the function DeterminePivot returns the index of the largest in the vector, and swaps that entry with the first entry. An optimized RRQR via MGS algorithm, RRQR-MGS, is now given in Figure 4.5.1.3. In that algorithm, \(A \) is overwritten with \(Q \text{.}\)

\begin{equation*} \newcommand{\initialize}{ v := {\rm ComputeWeights}( A )} \newcommand{\routinename}{ \left[ A, R, p \right] := {\rm RRQR\_MSG}( A, R, p ) } \newcommand{\guard}{ n( A_L ) \lt n( A ) } \newcommand{\partitionings}{ A \rightarrow \FlaOneByTwo{A_L}{A_R} , R \rightarrow \FlaTwoByTwo{R_{TL}}{R_{TR}} {R_{BL}}{R_{BR}} , p \rightarrow \FlaTwoByOne{p_{T}} {p_{B}} , v \rightarrow \FlaTwoByOne{v_{T}} {v_{B}} } \newcommand{\partitionsizes}{ A_L {\rm ~has~} 0 {\rm ~columns}, R_{TL} {\rm ~is~} 0 \times 0 , p_{T} {\rm ~has~} 0 {\rm ~rows}, v_{T} {\rm ~has~} 0 {\rm ~rows} } \newcommand{\repartitionings}{ \FlaOneByTwo{A_L}{A_R} \rightarrow \FlaOneByThreeR{A_0}{a_1}{A_2} , \\ \FlaTwoByTwo{R_{TL}}{R_{TR}} {R_{BL}}{R_{BR}} \rightarrow \FlaThreeByThreeBR{R_{00}}{r_{01}}{R_{02}} {r_{10}^T}{\rho_{11}}{r_{12}^T} {R_{20}}{r_{21}}{R_{22}} , \\ \FlaTwoByOne{ p_T } { p_B } \rightarrow \FlaThreeByOneB{p_0} {\pi_1} {p_2} , \FlaTwoByOne{ v_T } { v_B } \rightarrow \FlaThreeByOneB{v_0} {\nu_1} {v_2} } \newcommand{\moveboundaries}{ \FlaOneByTwo{A_L}{A_R} \leftarrow \FlaOneByThreeL{A_0}{a_1}{A_2} , \\ \FlaTwoByTwo{R_{TL}}{R_{TR}} {R_{BL}}{R_{BR}} \leftarrow \FlaThreeByThreeTL{R_{00}}{r_{01}}{R_{02}} {r_{10}^T}{\rho_{11}}{r_{12}^T} {R_{20}}{r_{21}}{R_{22}} , \\ \FlaTwoByOne{ p_T } { p_B } \leftarrow \FlaThreeByOneT{p_0} {\pi_1} {p_2} , \FlaTwoByOne{ v_T } { v_B } \leftarrow \FlaThreeByOneT{v_0} {\nu_1} {v_2} } \newcommand{\update}{ \begin{array}{l} [ \left( \begin{array}{c} \nu_1 \\ \hline v_2 \end{array} \right), \pi_1 ] = {\rm DeterminePivot}( \left( \begin{array}{c} \nu_1 \\ \hline v_2 \end{array} \right) ) \\ \FlaOneByThreeR{A_0}{a_1}{A_2} := \FlaOneByThreeR{A_0}{a_1}{A_2} \widetilde P( \pi_1 )^T \\ \rho_{11} := \| a_1 \|_2 \\ a_1 := a_1 / \rho_{11} \\ r_{12}^T := q_1^T A_2 \\ A_2 := A_2 - q_1 r_{12}^T \\ v_2 := {\rm UpdateWeights}( v_2, r_{12} ) \end{array} } \FlaAlgorithmWithInit \end{equation*}

Figure 4.5.1.3. RRQR via MGS, with optimization. Incorporating a stopping critera that checks whether \(\rho_{11}\) is small would allow the algorithm to determine the effective rank of the input matrix.

Let us revisit the fact that the diagonal elements of \(R \) are positive and in nonincreasing order. This upper triangular matrix is singular if a diagonal element equals zero (and hence all subsequent diagonal elements equal zero). Hence, if \(\rho_{11} \) becomes small relative to prior diagonal elements, the remaining columns of the (updated) \(A_R \) are essentially zero vectors, and the original matrix can be approximated with

\begin{equation*} A \approx Q_L \left( \begin{array}{c c} R_{TL} \amp R_{TR} \end{array} \right) = \begin{array}[t]{|c|} \hline ~~~ \\ ~~ \\ ~~ \\ \hline \end{array}. \begin{array}[t]{|c|} \hline ~~~~~~~~~~~~ \\ \hline \end{array}. \end{equation*}

If \(Q_L \) has \(k \) columns, then this becomes a rank-k approximation.

Remark 4.5.1.4.

Notice that in updating the weight vector \(v \text{,}\) the accuracy of the entries may progressively deteriorate due to catastrophic cancellation. Since these values are only used to determine the order of the columns and, importantly, when they become very small the rank of the matrix has revealed itself, this is in practice not a problem.