Skip to main content

Subsection 3.3.4 Householder QR factorization algorithm

Let \(A \) be an \(m \times n \) with \(m \geq n \text{.}\) We will now show how to compute \(A \rightarrow QR \text{,}\) the QR factorization, as a sequence of Householder transformations applied to \(A\text{,}\) which eventually zeroes out all elements of that matrix below the diagonal. The process is illustrated in Figure 3.3.4.1.

\begin{equation*} \begin{array}{ c | c | c | c } \mbox{Original matrix} \amp \begin{array}{l} \left[ \FlaTwoByOneSingleLine { \rho_{11} } { u_{21} }, \tau_{1} \right] = \\ ~~~~ {\rm Housev} \FlaTwoByOneSingleLine { \color{blue} \alpha_{11} } { \color{blue} a_{21} } \end{array} \amp \begin{array}{l} \FlaTwoByTwoSingleLine {\alpha_{11} }{ a_{12}^{T} } {a_{21} }{ A_{22} } :=\\ \FlaTwoByTwoSingleLine {\color{red} {\rho_{11}}}{ \color{red} {a_{12}^{T} - w_{12}^{T}} } {0}{ \color{red} {A_{22} - u_{21} w_{12}^{T}} } \end{array} \amp {\rm ``Move~forward''} \\[-0.2in] \\ \hline \amp \amp \amp \\[-0.2in] \begin{array}{| c c c c} \hline \times \amp \times \amp \times \amp \times \\ \times \amp \times \amp \times \amp \times \\ \times \amp \times \amp \times \amp \times \\ \times \amp \times \amp \times \amp \times \\ \times \amp \times \amp \times \amp \times \end{array} \amp \begin{array}{| c c c c} \hline \color{blue} \times \amp \times \amp \times \amp \times \\ \color{blue} \times \amp \times \amp \times \amp \times \\ \color{blue} \times \amp \times \amp \times \amp \times \\ \color{blue} \times \amp \times \amp \times \amp \times \\ \color{blue} \times \amp \times \amp \times \amp \times \end{array} \amp \begin{array}{| c c c c} \hline \color{red} \times \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \end{array} \amp \begin{array}{c | c c c} \times \amp \times \amp \times \amp \times \\ \hline 0 \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \end{array} \\[-0.2in] \amp \amp \amp \\ \hline \amp \amp \amp \\[-0.2in] \amp \begin{array}{c | c c c} \times \amp \times \amp \times \amp \times \\ \hline 0 \amp \color{blue} \times \amp \times \amp \times \\ 0 \amp \color{blue} \times \amp \times \amp \times \\ 0 \amp \color{blue} \times \amp \times \amp \times \\ 0 \amp \color{blue} \times \amp \times \amp \times \end{array} \amp \begin{array}{c | c c c} \times \amp \times \amp \times \amp \times \\ \hline 0 \amp \color{red} \times \amp \color{red} \times \amp \color{red} \times \\ 0 \amp 0 \amp \color{red} \times \amp \color{red} \times \\ 0 \amp 0 \amp \color{red} \times \amp \color{red} \times \\ 0 \amp 0 \amp \color{red} \times \amp \color{red} \times \end{array} \amp \begin{array}{c c | c c} \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ \hline 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \end{array} \\[-0.2in] \amp \amp \amp \\ \hline \amp \amp \amp \\[-0.2in] \amp \begin{array}{c c | c c} \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ \hline 0 \amp 0 \amp \color{blue} \times \amp \times \\ 0 \amp 0 \amp \color{blue} \times \amp \times \\ 0 \amp 0 \amp \color{blue} \times \amp \times \end{array} \amp \begin{array}{c c | c c} \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ \hline 0 \amp 0 \amp \color{red} \times \amp \color{red} \times \\ 0 \amp 0 \amp 0 \amp \color{red} \times \\ 0 \amp 0 \amp 0 \amp \color{red} \times \end{array} \amp \begin{array}{c c c | c} \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \\ \hline 0 \amp 0 \amp 0 \amp \times \\ 0 \amp 0 \amp 0 \amp \times \end{array} \\[-0.2in] \amp \amp \amp \\ \hline \amp \amp \amp \\[-0.2in] \amp \begin{array}{c c c | c } \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \\ \hline 0 \amp 0 \amp 0 \amp \color{blue} \times \\ 0 \amp 0 \amp 0 \amp \color{blue} \times \end{array} \amp \begin{array}{c c c | c } \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \\ \hline 0 \amp 0 \amp 0 \amp \color{red} \times \\ 0 \amp 0 \amp 0 \amp 0 \end{array} \amp \begin{array}{c c c c | } \times \amp \times \amp \times \amp \times \\ 0 \amp \times \amp \times \amp \times \\ 0 \amp 0 \amp \times \amp \times \\ 0 \amp 0 \amp 0 \amp \times \\ \hline 0 \amp 0 \amp 0 \amp 0 \end{array} \end{array} \end{equation*}
Figure 3.3.4.1. Illustration of Householder QR factorization.

In the first iteration, we partition

\begin{equation*} A \rightarrow \FlaTwoByTwoSingleLine { \alpha_{11} }{ a_{12}^{T} } { a_{21} }{ A_{22} }. \end{equation*}

Let

\begin{equation*} \left[ \FlaTwoByOneSingleLine { \rho_{11} } { u_{21} }, \tau_{1} \right] = {\rm Housev} \left( \FlaTwoByOneSingleLine { \alpha_{11} } { a_{21} } \right) \end{equation*}

be the Householder transform computed from the first column of \(A \text{.}\) Then applying this Householder transform to \(A \) yields

\begin{equation*} \begin{array}{rcl} \FlaTwoByTwoSingleLine { \alpha_{11} }{ a_{12}^{T} } { a_{21} }{ A_{22} } \amp:=\amp \left( I - \frac{1}{\tau_1} \FlaTwoByOneSingleLine { 1 } { u_{21} } \FlaTwoByOneSingleLine { 1 } { u_{21} }^H \right) \FlaTwoByTwoSingleLine { \alpha_{11} }{ a_{12}^{T} } { a_{21} }{ A_{22} } \\ \amp=\amp \FlaTwoByTwoSingleLine { \rho_{11} }{ a_{12}^{T} - w_{12}^{T} } { 0 }{ A_{22} - u_{21} w_{12}^{T} }, \end{array} \end{equation*}

where \(w_{12}^{T} = ( a_{12}^{T} + u_{21}^{H} A_{22} ) / \tau_1 \text{.}\) Computation of a full QR factorization of \(A \) will now proceed with the updated matrix \(A_{22} \text{.}\)

Homework 3.3.4.1.

Show that

\begin{equation*} \FlaTwoByTwo { I } { 0 } { 0 } { I - \frac{1}{\tau_1} \FlaTwoByOneSingleLine { 1 } { u_{21} } \FlaTwoByOneSingleLine { 1 } { u_{21} }^H } = \left( I - \frac{1}{\tau_1} \FlaThreeByOneB { 0 } { 1 } { u_{21} } \FlaThreeByOneB { 0 } { 1 } { u_{21} }^H \right). \end{equation*}
Solution
\begin{equation*} \begin{array}{rcl} \FlaTwoByTwo { I } { 0 } { 0 } { I - \frac{1}{\tau_1} \FlaTwoByOneSingleLine { 1 } { u_{21} } \FlaTwoByOneSingleLine { 1 } { u_{21} }^H } \amp=\amp I - \FlaTwoByTwo {0 } { 0 } { 0 } { \frac{1}{\tau_1} \FlaTwoByOneSingleLine { 1 } { u_{21} } \FlaTwoByOneSingleLine { 1 } { u_{21} }^H } \\ \amp=\amp I - \frac{1}{\tau_1} \FlaTwoByTwo {0 } { 0 } { 0 } { \FlaTwoByOneSingleLine { 1 } { u_{21} } \FlaTwoByOneSingleLine { 1 } { u_{21} }^H } \\ \amp=\amp I - \frac{1}{\tau_1} \FlaThreeByThreeBR {0 } { 0 }{0} { 0 }{1}{u_{21}^H} { 0}{u_{21}}{ u_{21} u_{21}^H} \\ \amp=\amp \left( I - \frac{1}{\tau_1} \FlaThreeByOneB { 0 } { 1 } { u_{21} } \FlaThreeByOneB { 0 } { 1 } { u_{21} }^H \right). \end{array} \end{equation*}

More generally, let us assume that after \(k \) iterations of the algorithm matrix \(A \) contains

\begin{equation*} A \rightarrow \FlaTwoByTwo { R_{TL} }{ R_{TR} } { 0 }{ A_{BR} } = \FlaThreeByThreeBR { R_{00} }{ r_{01} } { R_{02} } { 0 }{ \alpha_{11}}{ a_{12}^{T} } { 0 }{ a_{21} }{ A_{22} }, \end{equation*}

where \(R_{TL} \) and \(R_{00} \) are \(k \times k \) upper triangular matrices. Let

\begin{equation*} \left[ \FlaTwoByOneSingleLine { \rho_{11}} { u_{21} }, \tau_{1} \right] = {\rm Housev} \left( \FlaTwoByOneSingleLine { \alpha_{11} } { a_{21} }\right). \end{equation*}

and update

\begin{equation*} \begin{array}{rcl} A \amp:=\amp \FlaTwoByTwo { I } { 0 } { 0 } { \left( I - \frac{1}{\tau_1} \FlaTwoByOneSingleLine { 1 } { u_{21} } \FlaTwoByOneSingleLine { 1 } { u_{21} }^H \right) } \FlaThreeByThreeBR { R_{00} }{ r_{01} } { R_{02} } { 0 }{ \alpha_{11}}{ a_{12}^{T} } { 0 }{ a_{21} }{ A_{22} } \\ \amp=\amp \left( I - \frac{1}{\tau_1} \FlaThreeByOneB { 0 } { 1 } { u_{21} } \FlaThreeByOneB { 0 } { 1 } { u_{21} }^H \right) \FlaThreeByThreeBR { R_{00} }{ r_{01} } { R_{02} } { 0 }{ \alpha_{11}}{ a_{12}^{T} } { 0 }{ a_{21} }{ A_{22} } \\ \amp=\amp \FlaThreeByThreeBR { R_{00} }{ r_{01} } { R_{02} } { 0 }{ \rho_{11}}{ a_{12}^{T} - w_{12}^{T} } { 0 }{ 0 }{ A_{22} - u_{21} w_{12}^{T} }, \end{array} \end{equation*}

where, again, \(w_{12}^{T} = ( a_{12}^{T} + u_{21}^{H} A_{22} ) / \tau_1 \text{.}\)

Let

\begin{equation*} H_k = \left( I - \frac{1}{\tau_1} \FlaThreeByOneB { 0_k } { 1 } { u_{21} } \FlaThreeByOneB { 0_k } { 1 } { u_{21} }^H \right) \end{equation*}

be the Householder transform so computed during the \((k+1) \)st iteration. Then upon completion matrix \(A \) contains

\begin{equation*} R = \FlaTwoByOne { R_{TL} } { 0 } = H_{n-1} \cdots H_1 H_0 \widehat{A} \end{equation*}

where \(\widehat{A} \) denotes the original contents of \(A \) and \(R_{TL} \) is an upper triangular matrix. Rearranging this we find that

\begin{equation*} \widehat{A} = H_0 H_1 \cdots H_{n-1} R \end{equation*}

which shows that if \(Q = H_0 H_1 \cdots H_{n-1} \) then \(\widehat{A} = Q R \text{.}\)

Typically, the algorithm overwrites the original matrix \(A \) with the upper triangular matrix, and at each step \(u_{21} \) is stored over the elements that become zero, thus overwriting \(a_{21} \text{.}\) (It is for this reason that the first element of \(u \) was normalized to equal "1".) In this case \(Q \) is usually not explicitly formed as it can be stored as the separate Householder vectors below the diagonal of the overwritten matrix. The algorithm that overwrites \(A \) in this manner is given in Figure 3.3.4.2.

\begin{equation*} \newcommand{\routinename}{[ A, t ] = {\rm HQR\_unb\_var1}( A )} \newcommand{\guard}{n( A_{BR} ) > 0} \newcommand{\partitionings}{ A \rightarrow \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } {\rm ~and~} t \rightarrow \FlaTwoByOne{ t_{T} } { t_{B} } } \newcommand{\partitionsizes}{ A_{TL} {\rm ~is~} 0 \times 0 {\rm ~and~} t_T {\rm ~has~} 0 {\rm ~elements} } \newcommand{\repartitionings}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \rightarrow \FlaThreeByThreeBR{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} } {\rm ~and~} \FlaTwoByOne{ t_{T} } { t_{B} } \rightarrow \FlaThreeByOneB{ t_{0} } { \tau_{1} } { t_{2} } } \newcommand{\moveboundaries}{ \FlaTwoByTwo{ A_{TL} } {A_{TR}} { A_{BL} } { A_{BR} } \leftarrow \FlaThreeByThreeTL{ A_{00} }{ a_{01} }{ A_{02} } { a_{10}^T }{ \alpha_{11} }{ a_{12}^T } { A_{20} }{ a_{21} }{ A_{22} } {\rm ~and~} \FlaTwoByOne{ t_{T} } { t_{B} } \leftarrow \FlaThreeByOneT{ t_{0} } { \tau_{1} } { t_{2} } } \newcommand{\update}{ \begin{array}{l} \left[ \FlaTwoByOneSingleLine { \alpha_{11} } { a_{21}}, \tau_1 \right] := \left[ \FlaTwoByOneSingleLine { \rho_{11} } { u_{21}}, \tau_1 \right] = {\rm Housev} \left( \begin{array}{c} \alpha_{11} \\ a_{21} \end{array} \right) \\ {\rm Update~} \left( \begin{array}{c} a_{12}^T \\ A_{22} \end{array} \right) := \left( I - \frac{1}{\tau_1} \left( \begin{array}{c} 1 \\ u_{21} \end{array} \right) \left( \begin{array}{c c } 1 \amp u_{21}^H \end{array} \right) \right) \left( \begin{array}{c} a_{12}^T \\ A_{22} \end{array} \right) \\ ~~~~~{\rm ~via~the~steps~} \\ ~~~~~~~~~~~~~~~ w_{12}^T := ( a_{12}^T + a_{21}^H A_{22} )/\tau_1 \\ ~~~~~~~~~~~~~~~\FlaTwoByOneSingleLine { a_{12}^T } { A_{22} } := \FlaTwoByOneSingleLine { a_{12}^T - { w_{12}^T } } { A_{22} - { a_{21} } { w_{12}^T } } \end{array} } \FlaAlgorithm \end{equation*}
Figure 3.3.4.2. Unblocked Householder transformation based QR factorization.

In that figure,

\begin{equation*} \left[ A, t \right] = {\rm HQR\_unb\_var1}( A ) \end{equation*}

denotes the operation that computes the QR factorization of \(m \times n \) matrix \(A \text{,}\) with \(m \geq n \text{,}\) via Householder transformations. It returns the Householder vectors and matrix \(R \) in the first argument and the vector of scalars "\(\tau_{i}\)" that are computed as part of the Householder transformations in \(t \text{.}\)

Homework 3.3.4.2.

Given \(A \in \R^{m \times n} \) show that the cost of the algorithm in Figure 3.3.4.2 is given by

\begin{equation*} C_{\rm HQR}( m, n ) \approx 2 m n^2 - \frac{2}{3} n^3 {\rm ~flops}. \end{equation*}
Solution

The bulk of the computation is in

\begin{equation*} w_{12}^{T} = ( a_{12}^{T} + u_{21}^{H} A_{22} ) / \tau_1 \end{equation*}

and

\begin{equation*} A_{22} - u_{21} w_{12}^{T} \text{.} \end{equation*}

During the \(k \)th iteration (when \(R_{TL} \) is \(k \times k \)), this means a matrix-vector multiplication (\(u_{21}^H A_{22} \)) and rank-1 update with matrix \(A_{22} \) which is of size approximately \((m-k) \times (n-k) \) for a cost of \(4 (m-k)(n-k) \) flops. Thus the total cost is approximately

\begin{equation*} \begin{array}{l} \sum_{k=0}^{n-1} 4 (m-k)(n-k) \\ ~~~ = ~~~~\\ 4 \sum_{j=0}^{n-1} (m - n + j )j \\ ~~~ = ~~~~ \\ 4 ( m-n) \sum_{j=0}^{n-1} j + 4 \sum_{j=0}^{n-1} j^2\\ ~~~ = ~~~~ \\ 2 ( m-n) n ( n-1 ) + 4 \sum_{j=0}^{n-1} j^2 \\ ~~~ \approx ~~~~ \\ 2 ( m-n) n^2 + 4 \int_0^n x^2 dx \\ ~~~ = ~~~~ \\ 2 m n^2 - 2 n^3 + \frac{4}{3} n^3 \\ ~~~ = ~~~~ \\ 2 m n^2 - \frac{2}{3} n^3. \end{array} \end{equation*}
Homework 3.3.4.3.

Implement the algorithm given in Figure 3.3.4.2 as

function [ A_out, t ] = HQR( A  )

by completing the code in Assignments/Week03/matlab/HQR.m. Input is an \(m \times n \) matrix A. Output is the matrix A_out with the Householder vectors below its diagonal and \(R \) in its upper triangular part. You may want to use Assignments/Week03/matlab/test_HQR.m to check your implementation.

Solution

See Assignments/Week03/answers/HQR.m. Warning: it only checks if \(R \) is computed correctly.