Subsection 9.3.4 The Rayleigh Quotient Iteration
¶A basic idea that allows one to accelerate the convergence of the inverse iteration is captured by the following exercise:
Homework 9.3.4.1.
Let \(A \in \mathbb C^{m \times m} \text{,}\) \(\rho \in \mathbb C \text{,}\) and \(( \lambda, x ) \) an eigenpair of \(A \) .
Which of the follow is an eigenpair of the shifted matrix \(A - \rho I \text{:}\)
\(( \lambda, x ) \text{.}\)
\(( \lambda, A^{-1} x ) \text{.}\)
\(( \lambda - \rho, x ) \text{.}\)
\(( 1/(\lambda - \rho), x ) \text{.}\)
\(( \lambda - \rho, x ) \text{.}\)
Now justify your answer.
Let \(A x = \lambda x \text{.}\) Then
We conclude that \(( \lambda- \rho, x ) \) is an eigenpair of \(A-\rho I \text{.}\)
The matrix \(A - \rho I \) is referred to as the matrix \(A \) that has been "shifted" by \(\rho \text{.}\) What the next lemma captures is that shifting \(A \) by \(\rho\) shifts the spectrum of \(A \) by \(\rho \text{:}\)
Lemma 9.3.4.1.
Let \(A \in \C^{m \times m} \text{,}\) \(A = X \Lambda X^{-1} \) and \(\rho \in \C \text{.}\) Then \(A - \rho I = X ( \Lambda - \rho I ) X^{-1} \text{.}\)
Homework 9.3.4.2.
Prove Lemma 9.3.4.1.
This suggests the following (naive) iteration: Pick a value \(\rho \) close to \(\lambda_{m-1} \text{.}\) Iterate
Of course one would solve \(( A - \rho I ) v^{(k+1)} = v^{(k)} \) rather than computing and applying the inverse of \(A \text{.}\)
If we index the eigenvalues so that
then
The closer to \(\lambda_{m-1} \) the shift \(\rho \) is chosen, the more favorable the ratio (constant) that dictates the linear convergence of this modified Inverse Power Method.
A more practical algorithm is given by
where instead of multiplying by the inverse one would want to solve the linear system \(( A - \rho I ) v^{(k+1)} = v^{(k)}\) instead.
The question now becomes how to chose \(\rho \) so that it is a good guess for \(\lambda_{m-1} \text{.}\) Often an application inherently supplies a reasonable approximation for the smallest eigenvalue or an eigenvalue of particular interest. Alternatively, we know that eventually \(v^{(k)} \) becomes a good approximation for \(x_{m-1} \) and therefore the Rayleigh quotient gives us a way to find a good approximation for \(\lambda_{m-1} \text{.}\) This suggests the (naive) Rayleigh-quotient iteration:
Here \(\lambda_{m-1} \) is the eigenvalue to which the method eventually converges.
with
which means superlinear convergence is observed. In fact, it can be shown that once \(k \) is large enough
thus achieving quadratic convergence. Roughly speaking this means that every iteration doubles the number of correct digits in the current approximation. To prove this, one shows that \(\vert \lambda_{m-1} - \rho_k \vert \leq K \| v^{(k)} - \psi_{m-1} x_{m-1} \|_{X^{-1}}\) for some constant \(K \text{.}\) Details go beyond this discussion.
Better yet, it can be shown that if \(A \) is Hermitian, then (once \(k \) is large enough)
for some constant \(C \) and hence the naive Rayleigh Quotient Iteration achieves cubic convergence for Hermitian matrices. Here our norm \(\| \cdot \|_{X^{-1}} \) becomes any p-norm since the Spectral Decomposition Theorem tells us that for Hermitian matrices \(X \) can be taken to equal a unitary matrix. Roughly speaking this means that every iteration triples the number of correct digits in the current approximation. This is mind-boggling fast convergence!
A practical Rayleigh Quotient Iteration is given by
Remark 9.3.4.2.
A concern with the (Shifted) Inverse Power Method and Rayleigh Quotient Iteration is that the matrix with which one solves is likely nearly singular. It turns out that this actually helps: the error that is amplified most is in the direction of the eigenvector associated with the smallest eigenvalue (after shifting, if appropriate).