Subsection 9.2.4 The Schur and Spectral Decompositions
¶Practical methods for computing eigenvalues and eigenvectors transform a given matrix into a simpler matrix (diagonal or tridiagonal) via a sequence of transformations that preserve eigenvalues known as similarity transformations.
Definition 9.2.4.1.
Given a nonsingular matrix \(Y \text{,}\) the transformation \(Y^{-1} A Y \) is called a similarity transformation (applied to matrix \(A \)).
Definition 9.2.4.2.
Matrices \(A \) and \(B \) are said to be similar if there exists a nonsingular matrix \(Y \) such that \(B = Y^{-1} A Y \text{.}\)
Homework 9.2.4.1.
Let \(A, B, Y \in \mathbb C^{m \times m} \text{,}\) where \(Y \) is nonsingular, and \(( \lambda, x ) \) an eigenpair of \(A \text{.}\)
Which of the follow is an eigenpair of \(B = Y^{-1} A Y \text{:}\)
\(( \lambda, x ) \text{.}\)
\(( \lambda, Y^{-1} x ) \text{.}\)
\(( \lambda, Y x ) \text{.}\)
\(( 1/\lambda, Y^{-1} x ) \text{.}\)
The observation is that similarity transformations preserve the eigenvalues of a matrix, as summarized in the following theorem.
Theorem 9.2.4.3.
Let \(A,Y,B \in \C^{m \times m} \text{,}\) assume \(Y \) is nonsingular, and let \(B = Y^{-1} A Y \text{.}\) Then \(\Lambda( A ) = \Lambda( B ) \text{.}\)
Proof.
Let \(\lambda \in \Lambda( A ) \) and \(x \) be an associated eigenvector. Then \(A x = \lambda x \) if and only if \(Y^{-1} A Y Y^{-1} x = Y^{-1} \lambda x \) if and only if \(B ( Y^{-1} x ) = \lambda ( Y^{-1} x ) \text{.}\)
It is not hard to expand the last proof to show that if \(A\) is similar to \(B \) and \(\lambda \in \Lambda( A )\) has algebraic multiplicity \(k\) then \(\lambda \in \Lambda( B ) \) has algebraic multiplicity \(k\text{.}\)
In Subsection 2.2.7, we argued that the application of unitary matrices is desirable, since they preserve length and hence don't amplify error. For this reason, unitary similarity transformations are our weapon of choice when designing algorithms for computing eigenvalues and eigenvectors.
Definition 9.2.4.4.
Given unitary matrix \(Q \) the transformation \(Q^H A Q \) is called a unitary similarity transformation (applied to matrix \(A \)).
The following is a fundamental theorem for the algebraic eigenvalue problem that is key to practical algorithms for finding eigenvalues and eigenvectors.
Theorem 9.2.4.5. Schur Decomposition Theorem.
Let \(A \in \C^{m \times m} \text{.}\) Then there exist a unitary matrix \(Q \) and upper triangular matrix \(U\) such that \(A = Q U Q^H \text{.}\) This decomposition is called the Schur decomposition of matrix \(A\text{.}\)
Proof.
We will outline how to construct \(Q \) so that \(Q^H A Q = U \text{,}\) an upper triangular matrix.
Since a polynomial of degree \(m \) has at least one root, matrix \(A \) has at least one eigenvalue, \(\lambda_1\text{,}\) and corresponding eigenvector \(q_1 \text{,}\) where we normalize this eigenvector to have length one. Thus \(A q_1 = \lambda_1 q_1 \text{.}\) Choose \(Q_2 \) so that \(Q = \left( \begin{array}{ c|c} q_1 \amp Q_2 \end{array} \right) \) is unitary. Then
where \(w^T = q_1^H A Q_2 \) and \(B = Q_2^H A Q_2 \text{.}\) This insight can be used to construct an inductive proof.
In other words: Given matrix \(A \text{,}\) there exists a unitary matrix \(Q \) such that applying the unitary similarity transformation \(Q^H A Q \) yields an upper triangular matrix \(U \text{.}\) Since then \(\Lambda( A ) = \Lambda( U ) \text{,}\) the eigenvalues of \(A \) can be found on the diagonal of \(U \text{.}\) The eigenvectors of \(U \) can be computed and from those the eigenvectors of \(A \) can be recovered.
One should not mistake the above theorem and its proof for a constructive way to compute the Schur decomposition: finding an eigenvalue, \(\lambda_1 \) and/or the eigenvector associated with it, \(q_1 \text{,}\) is difficult. Also, completing the unitary matrix \(\left( \begin{array}{c | c } q_1 \amp Q_2 \end{array} \right) \) is expensive (requiring the equivalent of a QR factorization).
Homework 9.2.4.2.
Let \(A \in \C^{m \times m} \text{,}\) \(A = Q U Q^H \) be its Schur decomposition, and \(X^{-1} U X = \Lambda \text{,}\) where \(\Lambda \) is a diagonal matrix and \(X \) is nonsingular.
How are the elements of \(\Lambda \) related to the elements of \(U \text{?}\)
How are the columns of \(X \) related to the eigenvectors of \(A \text{?}\)
-
How are the elements of \(\Lambda \) related to the elements of \(U \text{?}\)
The diagonal elements of \(U \) equal the diagonal elements of \(\Lambda \text{.}\)
-
How are the columns of \(X \) related to the eigenvectors of \(A \text{?}\)
\begin{equation*} A = Q U Q^H = Q X \Lambda X^{-1} Q^H = ( Q X ) \Lambda ( Q X )^{-1}. \end{equation*}Hence the columns of \(Q X \) equal eigenvectors of \(A \text{.}\)
If the matrix is Hermitian, then the Schur decomposition has the added property that \(U \) is diagonal. The resulting decomposition is known as the Spectral decomposition.
Theorem 9.2.4.6. Spectral Decomposition Theorem.
Let \(A \in \C^{m \times m} \) be Hermitian. Then there exist a unitary matrix \(Q \) and diagonal matrix \(D \in \Rmxm \) such that \(A = Q D Q^H \text{.}\) This decomposition is called the spectral decomposition of matrix \(A\text{.}\)
Proof.
Let \(A = Q U Q^H \) be the Schur decomposition of \(A \text{.}\) Then \(U = Q^H A Q \text{.}\) Since \(A \) is Hermitian, so is \(U \) since \(U^H = ( Q^H A Q )^H = Q^H A^H Q = Q^H A Q = U \text{.}\) A triangular matrix that is Hermitian is diagonal. Any Hermitian matrix has a real-valued diagonal and hence \(D\) has real-valued on its diagonal.
In practical algorithms, it will often occur that an intermediate result can be partitioned into smaller subproblems. This is known as deflating the problem and builds on the following insights.
Theorem 9.2.4.7.
Let \(A \in \C^{m \times m} \) be of form \(A = \FlaTwoByTwo{ A_{TL} }{ A_{TR} }{ 0 }{ A_{BR} } \text{,}\) where \(A_{TL} \) and \(A_{BR} \) are square submatrices. Then \(\Lambda( A ) = \Lambda( A_{TL} ) \cup \Lambda( A_{BR} ) \text{.}\)
The proof of the above theorem follows from the next homework regarding how the Schur decomposition of \(A \) can be computed from the Schur decompositions of \(A_{TL} \) and \(A_{BR} \text{.}\)
Homework 9.2.4.3.
Let \(A \in \C^{m \times m} \) be of form
where \(A_{TL} \) and \(A_{BR} \) are square submatrices with Schur decompositions
Give the Schur decomposition of \(A \text{.}\)
Homework 9.2.4.4.
Generalize the result in the last homework for block upper triangular matrices:
For \(i = 0, \ldots , N-1 \text{,}\) let \(A_{i.i} = Q_i U_{i} Q_i^H \) be the Schur decomposition of \(A_{i,i} \text{.}\) Then