Subsection 1.3.1 Of linear transformations and matrices
ΒΆWe briefly review the relationship between linear transformations and matrices, which is key to understanding why linear algebra is all about matrices and vectors.
Definition 1.3.1.1. Linear transformations and matrices.
Let \(L : \Cn \rightarrow \Cm \text{.}\) Then \(L \) is said to be a linear transformation if for all \(\alpha \in \mathbb C \) and \(x, y \in \Cn \)
\(L( \alpha x ) = \alpha L( x ) \text{.}\) That is, scaling first and then transforming yields the same result as transforming first and then scaling.
\(L( x + y ) = L( x ) + L( y ) \text{.}\) That is, adding first and then transforming yields the same result as transforming first and then adding.
The importance of linear transformations comes in part from the fact that many problems in science boil down to, given a function \(F: \Cn \rightarrow \Cm \) and vector \(y \in \Cm \text{,}\) find \(x \) such that \(F( x ) = y \text{.}\) This is known as an inverse problem. Under mild conditions, \(F \) can be locally approximated with a linear transformation \(L \) and then, as part of a solution method, one would want to solve \(L x = y \text{.}\)
The following theorem provides the link between linear transformations and matrices:
Theorem 1.3.1.2.
Let \(L: \Cn \rightarrow \Cm \) be a linear transformation, \(v_0, v_1, \cdots, v_{k-1} \in \Cn \text{,}\) and \(x \in \C^k \text{.}\) Then
where
Proof.
A simple inductive proof yields the result. For details, see Week 2 of Linear Algebra: Foundations to Frontiers (LAFF) [27].
The following set of vectors ends up playing a crucial role throughout this course:
Definition 1.3.1.3. Standard basis vector.
In this course, we will use \(e_j \in \Cm \) to denote the standard basis vector with a "1" in the position indexed with \(j \text{.}\) So,
Key is the fact that any vector \(x \in \Cn \) can be written as a linear combination of the standard basis vectors of \(\Cn \text{:}\)
Hence, if \(L \) is a linear transformation,
If we now let \(a_j = L( e_j ) \) (the vector \(a_j \) is the transformation of the standard basis vector \(e_j \) and collect these vectors into a two-dimensional array of numbers:
then we notice that information for evaluating \(L( x )\) can be found in this array, since \(L\) can then alternatively be computed by
The array \(A \) in (1.3.1) we call a matrix and the operation \(A x = \chi_0 a_0 + \chi_1 a_1 + \cdots + \chi_{n-1} a_{n-1} \) we call matrix-vector multiplication. Clearly
Remark 1.3.1.4. Notation.
In these notes, as a rule,
Roman upper case letters are used to denote matrices.
Roman lower case letters are used to denote vectors.
Greek lower case letters are used to denote scalars.
Corresponding letters from these three sets are used to refer to a matrix, the row or columns of that matrix, and the elements of that matrix. If \(A \in \mathbb C^{m \times n} \) then
We now notice that the standard basis vector \(e_j \in \Cm \) equals the column of the \(m \times m \) identity matrix indexed with \(j \text{:}\)
Remark 1.3.1.5.
The important thing to note is that a matrix is a convenient representation of a linear transformation and matrix-vector multiplication is an alternative way for evaluating that linear transformation.
Let's investigate matrix-matrix multiplication and its relationship to linear transformations. Consider two linear transformations
and define
as the composition of \(L_A\) and \(L_B \text{.}\) Then it can be easily shown that \(L_C \) is also a linear transformation. Let \(m \times n\) matrix \(C \) represent \(L_C \text{.}\) How are \(A \text{,}\) \(B \text{,}\) and \(C \) related? If we let \(c_j \) equal the column of \(C \) indexed with \(j \text{,}\) then because of the link between matrices, linear transformations, and standard basis vectors
where \(b_j \) equals the column of \(B \) indexed with \(j \text{.}\) Now, we say that \(C = A B \) is the product of \(A \) and \(B \) defined by
and define the matrix-matrix multiplication as the operation that computes
which you will want to pronounce "C becomes A times B" to distinguish assignment from equality. If you think carefully how individual elements of \(C \) are computed, you will realize that they equal the usual "dot product of rows of \(A \) with columns of \(B \text{.}\)"
As already mentioned, throughout this course, it will be important that you can think about matrices in terms of their columns and rows, and matrix-matrix multiplication (and other operations with matrices and vectors) in terms of columns and rows. It is also important to be able to think about matrix-matrix multiplication in three different ways. If we partition each matrix by rows and by columns:
and
then \(C := A B \) can be computed in the following ways:
-
By columns:
\begin{equation*} \left( \begin{array}{c | c | c} c_0 \amp \cdots \amp c_{n-1} \end{array} \right) := A \left( \begin{array}{c | c | c} b_0 \amp \cdots \amp b_{n-1} \end{array} \right) = \left( \begin{array}{c | c | c} A b_0 \amp \cdots \amp A b_{n-1} \end{array} \right). \end{equation*}In other words, \(c_j := A b_j \) for all columns of \(C \text{.}\)
-
By rows:
\begin{equation*} \left( \begin{array}{c} \widetilde c_0^T \\ \hline \vdots \\ \hline \widetilde c_{m-1}^T \end{array} \right) := \left( \begin{array}{c} \widetilde a_0^T \\ \hline \vdots \\ \hline \widetilde a_{m-1}^T \end{array} \right) B = \left( \begin{array}{c} \widetilde a_0^T B \\ \hline \vdots \\ \hline \widetilde a_{m-1}^T B \end{array} \right). \end{equation*}In other words, \(\widetilde c_i^T = \widetilde a_i^T B\) for all rows of \(C \text{.}\)
-
One you may not have thought about much before:
\begin{equation*} C := \left( \begin{array}{c | c | c} a_0 \amp \cdots \amp a_{k-1} \end{array} \right) \left( \begin{array}{c} \widetilde b_0^T \\ \hline \vdots \\ \hline \widetilde b_{k-1}^T \end{array} \right) = a_0 \widetilde b_0^T + \cdots + a_{k-1} \widetilde b_{k-1} ^T, \end{equation*}which should be thought of as a sequence of rank-1 updates, since each term is an outer product and an outer product has rank of at most one.
These three cases are special cases of the more general observation that, if we can partition \(C \text{,}\) \(A \text{,}\) and \(B \) by blocks (submatrices),
and
where the partitionings are "conformal", then
Remark 1.3.1.6.
If the above review of linear transformations, matrices, matrix-vector multiplication, and matrix-matrix multiplication makes you exclaim "That is all a bit too fast for me!" then it is time for you to take a break and review Weeks 2-5 of our introductory linear algebra course "Linear Algebra: Foundations to Frontiers." Information, including notes [27] (optionally downloadable for free) and a link to the course on edX [28] (which can be audited for free) can be found at http://ulaff.net
.