We now show how the parallel matrix-vector multiplication and rank-1 update can be used to implement matrix-matrix multiplication and other matrix-matrix operations. For simplicity, we concentrate on the case C = A B , where all three matrices are . In all our explanations, we will use the following partitionings:
with and where represents the j th column of matrix X . Also,
where represents the i th row of matrix X .