SUMMA: Scalable Universal Matrix Multiplication Algorithm
- Robert A. van de Geijn
- Department of Computer Sciences
- University of Texas
- Austin, TX 78712
- Jerrell Watts
- Scalable Concurrent Programming Laboratory
- California Institute of Technology
- Pasadena, California 91125
- jwatts@scp.caltech.edu
Abstract
We give a straight forward, highly efficient,
scalable implementation of common matrix multiplication
operations.
The algorithms are much simpler than previously
published methods, yield better performance, and
require less work space.
MPI implementations are given, as are performance results
on the Intel Paragon system.