SUMMA: Scalable Universal Matrix Multiplication Algorithm
- Robert A. van de Geijn
- Department of Computer Sciences
- University of Texas
- Austin, TX 78712
- rvdg@cs.utexas.edu
- Jerrell Watts
- Jerrell Watts
- Scalable Concurrent Programming Laboratory
- California Institute of Technology
- Pasadena, California 91125
- jwatts@scp.caltech.edu
Abstract
In this paper, we give a straight forward, highly efficient,
scalable implementation of common matrix multiplication
operations.
The algorithms are much simpler than previously
published methods, yield better performance, and
require less work space.
MPI implementations are given, as are performance results
on the Intel Paragon system.
Robert van de Geijn and Jerrell Watts,
``SUMMA: Scalable Universal Matrix Multiplication Algorithm,''
submitted to Concurrency: Practice and Experience .
Robert van de Geijn and Jerrell Watts,
``SUMMA: Scalable Universal Matrix Multiplication Algorithm,''
Department of Computer Sciences, The Unversity of Texas,
TR-95-13, April 1995.
Also:
LAPACK Working Note #96
, May 1995.