SUMMA: Scalable Universal Matrix Multiplication Algorithm

Robert A. van de Geijn
Department of Computer Sciences
University of Texas
Austin, TX 78712
rvdg@cs.utexas.edu
Jerrell Watts
Jerrell Watts
Scalable Concurrent Programming Laboratory
California Institute of Technology
Pasadena, California 91125
jwatts@scp.caltech.edu

Abstract

In this paper, we give a straight forward, highly efficient, scalable implementation of common matrix multiplication operations. The algorithms are much simpler than previously published methods, yield better performance, and require less work space. MPI implementations are given, as are performance results on the Intel Paragon system.

Robert van de Geijn and Jerrell Watts, ``SUMMA: Scalable Universal Matrix Multiplication Algorithm,'' submitted to Concurrency: Practice and Experience .

Robert van de Geijn and Jerrell Watts, ``SUMMA: Scalable Universal Matrix Multiplication Algorithm,'' Department of Computer Sciences, The Unversity of Texas, TR-95-13, April 1995. Also: LAPACK Working Note #96 , May 1995.