Number of times this page has been accessed since Aug. 7, 1995:

A High Performance Parallel Strassen Implementation

Brian Grayson
Department of Electrical and Computer Engineering
University of Texas
Austin, TX 78712
bgrayson@pine.ece.utexas.edu
Ajay Pankaj Shah
Department of Computer Sciences
University of Texas
Austin, TX 78712
ajay@cs.utexas.edu
Robert A. van de Geijn
Department of Computer Sciences
University of Texas
Austin, TX 78712
rvdg@cs.utexas.edu

Abstract

In this paper, we give what we believe to be the first high performance parallel implementation of Strassen's algorithm for matrix multiplication. We show how under restricted conditions, this algorithm can be implemented plug compatible with standard parallel matrix multiplication algorithms. Results obtained on a large Intel Paragon system show a 10-20% reduction in execution time compared to what we believe to be the fastest standard parallel matrix multiplication implementation available at this time.

Brian Grayson, Ajay Shah and Robert van de Geijn "A High Performance Parallel Strassen Implementation," Department of Computer Sciences, The Unversity of Texas, TR-95-24, June 1995. Journal version: Parallel Processing Letters, Vol 6, No. 1 (1996) 3-12.