SELECTED PLAPACK IMPLEMENTATIONS OF MATRIX OPERATIONS

Index

Parallel Level-2 BLAS:
- PLA_Gemv: Parallel General Matrix-Matrix Multiplication
Parallel Level-3 BLAS:
- PLA_Gemm: Parallel General Matrix-Vector Multiplication
Parallel Factorization Routines:
- PLA_Chol: Parallel Cholesky Factorization

PLA_Gemv: General Matrix-Vector Multiplication

Main routine: PLA_Gemv.c
Parameter checking: PLA_Gemv_enter_exit.c

The best way to justify the Abstract Programming Interface used by PLAPACK is to look at how a parallel implementation code looks if it is coded in a more traditional fashion. Look at the corresponding ScaLAPACK code:

pdgemv_.c: ScaLAPACK Parallel BLAS (PBLAS) routine
pbdgemv.f: ScaLAPACK Parallel Blocked BLAS (PBBLAS) routine

References:

R. van de Geijn, Using PLAPACK (Users' Guide) , The MIT Press, 1997.

PLA_Gemm: General Matrix-Matrix Multiplication

Main routine: PLA_Gemm.c
This routine chooses between three different routines, depending on the shapes of the matrices involved:
- If matrix C contains most data, it is left in place and A and B are communicated. The algorithm is implemented as a sequence of rank-k updates.
  - PLA_Gemm_C.c
  - Algorithm
- If matrix A contains most data, it is left in place and B and C are communicated. The algorithm is implemented as a sequence of matrix-panel( of columns) multiplies.
  - PLA_Gemm_A.c
  - Algorithm
- If matrix B contains most data, it is left in place and A and C are communicated. The algorithm is implemented as a sequence of panel( of rows)-matrix multiplies.
  - PLA_Gemm_B.c
  - Algorithm
Parameter checking: PLA_Gemm_enter_exit.c

pdgemm_.c: ScaLAPACK Parallel BLAS (PBLAS) routine
pbdgemm.f: ScaLAPACK Parallel Blocked BLAS (PBBLAS) routine

References:

R. van de Geijn, Using PLAPACK (Users' Guide) , The MIT Press, 1997.

Robert van de Geijn and Jerrell Watts "SUMMA: Scalable Universal Matrix Multiplication Algorithm," Concurrency: Practice and Experience, Vol. 9 (4), pp. 255-274 (April 1997)

John Gunnels, Calvin Lin, Greg Morrow, and Robert van de Geijn, "A Flexible Class of Parallel Matrix Multiplication Algorithms" , Proceedings of First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing (1998 IPPS/SPDP '98), pp. 110-116 1998.

PLA_Chol: Cholesky Factorization

Main routine: PLA_Chol.c
Parameter checking: PLA_Chol_enter_exit.c
A much simpler implementation, which really shows off how a PLAPACK implementation is just a direct translation of how an algorithm is naturally expressed, is given by
- Algorithm
- PLA_Chol_simple.c

pdpotrf.c: ScaLAPACK Blocked Cholesky Factorization
pdpotf2.c: ScaLAPACK Unblocked Cholesky Factorization (needed by blocked factorization)

References:

R. van de Geijn, Using PLAPACK (Users' Guide) , The MIT Press, 1997.

Greg Morrow and Robert van de Geijn, "Zen and the Art of High-Performance Parallel Computing"

Greg Baker, John Gunnels, Greg Morrow, Beatrice Riviere, and Robert van de Geijn, "PLAPACK: High Performance through High Level Abstraction" , ICPP98.

Philip Alpatov, Greg Baker, Carter Edwards, John Gunnels, Greg Morrow, James Overfelt, Robert van de Geijn, Yuan-Jye J. Wu, "PLAPACK: Parallel Linear Algebra Libraries Design Overview" , SC97.

Philip Alpatov, Greg Baker, Carter Edwards, John Gunnels, Greg Morrow, James Overfelt, Robert van de Geijn, Yuan-Jye J. Wu, "PLAPACK: Parallel Linear Algebra Package," in Proceedings of the SIAM Parallel Processing Conference, 1997.

Back to PLAPACK page

Send mail to

plapack@cs.utexas.edu

Last Updated: Feb. 8, 2000