Robert A. van de Geijn

CS 395T Parallel Techniques for Numerical Algorithms

WHAT'S NEW

Team assignments for semester project
User manual for SL library (in preparation)
Example routine for dealing with communicators, making them 2D, and extracting row and column communicators.
For information on how to use the SP-2 accounts, click here
For infor on the TA, click here

Please start by filling out the survey

About the Course

CS 395T
Parallel Techniques for Numerical Algorithms
Unique Number: 47470
Time: MW 3:30-4:45
Room: GAR 201

About the Professor

Robert A. van de Geijn
Associate Professor of Computer Sciences
Office: Tay 4.115C
e-mail: rvdg@cs.utexas.edu
phone: 471-9720
Office Hours: to be determined

Materials

Required:
MPI: THE COMPLETE REFERENCE by Snir, Otto, Huss-Lederman, Walker, and Dongarra, MIT Press, 1995.
DO NOT PRINT THESE BOOKS OUT.
Reference:
Using MPI by Gropp, Lusk, and Skjellum, MIT Press, 1994
Other:
Other materials will be place on the webpage http://www.cs.utexas.edu/users/rvdg/395T.96

Syllabus

Why Parallel Computing?
Why Message Passing?
Collective Communication
High Performance Parallel Dense Linear Algebra Algorithms
Sparse Methods
Selected Topics

Grading
This course will be graded based on 1-2 initial programming projects, and a semester project of the student's choice. Group projects are encouraged.
Reference Information

Enrolled Students

The Argonne National Laboratory MPI site.

Information on various vendors

Intel Scalable Systems Division Intel Paragon
IBM MPI on SP-2.
Cray T3D and T3E
Silicon Graphics Incorporated Power Challenge
Convex SPP

Various documents used in the class

A Street Guide to Collective Communication on Parallel Computers

Chapter 1: Handout, Jan. 22
Chapter 1: Handout, Jan. 24
Chapter 1-2: Handout, Jan. 29

BLAS materials

BLAS quick reference guide. (Hardcopies handed out in class)
Level 2 BLAS paper
Level 3 BLAS paper

Using the iPSC/860 See James Overfelt's page
Communication Materials

NEW Prasenjit Mitra, David Payne, Lance Shuler, Robert van de Geijn, and Jerrell Watts, "Fast Collective Communication Libraries, Please," Proceedings of the Intel Supercomputing Users' Group Meeting 1995.

Hypercube Global Combine Paper
MESH Global Combine Paper
Barnett, et. al. ``Broadcasting on Meshes with Wormhole Routing,'' Journal of Parallel and Distributed Computing
Jerrell Watts and Robert van de Geijn, Mesh Pipelined Broadcast Paper
M. Barnett, S. Gupta, D. Payne, L. Shuler, R. van de Geijn and J. Watts, ``Interprocessor Collective Communication Library (InterCom),'' Scalable High Performance Computing Conference 1994.

Matrix-vector Multiplication Materials

Output from matrix-vector example
SHPCC94 MV multiply paper
Manuscript on MV multiplication

Matrix-Matrix Multiplication Materials

NEW Almadena Chtchelkanova, John Gunnels, Greg Morrow, James Overfelt, Robert A. van de Geijn, "Parallel Implementation of BLAS: General Techniques for Level 3 BLAS," submitted to Concurrency: Practice and Experience

NEW Robert van de Geijn and Jerrell Watts "SUMMA: Scalable Universal Matrix Multiplication Algorithm," Concurrency: Practice and Experience , to appear.

NEW Brian Grayson and Robert van de Geijn "A High Performance Parallel Strassen Implementation," Parallel Processing Letters, to appear.

Choi et al.: PUMMA paper
Huss-Lederman, S., E. Jacobson, A. Tsao, & G. Zhang, "Matrix Multiplication on the Intel Touchstone DELTA," Technical Report SRC-TR-93-101 (revised), Supercomputing Research Center, (Feb. 1994, 21 pages). This is an expanded version of PRISM Working Note #7 and supersedes PRISM Working Note #11. Appeared in Concurrency: Practice and Experience, Vol. 6 (7), Oct. 1994, pp. 571-594.

Dense Factorization Materials

Dongarra, van de Geijn, and Walker, "Scalability Issues Affecting the Design of a Dense Linear Algebra Library" Journal of Parallel and Distributed Computing, Vol. 22, No. 3, Sept. 1994.
LAPACK LU factorization codes

dgetrf.f (level 3)
dgetf2.f (level 2)

Parallel LU factorization codes

pzlubr.f (level 3)
pzlur.f (level 2)

Eigenvalue Materials

Jack Dongarra and Robert van de Geijn ``Reduction to Condensed Form on Distributed Memory Architectures,'' LAPACK Working Note 30, in Parallel Computing , 18, pp. 973--982, 1992.
J. Choi, J. Dongarra, and D. Walker, ``The Design of a Parallel Dense Linear Algebra Software Library: Reduction to Hessenberg, Tridiagonal, and Bidiagonal Form,'' UT, CS-95-275, February 1995.