ITXGEMM: Performance


The latest performance numbers have moved to here

Platform (Robert's laptop):


Matrix-matrix multiplication

To check the performance of the ITXGEMM dgemm routine in isolation, we timed the ATLAS dgemm against our ITXGEMM dgemm.
Case timed: DGEMM( "N", "N", ... )
Comparing only the peak is for sissies. Let's compare all matrix sizes!

LU factorization with pivoting

To check the performance of the ITXGEMM dgemm routine in context, we wrote a blocked LU factorization with partial pivoting. The driver routine times the LU factorization and associated forward and backward substitution.

Versions timed:


Performance:

Postscript graph
n ITX-FLAME ATLAS ATL-FLAME
100

200

300

400

500

600

700

800

900

1000

1250

1500

2000

2500

179.4

328.7

359.6

377.7

390.2

402.4

412.8

411.4

423.5

430.9

445.8

454.6

465.4

474.0

234.9

289.9

314.2

337.3

350.4

365.5

377.2

384.0

389.7

398.2

412.7

421.3

432.2

441.0


How to do your own performance evaluation.

Note: ATLAS has implementations of some LAPACK routines as part of the library (e.g. dgetrf). Thus, to do a fair comparison between ATLAS and ITXGEMM, you will need to order the libaries upon linking as follows:


flame@cs.utexas.edu
Last Updated: Nov. 15, 2000