To derive a level-3 BLAS left-looking variant for computing the factorization, consider the partitioning
The assumption is that bold-face parts of the lower triangular
matrix have already been computed, and have overwritten the
corresponding parts of A . The rest of the matrix has not
been updated at all, and the object of the next step is
to compute the next parts of the lower triangular matrix,
and
, overwriting the corresponding parts of
A .
From the above equation, we derive
or
The algorithm for the left looking version of the Cholesky factorization can be given as follows using the above equations
The PLAPACK implementation using global level-3 BLAS is given in Figure 8.4. This time the bulk of the computation is in the update
PLA_Environ_nb_alg( PLA_OP_MAT_PAN, template, &nb_alg );Notice how the code reflect the above described algorithm, which could have been taken straight from a number of textbooks (e.g., []).