In the implementations in Section , an algorithmic blocking size is passed as a parameter to the parallel matrix-matrix multiplication routines. Thus, a natural question is what the value of this parameter should be. Notice that the matrix-matrix multiplication examples all used one of the following basic operations: panel-panel update (rank-k update), matrix-panel multiply, or panel-matrix multiply. Thus, whatever blocking size makes these operations optimal can be expected to yield fast implementations of matrix-matrix multiply. In general, all level-3 BLAS can be implemented using these basic operation, and the equivalents that only operation with the upper or lower portion of the matrix: symmetric panel-panel update (symmetric rank-k), triangular matrix-panel multiply, and panel-triangular matrix multiply. Thus, we provide an environment inquiry routine that, given which of these operations underlies the algorithm being implemented, returns a suggested algorithmic blocking size. place HR here
place HR here Currently, the input parameter operation can take on the values