The above algorithm generalizes in a straight-forward manner
to ,
where x and y can have any valid vector distribution,
including projected and/or duplicated.
As for the matrix-vector multiply,
some care must be taken in creating xdup and ydup.
Notice that xdup must be aligned with the columns of
a, while ydup must be aligned with the rows
of a.
Creating xdup is now accomplished through the call
After this, all required communication and alignment is again hidden in the PLA_Copy routines. A code that generalizes even further, implementing the full functionality of the sequentialPLA_Pvector_create_conf_to( a, PLA_PROJ_ONTO_ROW, PLA_ALL_ROWS, &xdup ); PLA_Pvector_create_conf_to( a, PLA_PROJ_ONTO_COL, PLA_ALL_COLS, &ydup );
PLACE BEGIN HR HERE
PLACE END HR HERE