Next: More Operations and Information Up: Vector-Vector Operations Previous: Example: Parallelizing Inner Product

Example: Parallelizing ``axpy'' for Vector Objects

Given two global vectors x and y in objects x and y, and scaling factor in object alpha, we wish to compute the scaled addition

If both x and y are of object type vector and they are aligned identically to the template, and alpha is a multiscalar duplicated to all nodes, then the result can be computed by

Thus, in this situation it suffices to call the local axpy routine on the local data.

If x and y are not distributed as vectors and/or is not available on all nodes, one approach is to create two temporary vectors that are aligned and a multiscalar duplicated to all nodes, copy x , y , and into the temporary vectors and multiscalar, and execute the above described procedure for aligned vectors. The code would look similar to that given for parallel inner product in Figure .

We instead illustrate an alternative to the above approach, which redistributes the object x like the output object y, before local operations proceed (Figure ). In the code given in that figure, , x , and y are passed to the routine as objects alpha, x, and y. The first is a multiscalar. The last two we assume are vectors, possibly projected and/or duplicated. We now create a temporary object, temp_x, of object type vector and distributed like the output object y. We also create temp_alpha, a multiscalar duplicated to all nodes, so that all nodes will have a local copy of the scaling factor. Next, we copy x to temp_x and alpha to temp_alpha and a local axpy operation completes the necessary computation.

PLACE BEGIN HR HERE

PLACE END HR HERE

Next: More Operations and Information Up: Vector-Vector Operations Previous: Example: Parallelizing Inner Product

rvdg@cs.utexas.edu