Given two global vectors x and y in objects x and y,
and scaling factor in object alpha,
we wish to compute the scaled addition
If both x and y are of
object type vector and they are aligned identically to
the template, and alpha is a multiscalar
duplicated to all nodes,
then the result can be computed by
Thus, in this situation it suffices to call the local axpy routine on the local data.
If x and y are not distributed as vectors
and/or is not available on all nodes,
one approach is to create two temporary vectors
that are aligned
and a
multiscalar duplicated
to all nodes, copy x , y , and
into the
temporary vectors and multiscalar, and execute the above described procedure
for aligned vectors. The code would look similar to that given for
parallel inner product in Figure
.
We instead illustrate an alternative to the above approach,
which redistributes the object x
like
the output object y, before local operations proceed
(Figure ).
In the code given in that figure,
, x , and y are passed to the routine
as objects alpha, x, and y.
The first is a
multiscalar.
The last two we assume are vectors, possibly projected
and/or duplicated.
We now create a temporary object,
temp_x, of object
type vector and distributed like the output object y.
We also create temp_alpha, a
multiscalar duplicated
to all nodes, so that all nodes will have a local
copy of the scaling factor.
Next, we copy x to temp_x
and alpha to temp_alpha and a local
axpy operation completes the necessary computation.
PLACE BEGIN HR HERE
PLACE END HR HERE