Imagine the case where a node has computed a vector of data that must be added to a linear algebra object, e.g. all or part of a column of a matrix. There are a number of possibilities:
Clearly, considerable complexity results from managing the required communication. Add to this the distinct possibility that many or all nodes may be simultaneously generating vectors of data to be entered in linear algebra objects and the code that generates the problems to be passed to PLAPACK or other parallel linear algebra packages can become unmanageable.
- The given node may own the target entries in the linear algebra object. In this case, a local operation suffices.
- Another node owns the target entries in the linear algebra object. Now, communication is required before the data can be added to the linear algebra object. Moreover, if conventional MPI communication calls are used, the target node must be aware that the data is about to arrive.
- Many nodes own some of the the target entries in the linear algebra object. Now a number of communications are required, and all the target nodes must be aware of the data that is about to arrive.
Once an application has entered the PLAPACK API-active state, a call to the former of the following routines will achieve the above described operation. The latter call will retrieving information from global objects. place HR here
place HR here We first discuss PLA_API_axpy_vector_to_global. On the calling node, the local vector is given by its size, in size, the address where it starts in memory, in local_vector, and the stride in memory between entries in the vector, in local_stride. The target object is given by obj, which can be of any object type compatible with a vector (unit global length or width). The displacement, in displ, gives the starting index in the target object where the data in the local vector is to be placed. The scaling parameter, in alpha, allows a multiple of the local data to be added. The operation performed is
where y represents the target (global) object and x the local vector. (Notice that the call is like the ``axpy'' BLAS call, described in Chapter .) The notation y( i:j ) is used to indicate the sub-vector of y starting at entry i and ending at entry j . The data type of the data at the address alpha is assumed to be the same as the data type of the local vector and the target object. Notice that the call is only performed on the node that owns the local vector. All nodes can simultaneously perform such calls to enter local vectors in global objects.
The operation performed by PLA_API_axpy_global_to_vector reverses the above. Now y equals the local vector and x the global, so that the operation performed becomes
Again, only the node that owns the local vector makes the call and all nodes can make similar calls simultaneously.
It is important to realize that the above calls only initiate the operation. The data cannot be assumed to have arrived in the target until the object is closed, or a call to PLA_Obj_API_sync, to be discussed later in this chapter, is performed for the given global object.
Often, a locally computed vector contributes to a given object, but not as a contiguous sub-vector. For this PLAPACK provides a call that contributes multiple sub-vectors. PLAPACK also provides the mirror operation, gathering a number of sub-vectors from a global object into a local vector. place HR here
place HR here Notice that the code segment
is equivalent to the loopint nsub, *sizes, *displs; double *y, alpha; PLA_Obj obj = NULL; <...> PLA_API_multi_axpy_vector_to_global( nsub, sizes, &alpha, y, 1, obj, displs ) <...>
int nsub, *sizes, *displs, i; double *y, alpha; PLA_Obj obj = NULL; <...> local_displ = 0; for ( i=0; i<nsub; i++ ) { PLA_API_axpy_vector_to_global( sizes[i], &alpha, &y[ local_displ ], 1, obj, displs[i] ); local_displ += sizes[i]; } <...>