We postulate that one should never start by considering how to decompose the matrix. Rather, one should start by considering how to decompose the physical problem to be solved. Notice that it is the elements of vectors that are typically associated with data of physical significance and it is therefore their distribution to nodes that is directly related to the distribution of the problem to be solved. A matrix (discretized operator) merely represents the relation between two vectors (discretized spaces):
Since it is more natural to start with distributing the problem to nodes, we partition x and y , and assign portions of these vectors to nodes. The matrix A should then be distributed to nodes in a fashion consistent with the distribution of the vectors, as we shall show next. We will call a matrix distribution physically based if the layout of the vectors which induce the distribution of A to nodes is consistent with where an application would naturally want them. We will use the abbreviation PBMD for Physically Based Matrix Distribution.
As discussed, we must start by describing the distribution of
the vectors, x and y , to nodes, after which
we will show how the matrix distribution is induced
(derived) by the vector distribution.
Let and
be permutations so that
Here and
are the permutations that order the elements of
x and y , respectively,
that are to be assigned to the first node first,
then the ones assigned to the second node, and so forth.
Thus if the nodes are labeled
,
and
are assigned to
.
Notice that the above discussion links the linear algebra
object ``vector'' to a mapping to the nodes.
In most other approaches to matrix distribution, vectors
appear as special cases of matrices, or as somehow linked
to the rows and columns of matrices, after the distribution
of matrices is already specified.
We will also link rows and columns of matrices to vectors,
but only after the distribution of the vectors has
been determined, as prescribed by the application.
We again emphasize that this
means we inherently start with the (discretized) physical problem,
rather than the (discretized) operator.
Next, we partition matrix A conformally:
Notice that
and thus
This exposes a natural tie between
sub-vectors of and corresponding blocks of columns of
.
Also
so
there is a natural tie between
sub-vectors of and corresponding blocks of rows of
.
It has been well documented [, ] that for
scalability reasons, it is often important to assign matrices to nodes
of a distributed memory architecture using a so-called
two-dimensional
matrix distribution.
To do so, the p = rc nodes are viewed as a logical
mesh,
, with
and
. This requires us to decide how
to distribute the sub-vectors to the two-dimensional mesh.
We will assume this is done in column-major order.
(Notice that by appropriate choice of
and
, this
can always be enforced.)
The distribution of A is now
induced by requiring block of columns
to be assigned to the same column of nodes
as sub-vector
, and block of rows
to the same
row of nodes as sub-vector
(and thus
).
This is illustrated in
Figure 1.1.
Often, for load balancing reasons, it becomes important to over-decompose the vectors and matrices, and wrap the sub-blocks onto the node mesh. This can be described by now partitioning x and y so that
where N >> p . Partitioning A conformally yields the blocked matrix
An explicitly
two-dimensional wrapped distribution for the matrix
can now be obtained by
wrapping the blocks of x and y , which induces
a wrapping in the distribution of A , as illustrated
in
Figure 1.2.
Again, the sub-vectors are assigned to the node mesh in column-major
order, wrapping as necessary.
As before, the distribution of A is now
induced by requiring block of columns to be assigned to the same column of nodes
as sub-vector
, and block or rows
to the same
row of nodes as sub-vector
(and thus
).
This is also illustrated in
Figure 1.2.
Notice that the wrapping of the vectors induces a tight wrapping
of blocks of rows of
, and a looser wrapping (by a factor of r ) of the
blocks of columns of
.
We emphasize that the distributions described in
Figures
1.1
and
1.2
may appear to be special cases
of Physically Based Matrix Distribution, since
corresponding sub-blocks of x and y are assigned to the
same node. Notice that by choosing in Equation 1.2,
a general distribution can be attained [].
Having just stated that we can achieve total generality, we need to also clearly state that in the initial implementation of PLAPACK, we actually only implement a special case: