Thursday, March 19, 2020

HPC Lecture Notes for 19/03/2020

6.6 Collective Communication and ComputationOperations

MPI provides an extensive set of functions for performing many commonly used collective
communication operations. In particular, the majority of the basic communication operations
described in Chapter 4 are supported by MPI. All of the collective communication functions
provided by MPI take as an argument a communicator that defines the group of processes that
participate in the collective operation. All the processes that belong to this communicator
participate in the operation, and all of them must call the collective communication function.
Even though collective communication operations do not act like barriers (i.e., it is possible for
a processor to go past its call for the collective communication operation even before other
processes have reached it), it acts like a virtual synchronization step in the following sense: the
parallel program should be written such that it behaves correctly even if a global
synchronization is performed before and after the collective call. Since the operations are
virtually synchronous, they do not require tags. In some of the collective functions data is
required to be sent from a single process (source-process) or to be received by a single process
(target-process). In these functions, the source- or target-process is one of the arguments
supplied to the routines. All the processes in the group (i.e., communicator) must specify the
same source- or target-process. For most collective communication operations, MPI provides
two different variants. The first transfers equal-size data to or from each process, and the
second transfers data that can be of different sizes.
6.6.1 Barrier
The barrier synchronization operation is performed in MPI using the MPI_Barrier function.
int MPI_Barrier(MPI_Comm comm)
The only argument of MPI_Barrier is the communicator that defines the group of processes
that are synchronized. The call to MPI_Barrier returns only after all the processes in the group
have called this function.
6.6.2 Broadcast
The one-to-all broadcast operation described in Section 4.1 is performed in MPI using the
MPI_Bcast function.
int MPI_Bcast(void *buf, int count, MPI_Datatype datatype,
int source, MPI_Comm comm)
MPI_Bcast sends the data stored in the buffer buf of process source to all the other processes
in the group. The data received by each process is stored in the buffer buf . The data that is
broadcast consist of count entries of type datatype . The amount of data sent by the source
process must be equal to the amount of data that is being received by each process; i.e., the
count and datatype fields must match on all processes.
6.6.3 Reduction
The all-to-one reduction operation described in Section 4.1 is performed in MPI using the
MPI_Reduce function.
int MPI_Reduce(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, int target,
MPI_Comm comm)
MPI_Reduce combines the elements stored in the buffer sendbuf of each process in the group,
using the operation specified in op , and returns the combined values in the buffer recvbuf of
the process with rank target . Both the sendbuf and recvbuf must have the same number of
count items of type datatype . Note that all processes must provide a recvbuf array, even if
they are not the target of the reduction operation. When count is more than one, then the
combine operation is applied element-wise on each entry of the sequence. All the processes
must call MPI_Reduce with the same value for count , datatype , op , target , and comm .
MPI provides a list of predefined operations that can be used to combine the elements stored in
sendbuf . MPI also allows programmers to define their own operations, which is not covered in
this book. The predefined operations are shown in Table 6.3 . For example, in order to compute
the maximum of the elements stored in sendbuf , the MPI_MAX value must be used for the op
argument. Not all of these operations can be applied to all possible data-types supported by
MPI. For example, a bit-wise OR operation (i.e., op = MPI_BOR ) is not defined for real-valued
data-types such as MPI_FLOAT and MPI_REAL . The last column of Table 6.3 shows the various
data-types that can be used with each operation.
MPI_MAX
Maximum
C integers and floating point
MPI_MIN
Minimum
C integers and floating point
MPI_SUM
Sum
C integers and floating point
MPI_PROD
Product
C integers and floating point
MPI_LAND
Logical AND
C integers
MPI_BAND
Bit-wise AND
C integers and byte
MPI_LOR
Logical OR
C integers
MPI_BOR
Bit-wise OR
C integers and byte
MPI_LXOR
Logical XOR
C integers
MPI_BXOR
Bit-wise XOR
C integers and byte
MPI_MAXLOC
max-min value-location
Data-pairs
MPI_MINLOC
min-min value-location
Data-pairs
Table 6.3. Predefined reduction operations.
Operation Meaning Datatypes
The operation MPI_MAXLOC combines pairs of values (vi , li ) and returns the pair (v , l ) such
that v is the maximum among all vi 's and l is the smallest among all li 's such that v = vi .
Similarly, MPI_MINLOC combines pairs of values and returns the pair (v , l ) such that v is the
minimum among all vi 's and l is the smallest among all li 's such that v = vi . One possible
application of MPI_MAXLOC or MPI_MINLOC is to compute the maximum or minimum of a list of
numbers each residing on a different process and also the rank of the first process that stores
this maximum or minimum, as illustrated in Figure 6.6 . Since both MPI_MAXLOC and
MPI_MINLOC require datatypes that correspond to pairs of values, a new set of MPI datatypes
have been defined as shown in Table 6.4 . In C, these datatypes are implemented as structures
containing the corresponding types.
Figure 6.6. An example use of the MPI_MINLOC and MPI_MAXLOC operators.
When the result of the reduction operation is needed by all the processes, MPI provides the
MPI_Allreduce operation that returns the result to all the processes. This function provides the
functionality of the all-reduce operation described in Section 4.3 .
MPI_2INT
pair of int s
MPI_SHORT_INT
short and int
MPI_LONG_INT
long and int
MPI_LONG_DOUBLE_INT
long double and int
MPI_FLOAT_INT
float and int
MPI_DOUBLE_INT
double and int
Table 6.4. MPI datatypes for data-pairs used with the MPI_MAXLOC and
MPI_MINLOC reduction operations.
MPI Datatype C Datatype
int MPI_Allreduce(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
Note that there is no target argument since all processes receive the result of the operation.
6.6.4 Prefix
The prefix-sum operation described in Section 4.3 is performed in MPI using the MPI_Scan
function.
int MPI_Scan(void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)
MPI_Scan performs a prefix reduction of the data stored in the buffer sendbuf at each process
and returns the result in the buffer recvbuf . The receive buffer of the process with rank i will
store, at the end of the operation, the reduction of the send buffers of the processes whose
ranks range from 0 up to and including i . The type of supported operations (i.e., op ) as well as
the restrictions on the various arguments of MPI_Scan are the same as those for the reduction
operation MPI_Reduce .
6.6.5 Gather
The gather operation described in Section 4.4 is performed in MPI using the MPI_Gather
function.
int MPI_Gather(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, int target, MPI_Comm comm)
Each process, including the target process, sends the data stored in the array sendbuf to the
target process. As a result, if p is the number of processors in the communication comm , the
target process receives a total of p buffers. The data is stored in the array recvbuf of the target
process, in a rank order. That is, the data from process with rank i are stored in the recvbuf
starting at location i * sendcount (assuming that the array recvbuf is of the same type as
recvdatatype ).
The data sent by each process must be of the same size and type. That is, MPI_Gather must be
called with the sendcount and senddatatype arguments having the same values at each
process. The information about the receive buffer, its length and type applies only for the target
process and is ignored for all the other processes. The argument recvcount specifies the
number of elements received by each process and not the total number of elements it receives.
So, recvcount must be the same as sendcount and their datatypes must be matching.
MPI also provides the MPI_Allgather function in which the data are gathered to all the
processes and not only at the target process.
int MPI_Allgather(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, MPI_Comm comm)
The meanings of the various parameters are similar to those for MPI_Gather ; however, each
process must now supply a recvbuf array that will store the gathered data.
In addition to the above versions of the gather operation, in which the sizes of the arrays sent
by each process are the same, MPI also provides versions in which the size of the arrays can be
different. MPI refers to these operations as the vector variants. The vector variants of the
MPI_Gather and MPI_Allgather operations are provided by the functions MPI_Gatherv and
MPI_Allgatherv , respectively.
int MPI_Gatherv(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf,
int *recvcounts, int *displs,
MPI_Datatype recvdatatype, int target, MPI_Comm comm)
int MPI_Allgatherv(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf,
int *recvcounts, int *displs, MPI_Datatype recvdatatype,
MPI_Comm comm)
These functions allow a different number of data elements to be sent by each process by
replacing the recvcount parameter with the array recvcounts . The amount of data sent by
process i is equal to recvcounts[i] . Note that the size of recvcounts is equal to the size of
the communicator comm . The array parameter displs , which is also of the same size, is used
to determine where in recvbuf the data sent by each process will be stored. In particular, the
data sent by process i are stored in recvbuf starting at location displs[i] . Note that, as
opposed to the non-vector variants, the sendcount parameter can be different for different
processes.
6.6.6 Scatter
The scatter operation described in Section 4.4 is performed in MPI using the MPI_Scatter
function.
int MPI_Scatter(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, int source, MPI_Comm comm)
The source process sends a different part of the send buffer sendbuf to each processes,
including itself. The data that are received are stored in recvbuf . Process i receives sendcount
contiguous elements of type senddatatype starting from the i * sendcount location of the
sendbuf of the source process (assuming that sendbuf is of the same type as senddatatype ).
MPI_Scatter must be called by all the processes with the same values for the sendcount ,
senddatatype , recvcount , recvdatatype , source , and comm arguments. Note again that
sendcount is the number of elements sent to each individual process.
Similarly to the gather operation, MPI provides a vector variant of the scatter operation, called
MPI_Scatterv , that allows different amounts of data to be sent to different processes.
int MPI_Scatterv(void *sendbuf, int *sendcounts, int *displs,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, int source, MPI_Comm comm)
As we can see, the parameter sendcount has been replaced by the array sendcounts that
determines the number of elements to be sent to each process. In particular, the target
process sends sendcounts[i] elements to process i . Also, the array displs is used to
determine where in sendbuf these elements will be sent from. In particular, if sendbuf is of the
same type is senddatatype , the data sent to process i start at location displs[i] of array
sendbuf . Both the sendcounts and displs arrays are of size equal to the number of processes
in the communicator. Note that by appropriately setting the displs array we can use
MPI_Scatterv to send overlapping regions of sendbuf .
6.6.7 All-to-All
The all-to-all personalized communication operation described in Section 4.5 is performed in
MPI by using the MPI_Alltoall function.
int MPI_Alltoall(void *sendbuf, int sendcount,
MPI_Datatype senddatatype, void *recvbuf, int recvcount,
MPI_Datatype recvdatatype, MPI_Comm comm)
Each process sends a different portion of the sendbuf array to each other process, including
itself. Each process sends to process i sendcount contiguous elements of type senddatatype
starting from the i * sendcount location of its sendbuf array. The data that are received are
stored in the recvbuf array. Each process receives from process i recvcount elements of type
recvdatatype and stores them in its recvbuf array starting at location i * recvcount .
MPI_Alltoall must be called by all the processes with the same values for the sendcount ,
senddatatype , recvcount , recvdatatype , and comm arguments. Note that sendcount and
recvcount are the number of elements sent to, and received from, each individual process.
MPI also provides a vector variant of the all-to-all personalized communication operation called
MPI_Alltoallv that allows different amounts of data to be sent to and received from each
process.
int MPI_Alltoallv(void *sendbuf, int *sendcounts, int *sdispls
MPI_Datatype senddatatype, void *recvbuf, int *recvcounts,
int *rdispls, MPI_Datatype recvdatatype, MPI_Comm comm)
The parameter sendcounts is used to specify the number of elements sent to each process, and
the parameter sdispls is used to specify the location in sendbuf in which these elements are
stored. In particular, each process sends to process i , starting at location sdispls[i] of the
array sendbuf , sendcounts[i] contiguous elements. The parameter recvcounts is used to
specify the number of elements received by each process, and the parameter rdispls is used to
specify the location in recvbuf in which these elements are stored. In particular, each process
receives from process i recvcounts[i] elements that are stored in contiguous locations of
recvbuf starting at location rdispls[i] . MPI_Alltoallv must be called by all the processes
with the same values for the senddatatype , recvdatatype , and comm arguments.

Introduction to OpenMP

Watch below videos in sequence 







No comments:

Post a Comment

Lecture PPTs of MIS , Module 4 to Module 6

 Module 4 Social Computing (SC): Web 3.0 , SC in business-shopping, Marketing, Operational and Analytic CRM, E-business and E-commerce B2B B...