MPI

5.1 Install the MPI benchmark

and measure the performance of collective operations as follows:

Run the default set of tests: (for your VM cluster 2,4,8)

mpirun -n 2  mpiBench

Run the given message size range and iteration count for Alltoall, Scatter, Bcast, Allreduce, Allgather and Barrier:

mpirun -n 2  -b 32 -e 2K -i 100 -ppdebug mpiBench Alltoall, Scatter, Bcast, Allreduce, Allgather,Barrier  Barrier

Using cruch_mpiBench to show data

crunch_mpiBench -op Alltoall,Scatter,Bcast,Allreduce,Allgather,Barrier out.txt

For each operation, plots the bw for all test buffer size, for x axis =VM size .

5.2 Consider matrix multiplication example below. How is the communication setup?

Modify the code to use block style matrix multiplication as in the lecture.

Divide A,B, C into block of size N/4 x N/4

and distribute into each node for both A,B. The process computes each block of C using its A,B blocks.

Then, the block A,B are circulated. Assume the size N = 4K x 4K containing random number (numpy array)

Compare the time with the example in mpi4py.

-Running on a cluster

Run sudo apt-get install -y python-mpi4py on all nodes.

Test the installation: mpiexec -n 5 python -m mpi4py helloworld

Create machinefile in ~/ with the ip-addresses of the nodes:

farmer@192.168.17.11 farmer@192.168.17.12 farmer@192.168.17.13 farmer@192.168.17.14

Run:

mpirun -n 4 -machinefile ~/machinefile python -m mpi4py helloworld

Resources:

Page updated

Report abuse