The Message Passing Interface (MPI) is used on High Performance Clusters (HPC) to parallelize workloads for computationally intensive tasks. This resource does not include the use of task managers such as SGE (Rocks), PBS(Torque) or SLURM which is necessary for understanding for running jobs on cluster computing resources.
Understanding MPI can be a difficult topic because documentation can be a difficult software architecture concept because tutorials can be fairly difficult to understand. The tutorial provided within will use a Monte Carlo sampler where we can understand different mechanisms for MPI.
The first problem in developing an MPI application is to first develop a serial version of the problem. We will develop the problem with a serial version of a Monte Carlo sampler using OOP techniques to develop a simple Monte Carlo sampler.
Consider the problem, x1 + x2 = x3, where x1 and x2 are drawn from a prior distributions. Then the pseudo code would look like
set_seed(seed)
X1.initialize_distribution()
X2.initialize_distribution()
for n from 1...n_samples:
x1 = X1.draw_from_distribution()
x2 = X2.draw_from_distribution()
x3 = x1 + x2
archive(x1,x2,x3)
post_process()
Since random numbers in computer simulations aren't truely random, but deterministically generated from a psuedorandom series, knowing the initial seed used in computations, allow for replication of results in Monte Carlo Techniques.
import numpy as np
seed = 0