Matlab Parallel Jobs

Matlab Parallel Jobs

Simple PARFOR

In the MATLAB script, just substitute  for loops with parfor loops but note that parfor does not work for Nested loops.

In MATLAB script:

for i=1:n => parfor i=1:n

In SLURM script:

Need to reserve the whole node. Here, the number of processor (-c) and memory (--mem) depends on the type of node defined by keyword. Visit Server & Storage for details.

#SBATCH -N 1 -c <x> -C <keyword> --mem=<m>gb

Inherently Parallel Jobs

There are operations and functions in MATLAB that tries to grab as much core as possible. Please follow the job script rule as suggested for simple PARFOR. You can try the sample example "solveEq.m" at /usr/local/doc/MATLAB.

MDCS (Matlab Distributed Computing Server)

Configuration and Validation:

It is a good idea to create a separate directory for different versions of MATLAB under matlab directory at your home.

mkdir <version> /home/<caseID>/matlab

Distributed Jobs

In distributed MDCS, workers or processors are engaged to compute different tasks of a job in different nodes.

Interactive Submission:

Copy the distributed script "createjob.m" from /usr/local/doc/MATLAB which looks like the following.

%MATLAB Distributed Job

myCluster = parcluster; % returns a cluster object identified by default cluster

                       % profile

j = createJob(myCluster);

createTask(j,@rand,1,{{1} {2} {3}}); % Three rand tasks with three different inputs

submit(j); % tasks may be submitted to different nodes

wait(j);   % wait for jobs to complete

results = getAllOutputArguments(j); % Get the results

celldisp(results);                  % Display the results  

destroy(j);                         % Destroy all the traces for garbage mgmt

Open the MATLAB terminal following the Interactive Job Submission Procedure above. Make sure that you are in the directory where your "createjob.m" file is and then type:

createjob

outputs:

results{1} =

    0.3246

    ....

In this example, we are using MATLAB built in rand function. For user defined function, You need to add your script or path to directory that contain the script in TorqueProfile under "Files and Folder". Your create task statement looks as below.

createTask(j,@<YOURFUNCTION,<# of outputs>,{{<task1-input>} {<task2-input>} {...}});

BATCH Submission:

You can copy the dependency file "primeNumbersDist_serial.m" and the distributed job file "primeDist_serial.m" from /usr/local/doc/MATLAB to test. The script "primeNumbersDist_serial.m" counts the number of prime numbers given the upper bound.

The SLURM script for batch submission looks similar to the one for Monte Carlo Method. If you want to run distributed job with user defined function as a batch submission, copy the SLURM script "primeDist.slurm" from /usr/local/doc/MATLAB and submit it using:

sbatch primeDist.slurm

Parpool

MATLAB workers act on the part of the iterations in the same node (shared Memory).

poolObj = parpool(4);        % Assign 4 workers

parfor i = lower : upper

...

...

delete (poolObj);            % Garbage Collection

Example: Central Theorem

Central Theorem simulation is the example that exhibits slicing. This example investigates the performance of the central limit theorem in the deep tails of the distribution. As an example, simulate from a t distribution with df = 3. To increase the speed and avoid overloading the memory, each simulation is divided into a set of batches using parfor statement. This shows basic idea to take advantage of parallelism. Also, though it uses plot functions, the GUI has been printed in another file format (.ps) submitting the job as a batch job. This is just the example code to guide you in writing optimal code for parallelism.

Run as a Batch Job

Copy the MATLAB pool script file "central_theorem.m" from /usr/local/doc/MATLAB and create a job script file "runCenTh.slurm using the template above.

Submit the script:

sbatch runCenTh.slurm

In the latest version of matlab (R2014 & later) use "poolObj = parpool(n)" and "delete (poolObj)" at the beginning and the end respectively, where n is the number of workers. If your SLURM Profile configuration is not recognized even when it is selected as a default configuration, you need to explicitly provide the full-path to the exported profile in the matlab script:

myProfile = parallel.importProfile('/home/CaseID/<path-to-slurm-profile-file>/SLURMProfile2015b')

poolObj = parpool(myProfile,4);

To see the plot:

evince-viewer plot_central.ps