OpenMP

OpenMP

OpenMP (https://computing.llnl.gov/tutorials/openMP/ ), an API that enables direct multi-threaded, shared memory parallelism.

Important Notes

Compiling OpenMP Code

The compilers available to you in HPC supports OpenMP. Find the available compilers at Software Guide and use one of the compilers appropriate to your source code (C/C++ or Fortran) as showed below:

Using GNU Compiler

gcc -fopenmp <your-c-code> -o <executable-name>

g++ -fopenmp <your-c++-code> -o <executable-name>

gfortran -fopenmp <your-fortran-code> -o <executable-name> 

Using Intel Compiler 

icc -openmp <your-c/c++-code> -o <executable-name> 

ifort -openmp <your-Fortran-code>-o <executable-name> 

Using PGI Compiler 

The PGI module needs to be loaded to use PGI compiler. Use the command "module avail pgi" to see the available version. For the default version, type:

module load pgi

Compiling:

pgcc -mp <your-c/c++-code> -o <executable-name> 

pgf90 -mp <your-Frotran-code> -o <executable-name>

Running Batch Job

Example - hello.c using Intel Compiler:

Copy/Paste hello.c file in your home directory.

#include <omp.h>

main () {

  int nthreads, tid;

  /* Fork a team of threads with each thread having a private tid variable */

  #pragma omp parallel private(tid)

  {

    /* Obtain and print thread id */

    tid = omp_get_thread_num();

    printf("Hello World from thread = %d\n", tid);

    /* Only master thread does this */

    if (tid == 0)

    {

      nthreads = omp_get_num_threads();

      printf("Number of threads = %d\n", nthreads);

    }

  } /* All threads join master thread and terminate */

Compile using intel compiler icc. Intel module is loaded by default.

icc -openmp hello.c -o hello

After compiling hello executable, create the following job.slurm script:

#!/bin/bash

#SBATCH --nodes=1

#SBATCH --cpus-per-task=4

#SBATCH -J openmpi_test 

#SBATCH --output=hello.txt 

nproc=$(( $SLURM_JOB_CPUS_PER_NODE * $SLURM_NNODES ))

echo $nproc threads

cp hello $PFSDIR

cd $PFSDIR

# Set the number of Threads and Execute 

export OMP_NUM_THREADS=$nproc

module load intel

./hello

cp * $SLURM_SUBMIT_DIR

Submit the job:

sbatch job.slurm

Output file (hello.txt):

4 threads

Hello World from thread = 0

Hello World from thread = 1

Hello World from thread = 2

Number of threads = 4

Hello World from thread = 3

Scale-Up

Copy the C code "pi_trape.c" which calculates the pi using trapezoid integration. Compile the code and run it as a batch job as described above. Note the smaller values for elapsed time for OpenMP.

Sample examples can be found at https://computing.llnl.gov/tutorials/openMP/exercise.html

References:

1. Introduction

2. Sample Examples