HPC Home‎ > ‎Software Guide‎ > ‎

OpenMP

OpenMP

OpenMP (https://computing.llnl.gov/tutorials/openMP/ ), an API that enables direct multi-threaded, shared memory parallelism.

Important Notes

  • Request the number of processors that match the number of OpenMP threads that you want to use
  • If your requirement is very high look for available smp nodes at HPC Resource View
  • Want to start from basic C++ using omp pragmas? vist this site.

Compiling OpenMP Code

The compilers available to you in HPC supports OpenMP. Find the available compilers at Software Guide and use one of the compilers appropriate to your source code (C/C++ or Fortran) as showed below:

Using GNU Compiler
gcc -fopenmp <your-c-code> -o <executable-name>
g++ -fopenmp <your-c++-code> -o <executable-name>
gfortran -fopenmp <your-fortran-code> -o <executable-name> 

Using Intel Compiler 
icc -openmp <your-c/c++-code> -o <executable-name> 
ifort -openmp <your-Fortran-code>-o <executable-name> 

Using PGI Compiler 
The PGI module needs to be loaded to use PGI compiler. Use the command "module avail pgi" to see the available version. For the default version, type:
module load pgi

Compiling:
pgcc -mp <your-c/c++-code> -o <executable-name> 
pgf90 -mp <your-Frotran-code> -o <executable-name>

Running Batch Job

Example - hello.c using Intel Compiler:
Copy/Paste hello.c file in your home directory.
#include <omp.h>
main () {
int nthreads, tid;
/* Fork a team of threads with each thread having a private tid variable */
#pragma omp parallel private(tid)
{
/* Obtain and print thread id */
tid = omp_get_thread_num();
printf("Hello World from thread = %d\n", tid);
/* Only master thread does this */
if (tid == 0)
{
nthreads = omp_get_num_threads();
printf("Number of threads = %d\n", nthreads);
}
} /* All threads join master thread and terminate */

Compile using intel compiler icc. Intel module is loaded by default.
icc -openmp hello.c -o hello

After compiling hello executable, create the following job.slurm script:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=4
#SBATCH -J openmpi_test 
#SBATCH --output=hello.txt 
nproc=$(( $SLURM_JOB_CPUS_PER_NODE * $SLURM_NNODES ))
echo $nproc threads
cp hello $PFSDIR
cd $PFSDIR
# Set the number of Threads and Execute 
export OMP_NUM_THREADS=$nproc
module load intel
./hello
cp * $SLURM_SUBMIT_DIR

Submit the job:
sbatch job.slurm

Output file (hello.txt):
4 threads
Hello World from thread = 0
Hello World from thread = 1
Hello World from thread = 2
Number of threads = 4
Hello World from thread = 3

Scale-Up

Copy the C code "pi_trape.c" which calculates the pi using trapezoid integration. Compile the code and run it as a batch job as described above. Note the smaller values for elapsed time for OpenMP.


References:

1. Introduction
2. Sample Examples