GROMACS

GROMACS[1] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions that usually dominate simulations, many groups are also using it for research on non-biological systems, e.g. polymers.

For more information, including tutorials on how to use GROMACS for molecular dynamics, see the official GROMACS web site, especially the Tutorials page. There is also a lot of good information at the GROMACS Wiki page. Finally, there is a paper on optimization by the core developer team[3].

Important Notes

- The name of the executable for Gromacs may change in the newer versions. You can find them at the bin path obtained by "module display gromacs" command. Apply the change in your job file accordingly
- Refer to official GROMACS web site to understand the constraints regarding GPU jobs.
- Use GPU queue (e.g. gpu) to run jobs as Gormacs has GPU capability. Refer to HPC Resource View for available GPU queues
- When using GPU features of Gromacs, limit the number of MPI tasks to the number of GPUs.
- Imp: Only gromacs-plumed module and gromacs/2018.7 have MPI support

Installed Versions

All the available versions of GROMACS for use can be viewed by issuing the following command. This applies for other applications as well.

module spider Gromacs

output:

----------------------------------------------------------------------------

Versions:

GROMACS/2021.3-foss-2021a-CUDA-11.3.1

GROMACS/2021.5-foss-2021b-CUDA-11.4.1-PLUMED-2.8.0

GROMACS/2023.1-foss-2022a-CUDA-11.7.0

----------------------------------------------------------------------------

Description:

GROMACS is a versatile package to perform molecular dynamics, i.e.

simulate the Newtonian equations of motion for systems with hundreds

to millions of particles.

Other possible modules matches:

gromacs-plumed

You will need to load all module(s) on any one of the lines below before the "gromacs/2016.5" module is available to load.

gcc/6.3.0 openmpi/2.0.1

----------------------------------------------------------------------------

To find other possible module matches do:

module -r spider '.*gromacs.*'

The default version is identified by "(default)" behind the module name and can be loaded as:

module load gcc/6.3.0 openmpi/2.0.1

module load gromacs

The other versions of GROMACS can be loaded as:

module load gromacs/<version>

Running GROMACS in SLURM cluster

Gromacs can run both CPU and GPU jobs using the same Gromacs executable. However, jobs can run on GPUs only if they are available in the nodes else it falls back to CPU. In both cases, a GPU node is required for running Gromacs.

Interactive Job

Request a GPU node (in this case, with 12 cores per task, and 2 gpu)

srun --x11 -N 1 -c 12 -p gpu -C gpup100 --gres=gpu:2 --pty /bin/bash

Load the module

module load gcc/6.3.0 openmpi/2.0.1

module load gromacs

See all the gromacs commands

gmx_mpi help commands

output:

:-) GROMACS - gmx help, VERSION 5.1 (-:

GROMACS is written by:

Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar

Available commands:

anadock Cluster structures from Autodock runs

anaeig Analyze eigenvectors/normal modes

analyze Analyze data sets

angle Calculate distributions and correlations for angles and

dihedrals

...

Try to generate all the files in tutorial [2]. The sets of commands you will be using are:

gmx_mpi pdb2gmx -f 1AKI.pdb -o 1AKI_processed.gro -water spce

(enter '15' when prompted)

gmx_mpi editconf -f 1AKI_processed.gro -o 1AKI_newbox.gro -c -d 1.0 -bt cubic

gmx_mpi solvate -cp 1AKI_newbox.gro -cs spc216.gro -o 1AKI_solv.gro -p topol.top

gmx_mpi grompp -f ions.mdp -c 1AKI_solv.gro -p topol.top -o ions.tpr

gmx_mpi genion -s ions.tpr -o 1AKI_solv_ions.gro -p topol.top -pname NA -nname CL -nn 8 (enter '13' when prompted)

gmx_mpi grompp -f minim.mdp -c 1AKI_solv_ions.gro -p topol.top -o em.tpr

gmx_mpi mdrun -v -deffnm em

Batch Job

Copy the "lysozyme" directory from /usr/local/doc/GROMACS adopted from tutorial [2] and cd to it

cp -r /usr/local/doc/GROMACS/lysozyme .

cd lysozyme

You will see all the required files for Energy Minimization (EM) including the job file "job-parallel.slurm".

submit the job for EM:

sbatch job-parallel.slurm

You will see the output in slurm-<jobid>.out:

Running on 1 node with total 12 cores, 12 logical cores, 1 compatible GPU

Hardware detected on host gpu009t (the node of MPI rank 0):

CPU info:

...

GPU info:

Number of GPUs detected: 2

...

Reading file em.tpr, VERSION 5.1 (single precision)

Using 1 MPI process

Using 12 OpenMP threads

...

Steepest Descents:

Tolerance (Fmax) = 1.00000e+03

Number of steps = 50000

Step= 0, Dmax= 1.0e-02 nm, Epot= -4.52207e+05 Fmax= 2.48682e+05, atom= 710

...

writing lowest energy coordinates.

Steepest Descents converged to Fmax < 1000 in 402 steps

Potential Energy = -6.5317369e+05

Maximum force = 8.8877417e+02 on atom 1515

Norm of force = 3.2293442e+01

NOTE: The GPU has >25% less load than the CPU. This imbalance causes

performance loss.

NOTE: 23 % of the run time was spent in pair search,

you might want to increase nstlist (this has no effect on accuracy)

Gromacs with Plumed

This tutorial [5] uses PLUMED to analyze molecular dynamics simulations on the fly, to analyze existing trajectories, and to perform enhanced sampling.

Copy the directory "gromacs-plumed" from /usr/local/doc/GROMACS and cd to it

cp -r /usr/local/doc/GROMACS/gromacs-plumed .

cd gromacs-plumed

You will find the job.slurm file.

sbatch job.slurm

You will get the output files along with the slurm log file

...

Running on 1 node with total 12 cores, 12 logical cores, 0 compatible GPUs

Hardware detected on host comp103t (the node of MPI rank 0):

GROMACS: gmx mdrun, VERSION 5.1.4

Executable: /usr/local/gromacs/5.1.4-plumed2/bin/gmx_mpi

Data prefix: /usr/local/gromacs/5.1.4-plumed2

Command line:

gmx_mpi mdrun -s topolA.tpr -nsteps 10000 -plumed plumed.dat

Back Off! I just backed up md.log to ./#md.log.1#

+++ Loading the PLUMED kernel runtime +++

+++ PLUMED_KERNEL="/usr/local/plumed2/2.3.2/lib/libplumedKernel.so" +++

+++ PLUMED kernel successfully loaded +++

...

starting mdrun 'alanine dipeptide in vacuum'

10000 steps, 20.0 ps.

Writing final coordinates.

Back Off! I just backed up confout.gro to ./#confout.gro.1#

Core t (s) Wall t (s) (%)

Time: 0.774 0.807 95.9

(ns/day) (hour/ns)

Performance: 2141.397 0.011

gcq#379: "It takes money to make money, they say" (Lou Reed)

Visualize the trajectory using VMD

Request a compute node

srun --x11 --pty bash

Load VMD module:

module load vmd

Run VMD

vmd confout.gro traj_comp.xtc

Find the image as showed in Fig. 1: alanine dipeptide

Managing Interrelated Jobs

Here we use the Replica Exchange tutorial of Mark Abraham [3] to apply Gromacs productivity features in the HPC context with the SLURM scheduler. The tutorial assumes working on a stand-alone machine, rather than a cluster, so use the notes here related to adapting the tutorial for the cluster environment.

Obtain a copy from /usr/local/doc/GROMACS/remd.tgz. Copy to your work area and extract the files using 'tar -xzvf remd.tgz'. Change directory to remd/stage1.

We will concentrate on two topics. One is using a shell script to launch jobs from multiple subdirectories. The other is the use of the mdrun multidir flag. The latter facilitates running multiple instances of the same simulation, allowing for simulation inputs that vary in each instance, maintaining the code in separate directories. Prior to running the simulation, there is the need to equilibrate the simulation to their unique temperatures. The grompp command prepares input file, and does not run across multiple nodes.

Equilibration

The main goal here is to be efficient, but avoid launching all the equilibration jobs on the login node. Examine the submission script equil.sh in the stage1 directory:

#!/bin/bash

#SBATCH -n 1

#SBATCH -t 30:00

#SBATCH -o equil.out%j

echo $SLURM_JOB_NODELIST

srun gmx_mpi grompp -f equil -c confout

For this tutorial, the resource requirements are best left to the scheduler, as the jobs are very small, and will complete within 30 seconds. Hence, equil.sh is very simple. The variable '%j' labels the output files with the jobid. The equil.sh script should be run from within each of the equil0/...equil3/ directories. The script mult_grompp_srun.sh adapts the shell for-loop from the tutorial to copy equil.sh into each directory, and then submit the gmx_mpi grompp command using srun:

#!/bin/bash

for dir in equil[0-3]; do

cp equil.sh $dir;

cd $dir;

sbatch equil.sh;

cd ..;

done

Run the script: './mult_grompp_srun.sh' and then check that the jobs launch and run properly. There will be output in the terminal window providing the job numbers, and then use squeue -u <caseid> to see that the submitted jobs run. Refer to the tutorial to ensure that proper outputs are generated from the jobs in each directory equil[0-3]/.

REMD Simulations

Apply the same ideas as above to Part 2 of the REMD tutorial. Remember to avoid running jobs on the login node.

Refer to HPC Guide to Molecular Modeling and Visualization and HPC Software Guide for more information.

References:

[1] Gromacs Home

[2] Gromacs Tutorial

[3] GROMACS REMD (Replica Exchange) Tutorial

[4] Gromacs Optimization Paper

[5] Gromacs with Plumed