GROMACS
GROMACS
GROMACS[1] is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions that usually dominate simulations, many groups are also using it for research on non-biological systems, e.g. polymers.
For more information, including tutorials on how to use GROMACS for molecular dynamics, see the official GROMACS web site, especially the Tutorials page. There is also a lot of good information at the GROMACS Wiki page. Finally, there is a paper on optimization by the core developer team[3].
Important Notes
The name of the executable for Gromacs may change in the newer versions. You can find them at the bin path obtained by "module display gromacs" command. Apply the change in your job file accordingly
Refer to official GROMACS web site to understand the constraints regarding GPU jobs.
Use GPU queue (e.g. gpu) to run jobs as Gormacs has GPU capability. Refer to HPC Resource View for available GPU queues
When using GPU features of Gromacs, limit the number of MPI tasks to the number of GPUs.
Imp: Only gromacs-plumed module and gromacs/2018.7 have MPI support
Installed Versions
All the available versions of GROMACS for use can be viewed by issuing the following command. This applies for other applications as well.
module spider gromacs
output:
----------------------------------------------------------------------------
gromacs: gromacs/2016.5
----------------------------------------------------------------------------
Description:
GROMACS is a versatile package to perform molecular dynamics, i.e.
simulate the Newtonian equations of motion for systems with hundreds
to millions of particles.
Other possible modules matches:
gromacs-plumed
You will need to load all module(s) on any one of the lines below before the "gromacs/2016.5" module is available to load.
gcc/6.3.0 openmpi/2.0.1
----------------------------------------------------------------------------
To find other possible module matches do:
module -r spider '.*gromacs.*'
The default version is identified by "(default)" behind the module name and can be loaded as:
module load gcc/6.3.0 openmpi/2.0.1
module load gromacs
The other versions of GROMACS can be loaded as:
module load gromacs/<version>
Running GROMACS in SLURM cluster
Gromacs can run both CPU and GPU jobs using the same Gromacs executable. However, jobs can run on GPUs only if they are available in the nodes else it falls back to CPU. In both cases, a GPU node is required for running Gromacs.
Interactive Job
Request a GPU node (in this case, with 12 cores per task, and 2 gpu)
srun --x11 -N 1 -c 12 -p gpu -C gpup100 --gres=gpu:2 --pty /bin/bash
Load the module
module load gcc/6.3.0 openmpi/2.0.1
module load gromacs
See all the gromacs commands
gmx_mpi help commands
output:
:-) GROMACS - gmx help, VERSION 5.1 (-:
GROMACS is written by:
Emile Apol Rossen Apostolov Herman J.C. Berendsen Par Bjelkmar
..
Available commands:
anadock Cluster structures from Autodock runs
anaeig Analyze eigenvectors/normal modes
analyze Analyze data sets
angle Calculate distributions and correlations for angles and
dihedrals
...
Try to generate all the files in tutorial [2]. The sets of commands you will be using are:
gmx_mpi pdb2gmx -f 1AKI.pdb -o 1AKI_processed.gro -water spce
(enter '15' when prompted)
gmx_mpi editconf -f 1AKI_processed.gro -o 1AKI_newbox.gro -c -d 1.0 -bt cubic
gmx_mpi solvate -cp 1AKI_newbox.gro -cs spc216.gro -o 1AKI_solv.gro -p topol.top
gmx_mpi grompp -f ions.mdp -c 1AKI_solv.gro -p topol.top -o ions.tpr
gmx_mpi genion -s ions.tpr -o 1AKI_solv_ions.gro -p topol.top -pname NA -nname CL -nn 8 (enter '13' when prompted)
gmx_mpi grompp -f minim.mdp -c 1AKI_solv_ions.gro -p topol.top -o em.tpr
gmx_mpi mdrun -v -deffnm em
Batch Job
Copy the "lysozyme" directory from /usr/local/doc/GROMACS adopted from tutorial [2] and cd to it
cp -r /usr/local/doc/GROMACS/lysozyme .
cd lysozyme
You will see all the required files for Energy Minimization (EM) including the job file "job-parallel.slurm".
submit the job for EM:
sbatch job-parallel.slurm
You will see the output in slurm-<jobid>.out:
Running on 1 node with total 12 cores, 12 logical cores, 1 compatible GPU
Hardware detected on host gpu009t (the node of MPI rank 0):
CPU info:
...
GPU info:
Number of GPUs detected: 2
...
Reading file em.tpr, VERSION 5.1 (single precision)
Using 1 MPI process
Using 12 OpenMP threads
...
Steepest Descents:
Tolerance (Fmax) = 1.00000e+03
Number of steps = 50000
Step= 0, Dmax= 1.0e-02 nm, Epot= -4.52207e+05 Fmax= 2.48682e+05, atom= 710
...
writing lowest energy coordinates.
Steepest Descents converged to Fmax < 1000 in 402 steps
Potential Energy = -6.5317369e+05
Maximum force = 8.8877417e+02 on atom 1515
Norm of force = 3.2293442e+01
NOTE: The GPU has >25% less load than the CPU. This imbalance causes
performance loss.
NOTE: 23 % of the run time was spent in pair search,
you might want to increase nstlist (this has no effect on accuracy)
Gromacs with Plumed
This tutorial [5] uses PLUMED to analyze molecular dynamics simulations on the fly, to analyze existing trajectories, and to perform enhanced sampling.
Copy the directory "gromacs-plumed" from /usr/local/doc/GROMACS and cd to it
cp -r /usr/local/doc/GROMACS/gromacs-plumed .
cd gromacs-plumed
You will find the job.slurm file.
sbatch job.slurm
You will get the output files along with the slurm log file
...
Running on 1 node with total 12 cores, 12 logical cores, 0 compatible GPUs
Hardware detected on host comp103t (the node of MPI rank 0):
..
GROMACS: gmx mdrun, VERSION 5.1.4
Executable: /usr/local/gromacs/5.1.4-plumed2/bin/gmx_mpi
Data prefix: /usr/local/gromacs/5.1.4-plumed2
Command line:
gmx_mpi mdrun -s topolA.tpr -nsteps 10000 -plumed plumed.dat
Back Off! I just backed up md.log to ./#md.log.1#
+++ Loading the PLUMED kernel runtime +++
+++ PLUMED_KERNEL="/usr/local/plumed2/2.3.2/lib/libplumedKernel.so" +++
+++ PLUMED kernel successfully loaded +++
...
starting mdrun 'alanine dipeptide in vacuum'
10000 steps, 20.0 ps.
Writing final coordinates.
Back Off! I just backed up confout.gro to ./#confout.gro.1#
Core t (s) Wall t (s) (%)
Time: 0.774 0.807 95.9
(ns/day) (hour/ns)
Performance: 2141.397 0.011
gcq#379: "It takes money to make money, they say" (Lou Reed)
Visualize the trajectory using VMD
Request a compute node
srun --x11 --pty bash
Load VMD module:
module load vmd
Run VMD
vmd confout.gro traj_comp.xtc
Find the image as showed in Fig. 1: alanine dipeptide
Managing Interrelated Jobs
Here we use the Replica Exchange tutorial of Mark Abraham [3] to apply Gromacs productivity features in the HPC context with the SLURM scheduler. The tutorial assumes working on a stand-alone machine, rather than a cluster, so use the notes here related to adapting the tutorial for the cluster environment.
Obtain a copy from /usr/local/doc/GROMACS/remd.tgz. Copy to your work area and extract the files using 'tar -xzvf remd.tgz'. Change directory to remd/stage1.
We will concentrate on two topics. One is using a shell script to launch jobs from multiple subdirectories. The other is the use of the mdrun multidir flag. The latter facilitates running multiple instances of the same simulation, allowing for simulation inputs that vary in each instance, maintaining the code in separate directories. Prior to running the simulation, there is the need to equilibrate the simulation to their unique temperatures. The grompp command prepares input file, and does not run across multiple nodes.
Equilibration
The main goal here is to be efficient, but avoid launching all the equilibration jobs on the login node. Examine the submission script equil.sh in the stage1 directory:
#!/bin/bash
#SBATCH -n 1
#SBATCH -t 30:00
#SBATCH -o equil.out%j
echo $SLURM_JOB_NODELIST
srun gmx_mpi grompp -f equil -c confout
For this tutorial, the resource requirements are best left to the scheduler, as the jobs are very small, and will complete within 30 seconds. Hence, equil.sh is very simple. The variable '%j' labels the output files with the jobid. The equil.sh script should be run from within each of the equil0/...equil3/ directories. The script mult_grompp_srun.sh adapts the shell for-loop from the tutorial to copy equil.sh into each directory, and then submit the gmx_mpi grompp command using srun:
#!/bin/bash
for dir in equil[0-3]; do
cp equil.sh $dir;
cd $dir;
sbatch equil.sh;
cd ..;
done
Run the script: './mult_grompp_srun.sh' and then check that the jobs launch and run properly. There will be output in the terminal window providing the job numbers, and then use squeue -u <caseid> to see that the submitted jobs run. Refer to the tutorial to ensure that proper outputs are generated from the jobs in each directory equil[0-3]/.
REMD Simulations
Apply the same ideas as above to Part 2 of the REMD tutorial. Remember to avoid running jobs on the login node.
Refer to HPC Guide to Molecular Modeling and Visualization and HPC Software Guide for more information.
References:
[1] Gromacs Home
[2] Gromacs Tutorial
[3] GROMACS REMD (Replica Exchange) Tutorial