NAMD

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. Based on Charm++ parallel objects, NAMD scales to hundreds of cores for typical simulations and beyond 500,000 cores for the largest simulations. NAMD uses the popular molecular graphics program VMD for simulation setup and trajectory analysis, but is also file-compatible with AMBER, CHARMM, and X-PLOR. 

Tutorials by the developer show you how to use NAMD and VMD for biomolecular modeling. NAMD may be used interactively (principally for visual manipulation or analysis) or by job submission. Below, we provide an example script to run NAMD in batch mode on the cluster.

Important Notes

Installed Versions

All the available versions of NAMD for use can be viewed by issuing the following command. This applies for other applications as well.

module spider namd

output:

   namd/2.14-cuda

   namd/2.14

Choose the one to be loaded as:

module load namd/<version>

Running NAMD 

Copy the tarfile “namd2_example_files.tar.gz” from /usr/local/doc/NAMD/ to your home directory 

cp /usr/local/doc/NAMD/namd2_example_files.tar.gz ./

Untar it and change directory to "namd2_example_files/1-2-sphere"

tar xzvf namd2_example_files.tar.gz

cd namd2_example_files/1-2-sphere

The example comes from the NAMD tutorials (see link below) and performs a simulation to study the minimization and equilibration of ubiquitin in a water sphere placed in vacuum. The file ubq_ws_eq.conf provides the configuration details for the simulation, and is described in detail here. For the purposes of this exercise, we focus on running this simulation in the SLURM environment. Create the file "namd_sim_example.slurm" with the following content:

#!/bin/bash

#SBATCH -o namd2_test.o%j

#SBATCH -N 1 -n 1

# Load the NAMD module

module load namd

#

# Run NAMD

namd2 ubq_ws_eq.conf > ubq_ws_eq.log

Submit your job:

sbatch namd_sim_example.slurm

Output: ubq_ws_eq.log

...

...

Info: 1 NAMD  2.13  Linux-x86_64-TCP

...

...

ENERGY:    2600       218.4817       713.6095       308.3118        34.4095         -23039.0477      1774.9981         5.1648         0.0000      4177.0057         -15807.0667       301.0117    -19984.0723    -15791.6624       306.1430

WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 2600

WRITING COORDINATES TO OUTPUT FILE AT STEP 2600

CLOSING COORDINATE DCD FILE ubq_ws_eq.dcd

WRITING VELOCITIES TO OUTPUT FILE AT STEP 2600

====================================================

WallClock: 104.609947  CPUTime: 102.102859  Memory: 177.109375 MB

Program finished after 104.610902 seconds.

CPU Parallelization in a Single Node

Running NAMD on multiple CPUs requires the charmrun command.

#!/bin/bash

#SBATCH -N 1

#SBATCH -c 16

#SBATCH --mem=10gb

#SBATCH --time=15

#SBATCH -J NAMD

#SBATCH -o namd.log


module load namd/2.14

# Run in parallel

namd2 +p16 <NAMD config file>

+p16 indicates the number of processors being used, which should be less than or equal to the value of the "-c" option

GPU Parallelization

The following example requests a GPU node with gpu2v100 feature.

#!/bin/bash

#SBATCH -p gpu

#SBATCH -C gpu2v100

#SBATCH --gres=gpu:2

#SBATCH --mem=10gb

#SBATCH -N 1

#SBATCH -c 24

#SBATCH --time=10:00:00

#SBATCH -J NAMD_CUDA

#SBATCH -o namd_cuda.log


module load CUDA/11.6.0

module load namd/2.14-cuda

namd2 +p24 <NAMD config file> 

+p24 indicates the number of processors being used, which should be the same as specified for the "-c" option

The first line of the "top" command on the gpu node where the job is running:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

182309 stm 20 0 22.3g 533040 120200 R 779.1 0.3 0:31.05 namd2

Output from the nvidia-smi utility on the gpu node where the job is running (use -l flag for looping i.e. -l 1 for every second):

Thu Jan 3 10:06:44 2019

+-----------------------------------------------------------------------------+

| NVIDIA-SMI 396.37 Driver Version: 396.37 |

|-------------------------------+----------------------+----------------------+

| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |

| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |

|===============================+======================+======================|

| 0 Tesla P100-PCIE... On | 00000000:03:00.0 Off | 0 |

| N/A 28C P0 45W / 250W | 369MiB / 12198MiB | 61% E. Process |

+-------------------------------+----------------------+----------------------+

| 1 Tesla P100-PCIE... On | 00000000:82:00.0 Off | 0 |

| N/A 27C P0 40W / 250W | 367MiB / 12198MiB | 49% E. Process |

+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+

| Processes: GPU Memory |

| GPU PID Type Process name Usage |

|=============================================================================|

| 0 181840 C namd2 359MiB |

| 1 181840 C namd2 357MiB |

+-----------------------------------------------------------------------------+

Scaling Result

For the same problem (we use apoa1.namd for 10000 steps), the time it takes to finish the job goes down with more resources being utilized.