LAMMPS
LAMMPS
LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator) is a classical molecular dynamics program that has among its capabilities distributed-memory message-passing parallelism (MPI); spatial-decomposition of simulation domain for parallelism; easy to extend with new features and functionality; atomic, polymeric, biological, metallic, granular, or hybrid systems; pairwise potentials: Lennard-Jones, Coulombic, Buckingham, Morse, Yukawa, frictional granular, tabulated, hybrid; molecular potentials: bond, angle, dihedral, improper, class 2 (COMPASS); polymer potentials: all-atom, united-atom, bead-spring; long-range Coulombics: Ewald and PPPM (similar to particle-mesh Ewald); CHARMM and AMBER force-field compatability; constant NVE, NVT, NPT integrators; rRESPA hierarchical timestepping; SHAKE bond and angle constraints; parallel tempering (replica exchange); targeted molecular dynamics (TMD) constraints; and a variety of boundary conditions and constraints.
Important Note
The name of the executable may have been changed in different versions of LAMMPS. Find them using "module display lammps" command.
Installed Versions
All the available versions of LAMMPS for use can be viewed by issuing the following command. This applies for other applications as well.
module spider LAMMPS
output:
LAMMPS: LAMMPS/23Jun2022-foss-2021b-kokkos-CUDA-11.4.1
-----------------------------------------------------------------------------------------------------------------
Description:
LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively ..
Running LAMMPS in HPC
Copy the job files "*.slurm" and LAMMPS input file "in.lj" from /usr/local/doc/LAMMPS to your home directory
cp /usr/local/doc/LAMMPS/* .
Serial Job
Submit your serial job:
sbatch lammps.slurm
Find the output at slurm-<jobid>.out
cat slurm-<jobid>.out
output:
...
Performance: 10167.140 tau/day, 23.535 timesteps/s
81.6% CPU use with 1 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 723.2 | 723.2 | 723.2 | 0.0 | 85.10
Neigh | 88.351 | 88.351 | 88.351 | 0.0 | 10.40
Comm | 13.44 | 13.44 | 13.44 | 0.0 | 1.58
Output | 0.052433 | 0.052433 | 0.052433 | 0.0 | 0.01
Modify | 18.884 | 18.884 | 18.884 | 0.0 | 2.22
Other | | 5.865 | | | 0.69
Nlocal: 32000 ave 32000 max 32000 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 18872 ave 18872 max 18872 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 1.19999e+06 ave 1.19999e+06 max 1.19999e+06 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Total # of neighbors = 1199991
Ave neighs/atom = 37.4997
Neighbor list builds = 1000
Dangerous builds not checked
Total wall time: 0:14:09
Multi-node/Multi-core Parallel Job
Submit your parallel job:
sbatch mpi-lammps.slurm
Find the output at slurm-<jobid>.out
Performance: 170137.133 tau/day, 393.836 timesteps/s
383.7% CPU use with 4 MPI tasks x 4 OpenMP threads
....
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 24.732 | 25.055 | 25.813 | 8.9 | 49.34
Neigh | 3.9396 | 4.0063 | 4.1408 | 4.0 | 7.89
Comm | 18.723 | 19.751 | 20.74 | 19.7 | 38.89
Output | 0.038683 | 0.055631 | 0.069498 | 5.1 | 0.11
Modify | 1.2665 | 1.8835 | 3.0945 | 53.6 | 3.71
Other | | 0.03051 | | | 0.06
Nlocal: 8000 ave 8038 max 7964 min
Histogram: 1 0 0 1 0 1 0 0 0 1
Nghost: 8590 ave 8604 max 8573 min
Histogram: 1 0 1 0 0 0 0 0 0 2
Neighs: 300139 ave 302472 max 298510 min
Histogram: 2 0 0 0 0 0 1 0 0 1
Total # of neighbors = 1200557
Ave neighs/atom = 37.5174
Neighbor list builds = 1000
Dangerous builds not checked
Total wall time: 0:00:50
GPU Job
In /usr/local/doc/LAMMP mentioned in the section above, you will also find the input file (in-gpu.lj) and job file (gpu-lammps.slurm) for GPU job.
Now, run the gpu job
sbatch gpu-lammps.slurm
Find the output at slurm-<jobid>.out
...
- Using acceleration for lj/cut:
- with 1 proc(s) per device.
--------------------------------------------------------------------------
Device 0: Tesla K40m, 15 CUs, 11/11 GB, 0.74 GHZ (Mixed Precision)
--------------------------------------------------------------------------
...
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 16.118 | 16.118 | 16.118 | 0.0 | 64.31
Neigh | 0.00025239 | 0.00025239 | 0.00025239 | 0.0 | 0.00
Comm | 2.0493 | 2.0493 | 2.0493 | 0.0 | 8.18
Output | 0.014197 | 0.014197 | 0.014197 | 0.0 | 0.06
Modify | 6.1671 | 6.1671 | 6.1671 | 0.0 | 24.61
Other | | 0.7154 | | | 2.85
Nlocal: 32000 ave 32000 max 32000 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 18751 ave 18751 max 18751 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 0 ave 0 max 0 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Total # of neighbors = 0
Ave neighs/atom = 0
Neighbor list builds = 1000
Dangerous builds not checked
Total wall time: 0:00:25
Note: compare the wall time by running your MPI and GPU job
Refer to HPC Guide to Molecular Modeling and Visualization and HPC Software Guide for more information.
References:
Instructions to write input file for lammps can be found in project website.
LAMMPS Documentation - https://lammps.sandia.gov/doc/Manual.html