Bucket Sampling

Introduction

In this tutorial we will learn to use bucket sampling (BS) for efficient sampling of high dimensional free energy landscape. This tutorial follows the method used in the paper: Javed, R., Kapakayala, A. B., & Nair, N. N., Buckets instead of umbrellas for enhanced sampling and free energy calculations. Journal of Chemical Theory and Computation, 20(19), 2024 . It is recommended to read this paper for the details of parameters used for the simulation.

We study the free energy surface of alanine dipeptide in vacuo as a function of two backbone dihedral angles (Φ, Ψ) using BS. Here, we are assuming that users are familiar with GROMACS and can run MD simulation using GROMACS-PLUMED interface. The first thing we need to do is build an initial structure using ff14SB force-field and relax this at 300K so that we can use this as the input for our simulation. What we do is:

Minimize (2000 steps)
Heat (optional for small system)
Relax (300K, 100ps)

This tutorial consist of two parts:

Part 1: BS Simulation
Part 2: Reweighting

1. BS Simulation

a) Determination of parameter λ

Before starting the BS simulations, it is essential to estimate the collective variable (CV) diffusion length, λ, along which the bucket bias is to be applied. This step ensures that overly rigid wall potential do not adversely affect the simulation or introduce artifacts.

Steps for computing λ:

Preliminary Metadynamics Simulations:
Conduct short, independent metadynamics (MTD) simulations starting from diverse initial configurations. These simulations help capture the natural diffusion behavior of the system across the chosen CV.
Diffusion Analysis:
Use the script diffusion.f90 to calculate the diffusion coefficient from each independent simulation.
Averaging: Compute the quantity λ value over all MTD runs. This averaged quantity, λ, provides a reliable estimate of the natural fluctuation scale of the CV and informs the appropriate width of the bucket without imposing overly stiff boundaries.

For the alanine dipeptide in water simulation, we get λ=0.4 rad; and α is 0.5. Thus, we eliminate 0.2 rad from the right and left region for considering any possibility of boundary effect. Here is the reference HILLS file for one of the simulation performed.

b) Running the simulation

We are going to run 10 ns long BS simulation for all windows i.e. apply bucket bias along Φ and MTD bias along Φ and Ψ.

The following input files are required for running BS simulation for each umbrella window:

mds.mdp
plumed.dat

To run simulations in each of the windows, the user must specify a value of ξ for each bucket. The corresponding PLUMED input file for any given window is shown below; replace XX with the appropriate wall positions. For reference, see Table 1 for different values of ξ.

Here, we will run the simulation for six windows. The full range for our interest is from -3.14 rad to 3.14 rad along Φ. Thus, we divide the full region into 6 buckets each spanning a region of 1 rad. These are:

Table 1: Details of the BS bias window along the ϕ coordinate in the case of alanine dipeptide simulation.

The main output files of these simulations which are required for analysis are:

COLVAR : Collective Variables files for different windows
HILLS : List of added bias is stored in this file for different windows

2. Reweighting

The second part of our calculation is to use probability generated by BS simulation to get the free energy surface. But in our simulation we have applied MTD bias in all the 6 windows which will give biased probability distribution. We need to get unbiased probability distribution by reweighting and then patching all the windows using mean force based reweighting technique (Pal et al., 2021).

Following instructions are given to reweight and generate free energy surface:

Compile and execute the below codes code to get the unbiased (from MTD) from each of the windows.

ct_factor.f90

vbias.f90
probability.f90

Compile and execute the above codes using the commands below:

gfortran ct_factor.f90 -o ct.x
mpif90 vbias.f90 -o bias.x
gfortran probability.f90 -o prob.x
./ct.x
mpirun -n 1 ./bias.x
./prob.x

Once the probabilities of different windows are generated, we need to fit the free energies (obtained from unbiased simulation using F(s)=-kBT ln P(s)) using interpolation using the python script given below for each of the windows:

python fit_spline.py

NOTE: For the reconstruction part, only the η region for each of the window needs to be taken.

Once the different biasing windows have been fitted using the above script, the resulting mean force data from each window should be carefully compiled into a single file named mean_force.in. This step is critical to ensure the accuracy and reliability of BS simulation.

After obtaining the combined mean forces of all the windows, one can get the final free energy using the code reweight_MF.f90

The above code can be compiled and executed using the script run.sh and one can obtain the final free energy surface.

run.sh has the following information:

-T: (value in K) temperature of physical system. (Default: 300 K)

-dT: (value in K) ΔT parameter used in WT-MTD. (Default:1500 K)

-tmin: step number in COLVAR file, from which reweighting should start (Default:1)

-tmax : step number in COLVAR file, at which reweighting should stop (Default: total number of steps in COLVAR file)

-grid: <gridmin1> <gridmax1> <gridsize1> of CV1,<gridmin2> <gridmax2> <gridsize2> of CV2,……

-pfrqMD: Number of steps after which COLVAR file is updates

-dtMTD: Number of steps used to update bias in WTMTD

-nr: Number of buckets used in the simulation

Google Sites

Report abuse