Umbrella Sampling

Umbrella sampling is a computational technique applied for rare event studies and free energy calculations. It is a method used to explore and sample the free energy landscape of a system, particularly when there are energy barriers or transitions between different states. The primary concept behind umbrella sampling is to influence the system's sampling so that it can overcome energy barriers and efficiently explore different states. In the context of pDynamo, this is achieved by applying a restrictive harmonic potential to the atoms involved in the reaction coordinate, which can be defined in terms of distances, angles, or dihedrals. By applying this potential, the system is encouraged to sample regions of the free energy landscape that might be difficult to explore otherwise, thus facilitating the calculation of free energy differences between states and the characterization of rare events. 

On the main toolbar, this is the icon that represents umbrella sampling simulation (either in one or two dimensions).

In EasyHybrid, the calculation of umbrella sampling can  also be performed by accessing the Umbrella Sampling window located at:

Main Menu > Simulate > Umbrella Sampling

An overview of the window for calculating umbrella sampling can be seen in Figure 1.

Figure 1: Window for calculating umbrella sampling in EasyHybrid.

Input Setup

The user can choose between two types of data entry:

1 - From a coordinate in EasyHybrid's memory. This implies that the sampling of the windows must be sequential only, and the input for each sampled window is the final coordinate obtained in the previous window (see Figure 2 - left).

2 - From a set of frames (trajectory) already obtained by some sampling method. In this case, each frame is inputted to a sampling window, and they are executed independently. This option allows windows to be sampled in parallel on the CPU (see Figure 2 - right).

Reaction Coordinate

The reaction coordinate can be defined using distance criteria, chosen in the "Coordinate Type" combobox. In Figure 2, a simple distance criterion was used, involving the choice of only two atoms. EasyHybrid allows a second reaction coordinate to be defined, implying a sampling in two dimensions, in both cases the following parameters must be defined:

Step Size: It is the distance increment that will be applied to the atoms of the reaction coordinate. When the input is a set of frames (a procedure performed in parallel), this parameter is ignored, and the positions of the atoms in the reaction coordinate of each frame serve as the reference for the application of restraint potentials.

Number of Steps: It is the number of windows that will be sampled. When the input is a set of frames (a procedure performed in parallel), this parameter is ignored, and the number of windows is defined as the number of frames in the input path.

Force Constant: It is the force associated with the restriction potential.

Initial Distance: It is the initial distance trimmed to the atoms from the reaction coordinate (and depends on how the reaction coordinate is defined). When the input is a set of frames (a procedure performed in parallel), this parameter is ignored, and the initial distance is defined based on the coordinates of the atoms in the input frames.

Figure 2: The left side illustrates the sequential input type, where the windows are sampled one after another. On the right side, the parallel input type is shown. Here, a set of frames (trajectory) obtained through a sampling method is used as input. The figure also depicts the definition of the reaction coordinate, which is the criteria used to determine the progress of the reaction. This can be chosen from the "Coordinate Type" combobox, where distance criteria are commonly used.

Sampling

The sampling of each window is divided into 3 steps:

1 - Geometry optimization. This is an optional step and if desired, it must be activated by clicking on the respective checkbox shown in Figure 3.

2 - Equilibrium sampling by molecular dynamics. The integration algorithms used are the same for conventional molecular dynamics sampling.

3 - Data collection sampling by molecular dynamics.


Figure 3: Sampling setup.

The sampling of each window is divided into 3 steps:

A log file: This file contains the input parameters used for the calculation. It provides a record of the settings and options chosen for the simulation.

Equilibration sampling trajectories folder: This folder contains multiple trajectories generated during the equilibration phase. Each trajectory is stored as a separate folder, and within each trajectory folder, there are coordinate files that capture the system's coordinates at different time points during the equilibration process.

Data collection sample folder: This folder is dedicated to storing the data collected during the sampling phase. Similar to the equilibration sampling folder, it follows the same organizational structure. However, in this case, the coordinate files are not stored. Instead, this folder focuses on collecting and organizing other relevant data or results obtained during the data collection phase.

Overall, the file structure facilitates the organization and management of the generated files, allowing for easy access to input parameters, equilibration trajectories, and collected data from the sampling phase.