Research

Development of Path Sampling and Enhanced Sampling Algorithms

During my Ph.D. with Prof. Ioan Andricioaei at UC Irvine, I led the development of the weighted ensemble milestoning (WEM) algorithm for calculating free energy and kinetics of rare events in biomolecules from unbiased simulations. This approach combines two well-established path-sampling algorithms, weighted ensemble, and milestoning, and increases their efficiency and mitigates their shortcomings. I applied this novel strategy to calculate the residence time and binding affinity of the 4-hydroxy-2-butanone (BUT) ligand from the FKBP protein. The predictions from <100 ns of WEM simulation were in agreement with the 30 šœ‡s unbiased simulations performed using the Anton supercomputer. I also introduced an improved version of this algorithm (M-WEM) which was able to correctly predict the millisecond timescale residence time of the trypsin-benzamidine complex from nanosecond timescale WEM simulations. Considering the higher degree of complexity of this system, as demonstrated in subsequent studies, the scalability of the algorithm is noteworthy. A recent review article has independently identified my M-WEM method to be among the most efficient and accurate algorithms for predicting ligand binding kinetics.

As a postdoctoral researcher with Prof. Michele Parrinello, I developed an adaptive bias enhanced sampling algorithm named OPES flooding (OPESf) for calculating the kinetics of rare events. This method is a significant improvement over the standard infrequent metadynamics method used for kinetic calculations, as the user can explicitly control the maximum bias deposition and prevent biasing the transition state. An open-source implementation of OPESf is available in the PLUMED software. Using OPESf I calculated the kinetics of protein folding and unfolding, ligand-receptor unbinding, and enzyme catalysis. Others have used this method to study surface-catalyzed reactions. In a recent application, I used OPESf in combination with the data mining algorithm, dynamic time warping (DTW), to discover previously unknown pathways of protein-ligand unbinding and to calculate exit-path dependent unbinding kinetics. This work automates the analysis of pathways and kinetics in computational drug design, a field still heavily reliant on binding free energies.

References:Ā 

Deep Learning for Collective Variable DiscoveryĀ 

Identifying optimal collective variables (CV) is a key challenge in enhanced sampling simulation, particularly for complex systems in biophysics and material science. Deep learning methods have recently been utilized in various forms to address this problem. The key advantage of deep learning methods is that the CV can include the non-linear dependence of a multidimensional feature space. One such approach is the deep Targeted Discriminant Analysis (deep-TDA) where short unbiased simulations are used to train a collective variable that can distinguish various metastable basins. This technique has been particularly useful for chemical reactions and ligand-receptor binding problems. It has been noted in the literature that the optimal CV has the property that it should distinguish between the metastable basins as well as the transition states, in order to effectively drive the conformational transitions and accelerate the convergence of the free energy landscape. We improve upon deep-TDA to include information about the transition paths to design CVs that can not only distinguish the metastable basins but also the transition states. We tested our method on the model system of Muller Brown potential, folding and unfolding of the Chignolin mini-protein, and the binding of a small molecule ligand to the calixarene host from the SAMPL5 challenge. We demonstrate that the inclusion of the transition path information in the framework of Deep Learning based CV can significantly improve the accuracy and convergence speed of the free energy landscape. Our study will motivate future work in designing optimal CVs using transition state information and may find potential application in the computational investigation of biomolecular processes of pharmaceutical relevance and chemical reactions involved in catalysis.

References:

Studying the Role of Allosterism in Protein Conformational Change and Antigen-antibody Recognition Using Graph Network Model

During my Ph.D., I studied the role of allosteric communication in the dynamics of protein complexes specifically the ones relevant to the coronavirus pandemic. Using time-lagged independent component analysis (TICA), protein graph connectivity network, and mutual information, I identified the allosteric regulation by distant residues that can impact the conformational change in the receptor binding domain SARS-CoV-2 spike protein, facilitating the infection of human cells. Mutations in the two most prominent residues (D614G and A570D) predicted by my model were later discovered to create more infectious strains. Identifying these emerging mutants is an important step toward pre-designing vaccines for future outbreaks. This initial work identifies allosteric effects in protein conformational change and antibody recognition. I extended this graph network analysis to the dynamics of the neutralizing antibodies complexed with the spike protein and performed a comparative study between the wild-type and different variants of SARS-CoV-2 virus. This study demonstrates a new way to study allosterism in protein conformational change and antibody recognition. It will potentially facilitate the computer-aided rational design of monoclonal antibodies to combat infections from future strains of coronavirus.

References:Ā