Prof. Uwacu - Research blog

Week 2

Explored AutoDock setup, requiring .pdbqt files and grid configuration. I also looked into PyMol and Chimera software to convert .pdb file to .pdbqt file and cleaning the protein files.
Compared docking methods: AutoDock uses genetic algorithms, Vina uses gradient optimization, DiffDock uses generative modeling without predefined sites.
Interpreted output differences: AutoDock/Vina give binding energies, DiffDock offers ligand pose predictions with confidence scores.
Read and summarized ScanNet paper, which proposes a geometric deep learning model to predict binding sites directly from 3D protein structures.
Learned that ScanNet outperforms traditional methods using interpretable patterns and attention-based graph neural networks, but still need to understand details like spatio-chemical filters and attention pooling.

Week 3

AutoDock Progress Summary:

This week, I worked on documenting and testing protein-ligand docking using AutoDock. Here is what I have worked on:

Step-by-step Autodock documentation with screenshots.
Fixed the autodock code so it takes adt/ADT outputs ensure proper docking runs.
Worked on Protein 4fwb with ligand 3kp and calculated binding energy with the software.
Parameter testing:
- Best result with 10 GA runs ( Genetic Algorithm, how many independent dock runs to perform)
- Medium evaluation count performed best
Tested 2 additional protein-ligand pairs with full preprocessing.

Week 4

This week I have worked on setting up the PPL library, learning more about C++ language, and motion planning algorithm. Here is what I have worked on:

Set up and compiled the PPL library and Vizmo, created aliases for easier execution, and verified successful builds.
Completed tasks involving sampling and connection experiments; analyzed .stat and .map output files.
Explored how varying parameters (like k) affects performance and collision detection.
Practiced C++, reviewed motion planning concepts, and read about real-world applications.
Troubleshooted build issues and learned how to manage build configuration using CMakeLists.txt.

Week 5

This week I worked on flexible ligand setup, protein-ligand modeling, and roadmap generation using the PPL library. Here is what I have worked on:

Cleaned the protein and ligand structures using Chimera and PyMOL; retrieved protein data from the Protein Data Bank and built a geometry model.
Modeled a flexible ligand and used the “AlwaysTrue” validity checker to generate a random roadmap ignoring collisions.
Visualized the generated roadmap using Vizmo and examined its structure.
Switched back to “pqp_solid” for accurate collision detection and added parameters to ignore adjacent link collisions.
Struggles: Attempted to generate an energetically valid roadmap using the biopotential distance metric, but could not proceed due to unresolved issues with the distance metric. We needed more direction for it.

Week 6

This week I focused on integrating a new energy-based distance metric into the motion planning framework and connecting it to our roadmap planner. Here is what I have worked on:

Integrated the BioPotential class into the distance metrics module by refactoring the class into .h and .cpp files and registering it in the build system.
Successfully compiled and tested the BioPotential metric with the PRM planner; however, encountered issues when using it with RRT.
Investigated compatibility concerns with RRT, particularly whether it requires strictly Euclidean metrics.

Reading Reflection – “Sampling-Based Motion Planning for Tracking Evolution of Dynamic Tunnels in Molecular Dynamics Simulations” by Vonasek et al.:

Problem: The paper addresses the challenge of tracking the evolution of internal tunnels in proteins as they undergo molecular dynamics simulations—an important task for understanding ligand binding and escape paths.
Proposed Solution: The authors present a modified RRT algorithm that operates in two phases:
- Phase 1: Blocking spheres are added near the protein surface to keep the tree confined within the internal void space, ensuring thorough exploration.
- Phase 2: Blocking spheres are removed, allowing the tree to exit and identify potential tunnel exits.
Validation: The method is applied frame-by-frame over a molecular simulation. Trees are reused between frames through a process of pruning invalid nodes, expanding new nodes, and merging components. A dynamic component graph tracks tunnel evolution across frames.
Takeaways: This approach shows how planning algorithms can adapt to time-varying biological environments, reusing prior knowledge and maintaining connectivity across frames—an important principle for motion planning in dynamic systems like proteins.

Week 7

This week, I focused on implementing a new strategy called BioPRM, a modified version of the basic PRM (Probabilistic Roadmap) planner that incorporates BioPotential as its distance metric.

Strategy Implementation:

Created BioPRM by cloning and modifying the existing PRM strategy.
Integrated the BioPotential distance function into the sampling process to bias roadmap construction toward low-energy regions.
Adjusted the sampling to bounded samples near a binding site, using a user-defined center and radius provided via XML.
Made changes to get energy files to coordinates of energy samples.

Code Integration & XML Setup

Registered BioPRM in the motion planning strategy list and ensured proper compilation.
Set up XML tags to control the sampling region and enable energy recording for each sample.
Logged energy values of all configurations sampled near the binding site for later analysis.

Preliminary Results

Ran simulations with the new strategy and observed that samples energy files.
These results show really high values, and there needs more investigation on it.

Reflection

This week's work helped me combine energy distance metric in sampling-based planning. I learnt a lot about debugging because my code was not compiling first.

Week 8

This week, I shifted focus from code development to data analysis and result visualization.

Simulation Output & Statistics

Ran the BioPRM strategy and successfully generated energy statistics files for sampled configurations.
Wrote Python scripts using pandas, and matplotlib to:
- Compute descriptive statistics (mean, min, max, std)
- Visualize energy distributions with scatter plots, box plots, and comparison charts

Comparison with AutoDock

Started comparing BioPRM’s high energy samples with AutoDock docking poses.
It was somewhat to get a comparison because autodock has really low/ negative energy but our algorithm has very high energy

Poster Preparation

Began drafting my research poster, focusing on:
- Experimental section
- Visualizing roadmap samples and energy distributions
- comparisons with AutoDock results
Collected key figures and wrote preliminary captions and bullet points.

Page updated

Report abuse