Computational Biology Software Development
Computational Biology Software Development
BayFlux - Metabolic Modelling using Bayesian Statistics
Developing BayFlux-v2.0: Advanced Genome-Scale Metabolic Modeling Software
Designed and developed BayFlux-v2.0, a high-performance software platform that uses Bayesian inference to model genome-scale metabolic networks, with applications in biofuel optimization and industrial biotechnology.
Enhanced BayFlux to accurately analyze metabolic flux distributions, improving predictions and pathway optimization across diverse biological systems.
Extended BayFlux to support high-throughput simulations for species such as Pseudomonas putida, enabling cross-species flux analysis for industrial applications.
Algorithm Optimization: Multiple-Proposal Markov Chain Monte Carlo (MP-MCMC)
Implemented two novel Multiple-Proposal Markov Chain Monte Carlo (MP-MCMC) algorithms to parallelize the sampling process in BayFlux, significantly improving computational efficiency.
Achieved:
50× speedup via sparse matrix optimization
100× speedup using PyTorch-based CPU acceleration
Currently optimizing BayFlux for GPU-based execution using CUDA and PyTorch to further scale performance on heterogeneous compute platforms.
Integrated advanced sampling techniques to improve convergence speed, accuracy, and robustness of the Bayesian inference process.
Docker Containerization & HPC Deployment
Created a Docker-based deployment pipeline from scratch to streamline setup and ensure reproducibility across environments.
Optimized the container for deployment on high-performance computing (HPC) clusters like NERSC, using Shifter to enable large-scale metabolic simulations with minimal overhead.
Simplified usability by packaging a pre-configured environment, reducing user setup time and compatibility issues.
Scalable Architecture: Parallel Computing on CPU & GPU Clusters
Optimized BayFlux for distributed computing architectures, supporting both multi-core AMD CPUs and NVIDIA GPUs, paving the way for scalable, high-throughput metabolic modeling across HPC systems.
Expanding BayFlux to New Species: Toward Automation
Extended BayFlux’s capabilities to support non-model organisms, with successful implementation in Pseudomonas putida.
Ongoing work includes automating the genome-scale atom transition mapping process, currently a manual bottleneck, to enable high-throughput simulation pipelines across new biological species.
Our Github Repo: BayFlux