Research & Projects

Project 1

Machine Learning Methods for Predicting Essential Metabolic Genes from Genome-Scale Metabolic Network.

In this research paper, we present a novel approach to predicting metabolic gene essentiality in pathogenic Plasmodium falciparum by leveraging machine learning techniques and network analysis. Our research is based on the analysis of its genome-scale metabolic model using the iAM_Pf480 model from the BiGG database and essentiality information from the OGEE database. The key innovation in our approach lies in the incorporation of Mass Flow Graphs derived from the calculated flux vector of the metabolic network into a comprehensive network-based machine learning framework.

Objectives

Understand and investigate the application of machine learning and network-based techniques in predicting essential metabolic genes.

Results

Our proposed method achieved a remarkable accuracy rate of 0.85 and an AuROC of 0.7, showcasing its effectiveness in predicting gene essentiality. Notably, our approach identified nine genes previously considered non-essential in the OGEE database but now predicted to be essential. This discovery opens new avenues for potential drug targets in malaria treatment, highlighting the practical implications of our research.

Project 2

RNASeq-pop for RNA-Seq Analysis

RNA-Seq-Pop is a versatile computational pipeline designed by Dr. Sanjay Curtis Nagi to analyze Illumina RNA-Seq data from any organism. It not only performs core transcriptomic analyses, such as differential expression but also identifies and evaluates genetic polymorphisms, unlocking valuable population genomic insights.

Objectives

Our goal is to continuously improve RNA-Seq-Pop by developing and integrating scripts that enhance its RNA-Seq analysis performance.

Results

In recent updates, I collaborated with Dr. Sanjah to incorporate a new aligner that demonstrates higher accuracy, ensuring more reliable analysis outcomes.

Documentation: https://sanjaynagi.github.io/rna-seq-pop/

Github: https://github.com/stephen-bin/rna-seq-pop/tree/replacing-hisat2-with-star/workflow

Project 3

Anopheles Gambiea RNA-Seq Field Sample Analysis

This project leverages RNA-Seq data from field samples to explore genetic and transcriptomic characteristics in a comprehensive way. Our pipeline includes core analyses such as descriptive statistics, differential gene expression analysis, and gene ontology enrichment, with a special focus on identifying specific mutations and allele frequencies across samples.

Objectives

Our primary objectives are to conduct detailed descriptive analysis, identify differentially expressed genes, and perform gene ontology analysis to uncover functional insights. We aim to pinpoint notable mutations and track allele frequencies to support population-level genetic interpretations.

Results

Our results are presented through informative visualizations, including Venn diagrams, volcano plots, and heatmaps, providing a clear view of the genetic landscape and expression changes across the samples.

Page updated

Google Sites

Report abuse

Research & Projects

Research & Projects

Project 1

Machine Learning Methods for Predicting Essential Metabolic Genes from Genome-Scale Metabolic Network.

Objectives

Results

Project 2

RNASeq-pop for RNA-Seq Analysis

Objectives

Results

Project 3

Anopheles Gambiea RNA-Seq Field Sample Analysis

Objectives

Results

Contact:

Social Media