Research 

[2] Estimating Species Trees from Gene Trees by Maximizing Triplet Consistency

Supervisor: Dr. Md Shamsuzzoha Bayzid

Members: Kowshika Sarkar, Trisha Das, Mazharul Islam

Paper link: https://www.biorxiv.org/content/10.1101/594911v2

Paper homepage and datasets: https://islamazhar.github.io/STELAR/

 

Abstract: Gene trees often differ from species trees, creating challenges to species tree estimation. In this project, we are finding  a statistically consistent method for estimating the true species tree from gene trees ; that is the probability of returning the true species tree converges to one as the amount of data increases. Our method is a summary method that aims at maximizing triplet consistency to estimate true species tree.

Probabilistic Methods for Filling Gaps in Genome Assemblies

 Supervisors: Dr. Atif Hasan Rahman and Dr. Swakkhar Shatabda

 Members: Mazharul Islam and Sumit Tarafdar. 

Project homepage: https://github.com/islamazhar/GapFiller

Abstract: Since the genome length of an organism is very long, it is not possible for any current sequencer machine to sequence the whole genome at once. When the sequencer stitches the small reads from genome sequences together, there exist some gaps where the nucleotide sequence is unknown. To solve this problem, we have designed a probabilistic method for filling the gaps, which unlike other methods takes into account the gap length. Our experiment on both simulated and real data shows that this novel approach can fill up gaps of length with < 350 with quick convergence alongside significant accuracy. We are working on filling gaps of length more than > 350.