Basics of biological data
Role of algorithms and data structures
High-throughput DNA/RNA sequencing
Data structures and algorithms warm-up
Z algorithm
Suffix arrays
Suffix tries and suffix trees
Burrows-Wheeler Transform, FM-Index
Hamming distance, edit distance, dynamic programming
Pairwise sequence alignment
Global vs. local alignment, scoring functions
k-mer indexes
Co-linear chaining
Sketching and sublinear data structures
de Bruijn graphs, overlap graphs
Haplotype assembly and phasing
Hidden Markov models, gene finding
General purpose compression: Lempel–Ziv
Domain-specific compression algorithms
The tree of life
Maximum parsimony, maximum likelihood algorithms
Variant calling & emerging deep learning-based methods
Recent advances in genome assembly
We were fortunate to hear cutting-edge bioinformatics research from the following presenters in the previous iterations of DS202 course.
Denis Thieffry (2025)
Jim Shaw (2024)
Mikko Rautiainen (2023)
Niranjan Nagarajan (2022)
Amatur Rahman (2022)
Mile Sikic (2021)
Ramesh Hariharan (2021)
Kishwar Shafin (2021)
2024
Linear time suffix tree construction using Ukkonen's algorithm
Algorithms for computing tumour phylogenies
Genotype imputation using hidden Markov Models
Transformer-based deep learning model to predict sequence consensus
Reproducible feature selection in deep neural networks
2023
Likelihood-based algorithms for de novo genome assembly
Greedy algorithm for aligning DNA sequences
Near-linear time edit distance algorithm
Using stochastic context-free grammar to predict RNA structure
Building phylogenetic trees using maximum likelihood
2022
Accelerating sequence alignment using parallel prefix algorithm
Using wheeler graph as succinct index for efficient pattern matching
Construction of string graphs using FM-index for de novo genome assembly
Gene finding using hidden Markov models
2021
Succinct representation of de Bruijn graphs
Minimizer-based k-mer sampling and its applications
Distributed-memory parallel algorithms for construction of suffix array and suffix tree
Optimal algorithms for sequence to pangenome graph alignment
Knowledge of basic data structures, algorithms, programming experience
DS-221 Intro to scalable systems (or) DS-201 Bioinformatics (or) E0-251 Data structures and algorithms (or) E0-225 Design and analysis of algorithms (or) consent from the Instructor