Weekly topics covered and references for further reading will be posted here.
Week 1
(Jan 6 - Jan 10)
(i) basics of molecular biology, high-throughput sequencing, big-data challenges, (ii) guest lecture by Prof. Denis Thieffry (iii) dynamic range-minimum queries
MBCT Chapter 1, MBCT 3.1
Week 2
(Jan 13 - Jan 17)
(i) Bitvector rank and select queries, (ii) exact matching, Z-algorithm (iii) k-mer index, (iv) Suffix-array
MBCT 3.2, Gusfield Chapter 1, MBCT 8.1, 8.2, Manber and Myers 1993
Week 3
(Jan 20 - Jan 24)
(i) k-mer index (ii) Suffix-array (iii) Construction of suffix arrays (iv) Prefix-doubling approach for suffix array construction
MBCT 8.1, 8.2, Manber and Myers 1993
Week 4
(Jan 27 - Jan 31)
(i) Suffix tries (ii) Suffix trees (iii) LCP array (iv) Kasai's algorithm for constructing LCP arrays
Gusfield Chapter 5, Kasai LCP construction algorithm
Week 5
(Feb 3 - Feb 7)
(i) Generalised suffix trees (ii) Applications
Highlight: All-pairs longest suffix-prefix overlap problem
MBCT Theorem 8.15, Gusfield Chapter 7
Week 6
(Feb 10 - Feb 14)
(i) Burrows-Wheeler transform (ii) LF-Mapping (iii) FM-index
MBCT 9.1, 9.2
Week 7
(Feb 17 - Feb 21)
(i) Demo: Burrows-Wheeler transform, IGV alignment visualisation (ii) Mid-term examination (iii) Approximate matching (iv) Edit distance
MBCT 6.1
Week 8
(Feb 24 - Feb 28)
(i) Project topics (ii) Mid-term examination solutions (iii) Score-based approximate matching (iv) Global alignment (v) Approximate occurrence of a pattern in a text (vi) Local alignment
Gusfield 11.2, 11.3, 11.6
Week 9
(Mar 3 - Mar 7)
(i) Affine gap costs in approximate alignment (ii) Statistical significance of alignments (iii) Karlin-Altschul theory (iv) Maximal Unique Matches
Aluru Chapter 15, Aluru Chapter 7-7, MBCT 8.4.2
Week 10
(Mar 10 - Mar 14)
(i) Co-linear chaining (ii) Sub-quadratic algorithms for finding best scoring chain (iii) Co-linear chaining with L1 gap cost
Aluru Chapter 15
Week 11
(Mar 17 - Mar 21)
(i) Genome assembly (ii) Lander-Waterman statistics for shotgun sequencing (iii) Shortest common superstring (iv) Greedy algorithm for SCS (v) Runtime analysis, approximation factor, and feasibility of Greedy SCS
Glenn Tesler's slides, Gusfield 16.15, 16.17, Theorem 2 in Bresler et al. 2013
Week 12
(Mar 24 - Mar 28)
(i) de Bruijn Graphs (ii) Genome assembly using de Bruijn graphs (iii) [Project] A deep reinforcement learning framework for haplotype assembly (iv) [Project] DNA language models for mutation classification: identifying benign and harmful variants