Schedule

Weekly Activity

Week 2 - Chose Topic 9

9. Motif discovery. Implement the "Gibbs sampling" algorithm for motif finding described in Lawrence et al, "Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment, Science, 262, 208-214, 1993 (also described in the slides). Run the program and collect experimental data. How would you improve its performance?

Week 3 - Site creation and topic search

Read Articles:

Resnik, P. and Eric Hardisty. “GIBBS SAMPLING FOR THE UNINITIATED.” (2010).

Lawrence et al. " Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment" (1993)

L. Martino, V. Elvira, G. Camps-Valls. The Recycling Gibbs Sampler for Efficient Learning" (2017)

https://www.youtube.com/watch?v=BkUy3A99gno

Week 4 - changed to project 5. Suffix Arrays. Design a "suffix array" C++ class. Design an algorithm/program to find the maximal unique matches between two long strings (ideally chromosomes). You implementation must be space-efficient. Collect data on time and space used for different input sizes.

Week 5- reading papers:

  • Space Efficient Suffix Trees. J. Ian Munro, Venkatesh Raman an S. Srinivasa Rao

  • Linear-size suffix tries . Maxime Crochemore, Chiara Epifanio, Roberto Grossi, Filippo Mignosi

  • Algorithms on Strings, Trees, and Sequences, COMPUTER SCIENCE AND COMPUTATIONAL BIOLOGY. Dan Gusfield

Week 6 - Starting implementing

beginig of project

Week 7,8,9 - coding

week 10 - created git and coding solution O(n^2)