We will be discussing the problem of sequence mapping in the era of Next-Generation Sequencing and how the scale of the sequencing output makes the use of already discussed methods such as BLAT unusable. After going through the concept of genome/sequence indexing we will be focusing on the main sequence transformations that have been used in the recent past and those currently in use now for the mapping of NGS experiments. We will be covering:
1. Suffix tries/trees. Creation and Search
2. Suffix arrays. Creation, relationship with suffix tries and use in mapping
3. The Burrows-Wheeler Transform as the state of the art algorithm to map NGS sequences
Lecture 06: NGS Mapping Algorithms
Exercise #4
For your fourth exercise you are asked to work on the Burrows Wheeler Transformation in two distinct ways. Half of you (the first half of the list in alphabetical order) will implement a BWT to transform a small text in the english language. The text need not be more than 100-200 characters long and it should only contain upper case characters and the underscore "_" as word delimiter (in total 27 characters). The output should be the BWT of a given phrase. For the other half (see table below) the task will be to implement the inversal of the BWT in order to decode a message sent by their counterparts. The input will be a BWT and the output should be the natural language phrase (initial sequence).
The pairs will be filled in below
BWT Sender BWT Inversal