Home

forProject Title:  Comparison of Various Multiple Sequence Alignment Algorithms

Team Members Names:  Eduardo Buenviaje

Abstract:  This project will compare the different methods and speed of various pattern search algorithms.  Multiple pattern alignment algorithms typically handle three or more biological sequences.  Several tools have been created and are available for use.  The project will compare several of these tools by analyzing the results of various data sets.  The results should show a spectrum of between close and remotely related biological sequences. 

Plan:
  1. What will you implement?
    • The project will implement a program(s) and tools that will conduct a multiple sequence alignment using various algorithms.  The MSAs will be ran with nucleotide sequences.  The results will be collected and graphed. 
  2. What methods are you going to compare and how will you get them?
  3. Which datasets are you going to use and where will you get them from (links if
    possible)?
    • Datasets will be in FASTA format taken from the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/
    • Two datasets will be used.  One consisting of four known related organisms (i.e. human chromosome, mouse, etc.) and one consisting of three known unrelated organisms (i.e. e. coli, plant). 
  4. What kind of experiment will you run and what will you measure (e.g., time,
    score, p-value etc).
    • The project will initially measure and compare time and score of the various algorithms.

Workload Distribution:  100% - Buenviaje

List of Reference Papers: 
  • Xu Zhang, Tamer Kahveci
    A New Approach for Alignment of Multiple Proteins
  • Larkin et al
    ClustalW and ClustalX version 2.0