forProject Title: Comparison of Various Multiple Sequence Alignment Algorithms Team Members Names: Eduardo Buenviaje Abstract: This project will compare the different methods and speed of various pattern search algorithms. Multiple pattern alignment algorithms typically handle three or more biological sequences. Several tools have been created and are available for use. The project will compare several of these tools by analyzing the results of various data sets. The results should show a spectrum of between close and remotely related biological sequences. Plan:- What will you implement?
- The project will implement a program(s) and tools that will conduct a multiple sequence alignment using various algorithms. The MSAs will be ran with nucleotide sequences. The results will be collected and graphed.
- What methods are you going to compare and how will you get them?
- Methods to be compared initially are the Clustal, PSAlign, CHAOS/DIALIGN, and MUSCLE algorithms. These will be either taken from existing code, tools or created based off the algorithms defined in research papers.
- Which datasets are you going to use and where will you get them from (links if
possible)?
- Datasets will be in FASTA format taken from the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/
- Two datasets will be used. One consisting of four known related organisms (i.e. human chromosome, mouse, etc.) and one consisting of three known unrelated organisms (i.e. e. coli, plant).
- What kind of experiment will you run and what will you measure (e.g., time,
score, p-value etc). - The project will initially measure and compare time and score of the various algorithms.
Workload Distribution: 100% - Buenviaje List of Reference Papers: - Xu Zhang, Tamer Kahveci
A New Approach for Alignment of
Multiple Proteins
Larkin et al ClustalW and ClustalX version 2.0
-
|
|