Phylogeny

With the advancement of biology and computer science, the amount of DNA sequences has grown at a rapid rate giving rise to the analysis of phylogenetic trees with many taxa. Phylogeny is the study which illustrates the genetic relationships and evolutionary history of genes. Evolution is the result of a change of gene frequencies over time. Some genes or alleles eradicate while some get fixed in population with time. According to Darwin’s theory of Evolution, all species evolved from the same ancestor. Different mechanisms such as mutations, duplication of genes, genome reorganization and genetic exchanges have led to the biodiversity present in today.

In order to peruse phylogeny, different kinds of data can be utilized. Comparing the morphological characteristics between species is the classical procedure for investigation. High availability of molecular data, such as nucleotides and amino acid sequences to infer phylogenetic relationships, has made biologists doubt whether to use the molecular or morphological approach in the study of evolution. But molecular information is not available for extinct species whereas organisms like viruses don’t have fossil records. Thus, a biologist uses both approaches according to the requirements of the study.

When considering the existing methods to generate phylogenetic trees it can be stated that they are very inefficient in updating the phylogenetic tree when new species are discovered. The research addresses an efficient mechanism to create a phylogenetic tree proceeding with indexing the genome sequences.

The maximum likelihood analysis is commonly considered as the best approach in phylogenetic analyses, which is extremely intensive for computation. Availability of computer resources and the application of modern technologies are key factors that determine the use of such analyses. The research addresses a parallel implementation of a GPU accelerated maximum likelihood inference of phylogenetic trees.

Proper visualization of phylogenetic trees is also important. Most phylogenetic inference tools lack proper visualization techniques. As a result, the users are unable to search the tree effectively. But many tree viewers exist separately with numerous features and techniques. Integrating these techniques to the phylogenetic inference framework would be beneficial and cost effective. The research aims to develop a generic framework for the phylogenetic inference that accepts multiple datasets of amino acids and nucleotides and utilizes multiple GPU compatible algorithms as the optimality criterion and provides the visualization using different techniques as required while including magnifying capabilities.