Data Analysis ~ Computational biology ~ Bioinformatics ~ Biostatistics
I am a globally minded scientist applying computational, various omics, and molecular genetics based approaches to identify and understand novel metabolic pathways and transcription patterns. My future dream is to combine computational approaches with molecular biology to develop sustainable solutions for humanity.
This project was completed as part of my doctoral course work. This project is not part of the research completed for my dissertation.
Summary: The data was projected into a dimensional space reduced from the original data dimensions using principal component analysis. One dataset was FDR expression values of the genes of an organism. The other dataset was predictions of DNA shape based on DNAShapeR. Projection prior to clustering improved the ability of the Kmeans, DBScan, and Expectation Maximization (EM) using Bayesian Gaussian Mixture Models (BGMM) algorithms to detect clusters in two datasets. Also the run time was reduced by greater than 50%.