Dr. Kevin Coombes (BMI) and Dr. Yan Zhang (BMI) will co-instruct a course in Autumn 2018. Check it out:
BMI 8130 - Analysis and Applications of Genome-Scale Data
The goal of this course is to introduce trainees to the fundamental algorithms needed to understand and analyze genome-scale expression data sets. The course will cover three major kinds of applications. (1) Class Comparison seeks to describe which features differ between two or more known classes of patient samples (such as normal vs. tumor). Methodology includes (generalized) linear models with careful attention to the issue of multiple comparisons. (2) Class Discovery seeks to discuss the inherent structure present in a data set. The methodology includes a wide variety of techniques for clustering samples (including K-means as well as various forms of hierarchical clustering) and assessing the number of clusters and the robustness of cluster assignments. We also cover methods such as principal components analysis that help visualize the data. (3) Class Prediction seeks to discover and validate models that can accurately predict the class or the outcomes of new samples. Methods include a wide variety of machine learning and statistical methods for feature selection and model construction. We will also discuss methods for cross-validation and independent validation of predictive models. The course will include an introduction to, and hands-on experience with, the R statistical software environment and to the use of R packages that can be applied to these kinds of problems.