Teaching

BIO 792 Machine Learning and Modern Applied Statistics for Biology

This course is designed for those who aim for excellence in the treatment of data in biology and is ideal for PhD students early in their study who desire training that can be used in their dissertation work. Students will gain hands-on experience with numerous datasets,  including single cell RNA sequencing data, imaging data used for disease prediction, and microbiome data used for host trait prediction. By the end of this course, students will be able to build their own machine learning pipeline to analyze a variety of real biomedical datasets. 

GN428 Introduction to Machine Learning in Biology

New techniques in genomics have revolutionized biology, but generate large quantities of data that present challenges in extracting signal from noise.  This course will provide students the basic skills to manipulate and integrate different types of biological datasets and to learn how to mine them using data analysis tools ranging from basic to state of the art.  Machine learning methods provide a framework to analyze vast amounts of biological information and extract meaningful signals. By the end of the semester, students will have had exposure to a variety of modern machine learning tools for classification and prediction.  We will focus on exploration of DNA data (with millions of variants), expression data (> 20,000 genes), and microbiome data (thousands of features), combined with various disease/experimental measurements.  The course will cover the basics of loading and exploring datasets using visualization, followed by basic machine learning basic methods including classification and regression algorithms.

BIO 592 079 Computational Environmental Sciences and Toxicology