Research Areas

I have worked on many areas of machine learning -- from fundamental theoretical and methodological areas (e.g. generative modeling, signal processing) to exciting real-world applications (e.g. medical imaging, computational genomics, drug discovery, computational neuroscience).  This page represents a sample of projects that I have led.  The full list of my publications is here.

Generative Modeling

How can we design probabilistic models for complex data distributions?  How can we learn the parameters of these models at scale? How can we theoretically guarantee that the learning process is both accurate and efficient? How can we integrate the power of deep learning with traditional generative modeling approaches?  I study these problems in my recent papers: [ICML 2023], [AABI 2023]  

Drug Discovery

Deep neural networks are effective tools for drug representation learning, yet often struggle to disentangle meaningful biological signal from experimental noise.  In my [MLCB 2022] paper, I explore how to repurpose a common neural net layer for batch effects correction, leading to dramatic performance boosts in drug discovery-related tasks.     

Computational Genomics

By understanding and analyzing genetic data, we have the potential to reveal the mechanisms underlying human disease.  This will ultimately enable us to build impactful systems for precision medicine.  I develop computational methods to analyze the human genome; ongoing projects include (1) developing efficient statistical methods for polygenic risk scoring to predict disease susceptibility from millions of genetic variants, and (2) building deep learning models that can capture the highly epistatic effects of multiple genetic perturbations to accurately predict single-cell gene expression.  

Signal Processing

Fundamental signal processing concepts such as sparsity and dictionary learning provide a useful prior to explain high-dimensional signals that occur in the natural world, such as images and audio.  In [IEEE ICASSP 2022a] [IEEE TSP 2022], I introduce a scalable method for Bayesian sparse signal recovery that is up to thousands of times faster than existing baselines.  In [IEEE ICASSP 2022b], I develop a novel signal-processing-inspired neural network architecture for clustering images with up to 50x fewer parameters than other works.  

Medical Imaging

The amount of time that a patient needs to spend in a medical scanner is directly proportional to how much data needs to be collected to create their image.  What if we could collect less data but still guarantee accurate imaging, thus saving time for the patient and resources for medical institutions?  I develop novel methods for MRI reconstruction in sparse data regimes: [ISMRM 2021], [ISMRM 2022

Computational Neuroscience

The human brain is the most impressive form of natural intelligence, and neuroscience holds the key to unlocking its inner workings.  Machine learning can help neuroscientists extract patterns, trends, and insights from neural data at an unprecedented scale.  As an example, in [AISTATS 2019], I develop a machine learning model for automatically discovering meaningful clusters of neural firing sequences governed by common underlying dynamics. 

Natural Language Processing

Large language models and text-based encoders are potentially one of the most widely-useful forms of generative AI.  However, they suffer from notable limitations, such as enormous computational costs and susceptibility to reproducing societal biases.  In [EMNLP 2020], I develop a reinforcement-learning-inspired compression technique for language models that substantially accelerates generation speed.  In [ICML DGAI, 2023], we develop a method to detect, analyze, and explain the biases of text-to-image models.