Rishi Sonthalia

NEW WEBSITE : https://sites.google.com/bc.edu/rishi-sonthalia/about-me

New Papers

New preprint - On Regularization via Early Stopping for Least Squares Regression, R Sonthalia, J Lok, E Rebrova, arXiv preprint arXiv:2406.04425
New preprint - Discrete error dynamics of mini-batch gradient descent for least squares regression, J Lok, R Sonthalia, E Rebrova, arXiv preprint arXiv:2406.03696
New Paper at TMLR - Double Descent and Overfitting under Noisy Inputs and Distribution Shift for Linear Denoisers with Chinmaya Kausik and Kashvi Srivastava
New Paper at AISTATS - Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization with Yutong Wang and Wei Hu

Travel and Talks

June - Rutgers for DIMACS workshop on Random Matrix Theory and Optimization of Neural Networks
July 7th - July 13th - London for LogML in July https://www.logml.ai
July 25th - July 28th - Vienna for ICML
September - December - Los Angeles for IPAM program on Mathematics of Intelligence
October 21 - October 25 - Atlanta for SIAM conference on Mathematics of Data Science
December 16 - Londong for session on “over-parameterization in ML” at the CFE-CMStatistics conference
January 2025 - Seattle for JMM, organizing a session on Geometry and Combinatorics for Machine Learning with Eli Grigsby and Kathryn Lindsey.

About Me

Hi! I am an Assistant Professor in the Math department at Boston College. Before this, I was a Hedrick Assistant Adjunct Professor at UCLA under Andrea Bertozzi, Jacob Foster, and Guido Montufar. I obtained my Ph.D. in Applied and Interdisciplinary Mathematics from the University of Michigan. I won the Peter Smereka Award for the best-applied math thesis. My advisors were Anna C. Gilbert and Raj Rao Nadakuditi. I did my undergraduate studies at Carnegie Mellon University, obtaining a B.S. with double majors in discrete math and computer science.

Understanding the mathematical foundations of machine learning algorithms is crucial. My work has contributed to a better understanding of data's intrinsic geometric and probabilistic structure. This understanding has been applied to design better machine learning algorithms. My current area of focus is the mathematical underpinnings of geometric deep learning, optimization, and generalization.

Current Projects

Generalization - I am interested in understanding the generalization properties of machine learning models. I explore such questions using tools from random matrix theory. While most of my prior work has focused on linear models [1, 2, 3, 4], I am interested in developing tools to analyze non-linear models.
Inductive Bias, Regularization, and Optimization - I am very interested in understanding the inductive bias of various regularization techniques [5] as well as optimization techniques [6] . Currently, I am interested in understanding the effects of batching [7] and the role of loss landscape [8] and flatness.
Using Hyperbolic Geometry for Machine Learning - Large parts of machine learning are about learning parameterized functions. This is traditionally done by fitting the function to some data. Classically, we cared about minimizing the loss of function on this data. However, in the modern regime, multiple global minima exist due to overparameterization. Hence, the bias of a method towards picking a specific global minimum is known as the implicit bias of the method. I am interested in the interplay between implicit bias and geometry.

Here, geometry can play a role in a variety of ways. First, we could be looking at certain subspaces of functions in which the geometry of the subspace is important. Second, we could care about maps that factor through different manifolds. Here, the geometry of the manifold is important. Third, we could restrict our parameters to living in certain spaces. Hence, the geometry of this subspace is important. I am interested in understanding how the geometry affects the inductive bias of the machine learning methods.

Generative Modelling - Generative models allow us to sample from probability distributions. I am interested in using Random Matrix Theory to improve these methods.
Algebraic Structure of Graphical Models - Many probabilistic models, such as graphical models, have algebraic structures, such as they model a semi-algebraic subset of the probability simplex. Hence, I am interested in understanding this structure.

See publications for prior projects.

If you have any questions or ideas you want to discuss, or if you want to discuss math and computer science, ways to contact me can be found under the contact me tab.

Report abuse