Research and Publications

Research Interests 

Moving to Machine Learning after a PhD in Stochastic Analysis, I have initially approached the field from a theoretical perspective: much of my research is focused on Statistical Learning Theory, which, in simple terms,  concerns itself with estimating the number of i.i.d. observations required to train a machine learning model up to a certain accuracy. I maintain a strong interest in understanding why neural networks behave the way they do and in theoretically explaining their surprising generalization abilities.  I have proved the first norm-based generalization bounds for CNNs which take the convolutional structure into account, as well as improved dependence on the number of classes/labels in various multi-class and multi-label settings. In many cases, insights into the workings of machine learning methods or even theoretically motivated algorithmic improvements can arise from theoretical study. Thus, I see a natural connection between the applied and theoretical parts of my work. 

In the last couple of years, I have developed a strong interest in Recommender Systems and Matrix Completion, which offer a rare and impressive combination of fascinating unsolved mathematical problems and rich application areas. Matrix completion is the problem of completing a partially observed matrix assuming some kind of (typically low-rank) structure. Such methods can be applied in any field where the observed variable (the value of the matrix at the given entry) depends on a combination of two variables (the row and the column) each chosen from a moderate-sized finite set (the set of rows or columns), but observing all of the combinations would be prohibitively expensive or impractical. Recommender Systems are a natural application, where the rows and columns traditionally correspond to users (people) and items (movies, songs, books, wines, doctors, job advertisements to respond to, etc.).  However, the field has been successfully applied to drug interaction prediction, the prediction of activity coefficients of solvents in chemical engineering, and even personalized medicine. There are many optimization methods that indirectly constrain the space where the ground truth matrix is assumed to lie (explicit rank restriction, nuclear norm regularization, max norm regularization, to name but a few), and each comes with its own challenges regarding sample complexity, optimization guarantees, and unique areas of practical applicability. 

In my NeurIPS paper " Fine-grained Generalization Analysis of Inductive Matrix Completion", I have provided distribution-free sample complexity guarantees for Inductive Matrix Completion (IMC) which bridge the gap between the theoretical study of matrix completion and IMC. I have also provided a modified regularizer which brings the rate down to the same as in the uniform sampling case. I am particularly excited about such cases where the statistical analysis of the problem leads to theoretically motivated new algorithms. My contribution Orthogonal Inductive Matrix Completion contains a Nuclear norm-type algorithm which applies to the subspace-constrained case, showing that homogeneous Inductive Matrix Completion can be optimized with an iterative imputation procedure.  In another TNNLS paper Uncertainty-adjusted recommendation via matrix factorization with weighted losses, we propose an analogue of the weighted trace norm targetted at situations where the loss function is weighted by user- or item-dependent uncertainty estimating factors. Compared to the standard trace norm,  we prove that our novel regularizer exhibits advantages in terms of the weighted generalization error, which we also observe in practical recommender systems datasets. 

I am also working on other low-rank methods:  the paper  Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation contains some of the first and most comprehensive theoretical results for low-rank density estimation via low-rank histograms, including both bias and variance analysis. 


For a more detailed description of each paper, see my research statement on the SMU website: Research Statement.

Publications 

In Recommender Systems and Matrix Completion

In Broader Statistical Learning Theory and Extreme Multiclass Classification

In Computer Vision and Other Applications

*= Equal Contribution