I am a Research Assistant Professor at the Toyota Technological Institute at Chicago. Previously, I was a postdoctoral researcher at UC San Diego. During Spring 2015, I was a visiting scholar at the Simons Institute at UC Berkeley. I also spent two years in Europe, as a postdoctoral researcher at the Microsoft Research – Inria Joint Centre in Paris, and as an ERCIM Marie Curie fellow at Université Paris-Sud. I received my PhD in Electrical Engineering and Computer Science from MIT, while working at the Laboratory for Information and Decision Systems (LIDS). My research interests are broadly in statistics, information theory, machine learning, and their applications. In particular, I am interested in two extreme problems in statistics and machine learning: data scarcity and large datasets.


In situations with less data than traditionally assumed, one has to place structural assumptions in order to make statistical inference tasks well posed. I have spent a while studying rare probability estimation problems, in settings where outcomes are discrete. A natural structure in this case is a tail characterization of the probability distribution. This notion is prevalent in continuous problems, namely tail probability estimation, but I have shown that it is just as important in the discrete context. I continue to explore the implications and applications of this perspective to data compression with large alphabets and learning in data-starved regimes. In particular, I have been trying to combine such distributional structures with classical latent low-dimensional structures, such as low matrix rank. Some of the applications include natural language modeling, climate forecasting, and prediction of rare but impactful societal events.


As for large datasets, not only are efficient algorithms important to perform classical learning, but the additional data can be leveraged to improve performance. For example, selecting part of the data instead of the whole can help alleviate the computational difficulty. Such pruning, however, induces artificial data scarcity, and it becomes important to understand the resulting trade-offs between data size and computation. Some of my work in this shows that if data is summarized strategically, we can have genuine computation-statistics trade-offs: as more data becomes available, we can compute faster.