I am a researcher at Google Research India working on various practical challenges in the design of machine learning systems -- robustness, concept drift, cost-efficiency, human-AI interaction, etc. I also work on  cognition-inspired learning systems, including meta-learning,  continual learning, and robust vision. 

Previously, I led applied scientist teams at Microsoft Bing Ads, supporting large-scale ML models in production. I received a Ph.D from the University of Washington, and worked in various research capacities at UW, UC San Diego, Microsoft Research,  Fraunhofer Institute, and Lucent Bell Labs.  

For updated details, please see my Google Scholar and  LinkedIn pages.

Recent papers & news


Learned temporal reweighting

We propose a temporal reweighting approach for training models under slow concept drift. A meta-model scores each instance, and its age, according to the value it provides for future predictions. We outperform a range of other robust reweighting schemes by upto 8% relative, on a longitudinal dataset (9 years), and on a range of other nonstationary learning benchmarks. To our knowledge, this is the first proposal to leverage instance characteristics and data age for forward transfer.

Instance-conditional timescales of decay for nonstationary learning

N. Jain, P. Shenoy. AAAI 2024.

Early readouts debias distillation

We improve accuracy and across-group fairness of student models in distillation. We show that early readouts (linear decoding from earlier layers of the network) indicate featural bias through overconfident errors on underrepresented instances. By reweighting teacher loss as a function of early-layer error confidence, we show gains not only in worst-group accuracy but also overall accuracy over other distillation approaches on fairness benchmark datasets.

Using early readouts to mediate featural bias in distillation.

R. Tiwari, D. Sivasubramanian, A. Reddy, G. Ramakrishnan, P. Shenoy. WACV 2024.

Debiasing with a feature sieve

We propose a feature sieve--a novel method for automatically mediating between potential predictive features in a deep network based on their generalization capability.  Our method identifies and suppresses features with spurious label correlations, without access to definitions or other characterizations of potential features. We report significant gains (upto 11% relative) on real-world datasets with spurious feature-label correlations such as BAR, NICO, CelebA, Imagenet-9/Imagenet-A.

Overcoming simplicity bias in deep networks using a feature sieve

R. Tiwari, P. Shenoy. ICML 2023.