I am a researcher at Google Research India working on various practical challenges in the design of machine learning systems -- robustness, concept drift, cost-efficiency, human-AI interaction, etc. I also work on cognition-inspired learning systems, including meta-learning, continual learning, and robust vision.
Previously, I led applied scientist teams at Microsoft Bing Ads, supporting large-scale ML models in production. I received a Ph.D from the University of Washington, and worked in various research capacities at UW, UC San Diego, Microsoft Research, Fraunhofer Institute, and Lucent Bell Labs.
For updated details, please see my Google Scholar and LinkedIn pages.
Recent papers & news
Google Research blog posts about our work on spurious features & simplicity bias, and on reweighting for nonstationary learning.
CVPR 2024. Improving Generalization via Meta-Learning on Hard Samples. N. Jain, A.S. Suggala, P. Shenoy.
ICLR 2024. Learning model uncertainty as variance-minimizing instance weights. N. Jain, K. Shanmugham, P. Shenoy.
AAAI 2024. Instance-conditional timescales of decay for nonstationary learning. N. Jain, P. Shenoy.
WACV 2024. Using early readouts to mediate featural bias in distillation. R. Tiwari, D. Sivasubramanian, A. Reddy, G. Ramakrishnan, P. Shenoy.
ICML 2023. Overcoming simplicity bias in deep networks using a feature sieve. R. Tiwari, P. Shenoy.
Learned temporal reweighting
We propose a temporal reweighting approach for training models under slow concept drift. A meta-model scores each instance, and its age, according to the value it provides for future predictions. We outperform a range of other robust reweighting schemes by upto 8% relative, on a longitudinal dataset (9 years), and on a range of other nonstationary learning benchmarks. To our knowledge, this is the first proposal to leverage instance characteristics and data age for forward transfer.
Instance-conditional timescales of decay for nonstationary learning
N. Jain, P. Shenoy. AAAI 2024.
Early readouts debias distillation
We improve accuracy and across-group fairness of student models in distillation. We show that early readouts (linear decoding from earlier layers of the network) indicate featural bias through overconfident errors on underrepresented instances. By reweighting teacher loss as a function of early-layer error confidence, we show gains not only in worst-group accuracy but also overall accuracy over other distillation approaches on fairness benchmark datasets.
Using early readouts to mediate featural bias in distillation.
R. Tiwari, D. Sivasubramanian, A. Reddy, G. Ramakrishnan, P. Shenoy. WACV 2024.
Debiasing with a feature sieve
We propose a feature sieve--a novel method for automatically mediating between potential predictive features in a deep network based on their generalization capability. Our method identifies and suppresses features with spurious label correlations, without access to definitions or other characterizations of potential features. We report significant gains (upto 11% relative) on real-world datasets with spurious feature-label correlations such as BAR, NICO, CelebA, Imagenet-9/Imagenet-A.
Overcoming simplicity bias in deep networks using a feature sieve
R. Tiwari, P. Shenoy. ICML 2023.