I am a researcher at Google DeepMind India, interested in architectural and algorithmic advances for making foundation models efficient (training, inference, model size, etc.) and effective (quality, elastic compute, reasoning, etc.).
In recent research, I addressed various practical challenges in the design of machine learning systems -- robustness, concept drift, cost-efficiency, human-AI interaction, etc. I also worked on cognition-inspired learning systems, including meta-learning, continual learning, and robust vision. In a previous role, I led applied scientist teams at Microsoft Bing Ads in building & supporting large-scale production models of user behavior, including click & conversion prediction and user preference models & personalization,.
I received a Ph.D from the University of Washington, and worked in various research capacities at UW, UC San Diego, Microsoft Research, Fraunhofer Institute, and Lucent Bell Labs.
For updated details, please see my Google Scholar and LinkedIn pages.
Recent papers & news
ICML 2025. Masked Generative Nested Transformers with Decode Time Scaling. S. Goyal, et al.
ICLR Workshops 2025. Universal Model Routing for Efficient LLM Inference. W. Jitkrittum, et al.
Google Research blog posts about our work on spurious features & simplicity bias, and on reweighting for nonstationary learning.
CVPR 2024. Improving Generalization via Meta-Learning on Hard Samples. N. Jain, A.S. Suggala, P. Shenoy.
ICLR 2024. Learning model uncertainty as variance-minimizing instance weights. N. Jain, K. Shanmugham, P. Shenoy.