I am an Applied Scientist at Amazon Annapurna Labs, Cupertino, where I work on developing algorithms for resource-efficient training and inference for Large Language Models. Prior to that, I graduated with a PhD in Electrical Engineering from Stanford University, where I was fortunate to be co-advised by Mert Pilanci and Andrea Goldsmith. My research interests broadly lie in identifying and addressing problems related to machine learning over resource-constrained edge devices. I enjoy understanding problems in this domain from a theoretical perspective, investigating notions of optimality, and designing theoretically-backed practical algorithms.
I completed my Bachelors' and Masters' from Indian Institute of Technology (IIT) Kharagpur, with a major in Electronics and Electrical Communications Engineering, a minor in Computer Science, and a micro-specialization in Embedded Wireless Systems. During my time there, I worked with my advisor Mrityunjoy Chakraborty on topics of MIMO Radar Imaging which received the Best Undergraduate Thesis Award. I was also the recipient of the Prime Minister of India Gold Medal for being the class valedictorian.
Email: rajarshisaha95 [at] gmail [dot] com
[May 2026] Will be attending AISTATS-2026 to present our work, Demystifying Transition Matching: When and Why It Can Beat Flow Matching. Come say Hi! [arXiv]
[Jan 2026] Presenting a tutorial on Inference Optimization for Generative AI at AAAI-26, highlighting system–algorithm co-design techniques across LLMs and diffusion models. [Tutorial Website]
[Dec 2025] Presenting a talk at E&ECE Dept., IIT Kharagpur titled "Matrix Compression via Randomized Low-Rank and Low-Precision Factorization with applications in Compressing Large Language Models"