I am an Applied Scientist at Amazon Web Services (AWS) AI Research, Santa Clara, where I work on developing algorithms for resource-efficient training and inference for Large Language Models. Prior to that, I graduated with a PhD in Electrical Engineering from Stanford University, where I was fortunate to be co-advised by Mert Pilanci and Andrea Goldsmith. My research interests broadly lie in identifying and addressing problems related to machine learning over resource-constrained edge devices. I enjoy understanding problems in this domain from a theoretical perspective, investigating notions of optimality, and designing theoretically-backed practical algorithms.
I completed my Bachelors' and Masters' from Indian Institute of Technology (IIT) Kharagpur, with a major in Electronics and Electrical Communications Engineering, a minor in Computer Science, and a micro-specialization in Embedded Wireless Systems. During my time there, I worked with my advisor Mrityunjoy Chakraborty on topics of MIMO Radar Imaging which received the Best Undergraduate Thesis Award. I was also the recipient of the Prime Minister of India Gold Medal for being the class valedictorian.
Email: rajsaha [at] stanford [dot] edu
[Jul 2025] Attending ICML in Vancouver. Come say Hi at our poster session!
[Apr 2025] Presenting a talk at PORTAL (Stanford Center for Portable Accelerated Learning) titled "Compressing LLMs for Efficient Inference: Recent Advances in Low-Rank Methods, Quantization, and Structured Sparsity" [Link]
[Nov 2024] Releasing our latest work on compressing LLMs using low-precision and low-rank decomposition! Check out our new algorithm CALDERA here. Featuring on Princeton Engineering News and Stanford EE News.