I'm currently a senior research manager at Apple, leading a team of engineers and scientists to understand and improve reasoning and planning capabilities of large language models (LLMs). My goal is to close the reasoning and intelligence gap between frontier models and the genuine reasoning (General Intelligence) that humans possess. I also work on optimizing LLMs to run on-device and have efficient inference. In general, I like to work on understanding and demystifying how large vision and language models work and learn, in order to find more accurate and efficient pre-training and fine-tuning architectures, algorithms, and strategies.

Before that, I was a senior research scientist at DeepMind. As a research scientist, I worked on continual and lifelong learning, multitask and transfer learning, understanding the training dynamics of deep neural networks, and reinforcement learning. These research areas were in line with DeepMind's mission towards Artificial General Intelligence. As an applied scientist and engineer, I worked on applications of machine learning, e.g., using recommendation/predictive models, meta learning, causal inference, and reinforcement learning to improve Google's products such as YouTube, Cloud, and Sales.

I received my Ph.D. in Computational Science and Engineering from Georgia Institute of Technology, under the supervision of Hongyuan Zha and Le Song. I used to work on modeling and optimization of sequential events data, stochastic point processes, and dynamics of and on the networks. During my PhD I interned at DeepMind, Micorosft Research, Max Planck Instintute for Software Systems and Google, working on predicting and leveraging health data, information reliability, and analyzing google maps local listings data. I received my M.Sc. in Artificial Intelligence from the Computer Engineering Department at Sharif University of Technology and my B.Sc. from the same university in Software Engineering, in 2011 and 2009, respectively.

Google Scholar LinkedIn Twitter