Useful and Reliable AI Agents

Speaker bios

Panel 1: Infrastructure for AI agents

Harrison Chase is the CEO and co-founder of LangChain, a company formed around the open source Python/Typescript packages that aim to make it easy to develop Language Model applications. Prior to starting LangChain, he led the ML team at Robust Intelligence (an MLOps company focused on testing and validation of machine learning models), led the entity linking team at Kensho (a fintech startup), and studied stats and CS at Harvard.

Omar Khattab is a graduating CS Ph.D. candidate at Stanford, a Research Scientist at Databricks, and an incoming Assistant Professor at MIT EECS (Fall 2025), whose work spans Natural Language Processing (NLP), Information Retrieval (IR), and Machine Learning (ML) Systems. Omar's research creates models, algorithms, and programming abstractions for building reliable, transparent, and scalable NLP systems. He is the author of the ColBERT retrieval model, which has helped shape the modern landscape of IR, and the creator of the DSPy framework for building and optimizing language model programs. His lines of work on ColBERT and DSPy form the basis of influential open-source projects, together exceeding a million downloads per month, and have sparked applications at dozens of companies and startups. Omar’s Ph.D. has been supported by the Eltoukhy Family Graduate Fellowship and the Apple Scholars in AI/ML PhD Fellowship.

Shreya Shankar is a PhD student in computer science at UC Berkeley, advised by Dr. Aditya Parameswaran. Her research addresses data challenges in production ML pipelines through a human-centered lens, focusing on data quality, observability, and more recently, leveraging large language models for data preprocessing. Shreya's work has appeared in top data management and HCI venues, including SIGMOD, VLDB, CIDR, CSCW, and UIST. She is a recipient of the NDSEG Fellowship and co-organizes the DEEM workshop at SIGMOD, which focuses on data management in end-to-end machine learning. Prior to her PhD, Shreya worked as an ML engineer and completed her undergraduate degree in computer science at Stanford University.

Panel 2: Evaluating real-world use

Jelena Luketina is a Research Scientist at the AI Safety Institute (AISI), a directorate of UK Department for Science, Innovation and Technology with the mission of equipping governments with an empirical understanding of the safety of advanced AI systems. At AISI, she is working on building the internal agent infrastructure, as well as designing and conducting pre- and post-deployment agent evaluations of LLMs. Jelena holds a PhD from the University of Oxford, where her research was focused on transfer in sequential decision-making and deep reinforcement learning. During her PhD, she interned at Google DeepMind and Meta.

Karthik R. Narasimhan is an associate professor in Computer Science at Princeton and head of research at Sierra. His research spans the areas of natural language processing and reinforcement learning, with the goal of building intelligent agents that learn to operate in the world through both their own experience (”doing things”) and leveraging existing human knowledge (”reading about things”). Karthik received his PhD from MIT in 2017, and spent a year as a visiting research scientist at OpenAI in 2017-18 contributing to the first GPT language model. His research has been recognized by the NSF CAREER, an NAE Grainger Foundation grant, a Google Research Scholar Award, an Amazon research award, Bell Labs runner-up prize and outstanding paper awards at EMNLP (2015, 2016) and NeurIPS (2022).

Hailey Shoelkopf is a Research Scientist at EleutherAI, a non-profit research lab focused on enabling open science on large-scale AI models. Her research has focused on building reproducible infrastructure for empowering open science on large-scale models, with core interests in language model evaluation and systems optimizations in model pretraining. She is currently a maintainer of the LM Evaluation Harness library for standardized evaluations, and has previously worked on other projects across the LM development cycle such as pretraining Pythia, the first fully-transparent language models, and training Llemma, a base model for mathematics.

Panel 3: Ensuring reliability

Mehak Aggarwal is co-founder of a Series A startup, Sybill, which is building an AI assistant to help sales reps manage deals from discovery to close. Prior to starting Sybill, she was a research fellow at Harvard Medical School where she led AI development efforts for projects aimed at bringing healthcare to underserved communities. She has over half a dozen publications in Deep Learning and did her undergraduate from Indian Institute of Technology Delhi.

Iason Gabriel is a Staff Research Scientist at Google DeepMind and where he works in the Ethics Research Team. His work focuses on the ethics of artificial intelligence, including questions about AI value alignment, distributive justice, language ethics and human rights. More generally, he is interested in AI and human values, and in ensuring that technology works well for the benefit of all. He has contributed to several projects that promote responsible innovation in AI, including the creation of the ethics review process at NeurIPS. Before joining DeepMind, he taught moral and political philosophy at Oxford University, and worked for the United Nations Development Program in Lebanon and Sudan.

Azalia Mirhoseini is an Assistant Professor in the Computer Science Department at Stanford University. Professor Mirhoseini's research interest is in developing capable, reliable, and efficient AI systems for solving high-impact, real-world problems. Her work includes generalized learning-based methods for decision-making problems in systems and chip design, self-improving AI models through interactions with the world, and scalable deep learning optimization. Prior to Stanford, she spent several years in industry AI labs, including Anthropic and Google Brain. At Anthropic, she worked on advancing the capabilities and reliability of large language models. At Google Brain, she co-founded the ML for Systems team, with a focus on automating and optimizing computer systems and chip design. She received her BSc degree in Electrical Engineering from Sharif University of Technology and her PhD in Electrical and Computer Engineering from Rice University. Her work has been recognized through the MIT Technology Review’s 35 Under 35 Award, the Best ECE Thesis Award at Rice University, publications in flagship venues such as Nature, and coverage by various media outlets, including MIT Technology Review, IEEE Spectrum, The Verge, The Times, ZDNet, VentureBeat, and WIRED.