Combining Theory and Benchmarks: Towards A Virtuous Cycle to Understand and Guarantee Foundation Model Performance
Combining Theory and Benchmarks: Towards A Virtuous Cycle to Understand and Guarantee Foundation Model Performance
ICML 2026
Seoul, South Korea
July 10, 2026
Keynote Speakers
Program Manager, Information Innovation Office
DARPA
Topic
TBD
Bio
Dr. Patrick Shafto joined DARPA in September 2023 to develop, execute, and transition programs in artificial intelligence (AI), mathematics, machine learning, and human-machine symbiosis. He is a professor of mathematics and computer science at Rutgers University, and for the two years before joining DARPA, he was a member of the School of Mathematics at the Institute for Advanced Study in Princeton. His research focuses on the mathematical foundations of learning agents, bridging mathematics, machine learning, AI, and cognitive science. His work has been published in more than 100 papers related to mathematical, computational, and empirical perspectives on learning. He also co-founded and served as chief scientist for Redpoll, a startup focused on human-centered AI, from 2019-2023.
Associate Professor
MIT
Topic
TBD
Bio
Tamara Broderick is an Associate Professor in Electrical Engineering and Computer Science at MIT, where she is a member of LIDS and IDSS. Her research focuses on the foundations of Bayesian inference, uncertainty quantification, and scalable, interpretable machine learning. She develops methods to rigorously characterize uncertainty and improve the reliability of data-driven decisions in complex models. Her contributions have been recognized with the NSF CAREER Award, the ONR Young Investigator Award, and the COPSS Emerging Leader Award. Tamara earned her Ph.D. in Statistics from UC Berkeley and her A.B. in Mathematics from Princeton University.
Professor
Princeton
Topic
TBD
Bio
Mengdi Wang is Co-Director of Princeton AI for Accelerated Invention, and Professor of the Department of Electrical and Computer Engineering and the Center for Statistics and Machine Learning at Princeton University. Her research focuses on machine learning, reinforcement learning, generative AI, large language models, and AI for science. Mengdi received her PhD in Electrical Engineering and Computer Science from Massachusetts Institute of Technology in 2013, where she was affiliated with the Laboratory for Information and Decision Systems and advised by Dimitri P. Bertsekas. She serves as a Program Chair for ICLR 2023 and Senior AC for Neurips, ICML, COLT, associate editor for Harvard Data Science Review, Operations Research. Her research is supported by NSF, AFOSR, NIH, ONR, Google, Microsoft C3.ai, FinUP, RVAC Medicines, MURI, GenMab.
UK AI Safety Institute
Topic
TBD
Bio
Cozmin Ududec leads the Science of Evaluation team at the UK AI Security Institute in London. His work focuses on methods for evaluating frontier AI systems and making stronger empirical claims from evaluation results, including evaluation validity, log analysis, inference scaling, LLM personas, and changes in propensities and capabilities over long-horizon tasks. He joined AISI early in its life and previously co-led its pre-deployment testing programme. He holds a PhD in physics from the University of Waterloo and the Perimeter Institute.