Date: June 1, 2026
Speaker:
Stanford University
Because human preferences are too complex to codify, AIs operate with misspecified objectives. Optimizing such objectives often produces undesirable outcomes; this phenomenon is known as reward hacking. Such outcomes are not necessarily catastrophic. Indeed, most examples of reward hacking in previous literature are benign. And typically, objectives can be modified to resolve the issue.
We study the prospect of catastrophic outcomes induced by AIs operating in complex environments. We argue that, when capabilities are sufficiently advanced, pursuing a fixed consequentialist objective tends to result in catastrophic outcomes. We formalize this by establishing conditions that provably lead to such outcomes. Under these conditions, simple or random behavior is safe. Catastrophic risk arises due to extraordinary competence rather than incompetence.
With a fixed consequentialist objective, avoiding catastrophe requires constraining AI capabilities. In fact, constraining capabilities the right amount not only averts catastrophe but yields valuable outcomes. Our results apply to any objective produced by modern industrial AI development pipelines.
Benjamin Van Roy is a Professor at Stanford University, where he has served on the faculty since 1998. His research focuses on reinforcement learning and alignment. Beyond academia, he founded the Efficient Agent Team at DeepMind (acquired by Google) and Enuvis (acquired by SiRF/Qualcomm). He has also led research programs at Morgan Stanley and Unica (acquired by IBM). He received the SB in Computer Science and Engineering and the SM and PhD in Electrical Engineering and Computer Science, all from MIT, where his doctoral research was advised by John N. Tsitsiklis.
He is a Fellow of INFORMS and IEEE and has served on the editorial boards of Machine Learning, Mathematics of Operations Research, for which he edited the Learning Theory Area, Operations Research, for which he edited the Financial Engineering Area, the INFORMS Journal on Optimization, and Foundations and Trends in Machine Learning. He has been a recipient of the MIT George C. Newton Undergraduate Laboratory Project Award, the MIT Morris J. Levin Memorial Master's Thesis Award, the MIT George M. Sprowls Doctoral Dissertation Award, the National Science Foundation CAREER Award, the Stanford Tau Beta Pi Award for Excellence in Undergraduate Teaching, the Management Science and Engineering Department's Graduate Teaching Award, the INFORMS Frederick W. Lanchester Prize, and the INFORMS Philip McCord Morse Lectureship Award.
He has graduated dozens of doctoral students, who have gone on to careers in academia (Carnegie Mellon, Columbia, Cornell, MIT, Northwestern, Rice, Stanford, USC), technology (Adobe, Amazon, DeepMind, Meta, Microsoft, Netflix, OpenAI, Spotify, Tesla, xAI), and finance (Citadel, DE Shaw, Goldman Sachs, Jane Street, Morgan Stanley, Two Sigma).
Date: postponed
Speaker:
Carnegie Mellon University
We study the asymptotic response time tail in the M/G/𝑛 multi-server queue with heavy-tailed (regularly varying) job sizes, a setting representative of modern computing workloads. For single-server systems, tail optimization is well understood: under heavy-tailed job sizes, policies such as SRPT that strictly prioritize short jobs are strongly tail optimal, and giving any priority to large jobs is harmful. For multi-server systems, the question has been almost entirely open.
This paper gives the first strongly tail-optimal scheduling policies for the M/G/n queue with heavy-tailed job sizes. Our central finding is that the multi-server case is intrinsically different from the single-server case: Giving a small amount of "sympathy" to large jobs is essential for strong tail optimality. We establish strong (or arbitrarily close to strong) tail optimality across the full stability region, both with and without knowledge of job sizes.
Joint work with Zhouzi Li and Mor Harchol-Balter.
Alan Scheller-Wolf is the Richard M. Cyert Professor of Operations Management at the Tepper School of Business, Carnegie Mellon University. He is currently head of the doctoral program at Tepper; he has previously served as the Senior Associate Dean of Faculty. His research focuses on stochastic processes and queueing models with applications in supply chains, renewable energy, healthcare, service systems, computer science, sustainability, and child welfare. He has published extensively on topics such as assemble-to-order systems, redundancy in server farms, electricity markets with negative prices, and the design of sustainable operations.
Professor Scheller-Wolf has been a member of several major INFORMS prize and governance committees, including the Dantzig Prize Committee, IMPACT Prize Committee, Nicholson Prize Committee, and the M&SOM Journal Review Committee. He has also served on the editorial boards of Management Science, Operations Research, MSOM, and QUESTA.
His work is frequently motivated by real-world problems and has included collaborations with organizations such as Caterpillar, John Deere, Allegheny County Department of Children, Youth and Families, and the American Red Cross, among others, on issues ranging from logistics and capacity planning to humanitarian operations. At Carnegie Mellon, he teaches courses in sustainable operations, dynamic programming, and Six Sigma, and advises doctoral and undergraduate students in operations management and computer science.
Professor Scheller-Wolf holds a PhD and MPhil in Operations Research from Columbia University and dual undergraduate degrees (BA in Art History and BS in Mathematics and Computational Science) from Stanford University. He served in the US Peace Corps in Serowe, Botswana.