As foundation models continue to scale, the dominant bottleneck is shifting from compute to memory—capacity, bandwidth, and data movement. This trend is especially acute for emerging agentic AI systems, where long contexts, persistent state, and multi-step reasoning place unprecedented pressure on the memory hierarchy. This talk argues that breaking the memory wall requires a coordinated effort across the entire stack, and offers a unified perspective spanning algorithms, systems, architecture, and silicon.
The talk first considers how memory demand can be reduced at its source, through algorithm- and system-level techniques that reshape how models store and move data during inference. It then turns to memory-centric accelerator design, showing how such techniques can be co-designed with the underlying hardware rather than treated in isolation. Finally, it looks toward silicon realization and the future of memory-centric accelerators purpose-built for foundation models and agent systems. The unifying thread—from memory compression to scheduling to memory-centric architecture and specialized chips—points toward a practical path for sustaining the continued scaling of foundation models.
Yiran Chen is the John Cocke Distinguished Professor of Electrical and Computer Engineering at Duke University. He serves as the Principal Investigator and Director of the NSF AI Institute for Edge Computing Leveraging Next Generation Networks (Athena), the Director of Institute for AI Engineering (IAIE), and the Co-Director of the Duke Center for Computational Evolutionary Intelligence (DCEI). His research group focuses on innovations in emerging memory and storage systems, machine learning and neuromorphic computing, and edge AI. Dr. Chen has authored over 700 publications and holds 96 U.S. patents. His work has received widespread recognition, including two Test-of-Time Awards and 15 Best Paper/Poster Awards. He is the recipient of the IEEE Circuits and Systems Society’s Charles A. Desoer Technical Achievement Award and the IEEE Computer Society’s Edward J. McCluskey Technical Achievement Award. He also serves as the inaugural Editor-in-Chief of the IEEE Transactions on Circuits and Systems for Artificial Intelligence (TCASAI) and the founding Chair of the IEEE Circuits and Systems Society’s Machine Learning Circuits and Systems (MLCAS) Technical Committee. Dr. Chen is a Fellow of the AAAS, ACM, IEEE, and NAI, and a member of the European Academy of Sciences and Arts.
TBD
TBD
TBD
TBD
TBD
TBD