13:30 - 13:40 Welcome Message
13:40 - 14:30 Keynote: Automating Performance Optimization of Data Flow Within HPC Workflows
— Nathan R. Tallent (Pacific Northwest National Laboratory)
14:30 - 15:00 Paper Talk: EmuCSD: A Scalable Framework For Emulating Computational Storage Devices
— Saleh AlSaleh, Wahid Uz Zaman, Mahmut Taylan Kandemir (The Pennsylvania State University)
15:00 - 15:30 Coffee Break
15:30 - 16:10 Invited Talk: TBA
— Xiaoyang Lu (Illinois Institute of Technology)
16:10 - 16:40 Paper Talk: NZFS: A Null File System with Zero-Copy I/O Support for Application Benchmarking
— Shingo Hattori, Osamu Tatebe (University of Tsukuba)
16:40 - 17:10 Paper Talk: Chasing the Rabbit: A Systematic Exploration of Software-Defined HPC Storage
— Hariharan Devarajan (LLNL), Brian Behlendorf (LLNL), Blake Devcich (HPE), Dean Roehrich (HPE),
17:10 - 17:20 Closing Remarks
Modern AI, scientific computing, and data-intensive applications are increasingly constrained by the cost of moving data across deep and heterogeneous memory hierarchies. As a result, system performance depends not only on computational capability, but also on where data resides, how efficiently it moves, and whether data access latency can be reduced, overlapped, or hidden. This talk presents a data-centric computing perspective for addressing these challenges, showing how analytical modeling of data movement, concurrency-aware memory optimization, overlap between data movement and computation, and memory-centric architecture design can work together to improve performance for memory-bound AI and HPC workloads. The broader goal is to motivate future systems in which architecture, runtime, compiler, and emerging memory/storage technologies are co-designed around data movement as a first-class optimization target.
Xiaoyang Lu is a Research Assistant Professor in the Department of Computer Science at Illinois Institute of Technology. His research interests include memory performance modeling, memory-centric computer architecture, data movement optimization, cache and prefetching systems, processing-in-memory architectures, and hardware/software co-design for AI and data-intensive workloads.