Workshop Keynote: Exponential Technologies: High-Performance Computing as a Foundation for AI and Digital Economies (Slides)
Speaker: Horst Simon
The exponential progress of high-performance computing (HPC) underpins the recent breakthroughs in Artificial Intelligence—from training large language models to advances in generative and reinforcement learning. This talk outlines the evolution of HPC, highlighting milestones like the Exascale Computing Project and the TOP500 benchmarks, and explores the convergence of compute, data, and algorithms.
We also consider the growing computational demands of Bitcoin mining and other blockchain technologies, drawing parallels with AI in their reliance on massive, energy-intensive infrastructure. Projections suggest that future growth in both domains will stretch current capabilities and sustainability limits.
Addressing these challenges will require continued innovation in HPC architectures, energy efficiency, and system integration. In the Exponential Age, HPC is not only the engine behind scientific discovery but also a strategic asset for emerging digital economies.
Title: Benchmarking for HPC, AI and Workflows (Slides)
Speaker: Kalyan Kumaran
The Argonne Leadership Computing Facility (ALCF) has a well-established practice of creating tailored benchmark suites for each of its system procurements. These benchmarks are designed to represent the range of scientific applications typically run on ALCF systems, while also aligning with the goals and priorities of upcoming projects. In this talk, I’ll share the current progress on the benchmark suite for ALCF-4, codenamed Helios. This set of benchmarks reflects key areas of interest for the system—artificial intelligence (AI), modeling and simulation (ModSim), and scientific workflows. I'll discuss the selection process, the types of applications being considered, and how these benchmarks help guide the design and evaluation of the new system.
Title: The JUPITER Benchmark Suite: Details, Experiences, and Future Plans
Speaker: Andreas Herten
To procure the first exascale supercomputer in Europe, JUPITER, a comprehensive benchmark suite was developed to incorporate a diverse set of real-world applications and synthetic benchmarks. This JUPITER Benchmark Suite was published with many details, including all contained benchmarks as Open-Source software. The talk will present an overview of the benchmark suite, including design decisions and reference results, share experiences of the creation process and the time since then, and future plans of benchmarking the HPC system at JSC.
Title: Milabench: An AI benchmark suite (Slides)
Speaker: Xavier Bouthillier (MILA, Univeristy of Montreal)
AI workloads, particularly those driven by deep learning, are introducing novel usage patterns to high-performance computing (HPC) systems that are not comprehensively captured by standard HPC benchmarks. As one of the largest academic research centers dedicated to deep learning, Mila identified the need to develop and maintain a custom benchmarking suite to address the diverse requirements of its community, which consists of over 1,000 researchers. Last year, the benchmarking suite was updated using a novel methodology leveraging LLMs to analyse the whole corpus of 867 papers published in 2023. This analysis provided us with a better understanding of the research landscape at Mila, and allowed us to ensure a wide and proportional coverage of the active research domains in our institute. In this talk, I will present the design process behind the latest version of Milabench and the resulting 42 benchmarks. This will cover how we used LLMs to analyse the papers, how we evaluated the reliability of this method and how we derived our coverage targets based on the analysis.
Title: Benchmarking with Workflows and Workflow Management Systems (Slides)
Speaker: Ewa Deelman (University of Southern California)
Traditional benchmarking strategies in high-performance computing (HPC) focus primarily on system-centric metrics such as floating-point operations per second (FLOPS), memory bandwidth, and interconnect performance. While these metrics remain essential, they do not capture the full complexity of modern scientific discovery pipelines. Increasingly, researchers rely on a wide array of interconnected resources—including instruments, sensors, clouds, and distributed data repositories—to execute end-to-end workflows. These workflows span multiple stages of data acquisition, preprocessing, simulation, analysis, and postprocessing, demanding coordination and fault tolerance across heterogeneous environments.
In this talk, I will argue that workflow-centric benchmarking is necessary to address the evolving landscape of computational science. By moving beyond traditional benchmarks, we can better evaluate how well entire workflows perform under realistic conditions, including data movement, dynamic resource allocation, and complex dependency management. I will illustrate these concepts through experiences with the Pegasus Workflow Management System, which has been applied to scientific workflows from diverse domains—ranging from gravitational-wave physics to bioinformatics. Through concrete examples, I will show how workflow-level metrics and benchmarks can more accurately reflect real-world performance, guiding system architects, software developers, and domain scientists in designing and deploying future-generation HPC and distributed computing platforms.
Title: Benchmarking the Whole Workflow (Slides)
Speaker: Mike Ringenberg (Microsoft)
Abstract: Differences between how we benchmark systems versus how we actually use them have persisted throughout the history of HPC & AI (and computing in general). For example, Artificial Intelligence and Modelling-Simulation workflows often involve multiple complex data preparation steps. These steps can include data discovery, data ingestion, data cleaning, data formatting, data de-duplication, and other preparatory activities that ensure the model or simulation has the data it needs in a format it can work with. However, existing HPC & AI benchmarks typically focus only on the training, inference, or simulation steps of the workflow. In this talk, we explore some of these gaps, raise questions about why they have persisted, and invite discussion on future directions and efforts to address the gaps.
Slides:
Title: In an era of rapid AI & HPC innovations, how should benchmarks evolve? (Slides)
Speaker: Harun Bayraktar and John Gunnels (NVIDIA)
The goal of a benchmark is to measure system impact. With the increasing integration of AI into scientific workflows (e.g., AlphaFold protein structure database) in both supercomputing centers and large datacenters in the cloud, an unprecedented opportunity has arrived for long-established benchmarks to evolve, while preserving their historical provenance. Providing a figure of merit reflecting a broad swath of tomorrow’s most consequential workloads requires a focus on widespread trends in both software and hardware, while giving due consideration to the practical concerns involved. To signal the scientific impact of a system, demonstrating its value with respect to the next AlphaFold, as well as its capacity to handle prolonged stress and computational demands arising from time-tested algorithms of continuing utility, a benchmark must have the potential to exercise a broad range of computational engines. Floating-point emulation provides one way in which abundant resources designed for reduced precision computing can be creatively and effectively utilized in service of tasks that require high levels of accuracy. In this talk, we will demonstrate how this coupling provides the potential for not only unprecedented performance and power-efficiency, but opens up avenues for further innovation in areas such as data compression and finer-grained mixed-precision algorithms that will impact a wide range of computing platforms.