14th International Workshop on Runtime and Operating Systems for Supercomputers
ROSS @ SC25
14th International Workshop on Runtime and Operating Systems for Supercomputers
ROSS @ SC25
Note: Accepted papers will appear in the SC25 workshop proceedings
[10/16/2025] - Tentative program published below
[9/9/2025] - Paper notifications sent to authors
[8/7/2025] - Final deadline extension: August 15, 2025 (AoE)
[4/28/2025] - The ROSS Workshop will be held on Sunday, November 16, 2025
[3/18/2025] - The ROSS Workshop has been accepted as a half-day workshop to be included as part of the SC25 program!
The ROSS is a workshop aimed at identifying looming problems and discussing promising research solutions in the area of runtime and operating systems for extreme-scale supercomputer systems. Specifically, ROSS focuses on principles and techniques to design, implement, optimize, or operate runtime and operating systems for extreme-scale supercomputing and cloud environments. In addition to typical workshop publications, we encourage novel and possibly immature ideas, provided that they are interesting and on-topic. Well argued position papers are also welcome.
This workshop will consider all aspects of OS and runtime systems at extreme-scale including, but not limited to:
OS and runtime system scalability on many-node and multi/many-core systems;
Management of heterogeneous and reconfigurable compute resources, including FPGAs, GPUs, etc;
Management of emerging post-moore computing architectures, including quantum, neuromorphic, etc;
Distributed/hybrid/partitioned OSs and runtime systems for supercomputing;
Analysis and prevention of system noise and performance variability;
Runtime and operating systems for resource disaggregation;
Modeling and performance analysis of runtime systems;
The use of machine learning and AI techniques in the autotuning of system software and resource management;
OS and runtime considerations for large-volume, high-performance I/O;
Memory management and emerging memory technologies;
OS and runtime aspects of HPC in the cloud, convergence of supercomputing and cloud environments;
Infrastructure for cloud functions and serverless computing in the context of HPC;
OS and runtime support for emerging workloads such as on-demand or persistent service use cases;
Virtualization in HPC, including virtual machines and application containers;
The role of the OS and runtime system in minimizing power usage and energy efficiency;
OS and runtime impacts of security and trust for HPC.
Paper submissions open: July 1, 2025
Paper submission closes: August 8, 2025 August 15, 2025 (AoE)
Author notification: September 9, 2025
Camera-ready papers: September 22, 2024
Submissions for workshop papers must be single-blind and at least five (5) pages in length (and should not exceed eleven (11) pages including all text, appendices, figures, and references. Accepted papers that meet these requirements will be published.
Submissions shall be submitted to https://submissions.supercomputing.org/ and must conform to the requirements established by the ACM proceedings template, two-column, US letter. Latex users, please use the “sigconf” option. Word authors can use the “Interim Layout”.
Templates can be found here: https://www.acm.org/publications/proceedings-template
For the camera-ready version submission, carefully follow the instructions sent to authors. This will entail the following activities:
1) Signing the copyright transfer form to ACM – check e-mail – Due September 12 (hard deadline)
2) Updating paper metadata (title, author info) in Linklings – Due September 22
3) Submitting the camera-ready version via the ACM TAPS system – Due September 22 (hard deadline)
When: November 16, 2025
Where: SC25 St. Louis, Missouri
Venue: Americas Center
Registration: Registration opens July 9, 2025
Submissions: submissions.supercomputing.org/?page=Submit&id=SCWorkshopROSSSubmission&site=sc25
Questions: email kbferre@sandia.gov
ROSS 2025 will be held Sunday November 16, 2025 from 2pm - 5:30pm in room #265 of the America's Center Convention Complex. Tentative schedule provided below, All time are Central Standard time (UTC -6), the local time zone in St Louis, MO, USA
[2:00pm] Opening Remarks
[2:05pm - 3:00pm] Featured Speaker
Title: State-of-the-Art Communication Software for Supercomputers and its Applications by Dr. Jeff Hammond (NVIDIA)
Abstract: This talk will focus on high-performance communication software for GPU supercomputers. It will explain NCCL and NVSHMEM, including their historical context from MPI and SHMEM. The functionality and performance will be demonstrated through an example from linear algebra. Real world results from both scientific and commercial AI use cases will be described.
Bio: Jeff Hammond is a Principal Engineer at NVIDIA in the data center software organization, focused on GPU communications. He has extensive experience with the design and use of parallel programming models and scientific applications. Jeff’s most notable achievements include the MPI-5 Application Binary Interface standard, development of the MPI-3 one-sided communication software ecosystem, and contributions to the NWChem quantum chemistry project. He received a PhD in Chemistry from the University of Chicago in 2009.
[3:00pm - 3:30pm] SC25 Afternoon Break
[3:30pm - 4:00pm]
Extending the C++ Execution Control Library to Support Dynamic Parallel Runtime Systems
Ian Henriksen, Jan Ciesko and Stephen L. Olivier (Sandia National Laboratories)
[4:00pm - 4:30pm]
Assessing Page Reclamation Mechanisms for Linux
Shaochang Liu and Jie Ren (College of William & Mary)
[4:30pm - 5:00pm]
Reproducible Performance Evaluation of OpenMP and SYCL Workloads under Noise Injection
Christoffer Persson, Mathias Pretot, Minyu Cui, and Miquel Pericas (Chalmers University of Technology and University of Gothenburg)
[5:00pm - 5:30pm] Invited Talk
Title: Multikernel: Kernel-to-Kernel Isolation with Elastic Resource Management, by Cong Wang (Multikernel Technologies)
Abstract: Modern HPC and AI workloads challenge fundamental limitations in traditional operating system designs. Shared kernels introduce unpredictable performance interference, virtualization imposes unacceptable overhead, and monolithic architectures waste resources on unnecessary features.
Multikernel architectures address these challenges by providing each application with a dedicated, customized kernel instance. Combined with elastic resource management, this approach delivers predictable performance through kernel-level isolation, near-native performance without hypervisor overhead, and automatic workload-specific optimization for both HPC and AI.
This talk examines why current hardware capabilities and workload demands create the right conditions to fundamentally rethink OS design, and presents the architectural principles and system design of our multikernel proposal.
Bio: Cong Wang is the Founder and CEO at Multikernel Technologies. He is a distinguished Linux kernel developer with 17 years of experience and has been a Linux kernel maintainer of the networking traffic control subsystem since 2017. With over 1000 commits to the Linux kernel project, he possesses a deep understanding of cloud computing. Prior to founding Multikernel Technologies Inc., he led a software engineering team at ByteDance.
Kurt Ferreira - Sandia National Laboratories
Balazs Gerofi - Intel Corporation, USA
Torsten Hoefler - ETH Zürich
Jack Lange - Oak Ridge National Laboratory
Patrick Bridges - University of New Mexico
Miquel Pericas - Chalmers University of Technology; University of Gothenburg, Sweden
Kevin Pedretti - Sandia National Laboratories
Brian Kocoloski - Advanced Micro Devices, Inc. (AMD)
Bronis R. de Supinski - Lawrence Livermore National Laboratory (LLNL)
Scott Levy - Sandia National Laboratories
Bernd Mohr - Jülich Supercomputing Centre (JSC)
Nicholas Gordon - Barkhausen Institute