ESSA 2022 : 3rd Workshop on Extreme-Scale Storage and Analysis

Held in conjunction with IEEE IPDPS 2022 - June 3rd, 2022

Program

EDT CEST
10:00-10:10 16:00-16:10
Welcome

10:10-10:50 16:10-16:50
Keynote (Chair: Osamu Tatebe, University of Tsukuba)
Keep Your Composure: HPC, Data Services, and the Mochi Project
Rob Ross (Argonne National Laboratory)


10:50-11:20 16:50-17:20 Invited talk (Chair: Gabriel Antoniu, Inria)
DAOS: Nextgen Storage Stack for HPC and AI
Johann Lombardi (Intel)

11:20-11:50 17:20-17:50
Technical regular paper talk (Chair: Gabriel Antoniu, Inria)
Caching Support for CHFS Node-local Persistent Memory File System
Osamu Tatebe (University of Tsukuba), Hiroki Ohtsuji (Fujitsu Limited)

11:50-12:10 17:50-18:10
Technical short paper talk (Chair: Gabriel Antoniu, Inria)
A Locality-aware Cooperative Distributed Memory Caching for Parallel Data Analytic Applications
C
hia-Ting Hung, Jerry Chou (National Tsing Hua University), Ming-Hung Chen, I-Hsin Chung (IBM T.J. Watson)

12:10-12:40 18:10-18:40
Invited talk (Chair: Murali Emani, ANL)
The Curious Incident of the Data in the Scientific Workflow
Lavanya Ramakrishnan (Lawrence Berkeley National Laboratory)

12:40-13:10 18:40-19:10 Technical regular paper talk (Chair: Murali Emani, ANL)
Modeling Power Consumption of Lossy Compressed I/O for Exascale HPC System
Grant Wilkins, Jon C. Calhoun (Clemson University)

Keynote

Robert B. Ross (Argonne National Laboratory)

Robert Ross is a Senior Computer Scientist at Argonne National Laboratory and a Senior Fellow at the Northwestern-Argonne Institute for Science and Engineering. He is the Director of the DOE SciDAC RAPIDS Institute for Computer Science, Data, and Artificial Intelligence. Rob’s research interests are in system software for high performance computing systems, in particular distributed storage systems and libraries for I/O and message passing. Rob received his Ph.D. in Computer Engineering from Clemson University in 2000. Rob was a recipient of the 2004 Presidential Early Career Award for Scientists and Engineers and the 2020 Ernest Orlando Lawrence Award, and he was named an ACM Fellow in 2021.

Invited Speakers

Lavanya Ramakrishnan (Lawrence Berkeley National Laboratory)

Lavanya Ramakrishnan is a Senior Scientist and Division Deputy in the Scientific Data Division at Lawrence Berkeley National Lab. Her research interests are in building software tools for computational and data-intensive science with a focus on workflow, resource, and data management. More recently, she has been using user research methods to understand as well as verify/validate the context of use and social challenges that often impact tool design and development. She currently leads several project teams that consist of a mix of social scientists, software engineers, and computer scientists.

Johann Lombardi (Intel Corporation)

Johann Lombardi is a senior principal engineer in the Super Computing Group (SCG) at Intel. He is the lead architect of the Distributed Asynchronous Object Store (DAOS). Prior to DAOS, Johann led the sustaining team in charge of the Lustre filesystem worldwide support for 5 years at Cluster Filesystem, Sun, Oracle and Whamcloud. He then transitioned to research programs (Fast Forward, ESSIO & Path Forward) at Intel to lead the development of a nextgen storage stack for Exascale HPC, Big Data and AI that resulted into DAOS.

Workshop Overview

Advances in storage are becoming increasingly critical because workloads on high performance computing (HPC) and cloud systems are producing and consuming more data than ever before, and the situation promises to only increase in future years. Additionally, the last decades have seen relatively few changes in the structure of parallel file systems, and limited interaction between the evolution of parallel file systems, e.g., Lustre, GPFS, and I/O support systems that take advantage of hierarchical storage layers, e.g., node local burst buffers. However, recently the community has seen a large uptick in innovations in storage systems and I/O support software for several reasons:

  • Technology: The availability of an increasing number of persistent solid-state storage technologies that can replace either memory or disk are creating new opportunities for the structure of storage systems.

  • Performance requirements: Disk-based parallel file systems cannot satisfy the performance needs of high-end systems. However, it is not clear how solid-state storage can best be used to achieve the needed performance, so new approaches for using solid-state storage in HPC systems are being designed and evaluated.

  • Application evolution: Data analysis applications, including graph analytics and machine learning, are becoming increasingly important both for scientific computing and for commercial computing. I/O is often a major bottleneck for such applications, both in cloud and HPC environments – especially when fast turnaround or integration of heavy computation and analysis are required.

  • Infrastructure evolution. HPC technology will not only be deployed in dedicated supercomputing centers in the future. “Embedded HPC”, “HPC in the box”, “HPC in the loop”, “HPC in the cloud”, “HPC as a service”, and “near- to-real-time simulation” are concepts requiring new small-scale deployment environments for HPC. A federation of systems and functions with consistent mechanisms for managing I/O, storage, and data processing across all participating systems will be required to create a “continuum” of computing.

  • Virtualization and disaggregation: As virtualization and disaggregation become broadly used in cloud and HPC computing, the issue of virtualized storage has increasing importance and efforts will be needed to understand its implications for performance.

Our goals in the ESSA Workshop are to bring together expert researchers and developers in data-related areas including storage, I/O, processing and analysis on extreme scale infrastructures including HPC systems, clouds, edge systems or hybrid combinations of those, to discuss advances and possible solutions to the new challenges we face.

Topics

  • Extreme-scale storage systems (on high-end HPC infrastructures, clouds, or hybrid combinations of them)

  • Extreme-scale parallel and distributed storage architectures

  • The synergy between different storage models (POSIX file system, object storage, key-value, row-oriented, and column-oriented databases)

  • Structures and interfaces for leveraging persistent solid-state storage and storage-class memory

  • High-performance I/O library and services

  • I/O performance in extreme-scale systems and applications (HPC/cloud/edge)

  • Storage and data processing architectures and systems for hybrid HPC/cloud/edge infrastructures, in support of complex workflows potentially combining simulation and analytics

  • Integrating computation into the memory and storage hierarchy to facilitate in-situ and in-transit data processing

  • I/O characterization and data processing techniques for application workloads relying on extreme-scale parallel/distributed machine-learning/deep learning

  • Tools and techniques for managing data movement among compute and data intensive components

  • Data reduction and compression

  • Failure and recovery of extreme-scale storage systems

  • Benchmarks and performance tools for extreme-scale I/O

  • Language and library support for data-centric computing

  • Storage virtualization and disaggregation

  • Ephemeral storage media and consistency optimizations

  • Storage architectures and systems for scalable stream-based processing

  • Study cases of I/O services and data processing architectures in support of various application domains (bioinformatics, scientific simulations, large observatories, experimental facilities, etc.)

Chairs

Workshop Chairs

Chair: Osamu Tatebe, University of Tsukuba, Japan
Co-Chair: Gabriel Antoniu, Inria, France

Program Chairs

Chair: Kento Sato, RIKEN, Japan
Co-Chair: Murali Emani, Argonne National Laboratory, USA

Steering Committee

Gabriel Antoniu , Inria, Rennes, France
Franck Cappello, Argonne National Laboratory, USA
Toni Cortés, Barcelona Supercomputing Center, Spain
Kathryn Mohror, Lawrence Livermore National Laboratory, USA
Kento Sato, RIKEN, Japan
Marc Snir, University of Illinois at Urbana-Champaign, USA
Weikuan Yu, Florida State University, USA