RECUP: ScaIable Metadata and Provenance for Reproducible Hybrid Workflows
We develop methods to enable the reproducibility of performance and scientific results for numerical and data-intensive simulations orchestrated by workflows that run on high performance systems
We develop methods to enable the reproducibility of performance and scientific results for numerical and data-intensive simulations orchestrated by workflows that run on high performance systems
What (meta)data is relevant for reproducibility?
How can we capture, curate, fuse, and index the relevant (meta)data with minimal overhead at scale?
How can we compare two repeated runs both in terms of (meta)data and intermediate results to study different types of reproducibility?
How can we identify the root causes of runs that are not reproducible, both from the perspective of performance and results?
Application performance data and metadata are extracted at runtime to enable unified, FAIR-enabled metadata format that captures task details (dependencies, execution order, performance metrics, inputs and outputs, etc.). Darshan will be employed for workflow I/O instrumentation and Mochi for high-volume data aggregation. Radical manages resource allocation for HPC applications. Metadata and pointers to data are saved into a RO-Crate profile. Very low overhead checkpointing (VeLOC) captures application execution with fast lineage comparison based on scalable hashing towards a reproducibility framework.
Application performance data and metadata are extracted at runtime to enable unified, FAIR-enabled metadata format that captures task details (dependencies, execution order, performance metrics, inputs and outputs, etc.). Darshan will be employed for workflow I/O instrumentation and Mochi for high-volume data aggregation. Radical manages resource allocation for HPC applications. Metadata and pointers to data are saved into a RO-Crate profile. Very low overhead checkpointing (VeLOC) captures application execution with fast lineage comparison based on scalable hashing towards a reproducibility framework.
Contact: Line Pouchard, PI
E-mail: lcpouch@sandia.gov
pre sandia web site: linepouchard.github.io/profile/index.html