EE HPC WG HPC System Power & Energy Measurement Methodology for Benchmarking

Sustainably supporting science through committed community action

LINK TO MOST CURRENT DOCUMENT:

Energy Efficient High Performance Computing Power Measurement Methodology

https://drive.google.com/file/d/1eeNx9yBWUdtYerTwqk3PeJ4u1wamW3bU/view?usp=sharing

The Energy Efficient HPC Working Group (EE HPC WG) has a Power Measurement Methodology Team that has formulated and recommends a methodological approach for measuring, recording, and reporting the power used by a high performance computer (HPC) system while running a workload. This document is part of a collaborative effort between the Green500, the Top500 and the EE HPC WG. While it is intended for this methodology to be generally applicable to benchmarking a variety of workloads, the initial focus is on High Performance LINPACK (HPL).

The document defines four aspects of a power measurement and three quality levels. All four aspects have requirements that become increasingly stringent at higher quality levels.

The four aspects are as follows:

1. Granularity, time span, and type of raw measurement

2. Machine fraction instrumented

3. Subsystems included in instrumented power

4. Location of measurement in the power distribution network and accuracy of power meters.

The quality ratings are as follows:

• Adequate, called Level 1 (L1)

• Moderate, called Level 2 (L2)

• Best, called Level 3 (L3)

To grant a given quality level for a submission, the submission must satisfy the requirements of all four aspects at that quality level or higher.

This Team has engaged in data collection and analysis that support creation of this methodology. Some of this work has been documented and published. Below are the publications.

TITLE: A power measurement methodology for large scale, high performance computing

https://dl.acm.org/doi/10.1145/2568088.2576795

ABSTRACT: Improvement in the energy efficiency of supercomputers can be accelerated by improving the quality and comparability of efficiency measurements. The ability to generate accurate measurements at extreme scale are just now emerging. The realization of system-level measurement capabilities can be accelerated with a commonly adopted and high quality measurement methodology for use while running a workload, typically a benchmark. This paper describes a methodology that has been developed collaboratively through the Energy Efficient HPC Working Group to support architectural analysis and comparative measurements for rankings, such as the Top500 and Green500. To support measurements with varying amounts of effort and equipment required we present three distinct levels of measurement, which provide increasing levels of accuracy. Level 1 is similar to the Green500 run rules today, a single average power measurement extrapolated from a subset of a machine. Level 2 is more comprehensive, but still widely achievable. Level 3 is the most rigorous of the three methodologies but is only possible at a few sites. However, the Level 3 methodology generates a high quality result that exposes details that the other methodologies may miss. In addition, we present case studies from the Leibniz Supercomputing Centre (LRZ), Argonne National Laboratory (ANL) and Calcul Québec Université Laval that explore the benefits and difficulties of gathering high quality, system-level measurements on large-scale machines.

AUTHORS: Scogland, Thomas; Steffen, Craig; Wilde, Torsten; Parent, Florent; Coghlan, Susan; Bates, Natalie; Feng, Wu; Strohmaier, Erich

TITLE: Node Variability in Large-Scale Power Measurements: Perspectives from the Green500, Top500 and EE HPC WG

https://dl.acm.org/doi/10.1145/2807591.2807653

ABSTRACT: The last decade has seen power consumption move from an afterthought to the foremost design constraint of new supercomputers. Measuring the power of a supercomputer can be a daunting proposition, and as a result, many published measurements are extrapolated. This paper explores the validity of these extrapolations in the context of inter-node power variability and power variations over time within a run. We characterize power variability across nodes in systems at eight supercomputer centers across the globe. This characterization shows that the current requirement for measurements submitted to the Green500 and others is insufficient, allowing variations of up to 20% due to measurement timing and a further 10-15% due to insufficient sample sizes. This paper proposes new power and energy measurement requirements for supercomputers, some of which have been accepted for use by the Green500 and Top500, to ensure consistent accuracy.

AUTHORS: Scogland, Thomas; Azose, Jonathan; Rohr, David; Rivoire, Suzanne; Bates, Natalie; Hackenberg, Daniel