HPCC Benchmark

HPCC was developed to study future Petascale computing systems, and is intended to provide a realistic measurement of modern computing workloads. HPCC is made up of seven common computational kernels: STREAM, HPL, DGEMM (matrix multiply), PTRANS (parallel matrix transpose), FFT, RandomAccess, and b_eff (bandwidth/latency tests). The benchmarks attempt to measure high and low spatial and temporal locality space. The tests are scalable, and can be run on a wide range of platforms, from single processors to the largest parallel supercomputers.

The HPCC benchmarks test three particular regimes: local or single processor, embarrassingly parallel, and global, where all processors compute and exchange data with each other. STREAM measures a processor's memory bandwidth. HPL is the LINPACK TPP (Toward Peak Performance) benchmark; RandomAccess measures the rate of random updates of memory; PTRANS measures the rate of transfer of very large arrays of data from memory; b_eff measures the latency and bandwidth of increasingly complex communication patterns.

All of the benchmarks are run in two modes: base and optimized. The base run allows no source modifications of any of the benchmarks, but allows generally available optimized libraries to be used. The optimized benchmark allows significant changes to the source code. The optimizations can include alternative programming languages and libraries that are specifically targeted for the platform being tested.

The team results of the HPCC portion of the Cluster Competition will be announced on Tuesday when the TOP500 committee meets with the public to announce the new TOP500 list. Cluster Competition Teams are encouraged to be present during this presentation.

A C compiler and an implementation of MPI are required to run the benchmark suite.

More information on HPCC can be found at:

Introduction to the HPC Challenge Benchmark Suite, by Dongarra and Luszczek http://icl.cs.utk.edu/projectsfiles/hpcc/pubs/sc06_hpcc.pdf