Lishan Yang

Accurate and Fast Estimation of Input-Dependent General-Process Graphics Processing Unit Resillience

Computer Science | William & Mary

Co-Author: B. Nie, A. Jog

Advisor: Evgenia Smirni

Abstract

As Graphics Processing Units (GPUs) are becoming a de facto solution for accelerating a wide range of applications, its reliable operation is becoming increasingly important. One of the major challenges in the domain of GPU reliability is accurately measuring the GPGPU application error resilience. This challenge stems from the fact that a typical GPGPU application spawns a very large number of threads and then utilizes a large amount of potentially unreliable compute and memory resources available on the GPUs. Since the number of possible fault locations can be in the billions, evaluating every fault and examining its effect on the application error resilience is impractical. Even worse, the application error resilience is also input-dependent. In this work, for the first time, we deeply analyze the impact of different inputs on the application error resilience and show how analyzing a small fraction of input is sufficient to develop an accurate resilience model for a larger input. The key insight is based on the observation that error resilience is mostly determined by the mix of dynamic instructions rather than the data they process. Therefore, as long as the new input does not change the sequence of dynamic instructions, the application error resilience can be predicted accurately. In cases where different input affects the branch outcomes, only dynamic instruction profiling is required to estimate the new application error resilience. Overall, our new inputaware resilience estimation mechanism can reduce the overall sampling time by 57.6% while being highly accurate.

Bio

Lishan Yang is a fourth-year Ph.D. candidate in the Computer Science Department at William & Mary, under the supervision of Prof. Evgenia Smirni. Her research interest falls in GPU architecture, reliability analysis, and performance analysis. Before coming to William & Mary, she received her bachelor’s degree in Computer Science from the University of Science and Technology of China in 2016.

Yang, Lishan.pdf