PhD Research

RT-CUDA: A Software Tool for CUDA Code Restructuring

Recent development in Graphic Processing Units (GPUs) has opened a new challenge in harnessing their computing power as a new general-purpose computing paradigm with its CUDA parallel programming. However, porting applications to CUDA remains a challenge to average programmers. We have developed a restructuring software compiler (RT-CUDA) with best possible kernel optimizations to bridge the gap between high-level languages and the machine dependent CUDA environment. RT-CUDA is based upon a set of compiler optimizations. RT-CUDA takes a C-like program and convert it into an optimized CUDA kernel with user directives in a con.figuration .file for guiding the compiler. While the invocation of external libraries is not possible with OpenACC commercial compiler, RT-CUDA allows transparent invocation of the most optimized external math libraries like cuSparse and cuBLAS. For this, RT-CUDA uses interfacing APIs, error handling interpretation, and user transparent programming. This enables effi.cient design of linear algebra solvers (LAS). Evaluation of RTCUDA has been performed on Tesla K20c GPU with a variety of basic linear algebra operators (M+, MM, MV, VV, etc.) as well as the programming of solvers of systems of linear equations like Jacobi and Conjugate Gradient. We obtained signi.ficant speedup over other compilers like OpenACC and GPGPU compilers. RT-CUDA facilitates the design of e.fficient parallel software for developing parallel simulators (reservoir simulators, molecular dynamics, etc.) which are critical for Oil & Gas industry in KSA. We expect RT-CUDA to be needed by many KSA industries dealing with science and engineering simulation on massively parallel computers like NVIDIA GPUs.

Code Snippet of MV using CUBLAS library (left) and RT-CUDA API (right)

Input Files for RT-CUDA Conjugate Gradient Implementation

Input Files for RT-CUDA Conjugate Gradient Implementation

Output Files for RT-CUDA Conjugate Gradient Implementation