Eduardo Ponce Mojica
Polytechnic University of Puerto Rico
Major: Electrical Engineering
Faculty Advisor: Houssain Kettani, Ph.D.
Program: Research Alliance in Math and Science
Asynchronous Computing using CUDA on a Tesla C2050 GPU
The attempt to design parallel supercomputers aiming at exascale performance is currently pushing architectures to consider again fully distributed memory frameworks. A similar issue affects cloud computing. To date, most distributed computing schemes have relied primarily on synchronization and “check-point restart” mechanisms to guarantee the correctness of solutions. With the exponential growth of system components such approaches become unsustainable. An alternative is to use concurrently-asynchronous computing, which fully exploits each processing node. The advantage is the reduction in idle times of all processors, albeit at the expense of algorithm modification to overcome the emergence of computational chaos arising from out-of-phase task components. Similar synchronization delays occur in emerging, massively parallel multithreaded graphic processing units. The primary objective of this task is to explore the performance of asynchronous methods on NVIDIA Tesla C2050 GPU, which includes 448 processing cores. The corresponding algorithms will be programmed in CUDA Fortran. The specific iterative scheme that will be considered is the conjugate Gradients algorithm. Developing and demonstrating an asynchronous methodology may allow future parallel supercomputers to fully achieve their expected potential, and moreover, aid in the discovery and understanding of solutions for complex problems which are currently main research topics.
Jacob Barhen, Ph.D.
Computing and Computational Science Directorate
Oak Ridge National Laboratory