Welcome to our website!
Ongoing projects on automatic program optimization and
parallel programming notations are:
- Autoparallelization: We have a long line of research on auto
parallelization. Our most recent work focuses on vectorization. In (Ren, Wu and Padua 2005) we study vectorization of multimedia
applications. In a forthcoming paper (Maleki, Garzaran, Padua 2011) we present an
evaluation of the IBM, Intel, and GNU C vectorizers using synthetic loops and
code snippets from real applications. This last paper is the result of our collaboration with IBM in the context of the BlueWaters project. An complete list of our publications
on autoparallelization can be found here.
- Compilation of explicitly parallel programs: Practically all
the optimization techniques developed since the 1950s target sequential
programs. As parallel programming becomes more common, optimization for explicitly parallel programs will grow in importance. In (Ren, Wu, and Padua
2006) we discuss optimization of vector programs. In forthcoming papers
(Virlet, Zhou, Garzaran, Giacalone, Kuhn, and Padua 2011), (Zhou, Garzaran, Giacalone, Kuhn, and Padua 2011) we present optimization techniques
to reduce power consumption. This last paper is the result of our collaboration with Intel in the context of the Universal Parallel Computing Research Center (UPCRC). A complete list of our publication on compiler techniques for explicitly parallel programs can be found here.
- Programming languages: We are developing a parallel programming notation, Hierarchically Tiled Arrays (HTAs), with convenient representation of locality. Programs in this notation resemble conventional sequential programs with parallelism confined within powerful operators. More information can be found in the HTA project website. A complete list of our publications on parallel languages including earlier work on Cedar Fortran can be found here.
- Autotuning. We are working on building auto tuning libraries for linear algebra problems (LU decomposition, Triangular Sylvester Equations, QR decomposition). We are exploiting basic linear algebra properties to apply a divide and conquer approach to sub dividing the problems to improve data locality, and generate a parallel scheme to solve the problems. By studying these three test cases, we are hoping to identify the key elements necessary to automatically build autotuning systems for a larger class of linear algebra problems. In the past, we have worked on several autotuning projects. Our publications in this area can be found here.