Welcome to the GPU assignment of 5kk73 Embedded Computer Architecture! The purpose of this assignment is to get familiar with the modern GPU architecture and its programming model. Considering the availability of hardware and documentation, the NVIDIA GPU and the CUDA language will be supported in this assignment.

If you have a PC with an ATI GPU, you can choose to do the assignment using the ATI GPU and the OpenCL language. However, we can not provide supporting materials and code samples for ATI GPU and OpenCL language. You are on your own to set up the ATI Stream SDK environment on your machine and to study the OpenCL code samples. In this case, you have to look for resources outside this website. In the learning materials page, there are a few pointers to the ATI GPU and the OpenCL language. If you find it difficult to go through these procedures on your own, it is recommended to use the server with NVIDIA GPU.

You can get an account at the secretary's office (PT 9.24) to use the CUDA PC we provide . The user name starts with "5kk73gpu". Make sure you get the right one! If no one is there, you can also come to PT 9.19 for the account.

Please frequently backup important data on the CUDA PC, because:
  • We do not have backup on this PC
  • The accounts and data will be deleted at some time after the course finishes
If you are new to CUDA, go to preparation page and see how to get started. Also, since we use CMake as the build tool for the codes in this assignment, you might want to take a look at the CMake page.

Some learning materials are available here. After learning these materials, you should understand the essentials of GPU programming using CUDA and be able to write some GPU programs.

When porting the serial code to GPU, you may find that the performance is not as good as you expected. That's because the architecture and programming model of the GPU is very different from common CPU and it requires extra effort to write a good GPU program. The example page contains examples that show you several important optimization techniques for GPU programming. Please read it carefully.

  • (Dec. 18) The deadline is extended to Friday Dec. 23, 24:00h sharp.
  • (Dec. 6) Test images and a small change to the SIFT code are available.
  • (Nov. 30) A small remark on the intermediate results: you do NOT have to copy the result of a GPU kernel back to the system memory unless you need to use it on CPU. So you only need to make sure the final result is available in the system memory.
  • (Nov.23) SIFT reference implementation is available. Please visit the assignment page for more information.
  • site setup

Guidelines for the assignment

This assignment is an individal assignment, 
  1. Set up the CUDA environment, and try some examples to get familiar with the programming model.
  2. Learn the CUDA learning materials carefully and read the example page, and try it yourself to understand how to write an efficient GPU application.
  3. Finish the assignment on the  assignment page, and hand in a small report.
  4. When you run into some problems, the known issues page may be able to help. 
The deadline for this assignment is Dec. 23, 24:00h, 2011.

If you have any question, please feel free to contact Dongrui She, at PT 9.19, email: d.she _at_, or Zhenyu Ye, e-mail: _at_