Multicore Programming (CPH351)
Course Description
Parallel computing is a form of computation that utilizes multiple computing resources for solving a problem. Recently, parallel computing is widely employed in various fields (e.g., simulation, graphics, AI, Etc.) to improve the performance in the perspective of speed or accuracy. Among various parallel computing architectures, multi-core CPUs and GPUs are most commonly employed computing resources. This course aims to understand the power of parallel computing and learn a basic programming skill for developing a parallel program. During the 16-week course, students will learn
Concept of parallel computing, How to design parallel algorithms
OpenMP - A programming interface for utilizing multi-core CPUs (MIMD architecture)
CUDA - A programming interface for utilizing a NVIDIA GPU (many-core SIMT architecture)
Also, students will join two team projects whose goal is improving the performance of an application by using multi-core CPUs or/and GPUs.
Class(Spring, 2019) - Class overview (PDF)
CPH331 (Tue. 09:00~11:00, Thur. 11:00~13:00) / #125, 2nd Eng. Building
Instructor: Duksu Kim (bluekds at koreatech.ac.kr / #435, 2nd Eng. Building)
Office hour : Thu. and Thur. 14:00~16:00
Teaching Assistant: Young-Gyu Kim (caca209 at koreatech.ac.kr / #331, 2nd Eng. Building)
Tutoring program
Tutor : Hyuck Chan, Kown (gurckscks at koreatech.ac.kr)
Time/Location: Wed. 19:00~21:00, #246, 2nd Eng. Building
Office hour : Office hour : Thu. 19:00~21:00 (#408) and Thur. 14:00~16:00
Course git repository [link]
Prerequisite/Requirements
(Required) C Programming
(Recommended) System Programming, Data structure
(Required) PC or Laptop with a multi-core CPU
(Recommended) PC or Laptop with a NVIDIA GPU
We will rent a development kit (i.g. Jetson Kit) for CUDA if you need
However, you need to prepare a monitor and a keyboard/mouse yourself to use that.
Textbooks
[Main textbook] Lecture notes in this page
(OpenMP) Using OpenMP, B. Chapman et al. (The MIT Press) [link]
(OpenMP) An Introduction to Parallel Programming, Peter Pacheco (MK) [Eng] [Kor]
(CUDA) CUDA C Programming guide (NVIDIA) [link]
(CUDA) Professional CUDA C Programming, Jhon Cheng, Max Grossman, Ty Mckercher (NVIDIA) [link]
Setup CUDA Dev. environments
Windows Dev. environments [Kor]
Linux(Ubuntu) Dev. environments on Jetson Kit [Kor]
Trouble shooting
Q. My laptop has a Nvidia GPU, but CUDA does not work properly
A. Check the GPU system on your laptop whether a hybrid GPU system (e.g., Intel HD graphics + Nvidia GPU)
In this case, disabling the intel GPU on the device manager of you OS may fix the problem
Lecture Notes and Videos (in Korean)
Lecture slides of a week will be uploaded at the beginning of the week and a lecture video will be updated at the end of the week
Some of figures and sample code comes from reference textbooks
강의 슬라이드의 일부 그림 및 코드는 본 과목의 참고 교재에서 가지고 왔습니다.
Lecture 1. Why Parallel Computing?
What is and Why parallel computing?
Why parallel programming?
Contents (3/7)
Lecture 2. Introduction to Parallel Computing
Flynn’s Taxonomy (SIMD and MIMD)
Nondeterminism
Performance of Parallel Computing
Parallel Program Desig
Contents (3/12)
Lecture 3. OpenMP Overview
What is OpenMP?
Hello OpenMP!
Contents (3/14)
Lecture 4. Introduction to OpenMP (1/3)
Parallel construct
Work-sharing construct
Scope of Variables
Contents (3/19)
Lecture 5. Introduction to OpenMP (2/3)
Synchronization construct
Barrier
Critical section
Locks
Contents (3/26)
Lecture 6. Introduction to OpenMP (3/3)
Reduction clause
Scheduling clauses
Nested parallelism
Contents (4/2)
Video
Midterm Exam (4/9) - Good Luck :)
Lecture 8. CUDA Thread and Execution Model
Hello CUDA!
Basic workflow of CUDA Program
CUDA Thread Hierarchy & Organizing threads
CUDA Execution Model
Contents (4/23)
Lab. 4 (Kor)
Video: (Lab. 4-1) / (Lab. 4-2)
Video
Midterm (OpenMP) Project Presentation (4/30, 5/02)
Lecture 10. Maximizing Memory Throughput
Maximizing memory throughput
Global memory
Aligned memory access
Coalesced memory access
Shared memory
Bank conflict
Contents (5/14)
Slides (Eng/Kor)
No commentary for this Lab.
Video
Lecture 11. Synchronization & Concurrent execution
Synchronization
CUDA stream
Concurrent execution
Hiding data transfer overhead
CUDA Event
Contents (5/21)
Lecture 12. Get More Power!
Multi-GPUs
Heterogeneous Computing
Contents (5/28)
Slides (Eng)
Video
Extra. Nvidia Nsight
CUDA Debugging
Kernel Performance Profiling
Contents
Slides (Eng)
The Final Exam (6/11) - Good Luck :)
Preparing the final project
The Final (CUDA) Project Presentation (6/18, 6/20)
Final Project Team Evaluation Results [Link]
HPC Lab. is actively recruiting self-motivated M.S. and Ph.D. students
If you are interested in pursuing research on following fields as a student member,
High Performance Computing, GPGPU, Heterogeneous Parallel Computing for
Visualization, Computer graphics, VR/AR
Machine learning
Other interesting research topics
please contact to Prof. Duksu Kim (bluekds (at) koreatech.ac.kr).
Please check this document for more details.
HPC Lab.에서 대학원생(석사/박사)을 모집 중입니다.
고성능 컴퓨팅, GPGPU, 컴퓨터 그래픽스 등을 함께 공부/연구하고 싶으신 분들 연락주세요 :)
자세한 내용은 여기 를 확인해주세요.