Tutorial content

Topics covered

The tutorial material covers three main topics:

Introduction to CUDA Python with Numba
- Begin working with the Numba compiler and CUDA programming in Python
- Use Numba decorators to GPU-accelerate numerical Python functions
- Optimize host-to-device and device-to-host memory transfers

CUDA kernels in Python with Numba
- Learn CUDA’s parallel thread hierarchy and how to extend parallel program possibilities
- Launch massively parallel custom CUDA kernels on the GPU
- Utilize CUDA atomic operations to avoid race conditions during parallel execution

Multi-dimensional grids and shared memory for CUDA Python with Numba
- Learn support for GPU-accelerated Monte Carlo methods
- Learn multidimensional grid creation and how to work in parallel on 2D matrices
- Leverage on-device shared memory to promote memory coalescing while reshaping 2D matrices

(Note: due to this being only a half-day tutorial, it is unlikely that this third topic will be covered during the tutorial session. However, the material will be available for self-study thereafter.)

The content for this tutorial has been developed by the NVIDIA Deep Learning Institute

Fundamentals_of_Accelerated_Computing_with_CUDA_Python_FactSheet.pdf

Fact Sheet: Fundamentals of Accelerated Computing with CUDA Python

Example content

Cuda_Session1p4.pdf

Jupyter Notebook topic 1 page 4

CUDA_Python_Session2_1.pdf

CUDA_Python_Session2_2.pdf

Slides from topic 2

Cuda_Session3p1.pdf

Jupyter Notebook topic 3 page 1

Page updated

Report abuse