國家理論中心數學組「高效能計算 」短期課程

2018 NCTS Summer Course: Introduction to Parallel Computing (II)

Advanced Course on Multi-Threaded Parallel Programming using OpenMP/OpenACC for Multicore/Manycore Systems

2018年7月16日星期一至7月19日星期四 (9:00-17:00)

國立臺灣大學 天文數學館 440 室


直播/紀錄 (有兩個畫面可切換)

  • 2018/02/16 直播 Part 2 TBA
  • 2018/07/16 直播 Part 1 TBA



快速累積的大量數據,以及高速發展的超級電腦,是當今與未來的重要趨勢。具備高速計算能力,才能解決更大更複雜的問題,也能大幅提升進行尖端研究與開發產業應用的競爭力。在這個四天的密集課程,我們將簡介 MPI 與 OpenMP 的平行計算環境,說明如何在此平行環境求解稠密矩陣的特徵值問題,並將有限體積法以及大型線性系統疊代法平行化,求解三維 Poisson 方程。課程中將使用最先進的超級電腦實機操作。(註:帳號須經申請,並依相關規定審核通過後,方得使用。) 這個短期課程提供一個非常難得的機會,可以接觸到目前全世界最尖端的高速平行計算環境。歡迎教師、碩博士生、大學部同學報名參加。本課程以英語講授。


In order to make full use of modern supercomputer systems with multicore/manycore architectures, hybrid parallel programming with message-passing and multithreading is essential. While MPI is widely used for message-passing, OpenMP for CPU and OpenACC for GPU are the most popular ways for multithreading on multicore/manycore clusters. In this 4-day course, we focus on optimization of single node performance using OpenMP and OpenACC for CPU and GPU. We “parallelize” a finite-volume method (FVM) code with Krylov iterative solvers for Poisson’s equation on Reedbush supercomputer at the University of Tokyo with 1.93 PF peak performance (http://www.cc.u-tokyo.ac.jp/system/reedbush/index-e.html), which consists of the most recent CPU’s (Intel Xeon E5-2695 v4 (Broadwell-EP)) and GPU’s (NVIDIA Tesla P100 (Pascal)).

It is assumed that all the participants have already finished”邁向Petascale高速計算 (February 2016)” or”國家理論中心數學組「高效能計算 」短期課程 (February 2017 or February 2018)”, or they have equivalent knowledge and experiences in numerical analysis and parallel programming by MPI and OpenMP.

In the winter schools in 2016, 2017 and 2018, the target application was a 3D FVM code for Poisson’s equation by Conjugate Gradient (CG) iterative method with very simple Point Jacobi preconditioner. This time our target is same FVM code, but linear equations are solved by ICCG (CG iterative method with Incomplete Cholesky preconditioning), which is more complicated, powerful and widely-used in practical applications. Because ICCG includes “data dependency”, where writing/reading data to/from memory could occur simultaneously, parallelization using OpenMP/OpenACC is not straightforward. We need certain kind of reordering in order to extract parallelism. In this 4-day course, lectures and exercise on the following issues will be provided:

  • Overview of Finite-Volume Method (FVM)
  • Kyrilov Iterative Method, Preconditioning
  • Implementation of the Program
  • Introduction to OpenMP/OpenACC
  • Reordering/Coloring Method
  • Parallel FVM by OpenMP/OpenACC

預備知識 (Prerequisites)

  • Completion of one of the following two short courses:
    • 邁向Petascale高速計算 (February 2016) https://sites.google.com/site/school4scicomp/previous/2016-a-winter/
    • 國家理論中心數學組「高效能計算 」短期課程 (February 2017) https://sites.google.com/site/school4scicomp/2017-b-spring/
    • 國家理論中心數學組「高效能計算 」短期課程 (February 2018) https://sites.google.com/site/school4scicomp/2018-b-nk/
    • (or equivalent knowledge and experiences in numerical analysis (FVM, preconditioned iterative solvers) and parallel programming using OpenMP and MPI )
  • Experiences in Unix/Linux (vi or emacs)
  • Fundamental numerical algorithms (Gaussian Elimination, LU Factorization, Jacobi/Gauss-Seidel/SOR Iterative Solvers, Conjugate Gradient Method (CG))
  • Experiences in SSH Public Key Authentication Method

課程講義 (Course Handouts)

課前準備 (Preparation)

  • Windows
    • Cygwin with gcc/gfortran and OpenSSH
      • Please make sure to install gcc (C) or gfortran (Fortran) in “Devel”, and OpenSSH in “Net”
    • ParaView
  • MacOS, UNIX/Linux
    • ParaView
  • Cygwin: https://www.cygwin.com/
  • ParaView: http://www.paraview.org

直播/紀錄 (有兩個畫面可切換)

授課教授 (Instructors)

  • Professor Kengo Nakajima (中島 研吾 教授,東京大學 情報基盤中心 超級計算研究部門)

  • Professor Tetsuya Hoshino (星野哲也 助教授,東京大學 情報基盤中心 超級計算研究部門)

課程內容 (Schedule and Contents)

July 16, 2018 (M)

  • Introduction
  • Finite-Volume Method (FVM)
  • Login to Reedbush System
  • OpenMP
  • Reordering (1/2)

July 17, 2018 (T)

  • Reordering (2/2)
  • Parallel FVM by OpenMP

July 18, 2018 (W)

  • Introduction to GPU Programming
  • OpenACC (1/2)

July 19, 2018 (Th)

  • OpenACC (2/2)
  • Parallel FVM by OpenACC
  • Exercises



協辦單位:東京大學情報基盤中心臺灣大學數學系與應用數學科學研究所台灣工業與應用數學學會 GPU與高效能計算活動學群

主持人:王偉仲 (台灣大學應數所),林文偉 (交通大學應數系),舒宇宸 (成功大學數學系),黃聰明 (台灣師範大學數學系)

聯絡人:游墨霏小姐 (02-3366-8814, murphyyu@ncts.ntu.edu.tw)