Gwangsun Kim
Assistant Professor @ POSTECH
About me
I am an assistant professor at the Department of Computer Science and Engineering, POSTECH. Before joining POSTECH, I worked at Arm on improving Arm processor IPs for server systems. I earned my Ph.D. (2016) and M.S. (2012) degrees from KAIST in Computer Science under Prof. John Kim, and B.S. degree (2010) from POSTECH in Computer Science and Engineering and Electronic and Electrical Engineering (double major).
I am looking for ambitious graduate students and undergraduate interns who want to do innovative research on computer systems (see Research Areas and On-going projects below). If you are interested in working with me, please contact me at g.kim at postech dot ac dot kr.
Related links:
Employment
Assistant Professor, POSTECH, Nov. 2018 - Present
Senior Research Engineer, Arm Inc., Mar. 2018 - Oct. 2018
Senior Performance Engineer, Arm Inc., Sep. 2016 - Mar. 2018
Research Intern, NVIDIA, Jun. 2015 - Sep. 2015
Research Intern, Samsung Electronics, Jul. 2014 - Sep. 2014
Research Areas
I am interested in various topics in computer architecture and its interaction with different layers of computer system including algorithm, operating system, and programming models. Below are some of the topics I have been working on:
Domain-specific Accelerators for Machine Learning
Near-Data Processing / Processing-In-Memory
HW/SW Co-design
Memory system with Storage-Class Memory
Massively parallel architectures (e.g., GPU)
Large-scale systems (Data centers and supercomputers)
Interconnection networks
On-going Projects
Near-data processing (NDP) in memory expander for high-throughput DNN training in Multi-GPU systems
While the compute performance of GPUs has been improved significantly, the improvement in the memory system and interconect has lagged far behind. As a result, GPUs spend a significant fraction of time on memory- and communication-bound operations during DNN training. Meanwhile, recent memory-semantic interconnect (e.g., NVLink and Compute Express Link) pose opportunities to introduce memory expanders within a system and the ability to offload computation to the controller of the memory expander. This work proposes an NDP architecture with compiler support to offload memory- and communication-bound operations to memory expanders to substantially improve DNN training performance.
Heterogeneous 3D-stacked memory with Storage Class Memory (SCM) and DRAM for GPUs
SCM is a class of emerging memory devices that provide byte-addressability, non-volatility, and higher capacity/cost. However, as they provide lower bandwidth and higher latency than DRAM, the memory system needs to be carefully designed to incorporate them to achieve better performance. This work proposes a memory hierarchy with an efficient DRAM caching architecture for 3D-stacked memory with SCM and DRAM.
Scalable Neural Processing Unit (NPU) system architecture
Datacenters that serve a massive amount of machine learning service requests require that the NPUs have a good scalability at the chip-level (with many NPU cores), package-level (with multiple dies in a package), node-level (with multiple NPU cards), and rack-level (with multiple NPU nodes in a rack). This project explores software/hardware co-designing approaches to enable a very scalable NPU system for large-scale deep learning systems.
Architecture for accelerating large-scale Graph Neural Network (GNN) training
Training GNNs is challenging due to the intrinsic irregularity of graphs and the diversity of GNN architectures (GCN, GAT, etc.). Large-scale GNN training is even more challenging because the huge feature data for nodes and edges are distributed across multiple compute nodes, placing a heavy burden on the memory system and the interconnect. In this project, we are architecting a scalable, specialized accelerator system to accelerate the training of large-scale GNNs.
Lossless tensor compression for high-performance DNN inference/training
DNN inference/training requires very high memory capacity and bandwidth. Meanwhile, the inherent redundancy and sparsity in DNN's tensors pose an opportunity for significantly reducing the size of tensors to increase the effective memory capacity and bandwidth. In this project, we are working on developing an effective hardware-based tensor compression algorithm to improve the overall system performance and energy-efficiency for DNN inference and training.
Students
Ph.D./Integrated MS-PhD Students
Hyungkyu Ham
Jeongmin Hong
M.S. Student
Wonhyuk Yang
Geonwoo Park
Yunseon Shin
Researcher
Jinhoon Bae
Undergraduate Intern
Okkyun Woo
Alumni
Junkyung Choi (M.S., 2021)
Junho Lee (M.S., 2022)
Hyunuk Cho (M.S., 2023)
Publications
2022
Overcoming Memory Capacity Wall of GPUs with Heterogeneous Memory Stack
Jeongmin Hong*, Sungjun Cho*, Gwangsun Kim
IEEE Computer Architecture Letters [*: Equal contribution]
[ Paper ]
Dynamic Global Adaptive Routing in High-radix Networks
Hans Kasan, Gwangsun Kim, Yung Yi, John Kim
The 49th International Symposium on Computer Architecture (ISCA) (Accept. rate: 16.8%)
[ Paper ]
2021
Near-Data Processing in Memory Expander for DNN Acceleration on GPUs
Hyungkyu Ham*, Hyunuk Cho*, Minjae Kim, Jueon Park, Jeongmin Hong, Hyojin Sung, Eunhyeok Park, Euicheol Lim, Gwangsun Kim
IEEE Computer Architecture Letters [*: Equal contribution]
[ Paper ]
2018
TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks
2017
Toward Standardized Near-Data Processing with Unrestricted Data Placement for GPUs
History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers
Wonjun Song, Gwangsun Kim, Hyungjoon Jung, Jongwook Chung, Jung Ho Ahn, Jae W Lee, and John Kim
The 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (Accept. rate: 17.4%)
[ Paper ]
2016
Contention-based Congestion Management in Large-Scale Networks
Automatically Exploiting Implicit Pipeline Parallelism from Multiple Dependent Kernels for GPUs
Accelerating Linked-list Traversal through Near-Data Processing
High-Throughput System Design with Memory Networks
Gwangsun Kim
Ph.D. Thesis, School of Computing, KAIST
[ Paper ]
Transparent Offloading and Mapping (TOM): Enabling Programmer-Transparent Near-Data Processing in GPU Systems
iPAWS : Instruction-Issue Pattern-based Adaptive Warp Scheduling for GPGPUs
Minseok Lee, Gwangsun Kim, John Kim, Woong Seo, Yeongon Cho, and Soojung Ryu
The 22nd IEEE International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 22.1%)
[ Paper ]
Design and Analysis of Hybrid Flow Control for Hierarchical Ring Network-on-Chip
Hanjoon Kim, Gwangsun Kim, Hwasoo Yeo, John Kim, and Seungryoul Maeng
IEEE Transactions on Computers, vol. 65, no. 2, pp. 480-494, 1 Feb. 2016
[ Paper ]
2015
Overcoming Far-end Congestion in Large-Scale Networks
Jongmin Won, Gwangsun Kim, John Kim, Ted Jiang, Mike Parker, and Steve Scott
The 21st IEEE International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 22.1%)
[ Paper ]
2014
Multi-GPU System Design with Memory Networks
Memory Network: Enabling Technology for Scalable Near-Data Computing
Transportation-Network Inspired Network-on-Chip
Hanjoon Kim, Gwangsun Kim, Hwasoo Yeo, Seungryoul Maeng, and John Kim
The 20th International Symposium on High Performance Computer Architecture (HPCA) (Accept. rate: 25.6%)
[ Paper ]
Low-overhead Network-on-Chip Support for Location-oblivious Task Placement
Gwangsun Kim, Michael M. Lee, John Kim, Jae W. Lee, Dennis Abts, and Michael Marty
IEEE Transactions on Computers, vol. 63, no. 6, pp. 1487-1500, June 2014
[ Paper ]
2013
Memory-centric System Interconnect Design with Hybrid Memory Cubes
2012
Scalable On-chip Network in Power Constrained Manycore Processors
Hanjoon Kim, Gwangsun Kim, and John Kim
The 3rd International Green Computing Conference (IGCC)
[ Paper ]
2011
Teaching
CSED503: Advanced Computer Architecture, POSTECH, Fall 2020, Fall 2021
CSED311: Computer Architecture, POSTECH, Spring 2019, 2020, 2021, 2022, 2023
CSED490V: Parallel Architecture and Programming, POSTECH, Fall 2019, Fall 2022
CSED499: Research Project, POSTECH, Spring 2019
CSED199: Freshman Research Participation, Fall 2021
Contact
Email: g.kim at postech dot ac dot kr
Phone: +82-54-279-2260
Office: POSTECH PIAI(인공지능연구원) #433, 77 Cheongam-ro, Nam-gu, Pohang, Gyungbuk, Korea 37673