Research
Efficient Memory Management for High-Performance Computing
Due to the challenge of scaling DRAM density, a new class of memory (e.g., CXL Memory Expander), has received a lot of attention to bridge the performance gap between DRAM and SSD. Although such new class memory can provide abundant capacity, the performance is not comparable to the conventional DRAM. As a result, we expect that future large memory systems will be a form of tiered memory architecture. In this study, we revisit the design and implementation of memory management in state-of-the-art Linux for achieving high-performance. [Slides]
Keywords: Memory Management, Operating Systems, Linux Kernel Programming
Systems for Artificial Intelligence (Large Language Models)
A variety of deep learning (DL) based services including image classification, natural language processing, and recommendation are widely deployed in data centers such as Facebook, Google, Microsoft, Alibaba, and Netflix. There have been significant efforts to optimize the model serving systems. In this study, we focus on the impact of scheduling queries across heterogeneous systems, which equipped with CPUs, GPUs, and customized accelerators, to maximize the latency-bounded throughput.
Also, as modern machine learning models are becoming much larger and more complex, ML training systems require a large amount of memory as well as heavy computation capabilities. However, scaling the GPU memory capacity has been limited. As a result, it is challenging to train large machine learning models in a single GPU. In this project, we study how the operating systems or machine learning frameworks (e.g., PyTorch or TensorFlow) should manage the unified memory across GPUs for emerging applications. [Slides]
Keywords: Model Serving, Resource Scheduling, Unified Memory, ML Frameworks
Datacenter & Cloud Computing
Today, virtualization is a key enabler technology for cloud computing because it allows flexible resource management by allocating virtual machines, instead of physical systems, to cloud users. In addition, by consolidating underutilized systems onto fewer servers, system virtualization can provide high resource efficiency and reduce energy consumption. In this project, we introduce hardware and software techniques for the efficient CPU and memory virtualization.
Keywords: Hypervisor, Scheduling, Address Translation, Migration