Systems Reading Group

Computer Systems Reading Group

Our group discusses papers related to topics in computer systems and architecture. The group is open to everyone in KU who has interested in computer systems research! E-mail us (jsahn@csl.korea.ac.kr) if you are interested in joining this group.

- [Dec 19, 2025] LoL-PIM: Long-Context LLM Decoding with Scalable DRAM-PIM System, HPCA 2026
  - Presented by Yumin Lee
- [Nov 21, 2025] Mooncake: Trading More Storage for Less Computation — A KVCache-centric Architecture for Serving LLM Chatbot, FAST 2025
  - Presented by Soonjae Hwang
- [Aug 08, 2025] Page Migration for Hardware Memory Disaggregation Across a Network, ICS 2025
  - Presented by Eunseok Song
- [July 25, 2025] Accelerating LLMs using an Efficient GEMM Library and Target-Aware Optimizations on Real-World PIM Devices, CGO 2025
  - Presented by Seunggon Jeon
- [Jun 20, 2025] M5: Mastering Page Migration and Memory Management for CXL-based Tiered Memory Systems, ASPLOS 2025
  - Presented by Jongho Baik
- [May 30, 2025] Contiguitas: The Pursuit of Physical Memory Contiguity in Datacenters, ISCA 2023
  - Presented by Eunseok Song
- [May 23, 2025] Telescope: Telemetry for Gargantuan Memory Footprint Applications, ATC 2024
  - Presented by Seungsu Baek
- [May 9, 2025] Instruction-Aware Cooperative TLB and Cache Replacement Policies, ASPLOS 2025
  - Presented by Youngjoon Cheon

- [Aug 29, 2024] S-LoRA: Serving Thousands of Concurrent LoRA Adapters, MLSys 2024
  - Presented by Seunggon Jeon
- [Jul 23, 2024] Managing Memory Tiers with CXL in Virtualized Environments, OSDI 2024
  - Presented by Jongho Baik
- [Jul 04, 2024] GMLake: Efficient and Transparent GPU Memory Defragmentation for Large-scale DNN Training with Virtual Memory Stitching, ASPLOS 2024
  - Presented by Jinwoo Jeong
- [Jun 11, 2024] MATRYOSHKA: Non-Exclusive Memory Tiering via Transactional Page Migration, OSDI 2024
  - Presented by Jonghyeon Kim
- [May 30, 2024] IDYLL: Enhancing Page Translation in Multi-GPUs via Light Weight PTE Invalidations, MICRO 2023
  - Presented by Seungsu Baek
- [May 14, 2024] GRIT: Enhancing Multi-GPU Performance with Fine-Grained Dynamic Page Placement, HPCA 2024
  - Presented by Youngjoon Cheon

- [Sep 22, 2023] SHEPHERD: Serving DNNs in the wild, NSDI 2023
  - Presented by Seungsu Baek
- [Aug 31, 2023] Johnny Cache: the End of DRAM Cache Conflicts (in Tiered Main Memory Systems), OSDI 2023
  - Presented by Jonghyeon Kim
- [Aug 16, 2023] TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in Memory, MICRO 2021
  - Presented by Youngjoon Cheon
- [Aug 03, 2023] FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU, ICML 2023
  - Presented by Woohyung Choi
- [Jul 27, 2023] Mobius: Fine Tuning Large-Scale Models on Commodity GPU Servers, ASPLOS 2023
  - Presented by Jinwoo Jeong
- [Jul 20, 2023] AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving, OSDI 2023
  - Presented by Seungsu Baek
- [May 19, 2023] DiLOS: Do Not Trade Compatibility for Performance in Memory Disaggregation, EuroSys 2023
  - Presented By Youngjoon Cheon
- [April 06, 2023] Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression, ASPLOS 2023
  - Presented by Jiho Park
- [Mar 16, 2023] DeepUM: Tensor Migration and Prefetching in Unified Memory, ASPLOS 2023
  - Presented by Woohyung Choi
- [Feb 23, 2023] FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks, ASPLOS 2023
  - Presented by Jinwoo Jeong
- [Feb 02, 2023] Trident: Harnessing Micro-architectural Resources for All Page Sizes in x86 Processors, MICRO 2021
  - Presented by Jonghyeon Kim
- [Jan 19, 2023] Enable Simultaneous DNN Services Based on Deterministic Operator Overlap and Precise Latency Prediction, SC 2021
  - Presented by Seungsu Baek
- [Jan 05, 2023] From Cloud Computing to Sky Computing, HotOS 2021
  - Presented by Jeongseob Ahn

- [Sep 28, 2022] Orca: A Distributed Serving System for Transformer-Based Generative Models, OSDI 2022
  - Presented by Jinwoo Jeong
- [Aug 19, 2022] Ribbon: Cost-Effective and QoS-Aware Deep Learning Model Inference Using a Diverse Pool of Cloud Computing Instances, SC 2022
  - Presented by Seungsu Baek
- [Aug 03, 2022] TMO: Transparent Memory Offloading in Datacenters, ASPLOS 2022
  - Presented by Jonghyeon Kim
- [Mar 16, 2022] HeMem: Scalable Tiered Memory Management for Big Data Applications and Real NVM, SOSP 2021
  - Presented by Suhyun Kim
- [Mar 02, 2022] Check-N-Run: a Checkpointing System for Training Deep Learning Recommendation Models, NSDI 2022
  - Presented by Seungsu Baek
- [Feb 23, 2022] Layerweaver: Maximizing Resource Utilization of Neural Processing Units via Layer-Wise Scheduling, HPCA 2021
  - Presented by Jinwoo Jeong
- [Feb 16, 2022] Don’t shoot down TLB shootdowns!, EuroSys 2020
  - Presented by Jonghyeon Kim
- [Jan 19, 2022] MobiSR: Efficient On-Device Super-Resolution through Heterogeneous Mobile Processors, MobiCom 2019
  - Presented by Jungmo Ahn

- [Apr 06, 2021] Serving DNNs like Clockwork: Performance Predictability from the Bottom Up, OSDI 2020
  - Presented by Jinwoo Jeong
- [Mar 23, 2021] Persistent State Machines for Recoverable In-memory Storage Systems with NVRam, OSDI 2020
  - Presented by Minjong Ha
- [Mar 09, 2021] A Comprehensive Analysis of Superpage Management Mechanisms and Policies, ATC 2020
  - Presented by Wonkyo Choe
- [Feb 19, 2021] Balancing Efficiency and Fairness in Heterogeneous GPU Clusters for Deep Learning. EuroSys 2020
  - Presented by Taeklim Kim
- [Feb 05, 2021] HawkEye: Efficient Fine-grained OS Support for Huge Pages, ASPLOS 2019
  - Presented by Jonghyeon Kim
- [Jan 29, 2021] Fast RDMA-based Ordered Key-Value Store using Remote Learned Cache, OSDI 2020
  - Presented by Minjong Ha
- [Jan 22, 2021] AntMan: Dynamic Scaling on GPU Clusters for Deep Learning, OSDI 2020
  - Presented by Jinwoo Jeong

- [Nov 04, 2020] Towards Real-time Cooperative Deep Inference over the Cloud and Edge End Devices, Ubicomp 2020
  - Presented by Jungmo Ahn
- [Oct 07, 2020] Effectively Prefetching Remote Memory with Leap, ATC 2020
  - Presented by Minjong Ha
- [Sep 16, 2020] Optimizing the TLB Shootdown Algorithm with Page Access Tracking, ATC 2017
  - Presented by Jonghyeon Kim
- [Jul 29, 2020] Pipelined Data-Parallel CPU/GPU Scheduling for Multi-DNN Real-Time Inference, RTSS 2019
  - Presented by Jungmo Ahn
- [Jul 15, 2020] Capuchin: Tensor-based GPU Memory Management for Deep Learning, ASPLOS 2020
  - Presented by Jinwoo Jeong
- [Jun 17, 2020] HotRing: A Hotspot-Aware In-Memory Key-Value Store, FAST 2020
  - Presented by Minjong Ha
- [Jun 03, 2020] Enhancing and Exploiting Contiguity for Fast Memory Virtualization, ISCA 2020
  - Presented by Jonghyeon Kim
- [May 13, 2020] Towards Efficient NVDIMM-based Heterogeneous Storage Hierarchy Management for Big Data Workloads, MICRO 2019
  - Presented by Wonkyo Choe
- [Feb 19, 2020] Mitosis: Transparently Self-Replicating Page-Tables for Large-Memory Machines, ASPLOS 2020
  - Presented by Wonkyo Choe
- [Feb 12, 2020] DRAGON: Breaking GPU Memory Capacity Limits with Direct NVM Access, SC 2018
  - Presented by Taeklim Kim
- [Jan 29, 2020] Mosaic: A GPU Memory Manager with Application-Transparent Support for Multiple Page Sizes, MICRO 2017
  - Presented by Jinwoo Jeong
- [Jan 22, 2020] Unfair Scheduling Patterns in NUMA Architectures, PACT 2019
  - Presented by Jonghyeon Kim
- [Jan 15, 2020] Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge, ASPLOS 2017
  - Presented by Jungmo Ahn
- [Jan 08, 2020] Making Huge Pages Actually Useful, ASPLOS 2018
  - Presented by Wonkyo Choe

[Dec 03, 2019] Baymax: QoS Awareness and Increased Utilization for Non-Preemptive Accelerators in Warehouse Scale Computers, ASPLOS 2016
- Presented by Jinwoo Hwang
[Nov 12, 2019] Interplay between hardware prefetcher and page eviction policy in CPU-GPU unified virtual memory, ISCA 2019
- Presented by Jinwoo Jeong
[Oct 08, 2019] A Framework for Memory Oversubscription Management in Graphics Processing Units, ASPLOS 2019
- Presented by Taeklim Kim
[Sep 17, 2019] Thermostat: Application-transparent Page Management for Two-tiered Main Memory, ASPLOS 2017
- Presented by Jonghyeon Kim
[Sep 04, 2019 ] Software-Defined Far Memory in Warehouse-Scale Computers, ASPLOS 2019
- Presented by Jinwoo Hwang
[Aug 28, 2019] Combining HW/SW Mechanisms to Improve NUMA Performance of Multi-GPU Systems, MICRO 2018
- Presented by Wonkyo Choe
[Aug 07, 2019] Gandiva: Introspective Cluster Scheduling for Deep Learning, OSDI 2018
- Presented by Jinwoo Jeong
[Jul 31, 2019] Reducing DRAM Footprint with NVM in Facebook, EuroSys 2018
- Presented by Jiwon Jeon
[Jul 25, 2019] vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design, MICRO 2016
- Presented by Taeklim Kim
[Jul 17, 2019] Coordinated and Efficient Huge Page Management with Ingens, OSDI 2016
- Presented by Jonghyeon Kim
[Jul 03, 2019] Janus: Optimizing Memory and Storage Support for Non-Volatile Memory Systems, ISCA 2019
- Presented by Wonkyo Choe
[Jun 12, 2019] Introduction to Blackchain and HyperLedger
- Presented by Jinwoo Hwang
[May 15, 2019] PageSeer: Using Page Walks to Trigger Page Swaps in Hybrid Memory Systems, HPCA 2019
- Presented by Taeklim Kim
[Apr 10, 2019] Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization, ATC 2017
- Presented by Jinwoo Jeong
[Apr 03, 2019] Nimble Page Management for Tiered Memory Systems, ASPLOS 2019
- Presented by Jiwon Jeon
[Mar 12, 2019] HeteroOS - OS Design for Heterogeneous Memory Management in Datacenter, ISCA 2017
- Presented by Jonghyeon Kim
[Feb 28, 2019] Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer, MICRO 2008
- Presented by Wonkyo Choe
[Feb 14, 2019] PageForge: A Near-Memory Content-Aware Page-Merging Architecture, MICRO 2017
- Presented by Taeklim Kim
[Jan 24, 2019] Inter-Core Cooperative TLB Prefetchers for Chip Multiprocessors, ASPLOS 2010
- Presented by Jonghyeon Kim
[Jan 10, 2019] Translation Caching: Skip, Don’t Walk (the Page Table), ISCA 2010
- Presented by Wonkyo Choe

Report abuse