K. Kim, J. Kim, J. J. Kim, D. Li, and Y. Kim,
“FLEXLLM: Flexible and Cost-Efficient LLM Serving with Heterogeneous GPUs.”
In Proceedings of the IEEE Int’l Symp. Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS), 2025.
K. Kim, J. Kim, H. Chung, M.-H. Cha, H.-Y. Kim, and Y. Kim,
“Cost-Efficient LLM Serving in the Cloud: VM Selection with KV Cache Offloading.”
In Proceedings of the IEEE International Conference on Cloud Computing (CLOUD), 2025.
H. Lee, K. Kim, J. Kim, J. So, M.-H. Cha, H.-Y. Kim, J. J. Kim, and Y. Kim,
“Shared Disk KV Cache Management for Efficient Multi-Instance Inference in RAG-Powered LLMs.”
In Proceedings of the IEEE International Conference on Cloud Computing (CLOUD), 2025.
Y. Kim, K. Kim, Y. Cho, J. Kim, A. Khan, K.-D. Kang, B.-S. An, M.-H. Cha, H.-Y. Kim, and Y. Kim,
“DeepVM: Integrating Spot and On-Demand VMs for Cost-Efficient Deep Learning Clusters in the Cloud.”
Proceedings of the IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), 2024.
Y. Cho, Y. Kim, K. Kim, J. Kim, H.-Y. Kim, and Y. Kim,
“Optimizing Multi-level Checkpointing for Distributed Deep Learning Workloads on Cloud Spot VM Clusters.”
IEEE Access, vol. 12, pp. 116891–116904, Aug. 2024.
J. Kim, K. Kim, Y. Kim, M.-H. Cha, H.-Y. Kim, J. J. Kim, and Y. Kim,
"Resolving I/O Bottlenecks through Partition Prediction and Preloading in Large-Scale Disk-based GNN Training."
Korea Software Congress (KSC 2024), Yeosu Expo Convention Center, Korea, Dec. 2024.
H. Lee, K. Kim, J. So, M.-H. Cha, H.-Y. Kim, and Y. Kim,
"High-Performance LLM RAG System Utilizing Disk-Based KV Cache in Vector Database."
Korea Software Congress (KSC 2024), Yeosu Expo Convention Center, Korea, Dec. 2024.