Jongwoo Ko

I am a Ph.D. student of the KAIST AI and a member of OSI Lab (Advisor: Se-young Yun, KAIST). My current research focuses on efficient Transformer models [C6, C7, C8, C10], particularly generative language models like T5 or LLaMA. Additionally, I am also interested in efficient Vision Transformer models or multi-modal models [P1]. I aim to enhance the efficiency of large Transformer models.

Previously, my research interests revolved around developing new algorithms to address real-world challenges in the machine learning pipeline, such as noise label [C1, C3, W4] and class imbalance [C5] settings, while providing statistical or mathematical guarantees. I received a master's degree in the Department of Industrial and Systems Engineering from KAIST under the supervision of Prof. Heeyoung Kim.

Contact me : jongwoo [dot] ko [at] kaist [dot] ac [dot] kr [CV / Scholar / Github / LinkedIn]

Preprints 🗒️

(P: Preprint, *: Equal Contribution, ^: Equal Advising)

[P1] Improving Adaptability and Generalizability of Efficient Transfer Learning for Vision-Langauge Models

Yongjin Yang*, Jongwoo Ko*, Se-Young Yun.
Preprint. 2023. [paper] [code]

Publications 📑

(J: Journal, C: Conference, W: Workshop, *: Equal Contribution, ^: Equal Advising)

2024

[C10] DistiLLM: Towards Streamlined Distillation for Large Language Models

Jongwoo Ko, Sungnyun Kim, Tianyi Chen, Se-Young Yun.
The Forty-first International Conference on Machine Learning (ICML). 2024. Vienna [paper] [code]

[C9] Fine-tuning Pre-trained Models for Robustness Under Noisy Labels

Sumyeong Ahn, Sihyeon Kim, Jongwoo Ko, Se-Young Yun.
International Joint Conference on Artificial Intelligence (IJCAI). 2024. Jeju [paper]

2023

[C8] NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models

Jongwoo Ko*, Seungjoon Park*, Yujin Kim, Sumyeong Ahn^, Du-Seong Chang, Euijai Ahn, Se-Young Yun^
Findings of the Association for Computational Linguistics: EMNLP 2023 (Findings of EMNLP). 2023. Singapore [paper] [code]

[C7] Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding

Sangmin Bae*, Jongwoo Ko*, Hwanjun Song^, Se-Young Yun^
The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2023. Singapore [paper] [code]
🥈 Silver Prize, 30th Samsung Humantech Paper Awards

[W4] Efficient Utilization of Pre-trained Model for Learning with Noisy Labels

Jongwoo Ko*, Sumyeong Ahn*, Se-Young Yun
ICLR 2023 Workshop on Pitfalls of Limited Data and Computation for Trustworthy ML (TrustML). 2023. Kigali (Oral Presentation) [paper] [website]

[C6] Revisiting Intermediate Layer Distillation for Compressing Language Models: An Overfitting Perspective

Jongwoo Ko, Seungjoon Park, Minchan Jeong, Suk-Jin Hong, Euijai Ahn, Du-Seong Chang, Se-Young Yun
Findings of the Association for Computational Linguistics: EACL 2023 (Findings of EACL). 2023. Dubrovnik [paper] [code]

[C5/W2/W3] CUDA: Curriculum of Data Augmentation for Long-tailed Recognition

Sumyeong Ahn*, Jongwoo Ko*, Se-Young Yun
[C5] The Eleventh International Conference on Learning Representations (ICLR). 2023. Kigali. (Notable-Top-25%) [paper] [code]
[W2] NeurIPS 2022 Workshop on Distribution Shifts (DistShift). 2022. New Orleans [paper] [website]
[W3] NeurIPS 2022 ML Safety Workshop (MLSW). 2022. New Orleans [paper] [website]

[C4] Self-Contrastive Learning: Single-viewed Supervised Contrastive Framework using Sub-network

Sangmin Bae*, Sungnyun Kim*, Jongwoo Ko, Gihun Lee, Seungjong Noh, Se-Young Yun
Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI). 2023. Washington, D.C. (Oral Presentation) [paper] [code]

[C3/W1] A Gift from Label Smoothing: Robust Training with Adaptive Label Smoothing via Auxiliary Classifier under Label Noise

Jongwoo Ko*, Bongsoo Yi*, Se-Young Yun.
[C3] Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI). 2023. Washington, D.C. [paper] [code]
[W1] ALASCA: Rethinking Label Smoothing for Deep Learning Under Label Noise
- ICML 2022 Workshop on Principles of Distribution Shift (PODS). 2022. Baltimore [paper] [website]

2022

[C2] Synergy with Translation Artifacts for Training and Inference in Multilingual Tasks

Jaehoon Oh*, Jongwoo Ko*, Se-Young Yun.
The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2022. Abu Dhabi [paper] [code]

[J2] Deep Learning-Based Cataract Detection and Grading from Slit-Lamp and Retro-Illumination Photographs: Model Development and Validation Study

Ki Young Son*, Jongwoo Ko*, Eunseok Kim, Si Young Lee, Min-Ji Kim, Jisang Han, Eunhae Shin, Tae-Young Chung, Dong Hui Lim
Ophthalmology Science. 2(2). 100147 [paper]

2021

[C1] FINE Samples for Learning with Noisy Labels

Taehyeon Kim*, Jongwoo Ko*, Sangwook Cho, Jinhwan Choi, Se-Young Yun.
The Thirty-Fifth Annual Conference on Neural Information Processing Systems (NeurIPS). 2021. Virtual [paper] [code]
Selected as a WINNER in Qualcomm Innovation Fellowship Korea 2022 (QIFK 2022)

[J1] Deep Gaussian Process Models for Integrating Multifidelity Experiments with Non-stationary Relationships

Jongwoo Ko, Heeyoung Kim.
IISE Transactions, 54(7), 686-698. [paper] [code]
Selected as a FEATURED ARTICLE and HIGHLIGHTED in the ISE magazine

Experience 🌏

Applied Scientist Intern @Amazon.com Services LLC

Sunnyvale, California, United States
Apr. 2024 - Jun. 2024 (12 Weeks)

Invited Talk 📢

ASG (Applied Science Group) Research Talk @Microsoft

Title: Efficient Knowledge Distillation for sLLMs & On-device Generative AI Models
Introducing my recent paper "DistiLLM: Towards Streamlined Distillation for Large Language Models"
Host: Tianyi Chen

Code Implementations 🖥️

Pytorch-MiniLM

Unofficial Pytorch Reimplementation of MiniLM and MiniLMv2.
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers (NeurIPS 2020)
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers (Findings of ACL 2021)
https://github.com/jongwooko/Pytorch-MiniLM

Awards & Honors 🏆

Silver Prize, 30th Samsung Humantech Paper Awards (2024)

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding

Winner, Qualcomm Innovation Fellowship Korea (2022)

FINE Samples for Learning with Noisy Labels

Editor's Choice for Featured Article, IISE Transactions (2022)

Deep Gaussian Process Models for Integrating Multifidelity Experiments with Non-stationary Relationships

Education 🧑‍🎓

Korea Advanced Institue of Science and Technology (KAIST), Seoul, Korea, Mar. 2020 - Present

Ph.D in Kim Jaechul Graduate School of Artificial Intelligence (Advisor: Se-Young Yun)

Korea Advanced Institue of Science and Technology (KAIST), Daejeon, Korea, Mar. 2018 - Feb. 2020

M.S. in Department of Industrial and Systems Engineering (Advisor: Heeyoung Kim)

Thesis: Deep Gaussian Process Models for Integrating Multifidelity Experiments with Non-stationary Relationships

Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea, Mar. 2014 - Feb. 2018

B.S. in Department of Industrial and Systems Engineering (Magna Cum Laude)