Homepage of Dong-Jin Kim

Dong-Jin Kim (김동진)

Assistant Professor

Hanyang University (Multimodal AI Lab)

Email : djdkim [a] hanyang [d] ac [d] kr

Office: Room 512, Fusion Technology Center (FTC), 222 Wangsimni-ro, Seongdong-gu, Seoul, South Korea (ZIP: 04763)

Phone: (+82)-2-2220-2384

[CV] [linkedin] [Google Scholar]

한양대학교 Multimodal AI 연구실은 열정 넘치는 대학원생 (박사과정/석박통합 우대)을 모집합니다 (학부연구 필수).

관심 있으신 분은 (1) CV와 (2) 성적 증명서, (3) 연구 포트폴리오를 교수 이메일로 보내주시기 바랍니다.

연구, 프로그래밍 등 경험은 필수이고 영어시험 (TEPS, TOEIC 등)에서 높은 점수를 받으면 도움이 됩니다.

I am an assistant professor in the Department of Data Science at Hanyang University. In 2022, I was a Postdoctoral Scholar at the International Computer Science Institute (ICSI) at UC Berkeley under the supervision of Prof. Stella Yu. I received my B.S., M.S., and Ph.D. degrees advised by Prof. In So Kweon in the School of Electrical Engineering (EE) from KAIST (Korea Advanced Institute of Science and Technology) of South Korea in 2015, 2017, and 2021, respectively. I was a student intern with researchers Xiao Sun and Steve Lin in Visual Computing Group, the Microsoft Research Asia (MSRA), from June 2019 to November 2019. I received the Silver Prize of Samsung Humantech awards and Qualcomm Innovation award as the 1st author.

Research Interests

Scene Understanding
Language and Vision
Generative Models
Data Issues in Deep Learning

News

Feb. 2026. Four papers (3 main, 1 findings) accepted in CVPR 2026.

Sep. 2025. One paper accepted in NeurIPS 2025.

Sep. 2025. One paper accepted in EMNLP 2025.

Jul. 2025. Three papers accepted in ACM MM 2025.

Research Experiences

Assistant Professor (Sep 2022 ~ Present)

Department of Data Science, Hanyang University

Postdoctoral Scholar (Jan 2022 ~ Aug 2022)

EECS Department, UC Berkeley (Supervisor: Stella Yu)

Research Intern (Jun 2019 ~ Nov 2019)

Visual Computing Group, Microsoft Research Asia (Mentor: Xiao Sun and Steve Lin)

Research Assistant (Mar 2015 ~ Aug 2021)

Electrical Engineering, KAIST (Supervisor: In So Kweon)

Education

Ph.D. (Mar 2017 ~ Aug 2021)

Electrical Engineering, KAIST (Advisor : In So Kweon)

Dissertation: High-level Scene Understanding with Relational and Linguistic Priors

M.S. (Mar 2015 ~ Feb 2017)

Electrical Engineering, KAIST (Advisor : In So Kweon)

Thesis : Disjoint Multi-task Learning between Heterogeneous Action and Caption Data

B.S. (Mar 2011 ~ Feb 2015)

Electrical Engineering, KAIST

Selected Publications

"Cap4Bridge: Caption-Guided Cross-Modal Contextualization with Stochastic Augmentation for Text-Video Retrieval"

MinJu Jeon, HyunGee Kim, Si-Woo Kim, Youngtaek Oh, Soeun Lee, Dong-Jin Kim

IEEE Access, 2026. (Impact Factor 3.6)

[PDF]

"Adaptive Auxiliary Prompt Blending for Target-Faithful Diffusion Generation"

Kwanyoug Lee, SeungJu Cha, Yebin Ahn, Hyunwoo Oh, Sungho Koh, Dong-Jin Kim

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2026. (25.42% accept rate)

[PDF] [code]

Received Honorable Mention, 32nd Samsung Humantech Paper Awards

"SAIL: Similarity-Aware Guidance and Inter-Caption Augmentation-based Learning for Weakly-Supervised Dense Video Captioning"

Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, Minju Jeon, HynGee Kim, Dong-Jin Kim

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2026. (25.42% accept rate)

[PDF] [code]

"Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning"

Seung hee Choi, MinJu Jeon, Hyunwoo Oh, Jihwan Lee, Dong-Jin Kim

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2026. (25.42% accept rate)

[PDF] [code]

"ADAPT: Attention Driven Adaptive Prompt Scheduling and InTerpolating Orthogonal Complements for Rare Concepts Generation"

Kwanyoung Lee, Hyunwoo Oh, SeungJu Cha, Sungho Koh, Dong-Jin Kim

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Findings, 2026.

[PDF] [code]

"ADuLTS: Appearance Descriptions under Long-Tailed Scenarios with diverse synthesized images"

SeungJu Cha, Seunghee Choi, Kwanyoung Lee, Dong-Jin Kim

Computer Vision and Image Understanding (CVIU), 2026. (Impact Factor 3.5)

[PDF]

"Combining near-infrared spectroscopic signatures and physical traits based on machine vision to enhance accuracy in identification of the geographical origins of agricultural products"

{Seongsoo Jeong‡, Seung-hee Choi‡}, Hyunwoo Oh, Haejin Kim, Han-sub Chang, Jisook Song, Ho jin Kim, {Dong-Jin Kim*, Hoeil Chung*} (* Co-corresponding authors) (‡ Co-first authors)

Microchemical Journal, 2025. (Impact Factor 5.1)

[PDF]

"ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion"

Sungho Koh, SeungJu Cha, Hyunwoo Oh, Kwanyoung Lee, Dong-Jin Kim

Neural Information Processing Systems (NeurIPS), 2025. (24.52% accept rate)

[PDF]

"Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning"

MinJu Jeon, Si-Woo Kim, Ye-Chan Kim, HyunGee Kim, Dong-Jin Kim

International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025. (long, main) (22.16% accept rate)

[PDF]

Also presented at "Workshop on Multi-Modal Reasoning for Agentic Intelligence" in conjunction with ICCV 2025.

"FlawMatch: Conditional Defect Image Generation via Flow Matching for Improved Surface Defect Classification"

Hyunwoo Oh, Seunghee Choi, Jinho Baek, {Dong-Jin Kim*, Junegak Joung*} (* Co-corresponding authors)

Advanced Engineering Informatics (AEI), 2025. (Impact Factor 9.9)

[PDF]

"SIDA: Synthetic Image Driven Zero-shot Domain Adaptation"

Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, Taewhan Kim, Dong-Jin Kim

ACM International Conference on Multimedia (MM), 2025. (23.45% accept rate) (Oral)

[PDF]

Also presented at "Workshop on Curated Data for Efficient Learning" in conjunction with ICCV 2025.

"SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning"

Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, Dong-Jin Kim

ACM International Conference on Multimedia (MM), 2025. (23.45% accept rate) (Oral)

[PDF]

Also presented at "Workshop on Curated Data for Efficient Learning" in conjunction with ICCV 2025.

"CatchPhrase: EXPrompt-Guided Encoder Adaptation for Audio-to-Image Generation"

Hyunwoo Oh, SeungJu Cha, Kwanyoung Lee, Si-Woo Kim, Dong-Jin Kim

ACM International Conference on Multimedia (MM), 2025. (23.45% accept rate) (Oral)

[PDF]

"VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness"

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim, Hyunwoo Oh, Dong-Jin Kim

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2025. (22.1% accept rate)

[PDF] [code]

Also presented at "Workshop on AI for Creative Visual Content Generation Editing and Understanding" in conjunction with CVPR 2025.

"ViPCap: Retrieval Text-based Visual Prompts for Lightweight Image Captioning"

Taewhan Kim, Soeun Lee, Si-Woo Kim, Dong-Jin Kim

AAAI Conference on Artificial Intelligence (AAAI), 2025. (23.4% accept rate)

[PDF]

Also presented at "Workshop on Adaptive Foundation Models" in conjunction with NeurIPS 2024.

"IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning"

{Soeun Lee*, Si-Woo Kim*}, Taewhan Kim, Dong-Jin Kim (* Co-first authors)

International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. (long, main) (20.8% accept rate)

[PDF]

Also presented at "Workshop on Adaptive Foundation Models" and "Workshop on Video-Language Models" in conjunction with NeurIPS 2024.

"Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality"

Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Junmo Kim

International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. (long, main) (20.8% accept rate)

[PDF]

"Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data"

Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, and In So Kweon.

IEEE Access, 2024. (Impact Factor 3.4)

[PDF]

"Empirical study on using Adapters for debiased Visual Question Answering"

Jae Won Cho, Dawit Mureja Argaw, Yeongtaek Oh, Dong-Jin Kim, In So Kweon

Computer Vision and Image Understanding (CVIU), 2023. (Impact Factor 4.5)

[PDF]

"Counterfactual Mix-Up for Visual Question Answering"

{Jae Won Cho*, Dong-Jin Kim*}, Yunjae Jung, In So Kweon (* Co-first authors)

IEEE Access, 2023. (Impact Factor 3.9)

[PDF]

"Technical Report of NICE Challenge at CVPR 2023: Retrieval-based Data Discovery and Fusion for Zero-shot Image Captioning"

Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Jumno Kim

preprint, 2023.

[PDF] [code]

2nd place in the NICE Challenge at CVPR 2023

"Local Pseudo-Attributes for Long-Tailed Recognition"

Dong-Jin Kim, Tsung-Wei Ke, Stella X. Yu

Pattern Recognition Letters (PRL), 2023. (Impact Factor 5.1)

[PDF]

Also presented at the "Self-Supervised Learning: Theory and Practice" workshop in conjunction with NeurIPS 2022.

"Modeling Semantic Correlation and Hierarchy for Real-world Wildlife Recognition"

Dong-Jin Kim, Zhongqi Miao, Yunhui Guo, Stella X. Yu

IEEE Signal Processing Letters (SPL), 2023. (Impact Factor 3.201)

[PDF]

Also presented at "Workshop on Human in the Loop Learning" in conjunction with NeurIPS 2022.

"Generative Bias for Robust Visual Question Answering"

Jae Won Cho, Dong-Jin Kim, Hyeonggon Ryu, and In So Kweon

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2023. (25.78% accept rate)

[PDF] [code]

Received Bronze Prize, 28th Samsung Humantech Paper Awards (Top 2.8%)
Received Excellent Paper Award, IW-FCV 2023
- Also presented at "Workshop on Open-Domain Reasoning Under Multi-Modal Settings" in conjunction with CVPR 2023

"Self-Sufficient Framework for Continuous Sign Language Recognition"

YeongJun Jang, Youngtaek Oh, Jae Won Cho, Myungchul Kim, Dong-Jin Kim, In So Kweon, and Joon Son Chung

International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023. (Oral) (Top 3% recognition)

[PDF] [Project]

"Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition"

YeongJun Jang, Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, Joon Son Chung, and In So Kweon

British Machine Vision Conference (BMVC), 2022.

[PDF] [Project] [code]

"DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning"

YoungTaek Oh, Dong-Jin Kim, and In So Kweon.

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (25.3% accept rate)

[PDF] [Project] [code]

Also presented at "Workshop on Learning with Limited Labelled Data for Image and Video Understanding" in conjunction with CVPR 2022.

"MCDAL: Maximum Classifier Discrepancy for Active Learning"

{Jae Won Cho*, Dong-Jin Kim*}, Yunjae Jung, and In So Kweon (* Co-first authors)

IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022. (Impact Factor 14.255)

[PDF][arXiv] [code]

Also presented at "The Workshop on Fine-Grained Visual Categorization" in conjunction with CVPR 2022.

"Dense Relational Image Captioning via Multi-task Triple-Stream Networks"

Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, and In So Kweon.

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. (Impact Factor 24.314)

[PDF][arXiv] [Project] [Dataset] [code]

Received Qualcomm Innovation Award 2019.

"Single-Modal Entropy based Active Learning for Visual Question Answering"

{Dong-Jin Kim*, Jae Won Cho*}, Jinsoo Choi, Yunjae Jung, and In So Kweon (* Co-first authors)

British Machine Vision Conference (BMVC), 2021.

[PDF]

"ACP++: Action Co-occurrence Priors for Human-Object Interaction Detection"

Dong-Jin Kim, Xiao Sun, Jinsoo Choi, Stephen Lin, and In So Kweon,

IEEE Transactions on Image Processing (TIP), 2021. (Impact Factor 10.856)

[PDF][arXiv] [Project] [code]

"LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation"

Inkyu Shin, Dong-Jin Kim, Jae Won Cho, Sanghyun Woo, KwanYong Park, and In So Kweon

IEEE International Conference on Computer Vision (ICCV), 2021. (Oral) (3% accept rate)

[PDF]

Winner of Qualcomm Innovation Fellowship 2021.

"Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation"

Jae Won Cho, Dong-Jin Kim, Yunjae Jung, Jinsoo Choi, and In So Kweon

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Multimodal Learning and Applications Workshop, 2021.

[PDF]

Also presented at "Visual Question Answering Workshop" and "VizWiz Grand Challenge Workshop" in conjunction with CVPR 2021.

"Detecting Human-Object Interactions with Action Co-occurrence Priors"

Dong-Jin Kim, Xiao Sun, Jinsoo Choi, Stephen Lin, and In So Kweon,

European Conference on Computer Vision (ECCV), 2020. (27% accept rate)

[PDF] [Project] [code] [Slides] [Video] [Poster]

Received Silver Prize, 26th Samsung Humantech Paper Awards (Top 1.6%)
Also presented at "The 2nd workshop on Video Turing Test: Toward Human-Level Video Story Understanding" in conjunction with ECCV 2020.

"Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach"

Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon.

International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019. (23.8% accept rate)

[PDF] [Project] [Slides] [Poster]

Also, presented at "Language&Vision " and "Visual Question Answering and Dialog " Workshops in conjunction with CVPR 2019, and "CLVL: 3rd Workshop on Closing the Loop Between Vision and Language" in conjunction with ICCV 2019.

"Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning"

Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon.

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019. (25.2% accept rate)

[PDF] [Project] [Dataset] [code] [Slides] [Poster]

Extension of this work received Qualcomm Innovation Award 2019.
Also presented at "Language&Vision" and "Visual Question Answering and Dialog" Workshops in conjunction with CVPR 2019.

"Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks"

Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, Youngjin Yoon, and In So Kweon.

IEEE Winter Conference on Applications of Computer Vision (WACV), 2018. (Oral)

[PDF]

Reviewer Experiences

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
IEEE International Conference on Computer Vision (ICCV)
European Conference on Computer Vision (ECCV)
Conference on Neural Information Processing Systems (NeurIPS)
International Conference on Machine Learning (ICML)
International Conference on Learning Representations (ICLR)
Association for the Advancement of Artificial Intelligence (AAAI)

Teaching Experiences

at Hanyang University

Instructor for "Linear Algebra" (Fall 2022, Fall 2023, Fall 2024, Fall 2025, Spring 2026)
Instructor for "Computer Vision" (Spring 2023, Spring 2024, Spring 2025)
Instructor for "Deep Learning Basics" (Spring 2023, Spring 2024, Spring 2025)
Instructor for "Advanced Computer Vision Applications" (Spring 2023, Spring 2024, Spring 2025, Spring 2026)

at KAIST

TA for "Circuit Theory" (Spring 2016)
TA for "Introduction to Electronics Design Lab" (Fall 2016)
TA for "Electronics Design Lab" (Spring 2017)
TA for "My Life and Career in EE 2" (Fall 2017)
TA for "Advanced Topics in Deep Learning for Robotics and Vision" (Spring 2018, Spring 2019)
TA for "Computer Vision" (Fall 2018)
TA for "Multiple View Geometry" (Spring 2020)
Tutor for "Programming Structure for Electrical Engineering" (Fall 2015, Spring 2017)
Tutor for "Signals and System" (Fall 2017)
Tutor for "Circuit Theory" (Fall 2018)

Google Sites

Report abuse

Dong-Jin Kim (김동진)

Research Interests

News

Research Experiences

Assistant Professor (Sep 2022 ~ Present)

Postdoctoral Scholar (Jan 2022 ~ Aug 2022)

Research Intern (Jun 2019 ~ Nov 2019)

Research Assistant (Mar 2015 ~ Aug 2021)

Education

Ph.D. (Mar 2017 ~ Aug 2021)

M.S. (Mar 2015 ~ Feb 2017)

B.S. (Mar 2011 ~ Feb 2015)

Selected Publications

Reviewer Experiences

Teaching Experiences