Assistant Professor
Department of Computer Science (Joint Appointment)
Hanyang University (Multimodal AI Lab)
Email : djdkim [a] hanyang [d] ac [d] kr
Office: Room 512, Fusion Technology Center (FTC), 222 Wangsimni-ro, Seongdong-gu, Seoul, South Korea (ZIP: 04763)
Phone: (+82)-2-2220-2384
[CV] [linkedin] [Google Scholar]
한양대학교 Multimodal AI 연구실은 열정 넘치는 대학원생 (박사과정/석박통합 우대)을 모집합니다 (학부연구 필수).
관심 있으신 분은 (1) CV와 (2) 성적 증명서, (3) 연구 포트폴리오를 교수 이메일로 보내주시기 바랍니다.
연구, 프로그래밍 등 경험은 필수이고 영어시험 (TEPS, TOEIC 등)에서 높은 점수를 받으면 도움이 됩니다.
I am an assistant professor in the Department of Data Science at Hanyang University. In 2022, I was a Postdoctoral Scholar at the International Computer Science Institute (ICSI) at UC Berkeley under the supervision of Prof. Stella Yu. I received my B.S., M.S., and Ph.D. degrees advised by Prof. In So Kweon in the School of Electrical Engineering (EE) from KAIST (Korea Advanced Institute of Science and Technology) of South Korea in 2015, 2017, and 2021, respectively. I was a student intern with researchers Xiao Sun and Steve Lin in Visual Computing Group, the Microsoft Research Asia (MSRA), from June 2019 to November 2019. I received the Silver Prize of Samsung Humantech awards and Qualcomm Innovation award as the 1st author.
Scene Understanding
Language and Vision
Generative Models
Data Issues in Deep Learning
Sep. 2024. One paper accepted in EMNLP 2025.
Jul. 2025. Three papers accepted in ACM MM 2025.
Feb. 2025. One paper accepted in CVPR 2025.
Dec. 2024. One paper accepted in AAAI 2025.
Department of Data Science, Hanyang University
EECS Department, UC Berkeley (Supervisor: Stella Yu)
Visual Computing Group, Microsoft Research Asia (Mentor: Xiao Sun and Steve Lin)
Electrical Engineering, KAIST (Supervisor: In So Kweon)
Electrical Engineering, KAIST (Advisor : In So Kweon)
Dissertation: High-level Scene Understanding with Relational and Linguistic Priors
Electrical Engineering, KAIST (Advisor : In So Kweon)
Thesis : Disjoint Multi-task Learning between Heterogeneous Action and Caption Data
Electrical Engineering, KAIST
"Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning"
MinJu Jeon, Si-Woo Kim, Ye-Chan Kim, HyunGee Kim, Dong-Jin Kim
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025. (long, main) (22.16% accept rate)
[PDF]
"FlawMatch: Conditional Defect Image Generation via Flow Matching for Improved Surface Defect Classification"
Hyunwoo Oh, Seunghee Choi, Jinho Baek, {Dong-Jin Kim*, Junegak Joung*} (* Co-corresponding authors)
Advanced Engineering Informatics (AEI), 2025. (Impact Factor 9.9)
[PDF]
"SIDA: Synthetic Image Driven Zero-shot Domain Adaptation"
Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, Taewhan Kim, Dong-Jin Kim
ACM International Conference on Multimedia (MM), 2025. (??% accept rate)
[PDF]
"SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning"
Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, Dong-Jin Kim
ACM International Conference on Multimedia (MM), 2025. (??% accept rate)
[PDF]
"CatchPhrase: EXPrompt-Guided Encoder Adaptation for Audio-to-Image Generation"
Hyunwoo Oh, SeungJu Cha, Kwanyoung Lee, Si-Woo Kim, Dong-Jin Kim
ACM International Conference on Multimedia (MM), 2025. (??% accept rate)
[PDF]
"VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness"
SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim, Hyunwoo Oh, Dong-Jin Kim
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2025. (22.1% accept rate)
Also presented at "Workshop on AI for Creative Visual Content Generation Editing and Understanding" in conjunction with CVPR 2025.
"ViPCap: Retrieval Text-based Visual Prompts for Lightweight Image Captioning"
Taewhan Kim, Soeun Lee, Si-Woo Kim, Dong-Jin Kim
AAAI Conference on Artificial Intelligence (AAAI), 2025. (23.4% accept rate)
[PDF]
Also presented at "Workshop on Adaptive Foundation Models" in conjunction with NeurIPS 2024.
"IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning"
{Soeun Lee*, Si-Woo Kim*}, Taewhan Kim, Dong-Jin Kim (* Co-first authors)
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. (long, main) (20.8% accept rate)
[PDF]
Also presented at "Workshop on Adaptive Foundation Models" and "Workshop on Video-Language Models" in conjunction with NeurIPS 2024.
"Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality"
Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Junmo Kim
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. (long, main) (20.8% accept rate)
[PDF]
"Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data"
Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, and In So Kweon.
IEEE Access, 2024. (Impact Factor 3.4)
[PDF]
"Empirical study on using Adapters for debiased Visual Question Answering"
Jae Won Cho, Dawit Mureja Argaw, Yeongtaek Oh, Dong-Jin Kim, In So Kweon
Computer Vision and Image Understanding (CVIU), 2023. (Impact Factor 4.5)
[PDF]
"Counterfactual Mix-Up for Visual Question Answering"
{Jae Won Cho*, Dong-Jin Kim*}, Yunjae Jung, In So Kweon (* Co-first authors)
IEEE Access, 2023. (Impact Factor 3.9)
[PDF]
"Local Pseudo-Attributes for Long-Tailed Recognition"
Dong-Jin Kim, Tsung-Wei Ke, Stella X. Yu
Pattern Recognition Letters (PRL), 2023. (Impact Factor 5.1)
Also presented at the "Self-Supervised Learning: Theory and Practice" workshop in conjunction with NeurIPS 2022.
"Modeling Semantic Correlation and Hierarchy for Real-world Wildlife Recognition"
Dong-Jin Kim, Zhongqi Miao, Yunhui Guo, Stella X. Yu
IEEE Signal Processing Letters (SPL), 2023. (Impact Factor 3.201)
[PDF]
Also presented at "Workshop on Human in the Loop Learning" in conjunction with NeurIPS 2022.
"Generative Bias for Robust Visual Question Answering"
Jae Won Cho, Dong-Jin Kim, Hyeonggon Ryu, and In So Kweon
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2023. (25.78% accept rate)
Received Bronze Prize, 28th Samsung Humantech Paper Awards (Top 2.8%)
Received Excellent Paper Award, IW-FCV 2023
Also presented at "Workshop on Open-Domain Reasoning Under Multi-Modal Settings" in conjunction with CVPR 2023
"DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning"
YoungTaek Oh, Dong-Jin Kim, and In So Kweon.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (25.3% accept rate)
Also presented at "Workshop on Learning with Limited Labelled Data for Image and Video Understanding" in conjunction with CVPR 2022.
"MCDAL: Maximum Classifier Discrepancy for Active Learning"
{Jae Won Cho*, Dong-Jin Kim*}, Yunjae Jung, and In So Kweon (* Co-first authors)
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022. (Impact Factor 14.255)
Also presented at "The Workshop on Fine-Grained Visual Categorization" in conjunction with CVPR 2022.
"Dense Relational Image Captioning via Multi-task Triple-Stream Networks"
Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, and In So Kweon.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. (Impact Factor 24.314)
[PDF][arXiv] [Project] [Dataset] [code]
Received Qualcomm Innovation Award 2019.
"Single-Modal Entropy based Active Learning for Visual Question Answering"
{Dong-Jin Kim*, Jae Won Cho*}, Jinsoo Choi, Yunjae Jung, and In So Kweon (* Co-first authors)
British Machine Vision Conference (BMVC), 2021.
[PDF]
"LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation"
Inkyu Shin, Dong-Jin Kim, Jae Won Cho, Sanghyun Woo, KwanYong Park, and In So Kweon
IEEE International Conference on Computer Vision (ICCV), 2021. (Oral) (3% accept rate)
[PDF]
Winner of Qualcomm Innovation Fellowship 2021.
"Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation"
Jae Won Cho, Dong-Jin Kim, Yunjae Jung, Jinsoo Choi, and In So Kweon
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Multimodal Learning and Applications Workshop, 2021.
[PDF]
Also presented at "Visual Question Answering Workshop" and "VizWiz Grand Challenge Workshop" in conjunction with CVPR 2021.
"Detecting Human-Object Interactions with Action Co-occurrence Priors"
Dong-Jin Kim, Xiao Sun, Jinsoo Choi, Stephen Lin, and In So Kweon,
European Conference on Computer Vision (ECCV), 2020. (27% accept rate)
[PDF] [Project] [code] [Slides] [Video] [Poster]
Received Silver Prize, 26th Samsung Humantech Paper Awards (Top 1.6%)
Also presented at "The 2nd workshop on Video Turing Test: Toward Human-Level Video Story Understanding" in conjunction with ECCV 2020.
"Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach"
Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon.
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019. (23.8% accept rate)
[PDF] [Project] [Slides] [Poster]
Also, presented at "Language&Vision " and "Visual Question Answering and Dialog " Workshops in conjunction with CVPR 2019, and "CLVL: 3rd Workshop on Closing the Loop Between Vision and Language" in conjunction with ICCV 2019.
"Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning"
Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019. (25.2% accept rate)
[PDF] [Project] [Dataset] [code] [Slides] [Poster]
Extension of this work received Qualcomm Innovation Award 2019.
Also presented at "Language&Vision" and "Visual Question Answering and Dialog" Workshops in conjunction with CVPR 2019.
"Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks"
Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, Youngjin Yoon, and In So Kweon.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2018. (Oral)
[PDF]
Excellent Paper Award. 2023.
"Generative Bias for Robust Visual Question Answering"
IW-FCV 2023.
CVPR Doctoral Consortium. Jun 2021
IEEE CVPR 2021.
Silver Prize, 26th Samsung Humantech Paper Awards (Top 1.6%), 2020 [certificate]
"Detecting Human-Object Interactions with Action Co-occurrence Prior"
Samsung Electronics Co., Ltd.
Qualcomm Innovation Award, 2019 [certificate]
"Dense Relational Image Captioning via Multi-task Triple-Stream Networks"
Qualcomm Inc.
Certificate
International Computer Vision Summer School (ICVSS 2018) [certificate]
Sicily, Italy
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2020, 2021, 2022, 2023, 2024, 2025
IEEE International Conference on Computer Vision (ICCV) 2021, 2023
European Conference on Computer Vision (ECCV) 2022, 2024
Conference on Neural Information Processing Systems (NeurIPS) 2020, 2021, 2022, 2023, 2024, 2025
International Conference on Machine Learning (ICML) 2021, 2022, 2023, 2024, 2025
International Conference on Learning Representations (ICLR) 2022, 2023, 2024, 2025
Annual Meeting of the Association for Computational Linguistics (ACL) 2022
International Conference on Empirical Methods in Natural Language Processing (EMNLP) 2023
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2021, 2023)
International Journal on Computer Vision (IJCV) (2023)
IEEE Transactions on Image Processing (TIP) (2020, 2022)
IEEE Transactions on Cybernetics (Cybernetics) (2022)
IEEE Transactions on Multimedia (TMM) (2023)
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) (2020, 2023)
IEEE Sensors Journal (Sensors) (2022)
Association for the Advancement of Artificial Intelligence (AAAI) 2021, 2022, 2023, 2024, 2025
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2023, 2024
Winter Conference on Applications of Computer Vision (WACV) 2021, 2022, 2023, 2024
Asian Conference on Computer Vision (ACCV) 2020
at Hanyang University
Instructor for "Linear Algebra" (Fall 2022, Fall 2023, Fall 2024)
Instructor for "Computer Vision" (Spring 2023, Spring 2024)
Instructor for "Deep Learning Basics" (Spring 2023, Spring 2024)
Instructor for "Advanced Computer Vision Applications" (Spring 2023, Spring 2024)
at KAIST
TA for "Circuit Theory" (Spring 2016)
TA for "Introduction to Electronics Design Lab" (Fall 2016)
TA for "Electronics Design Lab" (Spring 2017)
TA for "My Life and Career in EE 2" (Fall 2017)
TA for "Advanced Topics in Deep Learning for Robotics and Vision" (Spring 2018, Spring 2019)
TA for "Computer Vision" (Fall 2018)
TA for "Multiple View Geometry" (Spring 2020)
Tutor for "Programming Structure for Electrical Engineering" (Fall 2015, Spring 2017)
Tutor for "Signals and System" (Fall 2017)
Tutor for "Circuit Theory" (Fall 2018)