"ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion"
Sungho Koh, SeungJu Cha, Hyunwoo Oh, Kwanyoung Lee, Dong-Jin Kim
Neural Information Processing Systems (NeurIPS), 2025. (24.52% accept rate)
[PDF]
"Sali4Vid: Saliency-Aware Video Reweighting and Adaptive Caption Retrieval for Dense Video Captioning"
MinJu Jeon, Si-Woo Kim, Ye-Chan Kim, HyunGee Kim, Dong-Jin Kim
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025. (long, main) (22.16% accept rate)
[PDF]
"FlawMatch: Conditional Defect Image Generation via Flow Matching for Improved Surface Defect Classification"
Hyunwoo Oh, Seunghee Choi, Jinho Baek, {Dong-Jin Kim*, Junegak Joung*} (* Co-corresponding authors)
Advanced Engineering Informatics (AEI), 2025. (Impact Factor 9.9)
[PDF]
"SIDA: Synthetic Image Driven Zero-shot Domain Adaptation"
Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, Taewhan Kim, Dong-Jin Kim
ACM International Conference on Multimedia (MM), 2025. (??% accept rate)
[PDF]
Also presented at "Workshop on Curated Data for Efficient Learning" in conjunction with ICCV 2025.
"SynC: Synthetic Image Caption Dataset Refinement with One-to-many Mapping for Zero-shot Image Captioning"
Si-Woo Kim, MinJu Jeon, Ye-Chan Kim, Soeun Lee, Taewhan Kim, Dong-Jin Kim
ACM International Conference on Multimedia (MM), 2025. (??% accept rate)
[PDF]
Also presented at "Workshop on Curated Data for Efficient Learning" in conjunction with ICCV 2025.
"CatchPhrase: EXPrompt-Guided Encoder Adaptation for Audio-to-Image Generation"
Hyunwoo Oh, SeungJu Cha, Kwanyoung Lee, Si-Woo Kim, Dong-Jin Kim
ACM International Conference on Multimedia (MM), 2025. (??% accept rate)
[PDF]
"VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness"
SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim, Hyunwoo Oh, Dong-Jin Kim
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2025. (22.1% accept rate)
Also presented at "Workshop on AI for Creative Visual Content Generation Editing and Understanding" in conjunction with CVPR 2025.
"ViPCap: Retrieval Text-based Visual Prompts for Lightweight Image Captioning"
Taewhan Kim, Soeun Lee, Si-Woo Kim, Dong-Jin Kim
AAAI Conference on Artificial Intelligence (AAAI), 2025. (23.4% accept rate)
[PDF]
Also presented at "Workshop on Adaptive Foundation Models" in conjunction with NeurIPS 2024.
"IFCap: Image-like Retrieval and Frequency-based Entity Filtering for Zero-shot Captioning"
{Soeun Lee*, Si-Woo Kim*}, Taewhan Kim, Dong-Jin Kim (* Co-first authors)
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. (long, main) (20.8% accept rate)
Also presented at "Workshop on Adaptive Foundation Models" and "Workshop on Video-Language Models" in conjunction with NeurIPS 2024.
"Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality"
Youngtaek Oh, Jae Won Cho, Dong-Jin Kim, In So Kweon, Junmo Kim
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024. (long, main) (20.8% accept rate)
[PDF]
"Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data"
Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, and In So Kweon.
IEEE Access, 2024. (Impact Factor 3.4)
[PDF]
"Empirical study on using Adapters for debiased Visual Question Answering"
Jae Won Cho, Dawit Mureja Argaw, Yeongtaek Oh, Dong-Jin Kim, In So Kweon
Computer Vision and Image Understanding (CVIU), 2023. (Impact Factor 4.5)
[PDF]
"Counterfactual Mix-Up for Visual Question Answering"
{Jae Won Cho*, Dong-Jin Kim*}, Yunjae Jung, In So Kweon (* Co-first authors)
IEEE Access, 2023. (Impact Factor 3.9)
[PDF]
"Local Pseudo-Attributes for Long-Tailed Recognition"
Dong-Jin Kim, Tsung-Wei Ke, Stella X. Yu
Pattern Recognition Letters (PRL), 2023. (Impact Factor 5.1)
Also presented at the "Self-Supervised Learning: Theory and Practice" workshop in conjunction with NeurIPS 2022.
"Modeling Semantic Correlation and Hierarchy for Real-world Wildlife Recognition"
Dong-Jin Kim, Zhongqi Miao, Yunhui Guo, Stella X. Yu
IEEE Signal Processing Letters (SPL), 2023. (Impact Factor 3.201)
[PDF]
Also presented at "Workshop on Human in the Loop Learning" in conjunction with NeurIPS 2022.
"Generative Bias for Robust Visual Question Answering"
Jae Won Cho, Dong-Jin Kim, Hyeonggon Ryu, and In So Kweon,
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2023. (25.78% accept rate)
Received Bronze Prize, 28th Samsung Humantech Paper Awards (Top 2.8%)
Received Excellent Paper Award, IW-FCV 2023
Also presented at "Workshop on Open-Domain Reasoning Under Multi-Modal Settings" in conjunction with CVPR 2023
"DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning"
YoungTaek Oh, Dong-Jin Kim, and In So Kweon.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2022. (25.3% accept rate)
Also presented at "Workshop on Learning with Limited Labelled Data for Image and Video Understanding" in conjunction with CVPR 2022.
"MCDAL: Maximum Classifier Discrepancy for Active Learning"
{Jae Won Cho*, Dong-Jin Kim*}, Yunjae Jung, and In So Kweon (* Co-first authors)
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022. (Impact Factor 14.255)
Also presented at "The Workshop on Fine-Grained Visual Categorization" in conjunction with CVPR 2022.
"Dense Relational Image Captioning via Multi-task Triple-Stream Networks"
Dong-Jin Kim, Tae-Hyun Oh, Jinsoo Choi, and In So Kweon.
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022. (Impact Factor 24.314)
[PDF][arXiv] [Project] [Dataset] [code]
Received Qualcomm Innovation Award 2019.
"Single-Modal Entropy based Active Learning for Visual Question Answering"
{Dong-Jin Kim*, Jae Won Cho*}, Jinsoo Choi, Yunjae Jung, and In So Kweon (* Co-first authors)
British Machine Vision Conference (BMVC), 2021.
[PDF]
"LabOR: Labeling Only if Required for Domain Adaptive Semantic Segmentation"
Inkyu Shin, Dong-Jin Kim, Jae Won Cho, Sanghyun Woo, KwanYong Park, and In So Kweon
IEEE International Conference on Computer Vision (ICCV), 2021. (Oral) (3% accept rate)
[PDF]
Winner of Qualcomm Innovation Fellowship 2021.
"Dealing with Missing Modalities in the Visual Question Answer-Difference Prediction Task through Knowledge Distillation"
Jae Won Cho, Dong-Jin Kim, Yunjae Jung, Jinsoo Choi, and In So Kweon
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Multimodal Learning and Applications Workshop, 2021.
[PDF]
Also presented at "Visual Question Answering Workshop" and "VizWiz Grand Challenge Workshop" in conjunction with CVPR 2021.
"Detecting Human-Object Interactions with Action Co-occurrence Priors"
Dong-Jin Kim, Xiao Sun, Jinsoo Choi, Stephen Lin, and In So Kweon,
European Conference on Computer Vision (ECCV), 2020. (27% accept rate)
[PDF] [Project] [code] [Slides] [Video] [Poster]
Received Silver Prize, 26th Samsung Humantech Paper Awards (Top 1.6%)
Also presented at "The 2nd workshop on Video Turing Test: Toward Human-Level Video Story Understanding" in conjunction with ECCV 2020.
"Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach"
Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon.
International Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019. (23.8% accept rate)
[PDF] [Project] [Slides] [Poster]
Also, presented at "Language&Vision " and "Visual Question Answering and Dialog " Workshops in conjunction with CVPR 2019, and "CLVL: 3rd Workshop on Closing the Loop Between Vision and Language" in conjunction with ICCV 2019.
"Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning"
Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, and In So Kweon.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019. (25.2% accept rate)
[PDF] [Project] [Dataset] [code] [Slides] [Poster]
Extension of this work received Qualcomm Innovation Award 2019.
Also presented at "Language&Vision" and "Visual Question Answering and Dialog" Workshops in conjunction with CVPR 2019.
"Disjoint Multi-task Learning between Heterogeneous Human-centric Tasks"
Dong-Jin Kim, Jinsoo Choi, Tae-Hyun Oh, Youngjin Yoon, and In So Kweon.
IEEE Winter Conference on Applications of Computer Vision (WACV), 2018. (Oral)
[PDF]
대한전자공학회 하계종합학술대회, 2025.
"클래스 토큰 정제를 통한 제로샷 분류 성능 향상" 최승희, 오현우, 김동진
"생성 이미지 캡션 데이터셋 정제를 통한 제로샷 이미지 캡셔닝 강화 연구" 김시우, 전민주, 김예찬, 김동진 (Oral) 우수논문상
"텍스트-비디오 검색에서 모달리티 갭 감소를 위한 듀얼-모달리티 향상 기법" 전민주, 김현지, 김시우, 김동진
"생성 이미지를 활용한 효율적인 제로 샷 도메인 적응 방법" 김예찬, 김시우, 차승주, 김동진
"ReST: 태그 정제 기술을 활용한 경량 이미지 캡셔닝" 김현지, 전민주, 이소은, 김동진
"스타일 특화 초기 잠재 코드 및 주파수 필터링을 활용한 확산 모델에서의 훈련 없는 스타일 전이" 안예빈, 차승주, 김동진
"맥락 기반 텍스트 쿼리를 활용한 비디오 시간 추론" 한지민, 전민주, 김동진
영상처리 및 이해에 관한 워크샵, 2025.
"제로샷 캡셔닝을 위한 전역적 및 지역적 맥락 검색과 객체 카운터 " 이소은, 김시우, 김태환, 김동진 우수포스터발표상
"적응형 특징 증강 기법을 활용한 긴 꼬리 분포 분류 문제 연구" 최승희, 오현우, 김동진
대한전자공학회 하계종합학술대회, 2024.
"태그 정보를 활용한 이미지 검색 모듈 프롬프팅" 전민주, 김현지, 이소은, 김동진 우수학생논문상
"검색 증강 및 퓨전 모듈 적용을 통한 텍스트 전용 이미지 캡셔닝 강화 연구" 이소은, 김태환, 김시우, 김동진
영상처리 및 이해에 관한 워크샵, 2024.
"확산 모델의 손실 함수 개선 및 멀티 모달 다 속성 조건을 이용한 오디오 기반 이미지 조작 개선" 이관영, 차승주, 최승희, 오현우, 김동진 (Oral) 우수논문상 장려상
"이미지 생성과 지도적 대조학습을 활용한 불균형 데이터 셋 분류 문제 해결 방안" 차승주, 최승희, 이관영, 김동진
"이미지 캡셔닝에서 어댑터를 활용한 효율적 이미지 검색 방법론" 전민주, 김시우, 이소은, 김동진