Jeong Hun Yeo

Jeong Hun Yeo

Ph.D. Candidate

Integrated Vision & Language Lab.

Korea Advanced Institute of Science and Technology (KAIST)

e-mail: sedne246@kaist.ac.kr / [Google Scholar] / [LinkedIn]

Education 

Ph.D in Electrical Engineering (advisor: Prof. Yong Man Ro

M.S in Electrical Engineering (advisor: Prof. Yong Man Ro

B.S in Electrical Engineering 

Publications 

Preprints

Jeong Hun Yeo, Chae Won Kim, Hyunjun Kim, Hyeongseop Rha, Seunghee Han, Wen-Huang Cheng, Yong Man Ro

arXiv 2024, Under review [Paper][Code]


International Journal

Jeong Hun Yeo, Minsu Kim, Jeongsoo Choi, Dae Hoe Kim, and Yong Man Ro 

IEEE Transactions on Multimedia (TMM), 2024 [Paper]


International Conference 

1. Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

Jeong Hun Yeo*,  Seunghee Han*, Minsu Kim, and Yong Man Ro (* Co-First Authors)

Empirical Methods in Natural Language Processing (EMNLP) 2024 Findings, [Paper][Code]

2. Efficient Training for Multilingual Visual Speech Recognition: Pre-training with Discretized Visual Speech Representation 

Minsu Kim*, Jeong Hun Yeo*, Se Jin Park, Hyeongseop Rha, and Yong Man Ro  (* Co-First Authors)

 The Association for Computing Machinery's Annual Conference on Multimedia, (ACMMM), 2024, [Paper][Code]  

3. Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation  

Se Jin Park*, Chae Won Kim*, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeong Hun Yeo, and Yong Man Ro  

Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) Oral Presentation, 2024, [Paper | Data | Demo]   

4. Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper

Jeong Hun Yeo*, Minsu Kim*, Shinji Watanabe, and Yong Man Ro (* Co-First Authors)

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [Paper][Code]

5. Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens

Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, and Yong Man Ro

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [Paper][Code]

6. Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge

Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, and Yong Man Ro (* Co-First Authors)

IEEE/CVF International Conference on Computer Vision (ICCV), 2023 [Paper][Code]

7. Multi-Temporal Lip-Audio Memory for Visual Speech Recognition

Jeong Hun Yeo, Minsu Kim, and Yong Man Ro

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 [Paper]

8. Distinguishing Homophenes using Multi-head Visual-audio Memory for Lip Reading

Minsu Kim, Jeong Hun Yeo, and Yong Man Ro

AAAI Conference on Artificial Intelligence (AAAI), 2022 [Paper] [Code]


Professional Activities 

Teaching 

Teaching Assistant