Jeong Hun Yeo
Jeong Hun Yeo
Ph.D. Candidate
Integrated Vision & Language Lab.
Korea Advanced Institute of Science and Technology (KAIST)
e-mail: sedne246@kaist.ac.kr / [Google Scholar] / [LinkedIn]
Education
Korea Advanced Institute of Science and Technology (KAIST), South Korea (2022 - Present)
Ph.D in Electrical Engineering (advisor: Prof. Yong Man Ro)
Korea Advanced Institute of Science and Technology (KAIST), South Korea (2020 - 2022)
M.S in Electrical Engineering (advisor: Prof. Yong Man Ro)
Korea Advanced Institute of Science and Technology (KAIST), South Korea (2014 - 2020)
B.S in Electrical Engineering
Publications
Preprints
Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing
Jeong Hun Yeo*, Seunghee Han*, Minsu Kim, and Yong Man Ro (* Co-First Authors)
arXiv 2024, Under review [Paper][Code]
Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units
Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, Se Jin Park, and Yong Man Ro (* Co-First Authors)
arXiv 2024, Under review [Paper]
International Journal
AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Jeong Hun Yeo, Minsu Kim, Jeongsoo Choi, Dae Hoe Kim, and Yong Man Ro
IEEE Transactions on Multimedia (TMM), 2024 [Paper]
International Conference
1. Visual Speech Recognition for Languages with Limited Labeled Data using Automatic Labels from Whisper
Jeong Hun Yeo*, Minsu Kim*, Shinji Watanabe, and Yong Man Ro (* Co-First Authors)
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [Paper]
2. Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, and Yong Man Ro
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024 [Paper]
3. Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge
Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, and Yong Man Ro (* Co-First Authors)
IEEE/CVF International Conference on Computer Vision (ICCV), 2023 [Paper]
4. Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Jeong Hun Yeo, Minsu Kim, and Yong Man Ro
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023 [Paper]
5. Distinguishing Homophenes using Multi-head Visual-audio Memory for Lip Reading
Minsu Kim, Jeong Hun Yeo, and Yong Man Ro
AAAI Conference on Artificial Intelligence (AAAI), 2022 [Paper] [Code]
Professional Activities
Teaching
EE837 Special Topics in Signal Processing: Multimedia Processing and Learning, KAIST (2023)
Teaching Assistant