Hyunjun Kim

Scholar

Education

Ph. D. Candidate Mar. 2020 - Present

Electrical Engineering, Korea Advance Institute of Science and Technology (KAIST) Daejeon, South Korea

- Advisor: Prof. Yong Man Ro

B.S. Mar. 2014 - Feb. 2020

Electrical and Electronic Engineering, Yonsei University Seoul, South Korea

Research Interests

Machine Learning

- Deep Learning

- Computer Vision

- Object Detection

- Multimodal Learning

- Multimodal Large Language Model

Currently, I am focused on exploring methods to expand the exceptional capabilities of Large Language Models (LLMs) into multimodal applications.

Specifically, my research aims to enhance the robustness of Multimodal Large Language Models (MLLMs) and improve their performance in challenging environments for object detection.

Publications

1. ReFoCUS: Reinforcement-guided Frame Optimization for Contextual Understanding

Hosu Lee*, Junho Kim*, Hyunjun Kim, Yong Man Ro (* Equal Contribution)

Under Review

2. DIP-R1: Deep Inspection and Perception with RL Looking Through and Understanding Complex Scenes

Sungjune Park*, Hyunjun Kim*, Junho Kim, Seongho Kim, and Yong Man Ro (* Equal Contribution)

Under Review

3. Language-guided Learning for Object Detection Tackling Multiple Variations in Aerial Images

Sungjune Park, Hyunjun Kim, Beomchan Park, and Yong Man Ro

Under Review

4. Look Every Frame All at Once: Video-Mamba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Hosu Lee*, Junho Kim*, Hyunjun Kim, Yong Man Ro (* Equal Contribution)

Under Review

5. SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Junho Kim*, Hyunjun Kim*, Hosu Lee, and Yong Man Ro (* Equal Contribution)

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025

6. Personalized lip reading: Adapting to your unique lip movements with vision and language

Jeong Hun Yeo, Chae Won Kim, Hyunjun Kim, Hyeongseop Rha, Seunghee Han, Wen-Huang Cheng, Yong Man Ro

Proceedings of the AAAI Conference on Artificial Intelligence 39 (9), 9472-9480

7. CODE: Contrasting self-generated description to combat hallucination in large multi-modal models

Junho Kim*, Hyunjun Kim*, Hosu Lee, and Yong Man Ro (* Equal Contribution)

Advances in Neural Information Processing Systems (NeurIPS), 2024

8. Weather-Aware Drone-View Object Detection via Environmental Context Understanding

Hyunjun Kim, Dahye Lee, Sungjune Park, and Yong Man Ro

IEEE International Conference on Image Processing (ICIP), 2024

9. Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection

Sungjune Park*, Hyunjun Kim*, and Yong Man Ro (* Equal Contribution)

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2024

10. Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank

Sungjune Park*, Hyunjun Kim*, and Yong Man Ro (* Equal Contribution)

Pattern Recognition (PR), 2024

11. Speaker-adaptive lip reading with user-dependent padding

Minsu Kim, Hyunjun Kim, and Yong Man Ro

European Conference on Computer Vision (ECCV), 2022

Page updated

Google Sites

Report abuse