Research Objective
My main research direction revolves around the challenge of efficient streaming perception in embodied AI and robotics. This area is vital because, unlike traditional models, embodied agents often lack complete observational data, necessitating real-time, sequential decision-making based on partial, streaming inputs. This mirrors how humans process visual information continuously, not in isolated instances.
Moving forward from my previous research, I am now focusing on integrating video perception models with Vision-Language Models (VLMs) and extending their applications to robotics, actively collaborating with Professor Youngwoon Lee.
You can assess my full research statement here.