Multimodal AI Lab

@ Inha University

Notice

We are looking for self-motivated and passionate students. If you are interested in joining us, please visit [Join us] page for more information.

Our research focuses primarily on understanding and how to interactively fuse multimodal information from diverse sources such as images, videos, text, audio, and etc. Specfically, our research topics cover various problems of Computer Vision, Natural Languge Processing, Signal Processing, which includes (but not limited to):

Vision & Language
Scene Understanding (e.g., detection, segmentation)
Cross-modal Retrieval (e.g., text-to-image, audio-to-video)
Video Understanding
Human Behavior Analysis
Knowledge Distillation

Large-scale Foundation Models (e.g., LLMs, LVMs)
Generative Models
Model Robustness
Semi- / Weakly-supervised Learning
Model Debiasing & Fairness

Page updated

Google Sites

Report abuse