Our lab focuses on advancing machine learning, particularly in the areas of Multimodal Learning, Self-supervised Learning, and Social Artificial Intelligence. However, our interests are not limited to these areas. We are open to a wide range of machine learning topics that enable machines to achieve human-level intelligence or beyond. Please refer to our publications for further information.
We build machines that can perceive, reason, and generate by leveraging multiple modalities such as visual, language, audio, and other sensory cues. By learning the interactions between these modalities, we address complex challenges in real-world environments.
Visual-Language learning / Audio-Visual learning / Multimodal foundation models
We explore self-supervised learning paradigms that enable machines to learn meaningful feature representations from weakly-labeled or unlabeled data. By reducing the reliance on costly labeled data and leveraging large-scale unlabeled data, we develop more efficient and scalable learning systems.
Weakly-labeled data / Unlabeled data / Representation learning
We develop machines with social intelligence, enabling them to recognize, interpret, and appropriately respond to human social behaviors. By modeling social dynamics involving both verbal and non-verbal communication, we aim to create socially intelligent machines that can interact seamlessly with humans across diverse social contexts.
Social perception / Social reasoning / Social AI agents