本實驗室專注於多媒體分析、人工智慧、跨模態學習與深度學習等領域的研究,致力於設計能夠理解 (Understanding)、學習 (Learning)與生成 (Generatig)具有語意與表現力的多模態智慧系統。研究涵蓋的模態包括但不限於:音樂、聲音、影像、影片與文字等。本實驗室特別重視以音樂為核心的人工智慧研究,研究主題涵蓋音樂資訊檢索、表現式音樂演奏生成,以及音樂驅動的多模態生成模型。此外,我們也積極探索多模態大型語言模型(Multimodal LLMs)、生成式 AI(Generative AI)、情感運算(Affective Computing)與人機互動(Human-Computer Interaction)等相關領域。The Multimedia and Multimodal Intelligence Lab (MMI Lab) focuses on research in multimedia analysis, artificial intelligence, cross-modal learning, and deep learning. Our goal is to develop intelligent systems capable of understanding, learning, and generating semantically meaningful and expressive multimodal content systems. The modalities we study include, but are not limited to, music, audio, images, video, and text.A primary focus of our laboratory is research on music-centric artificial intelligence, including music information retrieval, expressive music performance generation, and music-driven multimodal generation models. In addition, we actively explore emerging topics such as Multimodal Large Language Models (Multimodal LLMs), Generative AI, Affective Computing, and Human-Computer Interaction.