The Multimodal Artificial Intelligence Lab conducts advanced research at the intersection of perception, reasoning, and autonomous action across multiple sensory modalities. We develop intelligent systems that integrate and interpret data from diverse sources to operate effectively in complex, dynamic environments.
Our work includes:
Real-time multimodal sensor fusion and computer vision for robust perception
Advanced acoustic and audio signal processing for accurate modeling, detection, and tracking
Development and fine-tuning of private large language models (LLMs) and agentic AI systems for autonomous planning, adaptive decision-making, and dialogue management
Implementation of retrieval-augmented generation (RAG) and long-term memory architectures to enhance AI capabilities
By combining state-of-the-art techniques in vision, signal processing, language modeling, and cross-modal fusion, we build AI systems that are context-aware, continuously adaptive, and capable of intelligent interaction with their environment.
We build AI that doesn’t just analyze the world but engages with it intelligently.
We are located at: Room 213, Pangborn Hall, College of Engineering, Physics, and Computing, The Catholic University of America, Washington DC, 20064