The Multimodal Artificial Intelligence Lab conducts advanced research at the intersection of perception, reasoning, and autonomous action across multiple sensory modalities. We develop intelligent systems that integrate and interpret data from diverse sources to operate effectively in complex, dynamic environments.
Our work includes:
Real-time multimodal sensor fusion and computer vision for robust perception
Advanced acoustic and audio signal processing for accurate modeling, detection, and tracking
Development and fine-tuning of private large language models (LLMs) and agentic AI systems for autonomous planning, adaptive decision-making, and dialogue management
Implementation of retrieval-augmented generation (RAG) and long-term memory architectures to enhance AI capabilities
Combining state of the art techniques in computer vision, signal processing, and language modelling to build AI robots that are context aware.
By combining state-of-the-art techniques in vision, signal processing, language modeling, and cross-modal fusion, we build AI systems that are context-aware, continuously adaptive, and capable of intelligent interaction with their environment.
We build AI that doesn’t just analyze the world but engages with it intelligently.
Commitment To Ethics
1. Fairness and Bias Mitigation
2. Transparency and Explainability
3. Accountability and Governance
4. Privacy and Data Protection
5. Safety and Reliability
6. Human-Centered Values
7. Societal and Environmental Impact
8. Global and Cultural Considerations
We are located at: Room 213, Pangborn Hall, College of Engineering, Physics, and Computing, The Catholic University of America, Washington DC, 20064