What is AI?

Artificial Intelligence (AI) refers to the capability of machines to perform tasks that typically require human intelligence, such as learning, reasoning, perception, interaction, and creative content generation. From personalized virtual assistant to self-driving vehicles, AI is reshaping how we live and work.

AI Technology Landscape

What is Multimodal AI?

Multimodal AI refers to machine learning models capable of processing and integrating information from multiple modalities or types of data. These modalities can include text, images, audio, video and other forms of sensory input.

At the Multimodal AI Lab, we advance AI through four core research themes, three focusing on a unique modality or functional approach and the fourth theme fusing the different modalities into creating an AI robot:

Spatial Intelligence and Generative Network (SIGN): Computer Vision

Image processing and computer vision
Detection and tracking
Classification and recognition

Waveform Analysis and Vocal Engineering (WAVE): Acoustic Signal Processing
1. - Underwater acoustics signal processing.
  - Underground tunnel detection.
  - Speech processing.
  - Seismic signal processing.
  - Audio and waveform signal processing.
Agentic Understanding and Retrieval Architecture (AURA): Large Language Model
1. - Large Language Model
  - LLM-based virtual assistant
  - Agentic AI
  - Personalized human-like agent.

4. Robotics

Aerial Robots
Ground Robots

AI is revolutionizing our world, and Multimodal AI brings us closer to truly intelligent, human-like systems. At our lab, SIGN, WAVE, AURA and ROBOTICS represent how we fuse visual, auditory, and cognitive modalities into impactful, robust, and trustworthy AI applications. We’re building the future, one that’s more perceptive, adaptive, and responsibly intelligent.

Page updated

Report abuse