Workshop mAI4Health

First International Workshop on
Multimodal Artificial Intelligence for Healthcare (mAI4Health)

in conjunction with the
23rd International Conference on Image Analysis and Processing (ICIAP 2025)

September 2025 - Rome, Italy

[WORKSHOP NOT CONFIRMED]

< The workshop cannot be confirmed due to an insufficient number of contributions >

Motivation

Multimodal Artificial Intelligence (AI) is an emerging field in high-performance computational sciences that integrates multiple data streams—such as text, images, videos, audio, and numerical data—to enhance information extraction, inference accuracy, and bias reduction. By synthesizing diverse data sources, multimodal AI provides a more comprehensive representation of complex physical, medical, and societal processes.

In mission-critical domains like healthcare, multimodal AI has the potential to revolutionize medical analytics, improving disease detection, prediction, diagnosis, risk stratification, referrals, and clinical decision-making. As modern healthcare systems generate vast and diverse datasets—including medical reports, clinical notes, radiology images, physician dictations, patient audio recordings, physiological signals, and genomic data — AI models must effectively process and integrate these different data types, simulating how the human brain synthesizes multiple sensory inputs for decision-making.

Despite its advantages, multimodal AI faces challenges in data alignment across modalities, requiring well-annotated datasets and advanced embedding techniques. Additionally, integrating domain-specific medical knowledge is crucial for clinically meaningful interpretations.

This Workshop will showcase the latest advancements in multimodal AI-driven biomedical research and healthcare applications, addressing key challenges, emerging solutions, and future opportunities.

Topics List

The Workshop invites contributions on all aspects of multimodal AI models for image analysis in healthcare and biomedical applications, with a focus on (but not limited to):

Data Fusion Techniques. Techniques for integrating diverse data sources — such as genomics, electronic health records (EHRs), and wearable sensors — with image and video data in biomedicine.
AI in Multimodal Medical Imaging. Advanced techniques for integrating imaging data (e.g., MRI, CT, X-ray) with genomic, phenotypic, and clinical information to enhance diagnostic accuracy and provide deeper clinical insights for personalized medicine.
Multimodal AI in Remote and Telemedicine Applications. AI-driven integration of data from telemedicine platforms, remote sensors, and patient-reported outcomes for long-distance clinical care.
Multimodal AI for Assistive Tools. Enabling users to interact with technology through various modalities, such as voice commands and visual gestures, enhancing accessibility and user experience.
Natural Language Processing (NLP) and Computer Vision (CV) Integration. Combining NLP and CV to extract insights from clinical notes and medical images, enhancing decision support and clinical analysis.
Multimodal Large Language Model (LLM) and Conversational AI. AI-driven multimodal approaches that combine textual and visual inputs to answer questions and offer comprehensive insights, supporting medical professionals and caregivers in their decision-making processes.
Resilient Multimodal Artificial Intelligence. Developing systems that operate effectively in challenging, noisy, incomplete, and uncertain real-world biomedical settings.
Explainable AI (XAI) in Multimodal Healthcare Systems. Methods for enhancing the transparency and interpretability of multimodal AI models, ensuring that clinicians and patients can trust AI-driven decisions.
Ethical and Fairness Challenges in Multimodal AI. Addressing bias, data privacy, and the ethical challenges that arise when using multimodal datasets in healthcare applications.

Acknowledgment

We gratefully acknowledge financial support from the PNRR MUR project PE0000013-FAIR.