VADH'25

Vision-Based AI for Digital Health: From Pixels to Practice

in conjunction with ICCV 2025

Workshop Scope

Vision-based AI is dramatically reshaping digital health by marrying state-of-the-art imaging hardware with sophisticated machine-learning techniques. By leveraging high-resolution modalities—such as MRI, CT, and retinal photography—and applying deep neural networks to their outputs, these systems can detect the faintest signs of disease long before they become clinically apparent. This capability not only elevates diagnostic precision—catching tumors, vascular abnormalities, or degenerative changes at their earliest stages—but also relieves clinicians of repetitive image-review tasks, streamlining workflows and allowing more time for patient interaction.

At the same time, the marriage of vision-based AI with large language models (LLMs) is giving rise to truly multimodal “Vision + LLM” platforms. These hybrid systems fuse visual understanding with natural language processing, enabling automated drafting of richly detailed clinical reports, context-aware retrieval of related literature, and intelligent question-answering interfaces for medical teams. By uniting pixel-level insights with conversational language capabilities, Vision + LLM solutions not only bolster decision-making and documentation but also pave the way for more holistic, personalized care pathways—transforming raw imaging data into actionable knowledge across the entire spectrum of research, diagnosis, and treatment.

Key Themes:

Early, Precise Diagnosis: AI detects subtle anomalies in X-rays, MRIs, retinal scans, and more—catching disease sooner.
Automated Workflows: Machine-driven image review and report drafting free clinicians to focus on complex care.
Multimodal AI (Vision + LLM): Integrating visual data interpretation with language models to generate richer clinical insights and narratives.
Surgical & Monitoring Support: Real-time imaging guidance in the OR, plus facial-recognition and video-based patient tracking for safety.
Accelerated Research: High-speed, unbiased processing of vast imaging datasets streamlines discovery and treatment development.

Call for Papers

We invite paper submissions with topics include, but not limited to:

Vision LLM for Healthcare
- Improving diagnostic accuracy in clinical settings
- Improving treatment planning in clinical settings
- Interpreting medical images
- Generating detailed clinical reports
Medical Image Analysis and Diagnostics
- Detecting abnormalities
- Improving diagnostic precision
- Supporting treatment planning
Real-Time 3D Reconstruction for Medical Endoscopy
- Challenges in Endoscopic 3D Reconstruction
- Methodological Spectrum
- Adaptations for Endoscope Types
- Benchmarking & Datasets
Applications in Digital Health
- Identifying cancerous lesions at early stages
- Wound monitoring through sequential image analysis to track healing progress
- Gait analysis via video-based methods to assess patient mobility
- Remote patient monitoring using live video streams to continuously track vital signs and patient behaviors

Paper formatting: Papers must be a minimum of 4 pages and may not exceed 8 pages, including all figures and tables, formatted in the official ICCV style. Additional pages are permitted only for references. Please download the ICCV 2025 Author Kit for detailed formatting instructions.

Please follow the ICCV 2025 Author Guidelines and submit your paper through the VADH 2025 submission portal on Openreview.

Accepted papers will be published in the ICCV 2025 Workshop Proceedings following the ICCV 2025 publication guidelines.

Important Dates