Workshop Scope
Vision-based AI is dramatically reshaping digital health by marrying state-of-the-art imaging hardware with sophisticated machine-learning techniques. By leveraging high-resolution modalities—such as MRI, CT, and retinal photography—and applying deep neural networks to their outputs, these systems can detect the faintest signs of disease long before they become clinically apparent. This capability not only elevates diagnostic precision—catching tumors, vascular abnormalities, or degenerative changes at their earliest stages—but also relieves clinicians of repetitive image-review tasks, streamlining workflows and allowing more time for patient interaction.
At the same time, the marriage of vision-based AI with large language models (LLMs) is giving rise to truly multimodal “Vision + LLM” platforms. These hybrid systems fuse visual understanding with natural language processing, enabling automated drafting of richly detailed clinical reports, context-aware retrieval of related literature, and intelligent question-answering interfaces for medical teams. By uniting pixel-level insights with conversational language capabilities, Vision + LLM solutions not only bolster decision-making and documentation but also pave the way for more holistic, personalized care pathways—transforming raw imaging data into actionable knowledge across the entire spectrum of research, diagnosis, and treatment.
Key Themes:
Early, Precise Diagnosis: AI detects subtle anomalies in X-rays, MRIs, retinal scans, and more—catching disease sooner.
Automated Workflows: Machine-driven image review and report drafting free clinicians to focus on complex care.
Multimodal AI (Vision + LLM): Integrating visual data interpretation with language models to generate richer clinical insights and narratives.
Surgical & Monitoring Support: Real-time imaging guidance in the OR, plus facial-recognition and video-based patient tracking for safety.
Accelerated Research: High-speed, unbiased processing of vast imaging datasets streamlines discovery and treatment development.
Call for Papers
We invite paper submissions with topics include, but not limited to:
Vision LLM for Healthcare
Improving diagnostic accuracy in clinical settings
Improving treatment planning in clinical settings
Interpreting medical images
Generating detailed clinical reports
Medical Image Analysis and Diagnostics
Detecting abnormalities
Improving diagnostic precision
Supporting treatment planning
Real-Time 3D Reconstruction for Medical Endoscopy
Challenges in Endoscopic 3D Reconstruction
Methodological Spectrum
Adaptations for Endoscope Types
Benchmarking & Datasets
Applications in Digital Health
Identifying cancerous lesions at early stages
Wound monitoring through sequential image analysis to track healing progress
Gait analysis via video-based methods to assess patient mobility
Remote patient monitoring using live video streams to continuously track vital signs and patient behaviors
Paper formatting: Papers must be a minimum of 4 pages and may not exceed 8 pages, including all figures and tables, formatted in the official ICCV style. Additional pages are permitted only for references. Please download the ICCV 2025 Author Kit for detailed formatting instructions.
Please follow the ICCV 2025 Author Guidelines and submit your paper through the VADH 2025 submission portal on Openreview.
Accepted papers will be published in the ICCV 2025 Workshop Proceedings following the ICCV 2025 publication guidelines.
Important Dates
Paper Submission: June 27
Notification to authors: July 9
Camera-ready submission: August 18, 11:59 PM, Pacific Daylight Time
Workshop: October 19
VP, Digital Health, Eli Lilly & Co.
Professor, University of Pennsylvania
Organizers
Hui Zhang, PhD, Executive Director, Eli Lilly & Co.
Bojian Ho, PhD, Researcher, University of Pennsylvania
Yuanfang Guan, PhD, Professor, University of Michigan
Guangchen Ruan, PhD, Advisor Engineeri, Eli Lilly & Co.