MICCAI 2024
Date: October 6, 2024
Diamant Room at Palmeraie Palace
8:00AM - 12:30PM
Generative AI and large-scale self-supervised foundation models are poised to have a profound impact on human decision making across occupations. Healthcare is one such area where such models have the capacity to impact patients, clinicians, and other care providers. This tutorial, structured as a combination of lectures and demonstrations, seeks to furnish participants with a comprehensive guide on harnessing the power of vision and large language models within the healthcare domain. Then, we will describe methodologies that are tailored to clinical tasks and provide application examples from various imaging domains reflecting research interests within a wide MICCAI community.
Medical imaging has always been a challenging field to test the ideas developed in AI. The latest work in AI is all surrounding foundation models for language, vision, etc. Building healthcare-specific foundation models is relevant to our community as we have learned from past experience that the standard deep learning models still need a good amount of conditioning before they will be relevant to medical imaging. Learning these techniques in a timely fashion by our MICCAI community members will help accelerate not only their adoption in our field but also advance the science of AI by providing adequate requirements for such systems.
This is an emerging topic with little systematic courses organized at many universities. The course presented here is modeled after the speakers respective courses delivered at their universities, including CS277/BIODS271 at Stanford University.
Detailed breakdown of topics over a 4-hour tutorial window including a 30-minute coffee break is as follows:
1. Introduction to Foundation Models -- 8 to 8:40 AM
a. Evolution of Machine learning models
b. Definition of Foundation models
c. What makes a model foundational?
d. Examples of foundational models
e. Frameworks: Self-supervised learning, contrastive learning, masked auto-encoders
2. Vision-language models -- 8:40 to 9:20 AM
a. Zero-shot inference for classification
b. Vision-language models for medical imaging (e.g., embedding domain knowledge)
c. Captioning
3. Fine-tuning foundation models - 9:20 to 10:00 AM
a. Prompt learning
b. Adapters
c. Linear-probing baselines
d. Parameter-efficient fine-tuning (e.g., low-rank approximation)
e. Transduction helps VLMs.
Coffee break: 10 to 10:30 am
4. Foundational models for segmentation -- 10:30 to 11:10 AM
a. SAM and other foundation models for medical imaging.
b. Generalist vs. domain-specialized?
c. Finetuning for segmentation: Spatial adapters, parameter-efficient fine-tuning, constrained transductive inference.
5. Overview of Vision and Language Models (vLLMs) --11:10-11:50 AM
a. Expanding Large Language Models to Vision: LLaVA
b. vLLMs in Medicine
c. vLLMs in Pathology
d. vLLMs in Radiology
6. Enhancing vLLMs utilization -- 11:50 AM -12:30 PM
a. Prompting for chest X-ray report generation and disease diagnosis
b. Instruction tuning
c. Retrieval Augmented Generation
Familiarity with machine learning principles at a graduate level is expected of the participants.
To become familiar with the latest foundation models and learn how they can be relevant for multimodal medical imaging research.
To also have hands-on experience in using the models for some standard tasks in healthcare.
Building a clear understanding of the main strengths and weaknesses of several stat-of-the-art approaches and learning how to use them in several medical imaging problems
Acquiring basic knowledge as to how to implement some of these solutions in a case-study example.
Ismail Ben Ayed
Full Professor at ÉTS Montréal
Jose Dolz
Associate Professor at ÉTS Montréal
Julio Silva Rodriguez
Post-doc at ÉTS Montréal
Tanveer Syeda-Mahmood
IBM Fellow, Chief Scientist
Akshay Chaudhari
Assistant Professor at Stanford
Yuyin Zhou
Assistant Professor at University of California, Santa Cruz (UCSC)
Yunsoo Kim
PhD student at UCL
Honghan Wu
Associate Professor at UCL
Jinge Wu
PhD student at UCL
Ismail Ben Ayed
Full Professor at ÉTS Montréal
Jose Dolz
Associate Professor at ÉTS Montréal
Julio Silva Rodriguez
Post-doc at ÉTS Montréal
Yuyin Zhou
Assistant Professor at University of California, Santa Cruz (UCSC)
Tanveer Syeda-Mahmood
IBM Fellow, Chief Scientist
Akshay Chaudhari
Assistant Professor at Stanford
James Zou
Associate Professor at Stanford
Yusuf Abdulle
Research Assistant at UCL
Yunsoo Kim
PhD student at UCL
Honghan Wu
Associate Professor at UCL
Jinge Wu
PhD student at UCL