We invite submissions to the MIV workshop focusing on mechanistic interpretability and understanding of vision models. Papers can be submitted to either a proceedings or a non-proceedings track.
Proceedings Track Instructions: We welcome original submissions for the proceedings track, which will be published in the CVPR Workshops Proceedings. All accepted papers will be presented as posters and a selected group of papers will be presented as 10-minute spotlight talks. Submitted papers must follow the CVPR 2025 submission format (template link). Submissions to the proceedings track are limited to 4 pages in addition to an appendix.
Use the Proceedings Track submission link to submit your manuscript.
Non-Proceedings Track Instructions: We welcome already published works or papers being presented at CVPR for the non-proceedings track. All accepted papers will be presented as posters and a selected group of papers will be presented as 10-minute spotlight talks. Submitted papers must follow the CVPR 2025 submission format (template link). Non-proceedings submissions are allowed up to 8 pages and an appendix.
Use the Non-Proceedings Track submission link to submit your manuscript.
Submission Instructions: We will be using OpenReview to manage submissions, following a double-blind review process. All submissions must be anonymized. There will be no rebuttal phase. All accepted papers will be presented as posters at the workshop, and a selected set of submissions from both the proceedings and non-proceedings tracks will be chosen for spotlight presentations at the workshop. At least one author must be physically present for a spotlight talk.
Important Dates:
March 1st March 10th AOE - Paper submission deadline on OpenReview :
Proceedings Track submission link,
Non-Proceedings Track submission link
March 21 March 28th- Paper Acceptance Notification.
April 4th AOE - Camera Ready Deadline.
Areas of interest include but are not limited to:
Visualizing and Understanding Internal Components of Vision and Multimodal Models:
This involves developing methods for visualizing units of vision models such as neurons and attention heads.
Scaling and Automating Interpretability Methods:
How can we scale interpretability methods to larger models and beyond toy datasets for practical applications? This includes developing toolkits and interfaces for practitioners.
Evaluating Interpretability Methods:
This involves developing benchmarks and comparing interpretability methods.
Model Editing and Debiasing:
After developing methods for visualizing and understanding the internals of vision models, how can we causally intervene to change model behavior and make it more safe, less biased, and suited for specific tasks?
Identifying Failure Modes and Correcting Them:
Can we visualize the internals of models to find shortcomings of algorithms or architectures? How can we use these findings to improve design choices?
Emergent Behavior in Vision and Multimodal Models:
Using interpretability techniques, what are intriguing properties that we can discover of large vision and multimodal models? Examples include entanglement of visual and language concepts in CLIP or controllable linear subspaces in diffusion models.
Representation Similarity and Universality:
Several works have found the convergence of representations learned with different model architectures trained with different datasets, with different tasks and modalities. How can we characterize the similarity of these different models.
Understanding Vision Models with Language:
How can we develop methods to use language representations to explain visual representations?
In-Context Learning:
Language models have shown impressive zero-shot capabilities. How can we elicit similar responses from vision models?
Understanding the Role of Data and Model Behavior:
What role does data have on the algorithm? What are biases and properties we can extract from datasets?