ELVM
Efficient Large Vision Models

CVPR Workshop (2nd Edition)

Overview

Large Vision Models (LVMs) have reshaped computer vision, demonstrating strong generalization across tasks such as recognition, segmentation, generative modeling, and multimodal reasoning. However, their growing scale comes with substantial computational costs—training requires massive infrastructure, inference remains slow and energy-intensive, and deployment on resource-constrained devices is challenging. While scaling up models has driven impressive performance gains, efficiency has not kept pace. This raises a critical question: Can we achieve similar capabilities with significantly lower computational overhead?

This workshop focuses on the core principles of efficiency in large-scale vision models. Rather than simply reducing FLOPs or memory consumption, true efficiency comes from rethinking computation itself. How do we minimize redundant operations in generative models without compromising quality? Can autoregressive decoding and diffusion sampling be accelerated through parallelization? What are the trade-offs between compression, quantization, and expressivity? We seek to advance new directions in compact model representations, adaptive computation, parallel decoding, and structured sparsity—approaches that go beyond incremental optimizations and redefine how LVMs operate.

We invite researchers working on fast and scalable vision architectures, low-cost inference, and efficient generative models to share their insights. Whether through sampling acceleration, efficient transformers, new architectural paradigms, or theoretical limits of model compression, this workshop provides a platform to discuss how LVMs can be optimized for both performance and practicality.


Join us in shaping the next generation of vision models—where efficiency is not just a constraint, but a driving force for innovation. 

Sponsors & Partners

Please contact us if you are interested in sponsoring this event!