Foundation Models for Vision Applications (FMVA) 2026

Workshop held in conjunction with the International Conference on Pattern Recognition (ICPR) 2026

Palais des congrès de Lyon, Lyon, France.

August 22 2026 (08:00 - 14:00)

The Workshop

In recent years large scale visual-language models have seen an explosive rise in capability. As general models they offer exceptional performance on a range of downstream tasks not explicitly trained for. By necessity, the full capability of an architecture is often only expressed in terms of zero-shot performance on benchmark dataset tasks, such as ImageNet classification or ADE20K segmentation. The performance, generality, comprehension and prompt-sensitivity of the architectures in specific vision fields, such as biometrics or medical imaging, has room for exploration.

Visual Language Models have the potential to revolutionise these areas through direct application, novel systems design and explainability, leading to insights on future model development, and a more comprehensive understanding of the tasks they are applied to. This workshop aims to provide a platform for the effective utilisation of such architectures in all fields of computer vision with their varying requirements.

Topics

The workshop topics include (but are not limited to):

Dataset development and curation
Metrics and benchmarking methodologies (performance, robustness, and fairness, etc.)
Innovative applications and methodological advances for exploiting visual–language models
Multimodal learning and representation (for example synergy between vision and language)
Biometric analysis and human–computer interaction
Biomedical imaging and bioinformatics
Vehicular traffic perception and analysis
Image, speech, and video processing
Explainable and privacy-preserving artificial intelligence.

Page updated

Google Sites

Report abuse

Foundation Models for Vision Applications (FMVA) 2026

The Workshop

Further: