January, 10th 2021

Visual-Textual Image Understanding and Retrieval (VTIUR)

Organized in conjunction with ICPR2020 The 25th International Conference on Pattern Recognition, Milan, Italy

The aim of the Visual-Textual Image Understanding and Retrieval (VTIUR) workshop is to gather researchers and practitioners interested in building models capable of understanding visual data, such as images and videos, in order to build intelligent solutions to Computer Vision tasks, such as image analysis, classification, and retrieval.

Visual data may be enhanced with textual information. These conditions raise several challenges at the interface of Computer Vision and Natural Language Processing where multi-modal tasks, such as visual question answering and visual retrieval, take place. Recent advances in these tasks have been made possible by exploiting Artificial Intelligence techniques that make use of this supplementary source of knowledge. In particular, considering the need to attend to both visual and textual information in order to understand the intertwining between them, Deep Learning techniques make it possible to design complex neural networks where hierarchical representations of the available data can be automatically learned, while also taking into account their multi-modal nature. Furthermore, given the opportunities both from a research and from a production point of view, the VTIUR workshop aims at gathering people both from the academia and the industry, in order to stimulate the sharing of recent trends, novel ideas and applications, and to raise new opportunities and promising new directions of research that should be explored in the future.

VTIUR is a joint workshop on:


The worshop will be held virtually on January, 10th 2021.

Keynote speakers

Alberto Del Bimbo

Keynote: Garment Recommendation from fashion collections using Memory Augmented Neural Networks

Arnold Smeulders

Keynote: At the end of content-based image search