VisionDocs

1st Workshop on Computer Vision Systems for Document Analysis and Recognition

Overview

In this age of progressive digitalization, the ability to analyze documents in an automated way is gaining increasing importance in our everyday lives. The impact of Document Analysis is growing both in industrial and cultural settings, leading to the need for AI systems capable of analyzing highly diverse documents, characterized by differences in languages, geographical and temporal origins, as well as varied visual appearances, writing styles, and layouts across different domains.

While significant progress has been made in recent years, much of the research still focuses on a narrow subset of documents and tasks, leaving many areas of Document Analysis under-explored.

This workshop aims to encourage the development of new strategies to address the limitations of current systems, such as handling low-data environments, adapting to document classes with highly heterogeneous visual characteristics, and integrating multi-modal inputs for improved performance.

Call for Paper

Research papers are solicited in, but not limited to, the following topic areas:

Document image processing
Physical and logical layout analysis
Text and symbol recognition
Handwriting recognition
Document analysis systems
Document layout analysis
Document classification
Multimedia document analysis
Recognition of tables and formulas
Document forensics and provenance
Medical document analysis

Indexing and retrieval of documents
Document synthesis
Extracting document semantics
Graphics Recognition
Structured document generation
Historical document analysis
Document summarization and translation
Document analysis for social good
Multi-modal Document Analysis
Datasets and benchmarks of document analysis

Keynote Speaker

Vincent Christlein

Lead the Computer Vision group at the Pattern Recognition Lab, Friedrich-Alexander University of Erlangen-Nurnberg, Germany. He received his diploma and Dr.-Ing. degrees from FAU in 2012 and 2018, respectively. His primary research focus is on document analysis, including writer identification and handwriting imitation, as well as environmental projects such as glacier front segmentation and bird detection. In the field of document analysis, his work has earned recognition through several international competition wins and multiple awards.

Rita Cucchiara

Full Professor at the University of Modena and Reggio Emilia in the Department of Computer Engineering and Science since 2005. She directs the AImage laboratory, which brings together more 40 researchers who are involved in research fields in artificial intelligence such as computer vision, pattern recognition, in-depth learning and multimedia, as well as in the areas of surveillance, automotive, document analysis, and human-robot interactions. She is since 2015 an Advisory Board Member of the Computer Vision Foundation, CVF as PC of ICCV2017 and GC of CVPR2024 and is in the European Computer Vision Alliance Governing board as GC of ECCV2022. She has been president of the Italian association CVPL, published more than 400 articles, and has been a keynote speaker at numerous scientific conferences. In the field of document analysis, his work has gained many publications, he has organized workshops on the topic and created several challenges.

Silvia Cascianelli

Assistant Professor at the University of Modena and Reggio Emilia, Italy. She received her European PhD degree in Industrial and Information Engineering from the University of Perugia in 2019. In 2018, she was Visitor Researcher at Queen Mary University of London, working on Natural Language Processing and Computer Vision approaches for robotic applications. She has co-authored more than 50 scientific papers in journals and international conferences and serves as AE for the IEEE RA-L, Diversity Chair for ACM-MM 2022, and AC for ECCV and BMVC 2024 and CVPR 2025. Her research interests concern Document AI, Generative AI, Multimedia, and Deep Learning for Digital Humanities and Cultural Heritage.

Submission

We invite researchers to submit their original and unpublished work related to the workshop's theme. Authors can submit either full papers (max 8 pages + reference) or short papers (max 4 pages + reference), following the WACV 2025 formatting guidelines. Accepted full papers will published under the WACV 2025 workshop proceedings.

All submissions should be compiled for double-blind review, adopt the standard main conference WACV 2025 template.

Accepted papers will be presented during the workshop as oral presentations or posters.

Submission site: https://cmt3.research.microsoft.com/VISIONDOCS2025

Important Dates

Paper submissions: 6 December, 2024 23:59 PST 9 December, 2024 23:59 PST
Author Notification: 19 December, 2024 23:59 PST
Camera-ready: 10 January, 2025 23:59 PST 15 January, 2025 23:59 PST
Workshop date: 4 March, 2025 afternoon

Partial financial support was received from Strategic Departmental Plan on Artificial Intelligence, DMIF, University of Udine.

Page updated

Google Sites

Report abuse