Researchers and practitioners all over the world, from both academia and industry, working in the areas of document and textual analysis. Topics of interest include, but are not limited to, the following:
Topics covered
This workshop invites submissions with high-quality works that are related, but are not limited, to the topics below:
Multimodal Knowledge Graph Construction from documents
Vision-Language Models for Document Understanding
Joint Entity and Relation Extraction from visual and textual content
Structured Document Understanding with LLMs
Multimodal Document Representation Learning
Graph-Based Spatial and Semantic Reasoning in documents
Integration of Knowledge Graphs and Vision Transformers
Multimodal Invoice and Form Analysis
Cross-modal Retrieval in Document Collections
Benchmarks and Datasets for Multimodal Document Understanding
Note: Topics that are purely vision-based (e.g., OCR, table detection, handwriting recognition) or purely NLP-based are better suited to other ICDAR workshops. VINALDO focuses on their intersection.
Relevance for ICDAR conference and social impact
Robust reading is a very interesting area used in a lot of real-world applications. In this context, encouraging research on this topic becomes important. The VINALDO workshop will be developed in collaboration with researchers from academia and industry. It addresses the problem of information extraction/retrieval from scanned and non-scanned documents using a combination of machine vision and NLP approaches. The half-day workshop will enrich the 1st and 2nd edition of the VINALDO and GLESDO workshops organized by us, and held in ICDAR’21, ICDAR’23 and ICDAR’24, respectively. The participants from different sides will benefit a lot from the speakers' expertise and discover a large panel of research works and applications.
This workshop aims to enrich the ICDAR conference topics by adding the NLP aspect to computer vision methods to analyze scanned documents. It allows researchers and industries to discover the advantages and disadvantages of using NLP approaches for image document applications.