Workshop date: October 20th, 2025 afternoon (1-5pm)
Workshop Location: Honolulu Convention Center, 308 A
1:00-1:15pm Opening Remarks🤗
1:15-1:45pm Invited Talk: Simone Marinai - "Will Large Language Models Make Document Analysis a Solved Problem? Insights from Table and Graphics Understanding"
1:45-2:15pm Invited Talk: Andreas Fischer - "Explainable Document Analysis for Domain Experts: Combining Deep Learning with Rule-Based Methods"
2:15-2:45pm Invited Talk: Marcus Liwicki - "Does our research make ourselves obsolete? Or are we wanted more than ever before?"
2:45-3:00pm Oral Presentation and Best paper award! 🏆
3:00-4:00pm Coffee Break and Poster Session (Exhibit Hall II)☕
4:00-4:30pm Invited Talk: Christopher Tensmeyer - "Document Analysis and Document Generation: Two Sides of the Same Coin"
4:30-5:00pm Invited Talk: Naoto Inoue - "Graphic Design Generation Through Document Analysis"
5:00pm Closing Remarks 👋
(list in random order)
Full paper (in conference proceeding)
TAP-VL: Text Layout Aware Pretraining for Enriched Vision-Language Models
Jonathan Fhima, Elad Ben Avraham, Oren Nuriel, Yair Kittenplon, Roy Ganz, Aviad Aberdam, Ron Litman
A Survey on Reading Order, Table of Contents, and Structure Extraction in Document Analysis
Simone Giovannini
Structure-aware Contrastive Learning for Diagram Understanding of Multimodal Models
Hiroshi Sasaki
Improved Information Extraction by Leveraging Multi-Hypothesis OCR at Inference Time
Arthur Hemmer, Mickael Coustaty, Nicola Bartolo, Jean-marc Ogier
DocSemi: Efficient Document Layout Analysis with Guided Queries
Tahira Shehzadi, Ifza Ifza, Didier Stricker, Muhammad Zeshan Afzal
DIVE-Doc: Downscaling foundational Image Visual Encoder into hierarchical architecture for DocVQA
Rayane Bencharef, Abderrahmane Rahiche, Mohamed Cheriet
PRISM: Pruning for Rank-adaptive Interpretable Segmentation Model with Application to Historical Document Multiband Images
Kilian Declercq, Abderrahmane Rahiche, Mohamed Cheriet
Deep Learning-Based Intrusion Detection Systems for Phishing Email Detection: A Short Survey
Axel De Nardin, Silvia Zottin, Claudio Piciarelli, Gian Luca Foresti
Scanned documents forensics: detecting inserted characters through noise and chromatic artifacts
Marina Gardella, Julieta Umpierrez, Antoine Tadros, Seginus Mowlavi, Natalia Bottaioli, Diego Belzarena, Gabriele Facciolo, Roy He, Jean-michel Morel, Rafael Grompone von Gioi
Text Image Generation for Low-Resource Languages with Dual Translation Learning
Chihiro Noguchi, Shun Fukuda, Shoichiro Mihara, Masao Yamanaka
ZOD : Zero-shot and Out-of-Distribution Detection Dataset for Document Images
Sheikh Talha Uddin, Sankalp Sinha, Shino Sam, Didier Stricker, Muhammad Zeshan Afzal
CTC Transcription Alignment of the Bullinger Letters: Automatic Improvement of Annotation Quality
Marco Peer, Anna Scius-Bertrand, Andreas Fischer
Describe Anything Model for Visual Question Answering on Text-rich Images
Yen-Linh Vu, Duong-Dinh-Thang, Truong-Binh Duong, Anh-Khoi Nguyen, Thanh-Huy Nguyen, Le Thien Phuc Nguyen, Jianhua Xing, Xingjian Li, Tianyang Wang, Ulas Bagci, Min Xu
ChemMiner: A Large Language Model Agent System for Chemical Literature Data Mining
Kexin Chen, Yuyang Du, Junyou Li, Hanqun Cao, Menghao Guo, Xilin Dang, Lanqing Li, Jiezhong Qiu, Guangyong Chen, Pheng-Ann Heng
Quo Vadis Handwritten Text Generation for Handwritten Text Recognition?
Vittorio Pippi, Konstantina Nikolaidou, Silvia Cascianelli, George Retsinas, Giorgos Sfikas, Rita Cucchiara, Marcus Liwicki
CoSMo: A Multimodal Transformer for Page Stream Segmentation in Comic Books
Marc Serra Ortega, Emanuele Vivoli, Artemis Llabrés, Dimosthenis Karatzas
Short Papers:
Towards Reliable and Interpretable Document Question Answering via VLMs
Alessio Chen, Simone Giovannini, Andrea Gemelli, Fabio Coppini, Simone Marinai
Copy-move detection in scanned documents
Marina Gardella, Julieta Umpierrez, Pablo Muse, Rafael Grompone von Gioi, Jean-Michel Morel
EmptyT: Empty Template Recovery from Handwritten-Filled Forms
Natalia Bottaioli, Yanhao Li, Gabriele Facciolo, Jean-Michel Morel
Evaluating Medical School Personal Statements with SCRIBE: A Fine-Tuned Transformer System for Structured Feedback
Cole Krudwig, Faith Kurtyka, Sara Avila, Sean Dore, George (Guy) McHendry, Steven Fernandes
Download
Theatre Chapbooks At Scale: A Statistical Comparative Analysis of Typography
Seginus Mowlavi, Diego Belzarena, Paula Casariego Castinera, Alejandra Ulla Lorenzo, Gregory Randall, Jean-Michel Morel
Demos:
Interactive Document Image Forensics in IPOL
Marina Gardella, Julieta Umpierrez, Antoine Tadros, Seginus Mowlavi, Natalia Bottaioli, Diego Belzarena, Gabriele Facciolo, Roy Y. He, Jean-Michel Morel, Rafael Grompone von Gioi
Evaluating Medical School Personal Statements with SCRIBE: A Fine-Tuned Transformer System for Structured Feedback
Cole Krudwig, Faith Kurtyka, Sara Avila, Sean Dore, George (Guy) McHendry, Steven Fernandes