Workshop date: March 6, 2026 afternoon (1-5pm)
Workshop Location: AZ Ballroom 3-4
Portal Zoom LINK: https://reurl.cc/nq8aXd
Posters Boards: 80-91
Tentative program
1:00-1:05pm Opening Remarks🤗
1:05-1:35pm Invited Talk: Mickael Coustaty - "Toward a trustable document analysis system?"
1:35-3:00pm Oral Presentations:
An Empirical Study of Siamese Vision Transformers for Scribe Re-Identification
Alessio Fagioli, Carmine Fabbri, Luigi Cinque, Emanuela Colombi, Gian Luca Foresti
Cross-Lingual Transfer for Complex Scripts: A Benchmark on End-to-End Khmer Scene Text Spotting
Vannkinh Nom, Saly Keo, Souhail Bakkali, Muhammad Muzzamil Luqman, Mickael Coustaty, Jean-marc Ogier
Adaptive Multi-Scale Feature Fusion for Paragraph-Level Handwritten Text Recognition via Spatial Attention
Baha Edine Harrath, Mohammed Hamdan, Mohamed Cheriet
Gram-Schmidt Feature Reduction for Disentangled Writer Identification in Degraded Historical Documents: Separating Stylistic Signal from Artifact Noise
Kaveh Safavigerdini, Bahram Yaghooti, Amir Erfan Zareei Shams Abadi, Bruno Sinopoli, Kannappan Palaniappan
Advanced Document Understanding via Synthetic Instruction Tuning
Yanlin Zhou, Fuxiao Liu
3:00-3:45pm Coffee Break and Poster Session☕
3:45-4:15pm Invited Talk: George Retsinas - "HTG: Beyond Visual Quality - Diffusion Models, Guidance, and Evaluation"
4:15-4:50pm Oral Presentations:
SynthForm: Towards a DLA-free E2E Form understanding model
Andre Fu, Egil Karlsen
Pseudo Contrastive Learning for Diagram Comprehension in Multimodal Models
Hiroshi Sasaki
SAVIOR: Sample-efficient Adaptation of Vision-Language Models for OCR Representation
Akshata A Bhat, Sharath Naganna, Saiful Haq, Niyati Chhaya, Prashant Khatri, Krishna Chaitanya Reddy Tamataam, Neha Arun
4:50-5:00pm Best paper award! 🏆
5:00pm Closing Remarks 👋
(list in random order)
Full paper (in conference proceeding)
🏆SynthForm: Towards a DLA-free E2E Form understanding model
Andre Fu, Egil Karlsen
SAVIOR: Sample-efficient Adaptation of Vision-Language Models for OCR Representation
Akshata A Bhat, Sharath Naganna, Saiful Haq, Niyati Chhaya, Prashant Khatri, Krishna Chaitanya Reddy Tamataam, Neha Arun
Pseudo Contrastive Learning for Diagram Comprehension in Multimodal Models
Hiroshi Sasaki
Cross-Lingual Transfer for Complex Scripts: A Benchmark on End-to-End Khmer Scene Text Spotting
Vannkinh Nom, Saly Keo, Souhail Bakkali, Muhammad Muzzamil Luqman, Mickael Coustaty, Jean-marc Ogier
An Empirical Study of Siamese Vision Transformers for Scribe Re-Identification
Alessio Fagioli, Carmine Fabbri, Luigi Cinque, Emanuela Colombi, Gian Luca Foresti
Adaptive Multi-Scale Feature Fusion for Paragraph-Level Handwritten Text Recognition via Spatial Attention
Baha Edine Harrath, Mohammed Hamdan, Mohamed Cheriet
Gram-Schmidt Feature Reduction for Disentangled Writer Identification in Degraded Historical Documents: Separating Stylistic Signal from Artifact Noise
Kaveh Safavigerdini, Bahram Yaghooti, Amir Erfan Zareei Shams Abadi, Bruno Sinopoli, Kannappan Palaniappan
Short Papers:
Advanced Document Understanding via Synthetic Instruction Tuning
Yanlin Zhou, Fuxiao Liu
Multimodal Narrative Synthesis in Complex Documents via Omni-Parser Transformer
Tiger Yu
An Autoencoder-Based Orthogonal Nonnegative Matrix Factorization Framework for Blind Multispectral Image Decomposition
Thomas Olive, Abderrahmane Rahiche, Mohamed Cheriet