Call For Papers -

Document Intelligence (DI 2019) Workshop

at NeurIPS 2019


Business documents are central to the operation of business. Such documents include sales agreements, vendor contracts, mortgage terms, loan applications, purchase orders, invoices, financial statements, employment agreements and a wide many more. The information in such business documents is presented in natural language, and can be organized in a variety of ways from straight text, multi-column formats, and a wide variety of tables. Understanding these documents is made challenging due to inconsistent formats, poor quality scans and OCR, internal cross references, and complex document structure. Furthermore, these documents often reflect complex legal agreements and reference, explicitly or implicitly, regulations, legislation, case law and standard business practices.

The ability to read, understand and interpret business documents, collectively referred to as “Document Intelligence”, is a critical and challenging application of artificial intelligence (AI) in business. While a variety of research has advanced the fundamentals of document understanding, the majority have focused on documents found on the web which fail to capture the complexity of analysis and types of understanding needed across business documents. Realizing the vision of Document Intelligence remains a research challenge that requires a multi-disciplinary perspective spanning not only natural language processing and understanding, but also computer vision, knowledge representation and reasoning, information retrieval, and more -- all of which have been profoundly impacted and advanced by neural network-based approaches and deep learning in the last few years. The topics of interest for the workshop include but are not limited to the following:

  • Document modeling, and representations
  • Document structure and layout learning
  • Cleansing and image enhancement techniques for scanned documents
  • Information extraction from text, and semi-structured documents
  • Linguistic analysis of document content
  • Natural language reasoning, and inference
  • Question answering on business documents
  • Semantic understanding of document content
  • Document search, and clustering
  • Handwritten recognition in business documents
  • Table identification and extraction from business documents
  • Chart learning, and understanding
  • Domain-specific document understanding
  • Knowledge representation for business documents
  • Multi-lingual document understanding methods and frameworks
  • Integrated syntax and semantic approaches for document understanding
  • Transfer learning methods for business document reading and understanding

In addition to invited talks and open discussions on topics related to Document Intelligence, the workshop program will include a poster session which provides an opportunity to present peer-reviewed work on the topic related to Document Intelligence.


We are soliciting submissions of short research, report and vision papers in PDF format, following NeurIPS 2019 File Style, for presentation at the Workshop's Poster session, with the following guideline:

  • 2-page limit: Abstract of extended work from previously published papers at Top-Tier venues with a focus on aspects related to Document Intelligence, description of datasets for Document Intelligence research, position and vision papers, as well as papers describing industry, scientific or theoretical challenges.
  • 4-page limit: Original research contributions, or abstracts of papers previously submitted to other venues, but not currently under review in other venues and not yet published. The research contributions may discuss technical challenges of reading and interpreting business documents and present research results.

The page limits include references and any appendices. The review process is double-blind. The submitted contributions will be peer-reviewed by the Technical Program Committee, and preference will be given to high-quality original and relevant work to the Document Intelligence topics. It is expected that one of the authors of accepted contributions will register and attend the workshop to present the work, in the form of a poster, in the workshop's Poster Session. Accepted contributions will be made publicly available as non-archival reports, allowing future submissions to archival conferences or journals.

Important Dates

Paper Submission Deadline: September 9, 2019 . (Extended): Friday, September 13th23:59:59 AOE (Saturday, September 14th, 11:59:59 GMT).

Paper Notification Date: October 1, 2019

Workshop Date: December 14, 2019

Contact Information:


Workshop Organizing Committee

Tania Bedrax Weiss (Google)

Paul Bennett (Microsoft)

Nigel Duffy (EY)

Rama Akkiraju (IBM)

Program Committee Chair

Hamid Motahari (EY)

Program Committee Members

Payal Bajaj (Microsoft)

Mohit Bansal (UNC Chapel Hill)

Ken Barker (IBM)

Douglas Burdick (IBM Research)

Jamie Callan (Carnegie Mellon University)

Laura Chiticariu (IBM Watson)

Laura Dietz (University of New Hampshire)

James Fan (Google)

Alicia Fornes (Universitat Autònoma de Barcelona)

Dan Goldwasser (Purdue University)

DooSoon Kim (Adobe)

Yunyao Li (IBM Research)

Ryan McDonald (Google)

Erik T. Mueller (Capital One)

Ashok Popat (Google)

Martin Schall (University of Applied Sciences Constance, Germany)

Vibha Sinha (Facebook)

Luchen Tan (

Dan Tecuci (EY)

Shivakumar Vaithyanathan (Adobe)

Michael Witbrock (The University of Auckland)

Peter Yeh (Nuance)

Richard Zanibbi (Rochester Institute of Technology)