MMFM Challenge

News Update: The winners for the CVPR 2024 iteration of the challenge are announced here. Congratulations to all the teams!

Overview

Multimodal Foundation Models (MMFMs) have shown unprecedented performance in many computer vision tasks. However, on some very specific tasks like document understanding, their performance is still underwhelming. In order to evaluate and improve these strong multi-modal models for the task of document image understanding, we harness a large amount of publicly available and privately gathered data (listed in the image above) and propose a challenge. In the following, we list all the important details related to the challenge. Our challenge is running in two separate phases.

Dates

Phase 1 Data Release: March 20th 2024

Phase 2 Data Release: May 20th 2024

Online Evaluation Open: May 20th 2024

Phase 1&2 Submission Due: 9th June 2024

Awards

We will have $10K winner prizes for the top teams.

Datasets

Phase 1

For this phase, we build up a comprehensive data suite comprising publicly available datasets including DocVQA, FUNSD, IconQA, InfogrpahicVQA, Tabfact, TextbookVQA, WebSrc, Wildreceipt, WTQ. All these datasets are aligned with the goal of the challenge for document image understanding in very specific domains like tables, receipts, infographics, or document figures, etc. This collection consists of a train and test set and can be downloaded from MMFM Data Collection.

Phase 2

For this phase, an alien test set will be released. The original intent of this dataset is to prevent people from overfitting on the publicly available datasets. This test set will consist of data that is similar to the distribution of the phase 1 dataset collection but will not consist of any publicly available data sources. The input data will be released on May 20, 2024.

Submission

Registration: Please register for the competition and select the challenge track on the workshop site: MMFM2024 on CMT3.
The challenge will be running in two phases:

Phase 1: For this phase, train and test sets are provided. We encourage participants to download the data from MMFM Data Collection.
Phase 2: An alien test set will be released after the phase 1. The participants will be required to submit their results on the alien test set.

We encourage you to read the MMFM Challenge Readme for the details.

Evaluation & Submission: please refer to the Challenge Submission.

Rules: Please read the full rules here: Challenge Phases and General Rules.

Contact

For any questions, please write an email to the organizers, and we will get back to you as soon as possible: contactmmfm2024@gmail.com
License

This repository is licensed under the MIT License. See LICENSE for more details. To view the licenses of the datasets used in the challenge, please see LICENSES.

Partnership

Our challenge winners prize is awarded in collaboration with:

Tensorleap is on the mission of making AI production-ready by enhancing neural network transparency and equipping researchers with advanced technologies for development and monitoring grounded in cutting-edge explainability techniques.

Page updated

Google Sites

Report abuse