ICDAR 2024 Competition on Artistic Text Recognition

Update (March 31, 2024)

The images of Test_B are available at Google Drive. The submission channel of Test_B is open at CodaLab.
The final ranking is based on the results of Test_B. The competition will end at 23:59:59 UTC on March 31st.
When the competition ends, the top three teams are supposed to submit the technical report and source code (including checkpoints) within one week. Once any cheating is discovered, the corresponding team will be disqualified from the competition.

News (Important)

The images of Test_A are available at Google Drive. The submission channel of Test_A is open at CodaLab.
The competition schedule has been extended, and the first phase has been postponed to 24:00 on March 30.
The submission format has been changed from json to txt. Details can be found in the Task Section.

Introduction

Artistic text is widely used in advertisements, slogans, exhibitions, decorations, magazines, and books. However, artistic text recognition is an overlooked and extremely challenging task with importance and practicability in various applications. Artistic text recognition often has several challenges such as the various appearances with special-designed fonts and effects, the complex connections and overlaps between characters, and the severe interference from background patterns. Therefore, we organize this competition to invite participants to solve these challenges. We hope that the dataset and task could greatly promote the research in text recognition.

Quick start guide

Download the training dataset via this link.
Register to participate by sending an email to xdxie@hust.edu.cn.
Submit your entry via the competition server CodaLab-ICDAR24-WordArt.

Registration

We use CodaLab for submissions. The site for the submission will be ready before March 10. To register, please send an email to xdxie@hust.edu.cn, otherwise the submission is invalid.

Please include the following information in your email:

Names of participants and the corresponding usernames in CodaLab
The username in CodaLab for submission
Affiliation of participants
Team Name

Dataset

This dataset is based on our previously released artistic text recognition dataset WordArt (Toward understanding wordart: Corner-guided transformer for scene text recognition, ECCV 2022). Therefore, we name the previous dataset WordArt-V1 and the current dataset WordArt-V1.5. The WordArt-V1 dataset is comprised of 4,805 artistic text images for training and 1,511 images for testing. In WordArt-V1.5, we annotated more artistic text images from more types of scenes. We added 1,195 artistic text images for the training set and constructed a new testing set including 6,000 new test images. These images are from posters, greeting cards, covers, billboards, handwriting, etc. As a result, WordArt-V1.5 contains a total of 12,000 images with 6,000 for training and 6,000 for testing. The qualitative presentation of the dataset is shown in Fig. 1.

Specifically, the dataset is divided into a training set and two testing set (Test_A and Test_B). The training set consists of 6,000 images with annotations, which can be downloaded from this link. Test_A consists of 3,000 images and Test_B also consists of 3,000 images, whose annotations will not be released. In the competition stage, only the testing images are offered. The participants are required to submit their results on Test_A and Test_B with the specific format. Dividing into these two testing sets can prevent the model from overfitting on a single testing set. During the competition, Test_A will be available first and the top-ranked teams on the leaderboard will be selected for the evaluation of Test_B. The final ranking is based on the results of Test_B.

Each image in the dataset is annotated with a string. Annotations for the images are stored in a txt file with the following format:

train_image/1157.png EASTER

train_image/1209.png Birthday

……

Fig.1 Some examples from our dataset.

Task

The aim of this task is to accurately recognize text content from artistic text images.

Input: Artistic text images shown in Fig. 1.

Output: The text strings corresponding to the images.

Evaluation Protocols: We only use the word recognition accuracy (WRA) as the artistic text recognition evaluation protocol. WRA is defined by WRA=Wr/W, where W is the total number of words, and Wr represents the number of correctly recognized words ignoring the case and symbols.

Submission Format: Participants will be asked to submit a single txt file (answer.txt) containing results for all test images. This txt file should be directly compressed into a zip file before uploading to CodaLab. The results format in txt is:

test_image\new0.png sarcasvh

test_image\new1.png harrington

……

Competition Schedule

January 10, 2024: Competition websites are live. Training images and ground truth are available.
March 10, 2024: The images of Test_A are available. The submission channel of Test_A is open.
March 30, 2024: The submission channel of Test_A is closed. The top-10 teams advance to the finals of Test_B.
March 31, 2024: The images of Test_B are available. The submission channel of Test_B is open.
March 31, 2024: The submission channel of Test_B is closed. The final ranking is based on the results of Test_B.
April 1, 2024: Declaration of results online.

Competition Rules

The competition is open for participants from both industry and academia. Below we describe the tentative rules for our competition:

Participants can use publicly released datasets (synthetic or real) to train models, but cannot use private data in any phase of the competition. Participants cannot use publicly available pre-trained models.
Each participant team can include up to a maximum of 5 people from one or more affiliations.
One person can only participate on one team. Mentors included. No exceptions.
Any cheating will result in disqualification, including submitting manual labeling results and registering with false information.

Leaderboard

See the competition page for the leaderboard: CodaLab-ICDAR24-WordArt.

Contact

Please contact xdxie@hust.edu.cn or lingerdeng2023@163.com if you have further questions.

Organizers

Xudong Xie, Huazhong University of Science and Technology, xdxie@hust.edu.cn

Ling'er Deng, Huazhong University of Science and Technology, lingerdeng2023@163.com

Zhifei Zhang, Adobe Research, zzhang@adobe.com

Zhaowen Wang, Adobe Research, zhawang@adobe.com

Yuliang Liu, Huazhong University of Science and Technology, ylliu@hust.edu.cn