๐ WINNERS - TOP 5 TEAMS (SCOREBOARD) ๐
๐ฅ Rank 1:
Team Name: Pre-Par-e
Affiliation: Vellore Institute of Technology, Chennai
Score: 77.26
๐ฅ Rank 2:
Team Name: Tokenwise
Affiliation: IGCAR, HBNI, Chennai
Score: 53.11
๐ฅ Rank 3:
Team Name: Ezhuthani
Affiliation: Velammal College of Engineering and Technology, Madurai; Thiagarajar College of Engineering, Madurai
Score: 50.08
๐ Rank 4:
Team Name: Inkognito
Affiliation: Madras Institute of Technology, Chennai
Score: 42.45
๐ Rank 5:
Team Name: ElecrtoNexus
Affiliation: Mar Baselios College of Engineering and Technology, Thiruvananthapuram, Kerala; APJ Abdul Kalam Technological University, Thiruvananthapuram, Kerala
Score: 34.74
In India, a vast number of official documents, especially application forms, are still filled out by hand. Manually processing these forms are time-consuming, error-prone, and inefficient, highlighting the need for an intelligent system that can accurately extract, interpret, and validate handwritten data. These handwritten formsโoften written in English but with diverse handwriting styles and ink typesโpose a significant challenge. The aim of DeHaDo-AI challenge is to develop robust AI models capable of accurately recognizing handwritten English text from scanned application forms filled by Indian citizens. The challenge will focus on handling diverse handwriting styles, varying image quality, and missing or incomplete fields. Through this initiative, researchers, AI practitioners, and developers will have the opportunity to create state-of-the-art models that enhance handwritten character recognition accuracy, field validation, and automation. These models will ultimately reduce human effort and improve data integrity in real-world applications.
The DeHaDo-AI Challenge supports the development of AI-driven solutions for handwritten application forms in real time settings. The main significance of this challenge is listed below:ย ย
โ Boosting Documentation in Public Sector Workflow โ Large organizations handle millions of handwritten documents annually, including recruitment applications, legal agreements, financial applications, and HR records. Automating this process through AI-driven solutions can significantly reduce processing time, enhance accuracy, and improve workflow efficiency across global enterprises.ย
โ Enhancing ICR Accuracy โ Unlike printed text recognition, handwritten text poses challenges due to variations in handwriting styles, ink quality, and document conditions. This challenge fosters innovation in ICR (Intelligent Character Recognition) and NLP (Natural Language Processing) models to improve accuracy, making document processing more efficient for enterprises and large-scale operations.ย
โ Ensuring Data Completeness & Validation โ Many real-world applications require form completeness verification to detect missing or incorrect entries. DeHaDo-AI focuses on automated field validation, ensuring compliance with corporate and regulatory standards to reduce errors in MNC operations, financial records, and customer data processing.
By addressing these challenges, the DeHaDo-AI Challenge encourages researchers, AI practitioners, and industry experts to develop cutting-edge solutions that will shape the future of handwritten document processing in both corporate and public sectors.
Teams should consist of a maximum of four members each, with the designation of a team lead for effective communication purposes:ย
โ The dataset will exclusively be accessible to teams that have completed the registration process.
โ Participants are required to submit their algorithm's source code, authored in Python, and adequately commented. Additionally, teams must provide a comprehensive summary of their approach and algorithms in a written document. Moreover, participants must disclose the inference time of their code, utilized as an evaluation metric, and furnish details regarding the system specifications on which their code was developed.ย
โ Fair practice is essential. Violation could lead to the disqualification of entire teamย
Sample input imageย
[
{
ย ย ย ย "text": "Guardian Fire Security Service",
ย ย ย ย "bbox": [274, 67, 503, 86]
},
{
ย ย ย ย "text": "M. Deepika",
ย ย ย ย "bbox": [274, 94, 352, 113]
},
{
ย ย ย ย "text": "Murugan",
ย ย ย ย "bbox": [274, 119, 344, 138]
}
]
Sample predicted outputย json
The DeHaDo-AI Challenge dataset consists of scanned handwritten application forms with varying handwriting styles, image qualities, and structural layouts. The dataset is designed to test the robustness of AI models in text recognition, field validation, and completeness verification.
Participants must not use any paid APIs for handwritten text recognition.ย
Participants are required to train their own models using deep learning, large language models (LLMs), or similar approaches.ย
The model should be inferred using Python. You may use any of the following libraries for training the model:ย
ย PyTorch
ย TensorFlowย
ย Keras
ย Hugging Face Transformers
ย JAX
ย FastAI
ย ONNX (for deployment)ย
The results must be reproducible.
Submission Requirements: Participants must provide the trained model and the testing script to allow for reproduction of results.
Code and Model Ownership: The submitted code and model will be owned by the organizers.ย
Evaluation Results: Validation and test phase results will be shared with participants via email; there will be no public leaderboard or online result checking.
Transparency: These guidelines help ensure a fair and consistent evaluation process for allย participants.
Each participating team must submit their solutions in the following format. Each team is required to email their solution in a zipped file to ncvpripg2025dehadoai@gmail.com with the followingย specifications:
The final submission should include:ย ย
The Python inference scriptย ย
The trained model file(s)ย ย
Any required utility/helper scripts
A clear README with instructions for running the code
Naming format: TeamName_DeHaDo-AI_Challenge.zipย
Submission through the designated portal (to be announced).ย
Recognition Output Format โ The text extracted from handwritten forms should be converted into a structured JSON format, including both the recognized text and corresponding coordinates. Each image should have an individual JSON output file.
Model Architecture & Code โ AI model implementation, including training scripts and inference pipeline. Teams retain ownership of their code, and evaluation results will only be used with explicit consent.ย
Technical Report โ A document outlining the approach, methodology, and performance analysis.
Evaluation Metrics Report โ Summary of text recognition accuracy, field validation performance, and computational efficiency.
Executable Demo (Optional) โ A working prototype or API demonstrating real-time handwritten form processing
*Note on the model/Folder
We have provided a model/ folder containing sample trained model files to help participants test their inference pipeline. You can use any one of the provided models for initial testingโthis is sufficient to verify your code structure and functionality.
During final submission, you must replace the sample model with your own trained model, ensuring that the code can reproduce your submitted results using this model.
Sample Folder Structure for Submission:
submission/
โ
โโโ model/
โ ย โโโ handwritten_model.pth ย ย ย ย ย # Sample trained model (PyTorch format)
โ ย โโโ handwritten_model.h5ย ย ย ย ย ย # OR TensorFlow/Keras format (if applicable)
โ ย โโโ handwritten_model.onnxย ย ย ย # Optional: Exported ONNX model
โ
โโโ src/
โ ย โโโ inference.py ย ย ย ย ย ย ย ย ย ย ย # Main Python script for running inference
โ ย โโโ utils.pyย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย # Helper functions (e.g., image preprocessing)
โ ย โโโ model_architecture.pyย ย ย ย ย ย # Model architecture definition
โ
โโโ data/
โ ย โโโ sample_input/ ย ย ย ย ย ย ย ย ย ย ย ย # Optional: sample test data
โ
โโโ README.mdย ย ย ย ย ย ย ย ย ย ย ย ย ย ย ย # Instructions to reproduce the results
โโโ requirements.txt ย ย ย ย ย ย ย ย ย ย ย ย ย # Python dependencies
Participants may use the provided fact sheet template to submit their result during the final phase of the competition.
The top fiveย teams will be invited for a presentation of their solution in a dedicated session at NCVPRIPG 2025 and will also have the opportunity to contribute to a paper summarizing the challenge outcomes which will be submitted to the NCVPRIPG 2025 proceedings.
๐ Winner: โน20,000
๐ฅ First Runner-up: โน15,000
๐ฅ Second Runner-up: โน10,000
๐ Collaboration on writing summary paper
VIT Chennai
Anna University Chennai
TCE Madurai
Couger Inc., Japan
VIT Chennai
Anna University, MIT Campus, Chennai
VIT Chennai
VIT Chennai