Datathon'23 is now underway. Checkout it out here
Winners of Datathon'22
1st: Team Krispy (Kristopher Paul)
2nd: Team Comrades (K Vikas Mahender, Mukund Verma T)
3rd: Team AINE (Gunda Venkata Shanmukha Sainath )
4th: Team Cake (Bhomic Kaushik, Arnav Kumar Jain, Kinshuk Vasisht)
4th: Team noobz_ (Kshitij Mohan, Palak Goel, Madhav Mathur)
Welcome to Datathon@IndoML 2022. Like previous years, Datathon will be held in conjunction with IndoML 2022. We invite participation from students as well as early career professionals with up to 1 lakh in prize money to be won. Selected teams will also be invited to IndoML 2022 to present their solution to leading Machine learning researchers from around the world, both from industry and academia.
Registration is now open!
Join this group for competition announcements as well as asking questions
Given a set of grayscale document images, the task is to classify each image into one of the 16 classes or document types. The training dataset consists of 16000 images with 1000 images belonging to each class. Example images are provided below for some classes. The dataset was collected from the RVL-CDIP dataset [1].
Image: https://www.cs.cmu.edu/~aharley/rvl-cdip/
05/08/22 10/08/22-- Datathon starts (Registration opens & Dataset release)
12/08/22 19/08/22-- First Ask Me Anything (AMA) session (Open QA with the organizers)
16/09/22 -- Second AMA session
07/10/22 09/10/22 -- Submission deadline (submission of code and a short report explaining the solution)
05/11/22 -- Declaration of top-10 teams
15/12/22 - 17/12/22 -- IndoML 2022, top-10 teams will be invited at IndoML 2022 to present their work before the judges and final result will be declared
To be eligible for the prizes, the teams need to submit the code, implementation details as well as a 1-page report (format to be provided) explaining the solution.
The submitted models would be evaluated on a private/held-out test set which will be made public after the competition ends. During the competition, a test set will be released in Kaggle so that participating teams can evaluate themselves and tune their models.
Note: Highest accuracy is not the only criteria on which the winners will be decided. The teams will be judged based on overall performance, innovativeness of the proposed solution, and as well as new findings if any. The final decision on the winners will be made by the judges at IndoML 2022. Prizes to be won in multiple categories.
Participants should work in a team of maximum 3 members. A Google Form will be circulated for registration of the teams. once the team registers at Kaggle, there would be no further changes.
Each team should have at least one person from an Indian University or an Indian research lab.
Submission of code/implementation details and report is mandatory to be considered for prizes.
The organizers will take the final call on the final prize money as well as any modification of the evaluation criteria (if any).
Registration form (All teams are requested to fill this form in order to be eligible for prizes)
Join this group for announcements as well as asking questions
[1] A. W. Harley, A. Ufkes, K. G. Derpanis, "Evaluation of Deep Convolutional Nets for Document Image Classification and Retrieval," in ICDAR, 2015