Datathon@IndoML 2023

TLDR: Important dates and steps


Announcement:

Check out this space for further updates !!!

Welcome to Datathon@IndoML 2023. Like previous years, Datathon will be held in conjunction with IndoML 2023. We invite participation from students as well as early career professionals.ย  Selected teams will also be invited to IndoML 2023 to present their solution to leading Machine learning researchers from around the world, both from industry and academia.ย ย 

Task

Intent detection is commonly treated as a classification task in conversational systems. In this year's Datathon@IndoML, we pose the challenge of the 'intent recognition' task as a few-shot multi-class classification problem. The training data contains 150 classes, each with 15 utterances. The blind test data is available for testing your model. The labels to the test data utterances are not provided to the participants. To evaluate your model's performance, please proceed to our Codalab page. You may simply go to the "Participate" tab, navigate to the "submit/view results" link, and submit to the "Final" phase page.

Please note that you may skip Phase 1 a.k.a the development phase and directly submit to the final phase from October 2.

Please go through this tutorial to know how you can use Pre-Trained model to fine-tune it further using the training Data.

Competition timelineย 

9/08/23-- Datathon starts.

Aug-Sept -- Tutorials.

14/9/23 -- The Surprise dataset will be released and the evaluation leaderboard will be kicked off.

September (2nd half) -- Ask Me Anything (AMA) session (Open QA with the organizers).

2/10/2023-- New dataset has been released for final evaluation.

15/10/2023 -- Competition ends.

16/10/2023 -- Top teams have been announced.

22/10/2023 -- Code and Report submission for the top teams.

21/12/23 - 23/12/22 -- IndoML 2023, top teams will be invited at IndoML 2023 to present their work and the final result will be declared.

Evaluation

To be eligible for the prizes, the teams need to submit the code, implementation details as well as a 1-page report (format to be provided) explaining the solution.

During the competition, a surprise data set will be released so that participating teams can evaluate themselves and fine-tune their models for the domain adaptation task. The submitted models would be tested on the new dataset for the final evaluation.ย 

Note: Highest performance metric is not the only criterion on which the winners will be decided. The teams will be judged based on overall performance, innovativeness of the proposed solution, and as well as new findings if any. The final decision on the winners will be made by the judges at IndoML 2023. Prizes are to be won in multiple categories.ย 

Competition guidelines

[1] Bastianelli, Emanuele, et al. "SLURP: A Spoken Language Understanding Resource Package." Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.