TLDR: Important dates and steps
ποΈ Phase-1 Deadline: 9th September 2024 11:59Β PM, IST
ποΈ Starting Date: Phase-1 -> 5th August // Phase-2 -> 15th September
π Register: https://forms.gle/cVzA1cwyzuvXtzh17Β
π’ Join Discord Channel: https://discord.com/invite/V2W7gY8DRa
π Download the Data: https://codalab.lisn.upsaclay.fr/competitions/19907 Β -> Participate -> Files -> Public Data
ποΈ Submit your predictions at: https://codalab.lisn.upsaclay.fr/competitions/19907Β
π Starting Kit (Classification-Based): https://tinyurl.com/ymem4sxkΒ
π Starting Kit (Generative-Based)Β & Submission Guidelines: https://youtu.be/V5-KZNYaAEY?si=kkXmHIOeFyeSWMTJΒ
π First AMA Session: https://youtu.be/BWrDWUpj8j8
ποΈ Competition Deadline: Phase-1 -> 9th September // Phase-2 -> 22nd October
π Win prizes worth INR 1.5 Lakhs
Announcement:
The competition opens on 5th August 2024 at 12:00Β Noon IST.Β The dataset has been released alongside Codalab.Β
First Tutorial (Generative Model Based): 13th August 2024 (link: https://youtu.be/V5-KZNYaAEY?si=kkXmHIOeFyeSWMTJ )
Second Tutorial (BERT Based Model): 20th August 2024Β (link: https://youtu.be/iymEx8uDAUY?si=gWQysuuplm3CHn7p )
First AMA Session: Β 24th August 2024 (link: https://youtu.be/BWrDWUpj8j8?si=4kbYLp22y5RlmxtA )
Phase-1 Deadline: 9th September 2024 11:59 PM IST
Check out this space for further updates !!!
Welcome to Datathon@IndoML 2024 sponsored by NielsenIQ. Like previous years, Datathon will be held in conjunction with IndoML 2024. We invite participation from students as well as early career professionals.Β Selected teams will also be invited to IndoML 2024 to present their solution to leading Machine learning researchers from around the world, both from industry and academia.Β Β
Task: Attribute-Value Prediction From E-Commerce Product Descriptions
In recent years, e-commerce has seen tremendous growth, with major online retailers offering billions of products and shipping millions of packages daily. However, given the sheer volume of offerings, sellers find it extremely difficult to fill in extensive sets of product attributes, resulting in incomplete product profiles. E-commerce platforms, on the other hand, depend on such structured metadata, typically in the form of attribute-value pairs, for a deeper understanding of the products, and for facilitating critical downstream applications, such as search, product recommendation, question answering; as well as, for providing an enhanced customer experience.
Predicting attribute-value pairs from unstructured product descriptions is, therefore, a fundamental challenge for worldwide e-commerce catalogs such as Amazon, Walmart, and Alibaba. In this challenge, your task would be to develop a model that would automatically predict attribute-value pairs for a given product description. Along with a short product title, you will be provided details about the store and manufacturer, which you may choose not to use. You will then be required to predict the values for 5 levels of categories (starting from L0 to L4) and the brand for the given product as detailed below. Please note that the values may or may not appear in the product title. Also, unseen values for the brand/categories might appear in the hidden test data.
Evaluation
It will be updated soon
Competition guidelines
Participants should work in a team of a maximum of 3 members. A Google Form will be circulated for the teams' registration.Β
Each team should have at least one person from an Indian University or an Indian research lab.
Please join Discord at https://discord.com/invite/V2W7gY8DRaΒ to receive the latest updates, post your doubts, and engage in interactive discussions.
Submission of code/implementation details and report is mandatory to be considered for prizes.
The organizers will take the final call on the final prize money as well as any modification of the evaluation criteria (if any).
Updated by: Shubhadip Nag (IIT-Kgp), Soumi Das (MPI-SWS)