MPaCT

MPaCT Challenge:

English-Marathi Parallel Corpus Creation for Machine Translation

http://www.cfilt.iitb.ac.in/

Contact us: pb@cse.iitb.ac.in https://www.cse.iitb.ac.in/~pb/

CFILT, IIT Bombay, presents the English-Marathi Parallel Corpus Creation for Machine Translation (MPaCT) challenge. This challenge is a part of the National Language Translation Mission funded by MeitY. It aims towards helping and encouraging the advancement of Machine Translation technology in Indian Languages.

CHALLENGE OVERVIEW

Machine Translation (MT) is arguably the most widely used language technologies today thanks to the popularity of internet and globalised economy. Over the last two decades, MT technology has taken significant strides forward due to the adoption and advancement of data-driven approaches for natural language processing. Data is now the key driver of progress in MT. Large volumes of training data, also called parallel corpora, are needed for training the Machine Learning models used in MT. Unfortunately, large parallel corpora are not available in many Indian languages and this has been the major barrier for progress in MT technology for those languages. To address the data gap in Marathi language, CFILT, IIT Bombay is setting up the English-Marathi parallel corpus creation challenge and is opening it up to participants from industry and academic institutions. As part of this challenge, a two-pronged approach for creating high quality English-Marathi parallel corpus will be taken:

Translation: We will provide text documents in English and the participants are required to produce high quality Marathi translations.
Community contribution: Participants are encouraged to contribute parallel data from any domain with the goal of collectively building a large multi-domain English-Marathi parallel corpus.

Submissions will be evaluated based on the quality and throughput of the translations by a selection committee set up by CFILT, IIT Bombay using well-established evaluation metrics and participants will be ranked accordingly.

What you can gain by participating

A subset of top-ranking participants will be commissioned by CFILT, IIT Bombay after the successful conclusion of the challenge to build a large English-Marathi parallel corpus for its MT system. They will be compensated for their services as per the norms of Government of India. Terms and conditions apply.
Parallel data voluntarily contributed by participants during the challenge will be made available to all contributing participants. Terms and conditions apply.

Data Set

The data set to be provided to the participants of IMPaCT challenge comprises of text documents in English on various subjects drawn primarily from education domain.

English documents : ~10,000 ~~sentences~~ words

Registered participants will be directly sent the link to the data set after verification.

We are thankful to IIT Madras and Prof. Prathap Haridoss for making available the data set for use in the challenge.

Registration

~~Enroll yourself by registering on this link:~~ ~~Register Now!~~!
Only registered participants will get access to the data

Submission

Use submission portal to submit your submission.

The submission portal will open on August 20, 2020 and closes at midnight on August 28, 2020
See the documentation for further instructions about formatting and the submission procedure

Important Dates

~~Last date for~~ ~~registration: August 12, 2020~~
~~Meeting with registered participants: August 14, 2020 August 16, 2020~~

Questions submitted by registered participants: questions

Minutes of the meeting: minutes

~~Release of~~ ~~data set~~: ~~August 16, 2020 August 20, 2020~~
O~~pening of~~ s~~ubmission~~ ~~portal: August~~ 16~~, 2020~~ ~~August 20, 2020~~
~~Last date for submission: August~~ 28~~, 2020~~
Announcement of results: ~~On or before September~~ 15~~, 2020~~ ~~Evaluation of submissions is currently on. Results will be announced shortly after the evaluation is complete.~~ Results have been communicated to the participants.

Terms and Conditions

1. 1. - - Read the terms and conditions for participating in the challenge here.

MPaCT challenge is now complete.

Registration Closed

http://www.cfilt.iitb.ac.in/

Contact us: pb@cse.iitb.ac.in https://www.cse.iitb.ac.in/~pb/

Page updated

Google Sites

Report abuse