Call for Papers
The organisers are pleased to announce a new shared task, inviting participants to contribute novel systems for a Multilingual Lexical Simplification Pipeline. This task comprises lexical complexity prediction and lexical simplification, uniting these two core simplification tasks into a single pipeline. We invite participants to develop new lexical simplification systems for these two tasks in a variety of high- and low-resource languages (listed below).
This shared task will be hosted at the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024), which will be colocated with the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024) in Mexico City June 21-22nd.
Lexical complexity prediction was previously explored as part of the LCP 2021 shared task, hosted as part of SemEval 2021 (Shardlow et al. 2021). Participants were presented with a given word in a sentence and asked to evaluate its complexity on a continuous scale. The lexical complexity prediction task requires participants to judge the difficulty of a given target word within a context on a continuous scale in the range 0 (easy to understand) to 1 (hard to understand).
Lexical simplification has also recently been explored at the TSAR 2022 shared task (Saggion et al. 2022), hosted as part of the Text Simplification, Accessibility and Readability Workshop at EMNLP 2022. In this task, systems must provide easier to understand alternatives for a given identified complex word in its context.
The lexical simplification pipeline unites these two tasks. Given a sentence with a marked target word, the system must first make a prediction regarding the complexity of that target and second provide potential simpler alternatives for the target, or the target itself if no simpler alternative can be found. By co-developing systems to jointly perform these tasks, participants will create a working lexical simplification pipeline system that can be applied in settings such as education to improve the readability of texts for learners with applications beyond the scope of the task.
Languages
We will provide evaluation data for the following languages:
English (en)
Spanish (es)
French (fr)
Brazillian Portuguese (pt-br)
Bengali (bn)
Sinhala (si)
Filipino (fil)
Japanese (ja)
Italian (it)
Catalan (ca)
German (de)
Participants are free to submit to one or multiple languages. We strongly encourage submissions from multilingual systems that are capable of handling the languages that we have released and further languages beyond the scope of the task. We will provide a separate ranking for multilingual systems that participate in all languages.
Dataset Format
There is now a glut of available resources for simplification tasks such as lexical complexity prediction and lexical simplification. As such, each language will provide an unlabelled test set only comprising of 570 instances. Labelled trial data will also be released comprising of 30 instances per language for the purpose of calibrating systems for the evaluation phase. We will not release new training data for this task. Participants are encouraged to make use of the many existing resources for lexical complexity prediction and lexical simplification to train their systems. A list of available resources will be hosted on the shared task website.
Each data instance in the trial data will comprise of the following fields: language, target, begin, end, context, complexity, substitutions. These are described below:
Language: The language code for this instance
Target: The identified (whole-word) target to be evaluated
Begin: The begin-offset in Unicode code points of the target in the context
End: The end-offset in Unicode code points of the target in the context
Context: The context in which this target appeared. Typically, but not limited to the enclosing sentence boundaries.
Complexity: A complexity score bounded in the range 0-1 derived from asking 10 annotators to judge the target in its context on a scale of 1 (easy) to 5 (difficult).
Substitutions: A list of no more than 10 substitutions ranked by frequency of suggestion by the annotators.
Each data instance in the test data will comprise of the following fields: language, target, begin, end, context. Participant systems will provide the ‘complexity’ and ‘substitutions’ fields in the same format as the trial data.
Evaluation
For Lexical Complexity Prediction, we will evaluate using:
Root Mean Squared Error calculated between the system outputs for lexical complexity and the values returned by the annotators. See Shardlow et. al (2021) for details.
For Lexical Simplification, we will use two metrics defined in Saggion et al. (2022) as follows:
MAP@K uses a ranked list of system-generated substitutes against the set of gold-standard substitutes. MAP@k takes into account the position of the relevant substitutes among the first k generated candidates.
Accuracy@k@top1, which is the percentage of instances where at least one of the k top ranked substitutes matches the most frequently suggested synonym in the gold data.
We will also provide Human End-to-End Evaluation for:
Simplicity,
Fluency and
Meaning Preservation.
Human evaluation will take place for the top 5 ranking systems according to an average of the automated metrics. Availability of human evaluation for some languages will depend on the recruitment of evaluators from the task participants.
Paper Submission
Participants that submit systems are invited to also submit a paper of up to four pages in length following the same template as submissions to BEA. Participant papers should focus on a description of the system architecture that has been used to address the task, as well as appropriate evaluation and criticial discussion of that architecture on data sources from within and outside of the task. Papers should include a minimal related work section, referencing architecture-specific works. Participants will be given the automated results from the system evaluation prior to the paper submission deadline, and may choose to include these with their own interpretation in the system paper. Participants will also be invited to assess system outputs as part of a human evaluation campaign for high-performing systems, the results of which will be presented as part of the shared task description paper at the BEA workshop.
Participant Registration
Interested parties can register any time prior to the Final Submission via our participant registration Google Form
Registered Participants will be the first to receive reminders and updates when each phase of the data is released.
Further information will be released through the MLSP shared task website
Timeline
Fri Feb 16 , 2024 Trial Data Release
Fri Mar 15 , 2024 Test Data Release
Mon Mar 26, 2024 Final Submissions
Fri Apr 12, 2024 System Papers Due
Fri Jun 21 2024 BEA Workshop
Organisers
Matthew Shardlow, Manchester Metropolitan University
Marcos Zampieri, George Mason University
Kai North, George Mason University
Fernando Alva-Manchego, Cardiff University
Thomas François, UCLouvain
Remi Cardon, UCLouvain
Nishat Raihan, George Mason University
Tharindu Ranasinghe, Aston University
Joseph Imperial, University of Bath, National University Philippines
Riza Batista-Navarro, University of Manchester
Adam Nohejl, Nara Institute of Science and Technology
Akio Hayakawa, Pompeu Fabra University
Yusuke Ide, Nara Institute of Science and Technology
Laura Occhipinti, University of Bologna
Horacio Saggion, Pompeu Fabra University
Anna Hülsing, University of Hildesheim
Andrea Horbach, University of Hildesheim
Stefan Bott, Pompeu Fabra University
Saul Calderon Ramirez, Tecnológico de Costa Rica
Nelson Peréz Rojas, Tecnológico de Costa Rica
Martin Solis Salazar, Tecnológico de Costa Rica
References
Saggion, H., Štajner, S., Ferrés, D., Sheang, K.C., Shardlow, M., North, K. and Zampieri, M., 2022, December. Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification. In Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) (pp. 271-283).
Shardlow, M., Evans, R., Paetzold, G. and Zampieri, M., 2021, August. SemEval-2021 Task 1: Lexical Complexity Prediction. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021) (pp. 1-16).