The NivELE task focuses on the development and evaluation of Natural Language Processing (NLP) models for the automatic classification of written texts produced by students of Spanish as a Foreign Language (ELE). At the beginning of each academic term, language centers must place large volumes of students into levels within very tight deadlines (Cantero et al., 2025a). This task is essential, as the correct placement of students directly impacts their academic success. However, assessing written expression is the most arduous part of the process and consumes the most faculty time, as texts must be leveled one by one.
This shared evaluation task was created to streamline teaching efforts and provide more objective placement through automation. It consists of identifying which level of the Common European Framework of Reference for Languages (CEFR)—A1, A2, B1, B2, C1, or C2—a text written by an ELE student belongs to.
The novelty of this proposal is threefold:
To the best of our knowledge, it is the first task to propose automatic multi-class leveling specifically for ELE within the framework of IberLEF.
It integrates deep linguistic knowledge of pedagogy and certification with advanced NLP techniques.
It focuses on indicators of grammatical, lexical, and discursive complexity rather than just the overall quality of the text.
If you want to participate in the NivELE@IberLEF2026 shared task, please fill out this form. Once you are registered, you can ask any questions through the Google Group of the shared task NivELE@IberLEF2026.
Participants will be required to submit their runs and are asked to describe their systems in paper submissions. We encourage participating teams to highlight the real contribution of their systems in identifying successful approaches along with failed attempts and findings on how to advance in more performant solutions. This description must contain the following details:
Architecture: modules, components, data flow…
Additional data used for training (if any): augmented data, additional datasets…
Additional technologies employed (if any): existing OCR systems along with selection criteria
Pre-trained models used (if any): source of the model, selection criteria…
Experiments conducted and training parameters: configuration, hyperparameters used…
Analysis of results: findings from results, ranking according to different metrics, interpretation, and validation…
Error analysis: a study of failed predictions and their characterization, possible improvements, and lessons learned…
This information is considered minimal for submission approval, that is, this information is mandatory.
If you have any specific question about the NivELE 2026 task, we may ask you to let us know through the Google Group NivELE@IberLEF2026.
For any other questions that do not directly concern the shared task, please contact any of the organizers.
This work is funded by the Ministerio para la Transformación Digital y de la Función Pública and Plan de Recuperación, Transformación y Resiliencia - Funded by EU – NextGenerationEU within the framework of the project Desarrollo Modelos ALIA. This work has also been partially supported by Project CONSENSO (PID2021-122263OB-C21) funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR, Project ROMANET (CERV-2024-CHAR-LITI-101215052), funded by the European Union under the Citizens, Equality, Rights and Values programme, Project HEART-NLP-UJA (PID2024-156263OB-C21) and project VERITAS-H (AIA2025-163322-C64) funded by MICIU/AEI/10.13039/501100011033 and by ERDF/EU