This task consists of classifying a text from an ELE (Spanish as a Foreign Language) student into its corresponding level according to the CEFR.
The possible categories for each text are:
A1, A2, B1, B2, C1
Participants may submit their entries through the established evaluation platform (CodaLab), which will feature a real-time leaderboard. A maximum number of submissions per team will be allowed to encourage model robustness.
Primary Metric: Macro-averaged F1 score. This metric calculates the arithmetic mean of the F1 scores for each class, giving equal weight to both minority and majority categories. This penalizes models that are biased toward the predominant class.
Secondary Metrics: Precision and Recall (macro) to analyze the trade-off between prediction accuracy and the model's retrieval capability. Accuracy will serve as a general indicator of the total number of correct predictions.