Evaluation
Evaluation
Competition at Codabench is now aviable
The evaluation of the MiSonGyny task will focus on two main subtasks, each with its own specific criteria. For both subtasks, the macro-F1 score will be used as the primary evaluation metric.
What metrics will be reported?
Precision: To provide a clear view of the model's ability to avoid false positives.
Recall: To evaluate the model's ability to find all actual positive cases.
Macro-F1 score: The average of the individual F1-scores for each class or category.