Data and evaluation

Evaluation

For the Sentiment Analysis Track, we split the data into training and testing partitions. For developing their methods, participants will use the training partition, and subsequently the test partition will be used to evaluate the participant methods and to determine the winner of the challenge. For the Thematic group, we only use a partition as test.  

For the Type prediction, there are 3 classes (Attractive, Hotel, and Restaurant). For this reason, we apply the Macro F-measure as Equation (2) indicates.


Also, for the evaluation of the new sub-task, the country classification, the idea is similar to the type prediction measure. The Equation (3) shows the country classification measure. 


The final measure for this task is the average of 3 sub-tasks. The idea is that polarity has more weight than the other two subtasks, it will be given twice the importance, as we can see in Equation (4).

Sentiment analisys evaluation


For this edition, we propose to give more weight to minority classes. For the sentiment analysis collection of the Rest-Mex, the minority classes are the ones with the most negative polarities. Therefore, for this edition, to evaluate the result of the polarity classification, Equation (1) is proposed.


Where k is a forum participant system, C = {1,2,3,4,5}, Tc is the total instances in the collection, Tci is the total is instances in the class i. Finally, Fi(K) is the F-measure value for the class i obtained by the system k.  With this measure, correctly classified instances of class 1 will have more importance than instances of class 2, which in turn will have more importance than class 3, and so on.



Thematic Unsupervised Classification


To evaluate each system in the unsupervised classification task, an alignment must first be done. Given the Gold Standard, the output of each k system must be renumbered so that the themes correspond. This is because the only restriction that the participating teams have is that they must make 4 groups with the news shared in the competition.


This means that the labels do not necessarily coincide for the same groups expected in the Gold Standard. For this reason, a re-labeling will be done for each system using the Gold Standard label that shares the most instances with each of the groups resulting from the k system.


Once the alignment is done, it will be evaluated with a macro F-measure as shown in Equation(5).

Data

To access the data, you must register your team. Soon you will receive the data collection link.

Evaluation Rules

Runs for Track 1 will be received from 13th April 0:01 until 4th May 23:59 (-0600 UTC)

Runs for Track 2 will be received from 13th April 0:01 until 4th May, 23:59 (-0600 UTC)


Participants are allowed to submit several runs for each track.

Output Submission

Submissions formatted as described below and  sent via email to the account:  miguel.alvarez@cimat.mx

​Your software has to output for each task of the dataset a corresponding txt file. The file must contain one line per classified instance. Each line looks like this:

"TaskName"\t"IdentifierOfAnInstance"\t"Class"\n

It's important to respect the format with the " character, \t (tabulator) and \n (linux enter). The naming of the output files is up to you, we recommend to use the author and a run's identifier as filename with "txt" as extension.

For the Sentiment Analysis the possible labels are:

     "sentiment"    "0"   "5"  "Hotel" "Mexico"

     "sentiment"    "1"   "2" "Hotel" "Cuba"

     "sentiment"    "2"   "4" "Attractive" "Cuba"

     "sentiment"    "3"   "1" "Hotel" "Colombia"

     "sentiment"    "4"   "3" "Restaurant" "Mexico"


For the Thematic Unsupervised Classification the labels are:


     "thematic"    "0"3"

     "thematic"    "1"  "2"

     "thematic"    "2"1"

     "thematic"    "3"  "4"

     "thematic"    "4"  "1"



Notice that al instances number starts with 0.

A submission failing the format checking will be considered null.

Paper submission

Participants of the tasks will be given the opportunity to write a paper that describes their system, resources used, results, and analysis that will be part of the official IberLEF-2023 proceedings. 


Here are some important considerations for the article: