Data and evaluation
Evaluation
For the Sentiment Analysis Track, we split the data into training and testing partitions. For developing their methods, participants will use the training partition, and subsequently the test partition will be used to evaluate the participant methods and to determine the winner of the challenge. For the Thematic group, we only use a partition as test.
For the Type prediction, there are 3 classes (Attractive, Hotel, and Restaurant). For this reason, we apply the Macro F-measure as Equation (2) indicates.
Also, for the evaluation of the new sub-task, the country classification, the idea is similar to the type prediction measure. The Equation (3) shows the country classification measure.
The final measure for this task is the average of 3 sub-tasks. The idea is that polarity has more weight than the other two subtasks, it will be given twice the importance, as we can see in Equation (4).
Sentiment analisys evaluation
For this edition, we propose to give more weight to minority classes. For the sentiment analysis collection of the Rest-Mex, the minority classes are the ones with the most negative polarities. Therefore, for this edition, to evaluate the result of the polarity classification, Equation (1) is proposed.
Where k is a forum participant system, C = {1,2,3,4,5}, Tc is the total instances in the collection, Tci is the total is instances in the class i. Finally, Fi(K) is the F-measure value for the class i obtained by the system k. With this measure, correctly classified instances of class 1 will have more importance than instances of class 2, which in turn will have more importance than class 3, and so on.
Thematic Unsupervised Classification
To evaluate each system in the unsupervised classification task, an alignment must first be done. Given the Gold Standard, the output of each k system must be renumbered so that the themes correspond. This is because the only restriction that the participating teams have is that they must make 4 groups with the news shared in the competition.
This means that the labels do not necessarily coincide for the same groups expected in the Gold Standard. For this reason, a re-labeling will be done for each system using the Gold Standard label that shares the most instances with each of the groups resulting from the k system.
Once the alignment is done, it will be evaluated with a macro F-measure as shown in Equation(5).
Data
To access the data, you must register your team. Soon you will receive the data collection link.
Evaluation Rules
Runs for Track 1 will be received from 13th April 0:01 until 4th May 23:59 (-0600 UTC)
Runs for Track 2 will be received from 13th April 0:01 until 4th May, 23:59 (-0600 UTC)
Participants are allowed to submit several runs for each track.
Output Submission
Submissions formatted as described below and sent via email to the account: miguel.alvarez@cimat.mx
Your software has to output for each task of the dataset a corresponding txt file. The file must contain one line per classified instance. Each line looks like this:
"TaskName"\t"IdentifierOfAnInstance"\t"Class"\n
It's important to respect the format with the " character, \t (tabulator) and \n (linux enter). The naming of the output files is up to you, we recommend to use the author and a run's identifier as filename with "txt" as extension.
For the Sentiment Analysis the possible labels are:
TaskName: sentiment
IdentifierOfAnInstance:NumberOfOpinion
where NumberOfOpinion is the number line of the each opinion in the test file.
Classes: [1,5] '\t' [Attractive, Hotel, Restaurant] '\t' [Mexico, Cuba, Colombia]
Output example:
"sentiment" "0" "5" "Hotel" "Mexico"
"sentiment" "1" "2" "Hotel" "Cuba"
"sentiment" "2" "4" "Attractive" "Cuba"
"sentiment" "3" "1" "Hotel" "Colombia"
"sentiment" "4" "3" "Restaurant" "Mexico"
For the Thematic Unsupervised Classification the labels are:
TaskName: thematic
IdentifierOfAnInstance:NumberOfNews
where NumberOfNews is the number line of the each opinion in the test file.
Classes: [1,4]
Output example:
"thematic" "0" "3"
"thematic" "1" "2"
"thematic" "2" "1"
"thematic" "3" "4"
"thematic" "4" "1"
Notice that al instances number starts with 0.
A submission failing the format checking will be considered null.
Paper submission
Participants of the tasks will be given the opportunity to write a paper that describes their system, resources used, results, and analysis that will be part of the official IberLEF-2023 proceedings.
Here are some important considerations for the article:
System description papers should be formatted according to the Springer Conference Proceedings style: https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelines. Latex and Word templates can be found there.
The minimum length of a regular paper should be 5 pages. There is no maximum page limit.
Papers must be written in English.
Each paper must include a copyright footnote on the first page of each paper: {\let\thefootnote\relax\footnotetext{Copyright \textcopyright\ 2023 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). IberLEF 2023, September 2023, Spain.}}
Eliminate the numbering in the pages of the paper, if there is one, and make sure that there are no headers or footnotes, except the mandatory copyright as a footnote on the first page.
Authors should be described with their name and their full affiliation (university and country). Names must be complete (no initials), e.g. “Soto Pérez” instead of “S. Pérez”.
Titles of papers should be in emphatic capital English notation, i.e., "Filling an Author Agreement by Autocompletion" rather than "Filling an author agreement by autocompletion".
At least one author of each paper must sign the CEUR copyright agreement. Instructions and templates can be found at http://ceur-ws.org/HOWTOSUBMIT.html. The signed form must be sent along with the paper to the task organizers.