Participants can submit three different runs for each subtasks.
Format: each run will be an output file, and the format for each output file will have to be as follows:
<document_id> tab <gender_prediction> tab <age_prediction> tab <topic_prediction>
Where,
<id> is the instance id as provided in the test files
<gender_prediction> is one of "F" or "M"
<age_prediction> is one of [0,19], [20,29], [30-39], [40-49], [50-100]
<topic_prediction> is one of :
ANIME,
AUTO-MOTO,
BIKES,
CELEBRITIES,
ENTERTAINMENT,
MEDICINE-AESTHETICS,
METAL-DETECTING,
NATURE,
SMOKE,
SPORTS,
TECHNOLOGY
For SUBTASK 1, all the three dimensions must predicted by the participant
For SUBTASK2:
SUBTASK2a : Only gender must be predicted (two settings: one in-domain, one out-domain). Other fields must have an "X" value.
SUBTASK2b : Only age must be predicted (two settings: one in-domain, one out-domain). Other fields must have an "X" value.
For example, for the following instances in a test file:
<doc id="2246" topic="?" age="?" gender="?" >
<post>
Ah .... e che voi sappiate che cosa comporta se non la portiamo??
</post>
<post>
il mio meccanico dice di evitare
</post>
</doc>
If you are participating to the SUBTASK 1 you will have to produce line as such, assuming your system has predicted male, 40-49 as age range and CARS as Topic.
2246 M 40-49 AUTO-MOTO
If you are participating to the SUBTASK 2a you will have to produce line as such:
2246 M X X
If you are participating to the SUBTASK 2b you will have to produce line as such:
2246 X 40-49 X
Optimally, you will preserve the order as provided in the test files.
File names: each team has to make sure the submitted files obey the following standardised format:
<team-name>_<subtask>_<run-number>
<team-name> is your team name, for instance "UniABC"
<subtask> is one of either 1, 2a, 2b
<run-number> is one of 1,2,3
For example, team "UniABC" can submit three runs for subtask 1 and name the files:
UniABC_1_1
UniABC_1_2
UniABC_1_3