The training data of Cruciverb-IT can be downloaded at the following link: https://huggingface.co/datasets/cruciverb-it/evalita2026
You can download both datasets (task 1 and task 2) from the "Files and versions" tab in the HuggingFace repository.
For the proposed task, we rely on both the ItaCW crossword dataset (Zeinalipour et al., 2023) and on a collection of additional clue-solution pairs found on the web. The overall dataset, after dropping duplicates, contains approximately 410000 clue-answer pairs, encompassing various types of puzzles including wordplays, cryptic clues, named entities initials, fill the blank clues and so on. For the first task, the dataset will be divided into training (90%), validation (5%), and test (5%) sets, resulting in approximately 370,000 training examples, and 20,000 examples each for validation and testing. Each dataset will be released as a .csv file containing three columns: clue, answer and answer_length, with the answer column omitted in the test set. The data will be structured as follows:
clue,answer,answer_length
La lettera di Napoleone,enne,4
Si calzano più che altro in estate,espadrilles,11
Per i napoletani vale un momento,mo,2
For the second task, we automatically generate crossword grids by employing a constraint-driven, search-based construction algorithm designed to populate a predefined crossword layout with valid words from the list of answers contained in the aforementioned train, validation and test splits, respectively. Specifically, we first generate several empty and square matrices by placing black squares randomly, although ensuring symmetry in the layout, and, subsequently, we populate the grid with the aforementioned algorithm. Lastly, we collect the corresponding clues for each word in the grid, therefore obtaining several complete and plausible crosswords. We generate crosswords of different sizes in order to account for various levels of complexity: 5x5, 7x7, 9x9, 11x11, 13x13 with the following percentage of black squares, respectively: 15%, 16%, 22%, 27% and 27%. For the training set we generate 500 crossword grids, specifically: 300 (5x5), 150 (7x7), 25 (9x9), 15 (11x11) and 10 (13x13). For the validation set we generate 50 crossword grids, that is 10 for each size.
Specifically, each empty crossword grid is represented as a matrix, i.e. a list of lists, where each square is either blank (noted as a whitespace ' ') or a black square (noted as a dot '.').
Empty grids will be represented as follows:
[
['.', ' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' ', ' '],
[' ', ' ', ' ', ' ', '.']
]
On the other hand, given a grid, the corresponding clues are a list of dictionaries with the following keys:
"clue": the clue
"row": row index
"col": column index
"direction": the direction in which the answer should be placed, either "A" (Across) or "D" (Down)
"target": the answer to the clue
"length": the answer character length
Hence, clues are formatted as follows (targets will be excluded in the test set):
[
{
"target": "AUGE",
"clue": "Il momento di maggior successo",
"row": 0,
"col": 1,
"direction": "A",
"length": 4
},
{
"target": "LUNAR",
"clue": "La L di Lem.",
"row": 1,
"col": 0,
"direction": "A",
"length": 5
},
{
"target": "IRIDI",
"clue": "Arcobaleni poetici",
"row": 2,
"col": 0,
"direction": "A",
"length": 5
},
{
"target": "GATES",
"clue": "Il Bill della Microsoft",
"row": 3,
"col": 0,
"direction": "A",
"length": 5
},
{
"target": "IRIS",
"clue": "La Blond di un film interpretato dalla Gerini",
"row": 4,
"col": 0,
"direction": "A",
"length": 4
},
{
"target": "LIGI",
"clue": "Scrupolosi",
"row": 1,
"col": 0,
"direction": "D",
"length": 4
},
{
"target": "AURAR",
"clue": "Moneta islandese",
"row": 0,
"col": 1,
"direction": "D",
"length": 5
},
{
"target": "UNITI",
"clue": "Lo sono gli Emirati Arabi",
"row": 0,
"col": 2,
"direction": "D",
"length": 5
},
{
"target": "GADES",
"clue": "Un Antonio del flamenco",
"row": 0,
"col": 3,
"direction": "D",
"length": 5
},
{
"target": "ERIS",
"clue": "Dea greca",
"row": 0,
"col": 4,
"direction": "D",
"length": 4
}
]
Lastly, the filled grid should be formatted as follows:
[
['.', 'A', 'U', 'G', 'E'],
['L', 'U', 'N', 'A', 'R'],
['I', 'R', 'I', 'D', 'I'],
['G', 'A', 'T', 'E', 'S'],
['I', 'R', 'I', 'S', '.']
]