The dataset for Task A is designed for Arabic automated essay scoring (AES) and includes essays written by native Arabic-speaking first-year university students under test-like conditions. It contains unique essay IDs, prompt IDs, the full text of each essay, and holistic scores, which reflect the overall quality of the essays.
The dataset for Task A includes three files:
TAQEEM2025_TaskA_train_prompts.json: This file contains the writing prompts provided to students.
Each entry includes:
prompt_id (Integer): Unique identifier for each prompt.
prompt_text (String): The actual prompt text given to the students.
prompt_type (String): The type of the writing task ("persuasive" or "explanatory").
Example:
{
"prompt_id": 1,
"prompt_text": " ... باتَ اِهْتمام وحماس المراهقين",
"prompt_type": "explanatory"
}
TAQEEM2025_TaskA_train_essays.json: This file contains the full text of student essays written in response to a specific prompt.
Each entry includes:
prompt_id (Integer): Indicates which prompt the essay responds to (e.g., 1).
essay_id (String): A six-digit unique identifier for the essay (e.g., "010210").
essay (String): The full essay text.
Example:
{
"prompt_id": 1,
"essay_id": "010210",
"essay": "... الصحة والجسم السليم من نعم الله على الإنسان"
}
TAQEEM2025_TaskB_train_human_scores.csv: This file contains the trait-specific scores assigned by the human raters to each essay in CSV format.
Each row includes the following columns:
prompt_id (Integer): The ID of the prompt associated with the essay.
essay_id (String): The unique identifier of the essay, matching the essay_id in the essays file.
holistic (Integer): The overall holistic score assigned to the essay (range: 0–32).
Example:
prompt_id,essay_id,holistic
1,010210,24
**Note**
You must read the essay_id as a String to preserve leading zeros.
The training and dev sets for Task A will be released on June 10, 2025. . They are only available to registered teams.
We will release the test set on July 20, 2025.
We thank Qatar University for supporting the dataset collection and annotation, and the Ministry of Education and Higher Education (MoE) in Qatar for facilitating data collection from male and female high schools across Qatar.