Task A: Holistic Scoring

IN TAQEEM2025, we propose two sub-tasks; Task A is for Holistic Scoring, while Task B is for Trait-specific Scoring. We describe and formally define Task A below.

Task Definition

Evaluation Measures

Registration

Dataset

Task Definition

Task A is defined as follows:

Given a set of source prompts, the aim is to train a holistic scoring model using those prompts to score essays written for an unseen target prompt. The model should produce a single holistic score that reflects the overall quality of each essay.

Evaluation Measures

The primary evaluation metric for this task is the Quadratic Weighted Kappa (QWK), a standard AES performance metric that quantifies agreement between human-assigned scores and system predictions. The Root Mean Squared Error (RMSE) will also be reported for a more comprehensive analysis of model performance.

Task A will be assessed based on the average QWK of the holistic score across the test prompts.

Registration

You can find detailed information about Task A registration here .

Dataset

Detailed information about the dataset is here.

Page updated

Google Sites

Report abuse