PROPOR'24 Competition on Automatic Essay Scoring of Portuguese Narrative Essays
Overview
The goal of this competition is to develop computer systems capable of automatically evaluating essays to assist teachers in the classroom by enhancing formative feedback strategies, enabling them to focus on specific areas of writing that require improvement among their students. Specifically, the emphasis is on assessing narrative essays written in the Portuguese language by students within the Brazilian basic education system. As such, participants are invited to develop a computational system capable of estimating a grade for an input essay for each specified competency of interest following the established grading rubric, described below.
Task Description
All essays used in this competition were manually digitized and anonymized. Afterward, the essays were analyzed by two human evaluators who assessed different aspects of the essay based on a correction rubric described below. This rubric provides instructive guidance for educators to consider four required competencies:
Formal Register: Appropriate use of the Portuguese language. Aspects such as misspelling words, inadequate use of nominal/verbal agreement and nominal/verbal regency, and inappropriate usage of punctuation symbols are considered.
Thematic Coherence: Adequate understanding of the text production proposal and its development associated with knowledge from different areas, according to the requested proposal, i.e., the plausibility of the text developed concerning the motivating text.
Textual Typology: Conformity of the text production proposal regarding a Narrative textual typology, articulating ideas, facts, and information in a sequenced and logical way, presenting the constituent elements of this type of textual structure: narrator, place/space, temporal organization, multiple or single characters performing actions.
Textual Cohesion: Correct use of linguistic mechanisms to interconnect text elements, such as words, sentences, and paragraphs.
Each dimension was assessed using integer levels ranging from 1 to 5, with higher levels indicating better text quality and language proficiency and lower levels demonstrating a lack of proficiency.
Essay Example
The essays are stored in a Comma-separated values (CSV) file format, and each essay has the following fields: id, student essay text, prompt (text motivating), and the grades of the four competencies (formal_register, thematic_coherence, narrative_rhetorical_structure, cohesion). Below is an example of an essay extracted from the training corpus with its motivating text and rubric grades.
ID: 2
Student essay: "Chovia muito naquele dia, com trovões muito altos, vindos do céu. E depois que a chuva passou encontrei no quintal da minha cada uma pedra muito gigante brilhante não acreditei no que estava vendo, estava encantada com aquilo peguei e fui lá para dentro sem acreditar e guardei em um cofre, sem saber se realmente era real"
Prompt (Motivating text): "Chovia muito naquele dia, com trovões muito altos, vindos do céu. E depois que a chuva passou, encontrei no quintal da minha casa uma pedra muito brilhante."
Grades:
Formal Register: 4
Thematic Coherence: 3
Textual Typology: 3
Textual Cohesion: 4
Call for competitors
This PROPOR 2014 shared task focuses on the challenges of automatic narrative essay scoring written in Portuguese by primary school students.
This shared task will quantitatively assess automatic essay scoring solutions using 1,235 essays written in Brazilian Portuguese.
60% of essays will be available for training, 10% for validation, and 30% for evaluation.
At the end of the competition, the testing dataset and the required program to run the evaluation measures will be publicly available.
The competition will be hosted on the Kaggle platform, and the access link will be sent to registered teams.
The submitted systems will be evaluated using the weighted F1-score and Cohen’s linear Kappa coefficient.
The submitted solutions will also be compared with the performance of some baseline techniques.
A report on the competition will be published in the PROPOR 2024 conference proceedings.
NOTE: You may participate in this contest even if you do not plan to attend the PROPOR 2024 conference.
Register to participate in the competition
Register your team answering this form: https://forms.gle/y9ZKSzLNTN7nxuzg7
Important Dates
Nov. 27, 2023 Release of the training and validation dataset
Nov. 27, 2023 Competition opens to the participants
Jan. 20, 2024 Deadline for enter the competition.
Feb. 01, 2024 Paper submission to the PROPOR 2024 Program Committee describing the contest and the obtained results.
Mar. 14, 2024 Final contest results to be announced at the PROPOR 2024 conference.
Organizers
Hilário Tomaz Alves de Oliveira