Graphically Modeling Text Coherence for Automated Essay Scoring
Graphically Modeling Text Coherence for Automated Essay Scoring
Alex Meng, Amy Weng, Kevin Yin
Github Repository: https://github.com/KevinPHX/CS333AES
We performed a replication study of Stab and Gurevych’s 2017 paper, “Parsing Argumentation Structures in Persuasive Essays,” and applied their proposed algorithmic system to four hundred persuasive essays from the Automated Student Assessment Prize (ASAP), a dataset which has not been annotated with any information relating to argumentation structure. Our task is to construct directed graphs consisting of claims and premises for each essay and then score essays based on persuasiveness, which Wambsganss et al. (2020) define as the proportion of claims that are supported by premises in the essay. Ultimately, we investigate whether higher scoring persuasive essays by human evaluation necessarily exhibit more comprehensive argumentation structures.
The data we use includes the 402 essays from the Argument Annotated Essay (AAE) dataset released by Stab and Gurevych with their paper and the first 400 essays from the second set of the ASAP dataset. Of the AAE dataset, 80 essays form our test set to evaluate the models we implemented and the remainder serve as our training data. For Set 2, tenth graders were tasked to articulate an argument regarding censorship in libraries and were graded based on two domains: writing applications and language conventions. Since we are modeling the microstructure of arguments, we are only concerned with the first domain (D1), which encompasses evaluations of content, style, organization, and voice. Because we predict component types and relations using a variety of linguistic features from the essay itself, such as discourse markers and essay structure, the tree construction process inherently accounts for aspects of organization that the human evaluators of the essays consider during their rating process.