Computational Stylistics

Coalition Abstract

The written substrate of literary works is clearly the basis of all things literary. However, in the empirical study of literature, the textual base has long been secondary to the study of readerly and social-economic dimensions. What is more, following the formalist and structuralist traditions, empirical textual studies have as a rule focused on features of “deviance”, or foregrounding. To recalibrate this situation, the computational stylistics coalition will concentrate on the textual substrate of literary discourse – in formal detail, as well as in quantitative breadth.

Following ideas by Genette (1993) as well as Hamburger (1993), we will textually examine the dimension of style inasmuch as that of fictional world making. Our aim is to give a new text-based account of what makes literary texts literary: at the levels of (a) intra-literary discourse, in comparison with non-literary works, and (b) inter-literary/non-literary discourse, in diachronic and synchronous comparison with other (types of) literary texts. We ask whether literature, seen generally, is really typically marked by features that “stick out”, or whether it may be other, more backgrounded features that are indicative. As this research requires large and varied sets of data, as well as reliable and automatized means of textual analyses, we will tap into the computational assets available within the paradigm of “Digital Humanities” and Digital Stylistics”, using big samples of digitized versions of literary texts and text-mining techniques for their analysis.*

Inductive Stock Taking

At the highest level, our computational stylistic research wishes to contribute to a descriptive account of the full variety of genres and text types – throughout literary history up to the current date, including canonical as well as popular literary production. It pursues the following open questions about textual indicators of fictionality/literariness:

· Intra-discourse distribution: How is language used in specific literary genres across variables such as time, language, nation, and culture?

· Inter-discourse distribution: Do literary texts show features and ensembles of textual feature that are typically literary? What are these? If there are any, are these stable across time and genres? How about cross-cultural variation?

Literary Features Hunt

Given the long and relevant tradition of analyzing rhetorical features of style in literature, we also zoom in on features and patterns of deviance that are salient and attract readers’ attention, for example creative metaphorical language use and recurrence of stylistic features:

· Assessment: What textual features have been shown to systematically attract (today’s) reader attention or transport them into fictional worlds?

· Operationalization: Which of these features may be analyzed by means of computational stylistics, i.e. through formalization and quantification? What tools may be used to this end and in how far do these have to be adapted?

· Analysis: How are these features distributed across genres, epochs, cultures, and so on?

Naturally, the latter type of computational analysis may grow out of reader response studies, guide stimuli construction, and generally feed back into reader response studies.

---------------------

* Including but not limited to: frequency analyses of features and collocations, keyness, quantitative content analysis, stylometry, topic modeling, sentiment analysis, semantic word embedding. The methodology combines manual annotation, automatic detection, machine learning with (descriptive, multi-variate, inferential) statistics.

References

Genette, G. (1993). Fiction and Diction. (C. Porter, Trans.). Cornell University Press.

Hamburger, K. (1993). The Logic of Literature. (M. J. Rose, Trans.) (2nd Revised Ed.). Bloomington: Indiana University Press.

Author: J. Berenike Herrmann, Arthur Jacobs, Andrew Piper