Reference: Dixon, D. H. (2024). Introducing the Single Player Offline Game Corpus (SPOC): A corpus of seven registers from digital role-playing games. Corpora, 19(1), 107-122. https://doi.org/10.3366/cor.2024.0300
Download a sample of texts from SPOC here (Five texts from each register in each game)
Get the whole corpus: The SPOC is freely available to researchers and educators by signing this waiver and emailing it to ddixon49@gsu.edu.
Once received, you will receive the corpus via email as a .zip file. In the waiver, you are agreeing that you are only using the corpus for research or educational purposes and will not send or post any copies of the corpus.
Abstract
This paper describes the compilation and design of the Single Player Offline Game Corpus (SPOC), which is being made freely available for research and educational purposes. The SPOC was compiled by extracting the localization files from the digital directories of four popular commercial digital role-playing games: Divinity: Original Sin II, Fallout 4, the Elder Scrolls V: Skyrim, and the Witcher 3: Wild Hunt. The 3.7-million-word corpus contains more than 30,000 texts and is unique from other game corpora in that it has the following three characteristics: (1) the texts are categorized into seven registers using Biber and Conrad's (2019) register framework, (2) texts are systematically parsed into the smallest meaningful units of observation, and (3) all texts were compiled from the data files of the games themselves. Nearly all language use in the four games is accounted for and parsed into register categories based on their underlying situational characteristics, in particular the communicative purposes and the associated contexts in which the texts appear in the games.