Abstract for ISLE 4 (the 4th conference of the International Society for the Linguistics of English), 18-21 September 2016, Poznan, Poland.
Anni Sairio, Samuli Kaislaniemi, Tanja Säily, and Terttu Nevalainen
University of Helsinki
The Corpus of Early English Correspondence (CEEC) has been compiled of edited personal letters in order to facilitate sociolinguistic research into the history of English; the CEEC family of five corpora spans from 1400 to 1800 and includes 5.1 million words. According to Nevalainen and Raumolin-Brunberg (2003: 44), CEEC is a reliable tool for research on grammar, lexis and pragmatics, but “not necessarily for orthography and phonology, which should be studied from the most scrupulous editions and original manuscripts”, and this has been accepted as a limitation for the use of this material. However, CEEC has not been systematically examined for a) the orthographical research opportunities it may nevertheless provide and b) the types of editorial interference in the corpora. The CEEC family does not include editions with modernized spelling, but what have been the editors’ exact choices? What do the editors state as their principles, and what have they actually done with the letters? And what kind of orthographical research might be carried out using CEEC?
This paper presents the work of the ERRATAS project which charts editorial conventions in the roughly 200 collections of letters in the CEEC, starting from the seventeenth-century texts. The range of features charted include (for example) how spelling, capitalization and word divisions are retained, whether (and in what way) abbreviations are expanded, whether the entire letter text is reproduced, and so on. The goal is to code the CEEC letter collections for orthographical reliability in order to make this material accessible for orthographical analysis. In addition, the work will contribute to creating standard practices for compiling manuscript-based editions.
The ERRATAS project is part of the multidisciplinary project Interfacing Structured and Unstructured Data in Sociolinguistic Research on Language Change (STRATAS), which aims to further historical sociolinguistic research by addressing social meaning in language change and by developing new digital tools for exploring linguistic data.
References
Nevalainen, Terttu & Helena Raumolin-Brunberg. 2003. Historical sociolinguistics: language change in Tudor and Stuart England. London: Pearson Education.