ERRATAS: Charting editorial interference and orthographical reliability in editions of English historical letters

Abstract for Working from manuscript sources: Colloquium of the VARIANTTI-network (Finnish network on textual criticism and scholarly editing), 4 November, Turku, Finland.

Samuli Kaislaniemi (with Anni Sairio, Tanja Säily & Terttu Nevalainen)

University of Helsinki

The quickest way to compile a corpus of historical manuscript texts is to use texts already published in editions. The results of such “philological outsourcing” may be suitable for research on content, grammar, lexis and pragmatics, but not necessarily for orthography, punctuation, and layout, even when the editions used claim to have retained the original manuscript spelling. This compromise has nonetheless been accepted in order to create corpora such as the 5.1m-word Corpus of Early English Correspondence (CEEC), consisting of 12,000 English personal letters spanning the years 1400–1800.

Compilers of historical corpora have been careful to acknowledge limitations in edition-based corpora, but the use of editions in fact does not preclude using the resulting corpus for orthographical research. Yet until now there has been no easy way to determine the philological reliability of the text in an edition. What do editors state as their practices, and what have they actually done with their sources?

This paper presents the work of the ERRATAS project which charts editorial practices in the roughly 200 editions of letters used in the CEEC. The range of features charted include e.g. how spelling, capitalization and word divisions are retained, whether abbreviations are expanded and marked, and so on. The goal is to create a typology of editorial reliability, assigning editions a rating of orthographical reliability, and thus make this material accessible for orthographical analysis. This will greatly increase our understanding of the philological reliability of existing editions of historical manuscripts and allow for new openings in historical linguistic research.

ERRATAS is part of the multidisciplinary project Interfacing Structured and Unstructured Data in Sociolinguistic Research on Language Change (STRATAS), which aims to further historical sociolinguistic research by addressing social meaning in language change and by developing new digital tools for exploring linguistic data.