Download Windows and Mac Versions: https://github.com/ashleigh-cox/Vocabulary-List-Generator
Cox, A., Dixon, D. H., & Dixon, T. (2025). Vocabulary List Generator: A digital tool to generate frequency-based word lists adjusted for dispersion. Research Methods in Applied Linguistics 4(1). https://doi.org/10.1016/j.rmal.2025.100180
Abstract
In second and foreign language teaching, helping students learn new vocabulary is an important goal, and many researchers aim to help teachers determine which vocabulary items should be prioritized by making predictions about the words that specific learner populations are likely to need. Such predictions can be made by examining the words that frequently appear in corpora that are representative of the learners’ target language use domains. While early vocabulary lists tended to consider only frequency and range when ranking important words in a domain, more recently, researchers have argued for more robust measures of dispersion, including evenness (i.e., how evenly spread a word is across texts) and pervasiveness (i.e., the proportion of texts using a word, which is similar to range) (Egbert & Burch, 2023). However, the current tools and methods that are commonly used to form corpus-based vocabulary lists often do not include these dispersion measures, especially the measure of evenness. To make it easier for researchers and teachers to inform language learning targets and goals with vocabulary lists that consider both frequency and recommended dispersion measures, this report describes and demonstrates a new digital tool, the Vocabulary List Generator. The tool was piloted with a corpus of psychology journal articles, and the vocabulary list generated from the corpus demonstrates its use and output.
Download (Mac Version): https://github.com/danielhdixon/sampling_investigator
A Priori n size analysis and sample adequacy