Publications

On this page, you can browse the MACAWS team's publications. To look up the abstract of a specific publication, click on the links provided below.

Teaching with learner corpus data

Sommer-Farias, B., Vinokurova, V., Gorlova, A., & Centanin-Bertho, M. (2023). Teaching with learner corpus data. FLTMAG. https://fltmag.com/teaching-learner-corpus-data/

This article discusses the use of learner corpora in language teaching. It outlines the argument for using learner corpora in the classroom, discusses key terms in corpus-based teaching and strategies for using learner corpora, and presents a sample lesson from a Russian language classroom based on data from the MACAWS corpus.

Multilingual learner corpus for less commonly taught languages

Sommer-Farias, B., Novikov, A., Picoral, A., Bertho, M., & Staples, S. (2022). Multilingual learner corpus for less commonly taught languages. International Journal of Learner Corpus Research, 8(2), 261-282.

This article provides a detailed account of the framework and research and pedagogical applications of the [our learner corpus]. [Our learner corpus] is a monitor learner corpus of written and oral assignments on various topics from Foreign Language (FL) learners. Currently the corpus contains 124,054 words in Russian and 536,168 in Portuguese but it is updated each semester as new texts are added to the corpus. The online interface allows teachers, students and researchers to search for words and phrases and access metadata on students, courses and assignments. Our novel interactive Data-driven Learning (iDDL) tool allows embedding of concordance lines into websites and Learning Management Systems (LMS), facilitating student interaction with concordance lines. We also have an offline version of the corpus that is available upon request. 

Keywords: multilingual, Less Commonly Taught Languages (LCTL), interactive Data-driven Learning (iDDL)

Learner corpus as a medium for tasks

Novikov, A., & Vinokurova, V. (2022). Learner corpus as a medium for tasks. In W. Martelle & S. V. Nuss (Eds.), Teaching Russian through task: Task-based/supported instruction of Russian as a foreign language. Routledge.

This chapter argues for the use of a learner corpus in task-based teaching of Russian. First, the chapter provides the definitions of tasks and discusses texts in task-based teaching. Second, we elaborate on focused tasks in light of the principles of language awareness and introduce Data-Driven Learning (DDL). And finally, we describe the two types of focused tasks, namely structure-trapping tasks and DDL tasks, and explain how a learner corpus can be seen as a medium between the more traditional structure-trapping tasks and innovative DDL tasks. 

Syntactic and morphological complexity measures as markers of L2 development in Russian

Novikov, A. (2021). Syntactic and morphological complexity measures as markers of L2 development in Russian [Doctoral dissertation, University of Arizona].

Within second language acquisition research, L2 development has been traditionally analyzed through the dimensions of Complexity, Accuracy and Fluency (CAF) (Larsen-Freeman, 2009; Ortega, 2003; Skehan, 2009). Complexity within the CAF framework has gained the most attention and has often been examined through measures associated with clausal length (Bulté & Housen, 2012). In contrast, the present study examines complexity through the register-functional framework (Biber, Gray & Poonpon, 2011; Biber, Gray & Staples, 2016). The fundamental principle of the register-functional framework is that complexity is situation-dependent, meaning that complexity depends on the situational characteristics of texts such as communicative purposes of texts and their production circumstances. Thus, instead of relying on particular measures of complexity such as complexity indices or T-unit measures, the investigation of complexity within the register-functional framework begins with a linguistic description of texts. The present study builds on the foundation of the register-functional framework and adds to the body of L2 development research by being the first of its kind in L2 Russian. Previous studies that investigated L2 complexity in Russian are rather few (e.g., Henry, 1996; Kisselev & Alsufieva, 2017) but are also limited in that they 1) only study writing; 2) use either omnibus complexity measures or a rather limited set of measures; 3) investigate high levels of proficiency. The overall goal of this study is to provide a comprehensive description of syntactic and morphological L2 development at lower levels (e.g., beginner to intermediate) across speech and writing in Russian. To address these research gaps, the present study examines L2 development through morphological and syntactic complexity measures in L2 Russian. The study uses a corpus of written and spoken texts produced by learners across four program levels (i.e. the first two years of Russian). First, the study examines individual measures of morphological and syntactic complexity and interprets the findings in light of the curriculum progression and assignment effects. Second, the study performs a Multidimensional Analysis (MD) in order to group the individual measures of complexity into dimensions of complexity that are interpreted functionally. The results of the study show that both syntactic and morphological complexity measures behave differently across program levels. For example, while adverbial if- and when- clauses increase with program level, adverbial because-clauses decline. Similarly, while post-modifying nouns increase, attributive adjectives decrease. In terms of morphological complexity, the measures that have clear increasing trends across program levels are genitive nouns and adjectives, instrumental nouns and adjectives, dative nouns, and past perfective and imperfective verbs. The Multidimensional Analysis (MD) yielded two dimensions of complexity: 1) Narrative vs. Non-narrative/Descriptive, and 2) Informational vs. Personal. The narrative side of Dimension 1 includes perfective and imperfective past verbs, while the non-narrarive side includes 3rd person plural verbs and attributive adjectives. The informational side of Dimension 2 is represented by complexity measures such as prepositional adjectives, genitive singular nouns, genitive adjectives and attributive adjectives. In contrast, the personal side is characterized by such measures as 1st person present tense verbs, accusative nouns and non-finite complement clauses. These dimensions of complexity showed significant differences between program levels. Significant interactions between program level and mode were also demonstrated pointing out to differences between speech and writing with regards to these dimensions across program levels. Although the complexity measures included in these dimensions are very specific to Russian, these two dimensions have been consistently identified in other MD studies.

To view this dissertation, click here

The acquisition of preposition + article contractions in L3 Portuguese among different L1-speaking learners: A variationist approach

Picoral, A., & Carvalho, A. (2020). The acquisition of preposition+article contractions in L3 Portuguese among different L1- speaking learners: A variationist approach. Languages, 5(4), 45-62.

This paper sheds light on the paths of third language (L3) acquisition of Portuguese by Spanish–English speakers whose first language is Spanish (L1 Spanish), English (L1 English), or both in the case of heritage speakers of Spanish (HL). Specifically, it looks at the gradual acquisition of a categorical rule in Portuguese, where some prepositions are invariably contracted with the determiner that follows them. Based on a subcorpus of MACAWS, comprising 1910 written assignments by Portuguese L3 learners, we extracted 21,879 tokens in obligatory contraction contexts and submitted them to a multivariate analysis. This analysis allowed for the investigation of the impact of linguistic (type of preposition and definite article number and gender) and extra-linguistic factors (course level and learner’s language background), with logistic regression modeling with sum contrasts and individual as a random effect. While results point to some clear similarities across the three language groups—all learners acquired the contractions in a u-shaped progression and used more contractions with the a preposition and fewer with the por preposition—participants acquire contractions at a higher rate when the article is singular than when it is plural, and in the case of HL speakers, more so when the article is masculine than when it is feminine. These results confirm the facilitatory role of a previously acquired language (i.e., Spanish) that is typologically similar to the target language (i.e., Portuguese) in transfer patterns during L3 acquisition.

You can access the full article at the Languages webpage. We welcome your comments and questions!

L3 Portuguese by Spanish-English bilinguals: Copula construction use and acquisition in corpus data

Picoral, A. (2020). L3 Portuguese by Spanish-English bilinguals: Copula construction use and acquisition in corpus data. [Doctoral dissertation, University of Arizona].

Previous research on third language (L3) acquisition has shown that the source language for transfer to the L3 can be either an L1, an L2, or both (Bardel & Falk, 2007; Flynn et al., 2004; Rothman, 2014). It has been hypothesized that either typological similarities between languages previously acquired and the target language (Rothman, 2010), or language status (L1 vs. L2) of previous acquired languages (Bardel & Falk, 2007) determine cross- linguistic influence. This dissertation investigates the acquisition of copula structures in L3 Portuguese by Spanish-English three groups of adult bilinguals: L1 English L2 Spanish, L1 Spanish L2 English, and L1 Spanish/English (i.e., heritage speakers of Spanish for the purposes of this dissertation). Language use by both native speakers (L1 Spanish, L1 English, and L1 Portuguese) and learners (L3 Portuguese) is analyzed using word embeddings and logistic regression modeling. The goal of these methods is to reveal patterns of copula use and acquisition. Copula constructions were chosen because they allow for the combined investigation of form, syntactic frame, and concept/meaning, as proposed by third language acquisition scholars. The main goal of this dissertation is to shed light on both transfer patterns from previously acquired languages (i.e., Spanish and English) on L3 Portuguese, and establish L3 Portuguese developmental patterns across bilingual groups. Results show evidence of L3 Portuguese development for all three groups of Spanish-English bilinguals. However, transfer patterns from Spanish and English onto L3 Portuguese are not the same across all groups, varying in degree depending on the copula construction. These results conflict with the Typological Primacy Model, which predicts that L3 acquisition in adulthood starts o from a wholesale transfer of the pre-acquired language system that is most typologically similar to the target language (Rothman, 2014). This dissertation offers support instead to L3 acquisition models that take into consideration structural characteristics of individual constructions, and how similar or different these are between source and target languages, including models such as the Parasitic Model (Hall et al., 2009).