
Description/documentation of Indigenous languages of the Americas. 

My commitment to the promotion of intensive fieldwork-based linguistics and documentary linguistics has led me to carry out careful description and documentation of the Choguita variety of Rarámuri (Tarahumara), an endangered, previously undescribed Uto-Aztecan language spoken in northern Mexico. With its highly agglutinating morphology and complex morphophonological processes, Choguita Rarámuri provides a unique opportunity to explore critical questions about the nature of the phonology-morphology interface, as well as the interplay between processing and distributional factors in agglutinating morphological systems. The description and documentation of Choguita Rarámuri also has an impact on community initiatives to revert the contraction of Rarámuri. Language documentation that I have carried out in the past decade together with community members has produced a representative sample of video and audio recordings of a wide range of speech genres, currently housed at the Endangered Languages Archive at School of Oriental and African Studies at the University of London. 

The specific goals of my long term, ongoing project on this language are: 1) to assist community members in language documentation, mobilization of documentation products for immediate use in the community and training of younger speakers in documentation practices; 2) to develop theoretically informed publications; and 3) to produce a reference grammar of the language. This project was funded by the National Science Foundation Documenting Endangered Languages Program (DEL) (2012-2018). Previously, this project was funded by the Hans Rausing Endangered Languages Project (hosted at the School of Oriental and African Studies (SOAS) of the University of London) (2006-2010). Products of documentation are archived at ELAR, UC Berkeley's Survey of California and Other Indian Languages, and are also deposited in the community, responding to the interests of native speakers to have an archive of audio and video recordings that will serve as a community heritage for future generations (details about the Choguita Rarámuri materials deposited at ELAR can be found here). This project builds on work started through the project Choguita Raramuri (Tarahumara) documentation and description.

My dissertation research is based on my fieldwork on Choguita Rarámuri, which I began in 2003. Before that, I worked with a speaker of Ojachichi Rarámuri (between 2000 and 2002) for my undergraduate honors thesis. I have worked with speakers of Mayo (Taracahitan; Uto-Aztecan), Huastec Nahuatl (Aztecan; Uto-Aztecan), Yucatec Maya (Yucatecan; Mayan), Popti’ (Kanjobalan; Mayan), Ixpantepec NIeves Mixtec (Oto-Manguean) and Ja'a Kumiai (Yuman) for different projects. I have also worked with published sources of Guarijío (Taracahitan; Uto-Aztecan) and several Tepiman languages (Uto-Aztecan) for projects on synchronic and diachronic aspects of the prosodic morphology of these languages.

My PhD dissertation provides a detailed description and analysis of the phonology and morphology of Choguita Rarámuri, a previously undocumented Uto-Aztecan language (PDF). I am currently completing a reference grammar of this language.

Prosodic documentation and prosodic typology

Choguita Rarámuri and related languages exhibit both phonetic and phonological properties of stress systems. Stress distribution is largely determined by the lexical stress makeup of roots and the morphological constructions in which they appear. One of the most typologically unusual features of the stress systems of these languages is that they possess an initial three-syllable stress window, a pattern that has only been documented in a handful of languages of the world (Caballero 2011). In addition to stress, Choguita Rarámuri also has contrastive tone in stressed syllables (Caballero & Carroll 2015). While the development of tonal contrasts has been documented for a number of Uto-Aztecan languages, no variety of Rarámuri had been described as featuring a tonal contrast. In addition to using F0 to mark lexical and morphological distinctions, Choguita Rarámuri exploits F0 intonationally. I am investigating the distributional and phonetic properties of word prosodic phenomena in Choguita Rarámuri as well as the tonal and non-tonal phenomena involved in its intonational encoding. More recently, I have been investigating grammatical tone in this language (Caballero 2018a, 2018b, Caballero & Austin to appear) and seek to contribute to a better understanding of grammatical tone patterns cross-linguistically and their analysis in phonological and morphological theory. More generally, this project seeks to make a significant empirical contribution to the development of word prosodic typology and to the documentation of prosody of endangered and understudied languages.

Complex morphology in agglutinating languages. 

In collaboration with my colleague Vsevolod Kapatsinski (University of Oregon), I seek to understand the role that processability and distributional factors play on constraining the morphological complexity of lesser-studied languages with agglutinating morphologies. In previous work (Caballero 2010, 2008), I have shown that speakers of Choguita Rarámuri treat subparts of complex words as undecomposable wholes and reuse them to derive more complex word forms even if they contain smaller parts that appear to be inconsistent with the meaning of the resulting word. Word structures themselves may be constrained by pressures to allow for reliable mapping between sounds and forms (Hay & Baayen 2005) as well as other processing pressures. All of these tendencies are probabilistic in nature and combine in parallel to jointly determine the form of a word. Together with Vsevolod Kapatsinski (University of Oregon), I seek to understand the interplay between psycholinguistic and distributional factors and complex morphological systems through investigation of Choguita Rarámuri as a case study. For this purpose, we seek to combine documentation-based field research, with quantitative corpus-based research and experimental data. In our paper ‘Perceptual functionality of morphological redundancy in Choguita Rarámuri (Tarahumara)’ (Caballero & Kapatsinski 2015), we report the results of our first perception experiment designed to test the perceptual functionality of ME in Choguita Rarámuri. Our results suggest a mechanism of pragmatic inference at play in morphological processing, whereby listeners expect the speaker to produce as little as possible while successfully transmitting the intended information. Our work in progress seeks to elucidate morphological processing in this language using a Naive Discriminative Learning approach (Baayen et al. 2011). Our most recent results are reported in our forthcoming paper: "How agglutinative? Searching for cues to meaning in Choguita Rarámuri (Tarahumara) using an amorphous model" (Caballero & Kapatsinski to appear).

Typology of affix order and multiple exponence. 

Affix ordering and exponence are topics that lie at the core of morphological theory and form ideal testing grounds for determining the nature of the interface between different components of grammar. Affix order has been claimed to be driven by semantic factors, syntactic scope, psycholinguistic processing-based factors, prosodic factors, subcategorization frames, and morphological templates. These factors are not necessarily mutually exclusive, but the precise nature and limits of their interaction remain a topic of active research. Furthermore, growing documentation of lesser-known languages reveals patterns that challenge previous assumptions of possible affix order systems. In previous work (Caballero 2008, 2010), I document a new case of variable affix order in Choguita Rarámuri where alternative orders are determined by scope, templatic constraints, phonological subcategorization and phonological conditions on stem shape. 

Multiple (extended) Exponence (ME), the one-to-many mapping between a morphological category and its formal expression, is another important area of my research program. ME challenges widely-held principles of economy and structural complexity, as well as incremental morphological theories. Despite its critical theoretical ramifications and increasing number of documented cases, there is still no clear sense as to what is the possible range of variation in ME patterns cross-linguistically. In order to fill this gap, I have investigated, together with Alice C. Harris (UMass Amherst), what the possible parameters of variation are in documented patterns of ME. We surveyed 95 cases of ME in languages belonging to twenty-five language families. We show that ME is more common and less constrained than commonly believed and that there are virtually no constraints on the types of ME in terms of form or meaning properties (Caballero & Harris 2012); attempts in various frameworks to constrain it thus face a number of obstacles. This study represents, to the best of our knowledge, the first typological characterization of ME.

I have also developed formal tools to account for cross-linguistic ME patterns in joint work with Sharon Inkelas (UC Berkeley). In our paper ‘Word construction: tracing an optimal path through the lexicon’ (Caballero & Inkelas 2013), we propose a cyclic, optimizing production model of morphology that builds on Construction Morphology and provides a unified account of attested patterns of blocking and semantically superfluous morphology. We hypothesize that ME arises cross-linguistically through the cyclic optimization of word structure along scales of meaning strength and structural well-formedness. Under this account, ME can be generated through independently needed mechanisms, the same mechanisms also necessary to generate blocking effects. Our work in progress explores how the formal tools of OCM can derive the full range of typological patterns attested to date in terms of ME as documented in Harris (2017) (Caballero & Inkelas 2018).