Laura Horton

Lexical diversity: describing the lexicons of young sign systems

Homesign systems are manual-gestural systems developed by deaf individuals who do not have access to spoken language or to linguistic input in the form of an established sign language. These systems provide a window into the earliest stages of language emergence and a central question about this process of emergence concerns the similarity and convergence of different homesign systems created within the same community. However, quantifying similarity across a constellation of homesign systems is a particular challenge in the study of young sign systems. Studies documenting young sign languages, rural sign languages, and homesign systems have emphasized that there is considerable variation between signers, even in languages with a longer history of use (de Vos & Nyst, 2018; Sandler et al., 2011). These studies often compare sign forms directly, for example, Sandler et al., (2011) compare sign forms from signers of Al-Sayyid Bedouin Sign Language (ABSL) and note several issues including significant variation in form between signers, even those from the same family, and the fact that signers often produce multiple signs to describe a single photo.

In this study, I describe an alternative strategy for evaluating similarity between lexicons that does not involve comparing sign forms directly, but rather considers the distribution of signs within a given signer’s lexicon. I present a measure that I call lexical diversity, to characterize the frequency of sign forms within homesigner lexicons. This measure draws on one of the most durable findings from statistical linguistics – the frequency distribution of words. Across diverse corpora, words seem to follow a fairly simple distribution, termed Zipf’s law (Zipf, 1936, Mandelbrot, 1953). This distribution is characterized by a small set of high frequency words, words that comprise the majority of all words produced and a large set of low frequency words, produced rarely (Piantadosi, 2014).

The lexical diversity measure draws on two aspects of a Zipfian distribution, (1) the size of the set of signs that are least frequent – those signs that are produced only once – and (2) the sign that is produced most often. These two indices correspond to concepts from Kirby et al. (2015), expressivity and compressibility. Within the lexical diversity measure, signs that are produced only once are maximally expressive, they have only one referent. However, signs that are repeated across multiple referents may be providing some degree of compressibility in the lexicon, and reflect emergent structure in the system in the form of classifiers or compounds, reflecting categories in the lexicon.

I apply the lexical diversity measure to a dataset of signs elicited from ten child homesigners and their communication partners, including deaf adult relatives and peers. All participants live in Nebaj, a town in the central highlands of Guatemala. There is no standard sign language in use in Nebaj, so deaf people in the community develop homesign systems. Many of the deaf people in this sample are in contact with each other. There are several families with multiple generations of deafness so some deaf children have a deaf adult relative as a communicative model, and there is a local school for special education, where deaf students attend class together. Each participant described a set of 62 photos using their homesign system. The set of signs that each homesigner produced is treated as a mini-lexicon, and signs were glossed based on their iconic form properties, with a “conceptual component” (similar to Richie et al., 2014). Some conceptual components were repeated for multiple photos, such as a sign that iconically resembled driving a car, glossed DRIVE, that many signers produced for several different vehicles in the set of photos, while some conceptual conceptswere produced for only one photo (see Fig. 1).

The lexical diversity index consists of the proportion of signs that were produced only once and the proportion of the most-repeated sign for each signer’s lexicon. We plot these and find that the distributional patterns of the lexical diversity measure correspond to shared socio-communicative experiences (see Fig. 2). Homesigners who interact with other homesigners at school - their peers - have a larger proportion of signs used for only one photo. Homesigners with deaf family members, who interact with deaf adult homesigners, have a balanced proportion of signs used for only one photo and repeated signs. These systems share properties with what Kirby et al. (2015) describe as structured languages, languages that have some compressibility (in the form of repeated signs), while maximizing expressivity (signs for only one photo).

Figure 1. Sample lexical richness score. In the left column, a subset of signs produced by one participant during one session are illustrated with their gloss to the left. The stimulus photos that the signer described are pictured in the right column. If a sign was produced for a photo, they are connected by a solid line. Some signs were produced for more than one photo, e.g., STEER, which was produced for three different photos. Since signs were produced for multiple items, there were 8 unique signs, but 13 signs total in the calculations for the Hapax Legomenon and Most Frequent Sign. The most frequent sign, STEER, constitutes 3/13 or .231 of the total signs in this sample set. The four signs produced for only one photo (Hapax Legomena), LONG, ROUND, SPICY and SNIFF, are each 1/13 or .077. Added together, the set of signs produced only once is 4/13 or .31.

Figure 2. Lexical richness scores for signers from different communicative ecologies. The horizontal axis (Hapax signs) shows the signs that were produced only once as a proportion of all the signs an individual signer produced in a session. A higher value on this axis suggests that a signer produces more unique – expressive – signs. The vertical axis (Most Frequent Sign) shows the most frequent (repeated) sign as a proportion of all of the signs that a signer produced. A higher value on this axis indicates that a signer repeats the same sign for many referents (possible compressibility). Different types of communicative ecology cluster in different regions, with signers from peer ecologies having a higher proportion of hapax signs, while signers from some family ecologies have a higher proportion of frequent, repeated signs.

Referencesde Vos, C. & Nyst, V. (2018). The Time Depth and Typology of Rural Sign Languages. Sign Language Studies, 18(4), 477–487.Kirby, S., Tamariz, M., Cornish, H., & Smith, K. (2015). Compression and communication in the cultural evolution of linguistic structure. Cognition, 141, 87-102.Mandlebrot, B. (1953). An informational theory of the statistical structure of language. Communication theory, 486-502.Piantadosi, S. (2014). Zipf’s word frequency law in natural language: A critical review and future directions. Psychon Bull Rev, 21, 1112-1130.Richie, R., Yang, C. & Coppola, M. (2014). Modeling the Emergence of Lexicons in Homesign Systems. Topics in Cognitive Science, 6, 183–195.Sandler, W., Aronoff, M., Meir, I., & Padden, C. (2011). The Gradual Emergence of Phonological Form in a New Language. Natural Language and Linguistic Theory, 29(2): 503–43.Zipf, G. (1936). The Psychobiology of Language. London: Routledge.

Page updated

Google Sites

Report abuse