How is meaning structured in the mind? While decades of research on spoken languages have shown that words are organized into highly efficient “small-world” semantic networks, much less is known about how sign languages organize conceptual knowledge. We constructed large-scale semantic networks for American Sign Language (ASL) using a free sign association task, in which Deaf signers produced the first three signs that came to mind in response to a cue sign. Data were collected using a custom-built platform for sign language responses and annotation, resulting in over 100,000 cue–response pairs from 45 Deaf ASL signers. We analyzed the structure of the ASL semantic lexicon using network metrics such as clustering, average shortest path length, and modularity, and compared these properties to spoken English using data from the Small World of Words (SWOW) project. While both networks exhibited small-world characteristics, the ASL network was more modular and sparsely connected, indicating a more compartmentalized structure. Measures of semantic density and response diversity were positively related to lexical frequency, and phonological neighborhood density was also positively correlated with semantic density. These findings suggest that while signed and spoken languages share core organizing principles, language modality may shape how meaning is structured and accessed..
Highly accurate and precise models of sub-lexical compositionality with high coverage over the lexicon could be a valuable tool in studying processes like acquisition, lexicalization, and specific cases of comprehension (like neologisms and classifier predicates). However, characterizing the systematic relationships between form and meaning in sign language lexicons can be challenging, not simply because annotations from humans are expensive to obtain, but also because those relationships are often probabilistic and vary across individuals and contexts. This type of pattern recognition is well within the scope of machine learning (ML) methodology, and for American SL, we now have sufficient data to empirically test to where and to what extent ML methods can learn compositionality as a task unto itself. This seminar will introduce several open-source tools designed for linguists and cognitive scientists interested in studying compositionality at scale, with emphasis on models that automatically identify the phonological and lexical-semantic features of isolated sign productions, especially signs that the ML model has never seen before. Experimental results support the hypothesis that machine learning models can internalize certain form-meaning relationships, and that this skill can be helpful in (a) predicting the average age of acquisition, (b) predicting free associations from 41 early-acquisition deaf signers, (c) approximating the meaning of unseen signs (in isolation), and (d) reproducing the process of new sign creation. Future work will attempt to improve the operationalization of lexical semantics and apply these tools to sentence-level data.
Word associations are a powerful and unique tool to directly measure how meaning is organised in the mental lexicon. Several recent studies have shown that this method can be scaled efficiently to study diverse dimensions of lexical processing and semantic cognition at scale, across multiple languages, language variants, and language modalities. At the same time, massive associative models trained on large sections of human language such as Large Language Models have become very successful at capturing the formal aspects of language and predicting semantic cognition as well.
In this talk, I assess what the opportunities and limitations of both approaches are. I will first give an overview of word association research developments that came out of the Small World of Words project and explain how word associations are measured, what aspects of our language and lived experience they capture, and what this implies for how meaning is organised in the lexicon. The second part of the talk on new ways of quantifying semantic and conceptual diversity within and between languages and how these insights might extend to comparisons between monolinguals, bilinguals and spoken and sign languages.
In the last part, I will discuss new directions and ongoing work using large language models to generate and annotate associations, creating rich contextualised knowledge graphs that approximate cultural schemata.