Monojit Choudhury,
Professor of Natural Language Processing,
Mohamed bin Zayed University of Artificial Intelligence,
Abu Dhabi, United Arab Emirates
Abstract:
The Curious Case of Honorifics in South Asian Languages and Their Treatment in LLMs
Many South Asian languages exhibit honorific distinctions in second- and third-person pronouns, along with corresponding agreement patterns. For example, Hindi encodes three levels of formality - formal (aap), neutral (tum), and familiar (tu). The appropriate choice among these forms is governed by a complex interplay of socio-cultural conventions, interpersonal relationships, speaker attitudes, and contextual factors, which vary across languages and regions. In this talk, I will present two complementary lines of inquiry: (1) how large language models (LLMs) can be used as tools to conduct large-scale analyses of honorific usage from Wikipedia data, and what such analyses reveal about underlying socio-cultural conventions; and (2) how honorific systems themselves can serve as a lens for examining the socio-cultural and pragmatic understanding of South Asian languages exhibited by LLMs. The findings reveal several universal patterns across the languages studied, alongside striking differences.
Usman Naseem
Assistant Professor,
School of Computing at Macquarie University,
Sydney, Australia
Abstract:
From Cultural Blind Spots to Cultural Alignment in Large Language Models
Abstract: Cultural alignment in Large Language Models (LLMs) is essential for producing contextually aware, respectful, and trustworthy outputs. Without it, models risk generating stereotyped, insensitive, or misleading responses that fail to reflect the diversity of cultural norms and communication practices. Language carries meaning beyond words, shaped by shared expectations about politeness, intent, and context, which vary across communities and situations. When these factors are not taken into account, models may produce responses that are fluent but inappropriate or misunderstood in context. In this talk, I will present two complementary perspectives: (1) how current alignment approaches overlook these aspects of meaning, leading to recurring patterns of misinterpretation; and (2) how examining such failures helps reveal deeper gaps in how LLMs understand human communication. The discussion highlights consistent issues, including the over-reliance on dominant norms and uneven handling of culturally sensitive expressions, and points toward the need for alignment approaches that better account for the diversity of real-world communication.