Day1| September 11, Thursday
08:30 – 09:00
Registration on-site & Breakfast
09:00 – 09:15
Opening Remarks & Welcome
Jae Yeoul Jun (Vice President)
Heejung Cha (Dean of College of Humanities)
09:15 – 10:05
“Learning Variability Network Exchange (LEVANTE): A global framework for measuring children’s learning variability through collaborative data sharing” *remote presentation
Despite the ubiquity of variation in child development within individuals, across groups, and across tasks, timescales, and contexts, dominant methods in developmental science and education research still favor group averages, short snapshots of time, and single environments. The Learning Variability Network Exchange (LEVANTE) is a framework designed to enable coordinated data collection by diverse individual research teams worldwide, with the goal of measuring children’s variability within and across individuals, groups, and cultures. The measure set developed for LEVANTE aims to capture variability in learning outcomes (literacy and numeracy) as well as core cognitive constructs (language, reasoning, executive function, spatial cognition) and social constructs (social cognition, caregiver engagement, relations with peers), with each measure selected based on length, psychometric properties, cross-cultural applicability, and age span. LEVANTE will yield a large, open access longitudinal dataset for long-term research use. I'll talk about the initial construction and piloting of the LEVANTE measures, and will end by discussing some connections to AI benchmarking.
Michael C. Frank
(Stanford University)
📌Keynote Speaker
Early Language Acquisition (Moderator: Rajalakshmi Madhavan)
10:05 – 10:30
“Examining the different uses of normative language during caregiver-child interactions in China and the US” *remote presentation
Along children’s developmental trajectory, social interactions with caregivers facilitate the development of their moral reasoning (Mammen & Paulus, 2023). In this process, normative language, which passes down across generations the value judgement regarding the right and wrong of individual actions in accordance with social conventions or formalized rules, may play a particularly important role (Mammen & Paulus, 2023; Paulus, 2020). Moreover, given the vast cross-cultural differences in the nature of social rules and caregiver-child communications, distinct expressions and enforcement of norms and conventions may lead to different uses of normative language in caregiver-child conversations (Mammen & Paulus, 2023). The current study focuses on two cultures that emphasize different beliefs and values, namely China and the United States, to probe whether there is a difference in the use of normative language in caregiver-child naturalistic interactions. Specifically, I analyzed naturalistic caregiver–child interactions from Chinese and American families involving children aged 4 to 6 years, drawing data from previously established CHILDES corpora in both cultures (e.g. Brown, 1973; Luo et al., 2012; MacWhinney, 2000; Warren-Leubecker & Bohannon, 1984). The preliminary results suggest that Chinese caregivers resort to parental authority constantly to reinforce children’s politeness in constructing positive interpersonal relationships. Meanwhile, American caregivers recognize their children’s equal status as conversational partners and explain in detail the moral rules and social conventions governing individual Along children’s developmental trajectory, social interactions with caregivers facilitate the development of their moral reasoning (Mammen & Paulus, 2023). In this process, normative language, which passes down across generations the value judgement regarding the right and wrong of individual actions in accordance with social conventions or formalized rules, may play a particularly important role (Mammen & Paulus, 2023; Paulus, 2020). Moreover, given the vast cross-cultural differences in the nature of social rules and caregiver-child communications, distinct expressions and enforcement of norms and conventions may lead to different uses of normative language in caregiver-child conversations (Mammen & Paulus, 2023). The current study focuses on two cultures that emphasize different beliefs and values, namely China and the United States, to probe whether there is a difference in the use of normative language in caregiver-child naturalistic interactions. Specifically, I analyzed naturalistic caregiver–child interactions from Chinese and American families involving children aged 4 to 6 years, drawing data from previously established CHILDES corpora in both cultures (e.g. Brown, 1973; Luo et al., 2012; MacWhinney, 2000; Warren-Leubecker & Bohannon, 1984). The preliminary results suggest that Chinese caregivers resort to parental authority constantly to reinforce children’s politeness in constructing positive interpersonal relationships. Meanwhile, American caregivers recognize their children’s equal status as conversational partners and explain in detail the moral rules and social conventions governing individual behaviors. This study contributes to the in-depth exploration of how caregiver-child interactions in China and the US respectively promote children’s moral development through the different uses of normative language.
References
Brown, R. (1973). A first language: The early stages. Cambridge, MA: Harvard University Press.
Luo, Y., Snow, C.E., & Chang, C. (2012). Mother-child talk during joint book reading in low-income American and Taiwanese families. First Language, 32 (4), 494-511. https://doi.org/10.1177/0142723711422631
MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk (3rd ed.). Hillsdale, NJ: Erlbaum.
Mammen, M., & Paulus, M. (2023). The communicative nature of moral development: A theoretical framework on the emergence of moral reasoning in social interactions. Cognitive Development, 66, 101336. https://doi.org/10.1016/j.cogdev.2023.101336
Paulus, M. (2020). How do young children become moral agents? A developmental perspective. In J. Decety (Ed.), The social brain: A developmental perspective (pp. 161–177). The MIT Press. https://doi.org/10.7551/mitpress/11970.003.0012
Warren-Leubecker, A., & Bohannon, J. N. (1984). Intonation patterns in child-directed speech: Mother-father speech. Child Development, 55, 1379–1385.
Jiayi Song & Jinming Chen (Harvard Graduate School of Education)
10:30 – 10:55
“Birth to Three Language Acquisition: Influences of Ambient Language in Montessori Setting” *remote presentation
There is an expanse of literature looking at various topics supporting language acquisition, acquisition through direct instruction and Montessori education, especially in preschool; however, there is a lack of research in infant and toddler Montessori classrooms. Most of the empirical data regarding language acquisition has focused on the child’s acquisition of vocabulary through direct instruction, rather than the learning capability from overhearing a third party in a naturalistic setting. The purpose of the intervention study was to add to the limited empirical research on language acquisition in infant and toddler Montessori environments. More specifically, the intervention assessed if infants and toddlers could indirectly acquire new vocabulary through the Absorbent Mind from teachers and peers’ ambient dialogue during the Montessori three-period lesson. The research utilized a descriptive, correlational pre-and-post quasi-experimental design to assess and analyze vocabulary and ambient language. Data collection occurred in three Association Montessori Internationale (AMI) and American Montessori Society (AMS) infant and toddler mixed-aged programs for a total of six classrooms throughout New York State and Maryland. The Language Environmental Analysis (LENA) system was used to analyze audio recordings. Legal-non-sense words were used to identify nonsense objects during the three-period lesson, the prescribed Montessori language presentation. Transcriptions of audio recordings quantified vocabulary acquisition and ambient language, including the non-sense words. Paired t-tests and ANCOVA were used to analyze children’s acquired vocabulary. A fidelity scale was developed to analyze the extent to which Montessori trained teachers adhered to the three-period lesson intervention. The findings demonstrated that the youngest children whether verbal or non-verbal were able to acquire vocabulary from the classroom that was only ever overheard. The findings provide opportunities to improve infant and toddler teachers' classroom practice related to language acquisition. Additionally, the findings demonstrated the reliability to use LENA to measure ambient language in classroom settings. Lastly, the study demonstrates the need to replicate the study in older age groups and in multi-lingual settings.
Claudine Campanelli (CUNY)
10:55 – 11:15 Coffee Break
11:15 – 11:40
Learning with Less: What Child-Directed Speech Reveals about Efficient Language Learning
Infants acquire language with remarkable efficiency, raising a fundamental question: does this success stem from unusually supportive input or from powerful learning mechanisms? This talk examines the input side, focusing on the properties of child-directed speech (CDS) in Korean. Across three studies, I show that apparent CDS advantages in word segmentation and phonological clarity can be traced to underlying structural and distributional characteristics, such as utterance length, lexical diversity, and lexical frequency, rather than to register itself. These findings suggest that many of the benefits attributed to CDS arise indirectly as byproducts of input structure, challenging the view that they are the result of direct adaptations for teaching. More broadly, the results highlight the importance of examining the structural properties of input to understand how CDS supports early language acquisition.
Eon-Suk Ko (Chosun University)
11:40 – 12:05
“Caregiver Contingent Feedback and Conversational Initiative in Early Vocabulary Development”
This study examines how maternal conversational initiative and children’s language status influence maternal responsiveness in early development. Using daylong naturalistic recordings from 141 Korean mother–child dyads (children aged 7–30 months), we explored whether caregivers who initiate more conversations also respond more promptly and frequently to their children’s speech-like vocalizations. Children were grouped as either at-risk or typical based on scores from the Korean MacArthur–Bates Communicative Development Inventory (K-MBCDI). Key measures included the adult-initiated conversational block (AICF), the speech response ratio (proportion of speech-like child vocalizations receiving maternal responses within 1 second), and the contingency ratio (selectivity of maternal response toward speech vs. non-speech). Results from mixed-effects regression showed that mothers who initiated more conversations tended to respond more frequently to their children’s speech-like vocalizations, with a stronger and more consistent effect in the typical group. While both AICF and child language group (typical vs. at-risk) significantly predicted the speech response ratio, the contingency ratio did not differ significantly between groups. These findings underscore the value of frequent and immediate maternal responsiveness—rather than selective feedback—as a support for early language development. Future work should explore longitudinal data and multimodal interaction features to further clarify causal pathways and individual differences in caregiver–child interaction patterns.
Jongmin Jung (Chosun University)
12:05 – 12:30
“Convergence Between Eye-Tracking-Based Word Recognition and Parental Report in Assessing Language Development in 14-Month-Old Korean Infants”
abstract
Jun Ho Chai (Sunway University)
12:30 – 13:30 Lunch
Home Language Environment (Moderator: Ioana Buhnila)
13:30 – 13:55
“The role of caregiver-child interactions and children’s interests in early vocabulary development"
The ‘active child’ perspective of children’s development suggests that children are active learners, who choose what, when, and from whom to learn, which also subsequently boosts learning from these scenarios. Recent research also demonstrates that children’s individual interests in certain natural object categories shape novel word learning within these categories. On the other hand, the ‘pedagogical parent’ also is an optimal source of information, who carefully selects and tailors the information they provide to the children to maximise their learning during interactions. In this work, I investigate how the active child and the pedagogical parent jointly influence children’s language learning in their formative years. Firstly, I explore how parents and children modify their verbal and non-verbal behaviours in naturalistic interactions with novel and familiar objects, and whether children’s selective visual attention modulates their learning from the interactions. Secondly, I investigate caregiver-child interactive behaviours during shared book reading, and whether children’s individual interests boost caregiver-child interaction quality and subsequent learning from the book reading scenario. Finally, I present a longitudinal analysis of how children’s emerging interests develop into sustained individual interests, and whether these emerging and sustained interests shape their vocabulary size across early development. With this, I aim to provide a more comprehensive understanding of the dynamics between children’s information seeking behaviour and parents’ information providing behaviour, and how these come together to shape early vocabulary development.
Rajalakshmi Madhavan (University of York)
13:55 – 14:20
“ManyBabies-AtHome Word Recognition: A remote, cross-linguistic study”
The majority of knowledge produced in developmental science comes from WEIRD countries (Western, Educated, Industrialized, Rich, Democratic; Singh et al., 2021), a bias which threatens to limit our understanding of the factors influencing early development. One promising solution to improve diversity, moving data collection to online, remote options, has so far not lived up to its promise, as the majority of online, remote studies are centered in the United States (Zaadnoordijk & Cusack, 2022). The ManyBabies-AtHome (MBAH) project aims to produce a resource-friendly, open-source, and accessible approach to make it possible for online studies to live up to their promise of increasing diversity (Zaadnoordijk et al., 2021). As a part of MBAH, the current study aims to investigate and establish best practices in the study infants’ word recognition development using an online, remote version of the Looking-While-Listening paradigm (LWL; Fernald, Zangl, Portillo, & Marchman, 2008). Cross-linguistic studies using this paradigm are rare (e.g. Ramon-Casas et al., 2009) and experimental design and analytic decisions vary considerably between studies (Zettersten, et al., 2021), rendering comparisons difficult or even making them impossible.
We will discuss our project goals as well as the state of the project. Our goals are four-fold: 1) Societal: Attract a more diverse set of researchers and study populations using a linguistically-inclusive and resource-friendly approach; 2) Theoretical: Investigate the development of infant word recognition abilities in a linguistically diverse sample; 3) Methodological: Determine the comparability of results produced in-lab with online, remote implementations of the LWL paradigm; and 4) Scientific: Generate an extensive dataset to train automatic eye gaze algorithms and for future researchers to plan their studies.
Katie Von Holzen (Technical University of Braunschweig)
14:20 – 14:45
“Home speech environment of Japanese infants: evidence from long-format recordings"
Spoken language acquisition is one of the most remarkable achievements during infancy that paves the way for later literacy, language, social and academic skills. However, how infants achieve this and with exceptional speed is not well understood. A role of infants’ home speech environment for language development has been documented; however, how this environment develops over the first years of life is still not well documented. This project examined the composition of early infants’ speech environment and its effects on infants’ language outcomes by using a unique dataset of large-scale naturalistic language environment data obtained in home settings. Data were collected from 30 infants longitudinally every three months starting at infants’ age of six- until 18-months, allowing us to assess the evolution of input composition over time. At each timepoint, infants’ home speech environment was recorded for two days in a row via a wearable audio recorder. In the analyses reported here, we focus on the quantity of speech input and infant vocalizations. Quantity of adult speech that infants are exposed to was quantified as the frequency of adult speech per recording hour at infants’ ages of 6, 9, 12, 15, and 18 months using Voice Type Classifier (VTC, Lavechin et al., 2020). The results demonstrated a significant effect of Age (F(4, 118.20) = 6.34, p < 0.001). Quantity of adult speech input was higher at 9 (M = 16.4, SD = 7.08) compared to 6 (M = 11, SD = 5.16), 15 (M = 11.1, SD = 6.57), and 18 months (M = 11, SD = 6.22). One aspect explaining these results might be infants’ motor behavior as measured via accelerometer integrated in the audio recorders. Infants showed the lowest amount of activity at 6 months mostly spending time in lying position, with increase in crawling and upright positions at 9 and 12 months, and the highest amount of upright motor activity at 15 and 18 months. It is possible that at 9 and 12 months infants are mostly around parents consequently getting exposed to the highest amount of speech input, whereas at 15 and 18 months a high mobility might keep them at larger distances from the sources of speech input as they are exploring their environment. Regarding the amount of infants’ vocalization, the results demonstrated stability in the frequency of infants’ vocalization per recording hour with marginally significant age differences (F (4, 125.4) = 2.49, p = 0.05). Quantity of infant vocalizations was the lowest at 18 months (M = 17.2, SD = 6.93) with a similar quantity of vocalizations at 6 (M = 18.8, SD = 8.41), 9 (M = 19.3, SD = 6.56), 12 (M = 18.4, SD = 6.93), and 15 months (M = 18.8, SD = 6.40). However, post hoc tests did not reach statistical significance. Whereas there were no differences in the quantity of infants’ vocalizations across different ages, it is possible that what changes over age is the quality of infants’ vocalizations, which will be explored in the next step.
Irena Lovčević (WPI-IRCN (International Research Center for Neurointelligence), The University of Tokyo)
14:45 – 15:10
"Bilingual Parenting Practices among Swahili-Speaking Families in Tanzania: Usage-Based Theory of Language Acquisition Perspective" *remote presentation
In Tanzania, Swahili functions as the national and cultural lingua franca and English operates as the official language of formal education and global integration. Among Swahili-speaking families, the phenomenon of bilingual parenting, where both Swahili, and English are used within the household, presents a rich site for exploring how children acquire and use multiple languages. However, while bilingualism in Tanzanian schools and public discourse has been widely studied, limited attention has been paid to the home domain, where parents play a critical role in structuring linguistic exposure and shaping children's bilingual competence. This study seeks to examinine how Swahili-speaking parents in urban Tanzania use Swahili and English in everyday interactions with their children, focusing on the frequency, context, and function of language use from a Usage-Based Linguistic perspective. The study was guided by the Usage-Based Theory of Language Acquisition (Tomasello, 2003), which posits that children acquire language not through abstract rule learning but by attending to repeated, meaningful linguistic input in context. The theory emphasises the importance of frequency, communicative function, social interaction, and intention-reading in developing linguistic schemas. Within this framework, the research aimed to explore how bilingual Swahili-speaking parents model, reinforce, and differentiate language use in real-life family routines and how these usage patterns influence the bilingual development of their children. Using a qualitative exploratory design, data were collected from 15 bilingual households in Dar es Salaam (Ilala and Kinondoni districts) through semi-structured interviews, participant observation, and audio-recorded naturalistic conversations. The data were analysed thematically with a focus on identifying recurring linguistic patterns, usage contexts, and frequency of exposure to each language. The findings reveal that Swahili and English serve functionally distinct roles in the household. Swahili was predominantly used in emotionally charged and culturally embedded contexts such as caregiving, greetings, discipline, storytelling, and social bonding. English, by contrast, was used in academic, instructional, and aspirational contexts, such as during homework, praise, commands, and future-oriented discourse. The frequency of each language in specific contexts led children to develop domain-specific competence, confirming Tomasello’s assertion that repeated exposure in meaningful contexts shapes linguistic development. Children demonstrated greater fluency in the language tied to the domain of use, employing Swahili for emotional negotiation and social rituals, and English for formal learning and tasks. These findings suggest that bilingualism in Tanzanian households is contextually and socially motivated, with language choice reflecting not only parental preference but also cultural expectations, educational aspirations, and functional differentiation. The implications are twofold: first, policies promoting bilingual education should recognise and integrate the role of the family in language development; second, parents should be made aware of the powerful cognitive and developmental impacts of patterned language use at home. For future research, it is recommended that similar studies be conducted in rural contexts, among lower-income families, and across different Tanzanian ethnic groups to explore how socioeconomic, regional, and cultural factors influence usage-based bilingual parenting practices. Longitudinal studies would also help track the developmental trajectory of bilingual competence over time within these home environments.
Okoa Dani Simile (University of Dar es Salaam)
15:10 – 15:30 Coffee Break
15:30 – 16:10
✨Lightning Talks
⚡“Pronunciation Instruction in Japanese and South Korean Universities: A Data-Driven Study of Beliefs, Training, and Engagement in Spanish and Portuguese Classrooms” *remote presentation
This study investigates pronunciation instruction in the teaching and learning of Spanish and Portuguese at universities in Japan and South Korea—two underexplored contexts in applied linguistics. Adopting a data-driven, comparative perspective, we collected and analyzed responses from 120 participants (29 professors and 91 students), offering insights into how institutional and sociocultural factors shape attitudes, practices, and perceived challenges in pronunciation instruction.
The mixed-methods design centered on structured surveys administered in participants’ preferred languages (Spanish, Portuguese, Japanese, or Korean), with items tailored to explore three key domains: (1) the importance attributed to pronunciation, (2) teacher training and preparedness, and (3) student engagement with pronunciation-focused instruction. Quantitative analysis of Likert-scale items was conducted using non-parametric statistical methods (Kruskal-Wallis and Mann-Whitney U tests), with Bonferroni corrections applied for multiple comparisons.
Our findings reveal significant cross-country differences. Professors in South Korea consistently attributed greater importance to pronunciation in their teaching compared to those in Japan, regardless of the language taught. This suggests a stronger institutional emphasis on oral proficiency in the South Korean context. While both professors and students recognized pronunciation as crucial for communicative competence, Japanese students studying Portuguese reported significantly lower self-perceived preparedness and found it more difficult to distinguish L2 pronunciation patterns from their L1. These differences suggest varying levels of instructional support and exposure across contexts.
Training emerged as a critical factor. Many instructors, despite their interest in pronunciation activities, reported limited formal preparation in phonetics or phonology. Students' perceptions of instructional adequacy in this area were mixed, often reflecting broader structural gaps in curriculum design. Notably, both groups emphasized the need for more integrated and systematic pronunciation instruction, supported by culturally responsive pedagogy and practical training resources.
Underlying many of these issues is the persistent influence of native-speakerism, a linguistic ideology that idealizes native-like pronunciation and shapes hiring practices, curriculum design, and classroom expectations. This belief system not only marginalizes non-native teachers but also limits students' engagement by promoting unrealistic goals. Our findings highlight how such ideologies manifest differently in each national context and impact the perceived value and delivery of pronunciation instruction.
By providing a comparative, empirically grounded view of pronunciation instruction in higher education, this study contributes to a better understanding of how cognitive, institutional, and sociocultural variables interact across learning environments. We advocate for an intelligibility-oriented approach that challenges native-speaker norms and supports learners' communicative needs through enhanced teacher training, diversified materials, and the integration of technology to promote autonomy. These findings hold implications for language education policy and pedagogy across multilingual, global contexts.
María Teresa Martínez-García & Alexandre Ferreira Martins* (University of Valladolid & Hankuk University of Foreign Studies*)
⚡"The Cognitive and the Social Nature of Literacy: Predictors of Reading and Writing Abilities in Adult Emergent L2 Literates with History of Migration" *remote presentation
Functional literacy (the practical ability to read and write) is one of the three core components of overall literacy in models such as PIAAC (2020) – an international framework that evaluates the competencies adults need to function socially and achieve personal and professional goals in developed countries (Kyröläinen & Kuperman 2021). Individual literacy models vary as to the degree of significance that they attribute to social vs. cognitive predictors. While purely technical literacy models (Perfetti & Stafura 2014; Kim 2020 among many others) are structured around cognitively mediated component skills (e.g., phonological recoding, morphological awareness, visual word recognition), still other models define literacy as a purely social phenomenon. For example, Kyröläinen and Kuperman (2021) stress the role of social environment that promotes the culture of literacy (e.g., through print-related social activities, socially determined incentives as well as individual motivation). The cognitive skills which are a prerequisite for functional literacy development are mediated by environmental factors. Previous research shows that learners with extended formal education typically demonstrate more advanced metacognitive abilities (Arsyad & Villia 2022; Edeleva et al., in prep).
At the same time, methods that are currently widely used to investigate and assess literacy development in adult emergent L2 literates tend to rely on assumptions related to linguistic distance between L1 and L2 and interpretations of emergent error patterns (cf. word dictations). They do not adequately account for individual cognitive and psychological profiles as well as environmental factors. In the current study, we draw on the assumption that functional literacy is determined by a constellation of proximal and distal factors which are interrelated. However, different factors (cognitive vs. biographic) may not be equally informative regarding various literacy outcomes.
We will report preliminary results of the study of 47 adult emergent multiliterates living in Germany who attended a literacy-enhanced language integration course at the time of the study. Their literacy skills were evaluated through two different tasks: natural text reading and a spelling dictation (cf. Do Manh et al., 2021). Additionally, the participants performed a non-word repetition tasks and were screened for their educational and social background.
We inquire how participants’ biographic (e.g., time spent in the host country, years of schooling), psychological (e.g., self-reported language proficiency) and cognitive (e.g., phonological awareness, working memory) are informative to predict formal indicators of L2 German reading and spelling skills. Specifically, we hypothesise that biographic and psychological factors are stronger associated with more authentic tasks (e.g., natural text reading) while cognitive predictors are more informative for technical tasks (e.g., word dictation).
Our findings contribute to a better understanding of how literacy screenings can operate on the attributes of the reader or language user rather than on immanent characteristics of texts and how these attributes correlate with formal measures of language acquisition.
Julia Edeleva
(BTU Cottbus-Senftenberg)
⚡"Algorithmic Pruning Theory (APT): A Neurocognitive Model for Understanding the Influence of Algorithm-Driven Digital Platforms on Human Cognitive Development" *remote presentation
In the modern-day digital platforms, with algorithmic feed-personalized streams of media dictates the daily feed of information we reach through social media. Children and adolescents are immersed in this algorithmically curated content feeds shaped by machine learning systems followed by which they watch and are interested in. Unlike traditional media, these platforms do not merely present content; they actively adapt to each user’s behavior, optimizing for engagement in ways that fundamentally alter cognitive development. Yet developmental science lacks a comprehensive framework to understand their long-term impact on the brain. Existing discussions about screen time have not fully captured the complete picture of the influence of algorithmic feeds. Standard models treat all digital exposure as uniform. They tend to ignore how recommendation systems function as neural sculptors, reinforcing specific cognitive pathways through carefully timed rewards and stimuli. Existing models do not recognise that algorithmic feeds disproportionately shape attention, memory, decision-making and Widen cognitive disparities, across populations. Emerging observations including shortened attention spans, reduced memory retention and heightened impulsivity in frequent users has been able to point to identify profound shifts in neurocognitive development that existing theories seemingly are unable to completely explain. This paper introduces “Algorithmic Pruning Theory (APT)”, as a model that explains how algorithmically curated content feeds act as “artificial architects” in human neural development. APT is providing details on how engagement-driven algorithms reinforce reactive cognition while weakening deeper cognitive control.APT offers the first structured approach in understanding and ultimately guiding how algorithmic environments are influenced in shaping the minds of children and adolescents by bridging developmental neuroscience, computational psychology, and behavioral science. This theory is not only focusing on screen time but it is also about how algorithm-curated feeds which actively rewires the developing brain.This paper aims to propose Algorithmic Pruning Theory (APT), a neurocognitive framework explaining how personalized algorithmic feeds modify developing neural architecture through engagement-driven reinforcement mechanisms. Through identifying and characterizing three core neurocognitive pruning processes (Attentional pruning - strengthening of reactive attention networks, Memory reorganization - shift toward shallow encoding, Reward recalibration - dopaminergic system hypersensitivity). Existing research highlights algorithmically curated feeds as powerful shapers of cognitive development, yet gaps remain in understanding their neural mechanisms. Existing evidence showcases that these platforms reinforce rapid attention shifts and reduce deep memory encoding while amplifying impulsive behaviors through their engagement-driven design, making children and adolescents particularly vulnerable due to the nature of their ongoing brain maturation processes. APT presents a neurocomputational model identifying how algorithmically curated content feeds structurally modifies developing brains through four synergistic mechanisms of artificial neural selection. They are Selective Pathway Reinforcement, Stimulus-driven attention networks (temporoparietal junction, inferior frontal gyrus) through micro-scheduling of novel stimuli (mean latency = 27s between content shifts), Reactive memory systems (posterior hippocampus) via content volatility (68% of recommended videos diverge topically from initial seed content) and Dopaminergic prediction circuits (ventral tegmental area → nucleus accumbens) through variable-ratio reinforcement schedules). Furthermore APT reveals how three specific mechanisms (attentional pruning, memory reorganization, and reward recalibration) through which these platforms modify developing human brains. The theory spotlights the critical vulnerabilities accounting for differential effects across content types during sensitive developmental periods. This paradigm shift calls for renewed examination of digital environments' cognitive impacts, moving beyond passive consumption models to recognize algorithms as active neural sculptors in child and adolescent development. This paper identifies Validating APT’s mechanisms through longitudinal neuroimaging, Developing standardized algorithmic calculation index metrics, examine cross-cultural differences in algorithmic pruning, testing age-specific vulnerability windows and exploring content-type moderation effects as areas of future directions. In conclusion this theory layout critical foundations on examining developmental impacts of exposure to algorithmic digital environments. It is calling for urgent need of research on how personalized algorithms may be rewriting fundamentals of human cognitive architectures during childhood and adolescence's sensitive windows of neural plasticity.
Key words: Algorithm feeds, Neurocognitive Development, Attentional Reinforcement, Adolescent cognitive development, Memory.
H. G Madhushani Dominicta Rathnayake (The Open University of Sri Lanka)
16:15 – 17:00 Discussion Session I & Announcements
Day 2| September 12, Friday
08:30 – 09:00
Registration on-site & Breakfast
Language and Cognitive Health Across the Lifespan (Moderator: Eon-Suk Ko)
09:00 – 09:10
Welcoming remarks for the second day
I direct the Genome Intelligence Mining Lab, where we focus on uncovering the hidden codes and mechanisms embedded in human genetic information. Our overarching goal is to transform these insights into strategies for understanding complex diseases and improving human health.
In this presentation, I will highlight our efforts to develop predictive and preventive approaches for Alzheimer’s disease (AD). Specifically, we have pursued two complementary directions: (1) building flexible risk-prediction models that integrate multi-modal data, and (2) evaluating the strengths and limitations of each modality to optimize their effective use. We established a genetic risk prediction framework that incorporates common genetic variants (single nucleotide polymorphisms/variants) and identifies AD-associated loci. This genetic risk is dynamically updated by integrating additional evidence from other modalities—such as neuroimaging, plasma biomarkers, and cognitive measures—when available.
Validation using data from the Gwangju Alzheimer’s & Related Dementias (GARD) study showed that this multi-modal framework substantially improves prediction of AD, as measured by amyloid positivity and early cognitive decline, compared to single-modality strategies. These results illustrate how mining genetic information and combining it with multi-modal data can open new avenues for primary and secondary prevention of late-onset Alzheimer’s disease (LOAD).
Heesook Kang
(Chosun University)
09:10 – 09:35
“Converging multi-modal evidences for the primary prevention of Alzheimer's dementia”
I direct the Genome Intelligence Mining Lab, where we focus on uncovering the hidden codes and mechanisms embedded in human genetic information. Our overarching goal is to transform these insights into strategies for understanding complex diseases and improving human health.
In this presentation, I will highlight our efforts to develop predictive and preventive approaches for Alzheimer’s disease (AD). Specifically, we have pursued two complementary directions: (1) building flexible risk-prediction models that integrate multi-modal data, and (2) evaluating the strengths and limitations of each modality to optimize their effective use. We established a genetic risk prediction framework that incorporates common genetic variants (single nucleotide polymorphisms/variants) and identifies AD-associated loci. This genetic risk is dynamically updated by integrating additional evidence from other modalities—such as neuroimaging, plasma biomarkers, and cognitive measures—when available.
Validation using data from the Gwangju Alzheimer’s & Related Dementias (GARD) study showed that this multi-modal framework substantially improves prediction of AD, as measured by amyloid positivity and early cognitive decline, compared to single-modality strategies. These results illustrate how mining genetic information and combining it with multi-modal data can open new avenues for primary and secondary prevention of late-onset Alzheimer’s disease (LOAD).
Jungsoo Gim (Chosun University)
09:35 – 10:00
“Speech abnormality patterns throughout the life span” *remote presentation
Speech production is a complex, planned activity that engages multiple brain areas. Therefore, speech holds great potential as a screening and monitoring tool for individuals at high likelihood of developing brain- and cognition-related conditions. Utilizing scalable, automated tools to assess cognitive and social functions, I present unique speech and language signatures of various clinical conditions, including neurodevelopmental conditions (e.g., autism), psychosis (e.g., schizophrenia), and neurodegenerative disorders (e.g., Alzheimer’s disease). Findings suggest that automated speech and language measures can provide objective, non-invasive and sensitive markers for screening and monitoring individuals with various clinical conditions.
Sunghye Cho (University of Pennsylvania)
10:00 – 10:40
"Toward Predictive and Preventive Strategies for Alzheimer’s Disease: Findings from the GARD Cohort Study"
abstract
Kun Ho Lee (Chosun University)
📌 Keynote Speaker
10:40 – 11:00 Coffee Break
Speech Production and Perception (Moderator: Jihye Suh)
11:00 – 11:25
“Individual Differences in Categorical Perception: Exploring Links to Executive Functions and Language Experience” *remote presentation
Categorical perception of speech (CPS) refers to the processing mechanism by which listeners perceive continuous speech signals as discrete phonetic categories, showing greater sensitivity to differences between categories than within categories. Previous studies have shown that more robust CPS—characterized by reduced sensitivity to within-category variation—reflects more established phonetic representations in both first and second language learners, as well as in individuals with dyslexia. More recently, however, research has reported a range of individual differences in how discretely listeners categorize speech sounds, with some individuals showing gradient perception by more effectively assessing redundant acoustic cues. This talk explores why some listeners exhibit gradient rather than categorical perception, and what functional advantages this perceptual style might offer. In particular, it focuses on how individual differences in executive functions—such as working memory, inhibitory control, and cognitive flexibility—as well as factors like age (children vs. adults) and language experience (first vs. second language), may be linked to the ability to make use of fine-grained acoustic information when making phonetic category decisions.
Eun Jong Kong (Korea Aerospace University)
11:25 – 11:50
“How Heritage Speakers Sound: Taking Both Holistic and Segmental Approaches”
A heritage speaker is defined as an individual raised in a home where a language other than the majority language of the society is spoken (Valdés, 2001). My work has aimed to address a gap in the heritage phonology literature by employing both top-down and bottom-up approaches. Kim et al. (2025) adopted a top-down approach, conducting an accent rating task using resynthesized stimuli in which either segmental or prosodic information was removed. Our results show that, between segments and prosody, heritage accent is more pronounced in prosody than in segments, whereas segments play a larger role in the perception of L2 foreign accent.
In Kim (2025), a bottom-up approach was adopted to examine how allomorphs (i.e., different forms of the same morpheme) and sound systems interact in heritage grammar. Specifically, the study analyzed (i) plural morphemes (i.e., -s and -es) and (ii) definite/indefinite articles (i.e., el/la for Spanish; the and a/an for English) produced by heritage speakers of Spanish at both phonetic and phonological levels. A significant contribution of this study was the comparison of Spanish heritage speakers with (non-heritage) early bilinguals in Mexico. This comparison was intended to distinguish cross-linguistic influence at the individual level from cross-linguistic influence at the community level. The results show that cross-linguistic influence at the individual level is more pronounced at the phonetic than at the phonological level. Acoustic analyses reveal that bilinguals often produce intermediate values between those of Spanish and English monolinguals, with heritage speakers aligning more closely with English monolinguals and early bilinguals in Mexico aligning with Spanish monolinguals. Furthermore, several acoustic measurements provide evidence of cross-linguistic influence at the societal level, suggesting that divergence in heritage bilinguals cannot be attributed solely to being early bilinguals.
Joo Kyeong Kim (UCLA)
11:50 – 12:20
✨Lightning Talks
⚡“Generative AI for Thematic Coding in Qualitative Linguistic Analysis” *remote presentation
This study (Sun et al., 2025) examines how generative artificial intelligence (AI), specifically ChatGPT, can augment qualitative methods in linguistic analysis. We focus on developing and testing a systematic approach for integrating AI into qualitative coding workflows, particularly for large sets of open-ended survey or interview data. Using a dataset of naturally occurring narrative responses from a hospitality context as an example, we compared two analytic processes: (1) human coding guided by Grounded Theory and the Qualitative Content Analysis method, and (2) AI-augmented coding using purpose-built prompts to elicit thematic and linguistic categorization from ChatGPT.
Our methodological framework was designed to assess both the efficiency and the trustworthiness of AI-assisted coding. We evaluated the degree of overlap between AI-generated themes and those identified by human coders, as well as the distinct contributions each approach offered. While ChatGPT’s thematic outputs aligned closely with most human-coded categories, notable divergences emerged, revealing areas where human interpretive judgment captured interactional and sociocultural dimensions that AI overlooked. These differences underscore the complementary—not substitutive—role of AI in qualitative linguistic analysis.
The methodological contribution of this work lies in operationalizing prompt engineering for qualitative coding, establishing a protocol for comparing AI-generated and human-generated codes, and identifying procedures for using AI output to triangulate and refine human-led analyses. Beyond efficiency gains, the approach offers a replicable pathway for researchers to integrate AI tools into early-stage coding, pattern recognition, and theme validation, especially when working with large, text-rich datasets in linguistics, sociolinguistics, and discourse studies. By presenting prompt design strategies tailored for thematic and linguistic coding, and by proposing a comparative analysis protocol to assess convergence and divergence between AI and human coding, this study illustrates how AI outputs can be leveraged to both streamline and strengthen qualitative interpretations.
By foregrounding the analytical process rather than the specific content domain, this research provides a practical model for incorporating AI into the analysis of naturalistic language data, from long-form recordings to open-ended elicitation tasks. The findings demonstrate that generative AI can surface preliminary thematic structures rapidly, enabling human researchers to focus on interpretive depth, contextual nuance, and theoretical framing. As AI tools continue to evolve, embedding them within established qualitative frameworks will be essential for ensuring rigor, transparency, and methodological accountability in linguistic analysis.
Hala Sun
(Michigan State University)
⚡"Who Knows Best? Accuracy of Self- and Caregiver Assessments of Language Abilities in Monolingual and Bilingual Caregiver-Child Dyads" *remote presentation
Self-assessments and caregiver reports are widely used in both research and clinical settings to evaluate children’s language abilities. These subjective measures are low-cost, easy to administer, and capture perspectives on children’s language abilities in naturalistic environments, making them appealing alternatives to objective assessments. However, their concurrent validity with objective measures varies. Many studies report alignment between caregiver reports and children’s performance on objective language assessments (Miller et al., 2017; Prathanee et al., 2012), while others find little correspondence (Hus et al., 2011; Selin et al., 2018). Self-assessments more consistently show moderate correlations with objective measures (Delgado et al., 1999; Gollan et al., 2011), and developmental evidence indicates children’s awareness of their own and others’ language skills emerges early and becomes more accurate with age (Clark, 1978; Kaderavek et al., 2004). Language background (i.e., monolingualism or bilingualism) may also influence the accuracy of both self-assessments and caregiver reports. Some research suggests both self-assessments and caregiver reports tend to overestimate bilingual children’s language abilities (Lindman, 1977; Lust et al., 2016; Zeyl, 2021), though certain caregiver-report tools have demonstrated validity (Marchman, 2002). However, it remains unclear whether self-assessments or caregiver reports align more closely with standardized assessment results, and whether this differs by language background.
To address these questions, using data from 30 caregiver-child dyads (14 monolingual English, 16 bilingual), including neurotypical children (n = 24) and children with language impairment (n = 6), aged 9–12 years (M = 11;0, SD = 1;2), this study directly compared self-and caregiver ratings of the child’s language ability to standardized assessment outcomes in both monolingual and bilingual children. Children and caregivers each independently completed a questionnaire about the child’s language abilities. The questionnaires consisted of several 4-point Likert-scale items targeting various domains of language, including receptive and expressive language (e.g., “Does your child need repetition?”/ “Do you need to hear something multiple times to understand it?”, “Does your child forget words they know?”/“How often do you use new words you just learned?”). Children’s language abilities were also objectively measured with the Clinical Evaluation of Language Fundamentals–Fourth Edition (CELF-4), a standardized assessment that evaluates core aspects of receptive and expressive language.
Mean CELF-4 Core Language scores ranged from 40 to 132 (M = 99.70, SD = 22.98), indicating a broad range of abilities with an average overall score. Mean summed questionnaire scores were 6.53 (SD = 1.50) for self-ratings and 8.07 (SD = 1.78) for caregiver reports. Regression analyses revealed that both self- and caregiver ratings significantly predicted CELF-4 scores (β = 5.48, SE = 1.87, p = .005), with no effects of language background, respondent type, or their interaction.
These findings suggest that, for school-aged children, both self-assessments and caregiver reports can provide accurate evaluation of language abilities relative to standardized assessments, regardless of language background. This challenges previous evidence suggesting that these measures may inaccurately reflect bilingual children’s language abilities, and underscores the potential of subjective measures as reliable, accessible tools for evaluating language skills across diverse populations.
Ashlie Pankonin (San Diego State University)
12:20 – 13:15 Lunch
Computational Methods in Psycholinguistics (Moderator: Joo Kyeong Kim)
13:15 – 13:40
“Topological data analysis of music graphs and its applications”
Some semantic time series data exhibit unique topological characteristics that can help to understand hidden geometric and topological structures within the data. These features provide valuable insights into the underlying semantics. In this talk, we consider music data as an example of semantic time series data and transform it into a network represented as a graph. We then apply topological data analysis (TDA), particularly persistent homology, to find interesting structural patterns such as one-dimensional cycles. We further quantify how these cycles are interconnected within the music considered. We demonstrate the effectiveness of this method and show that TDA with persistent homology not only helps to understand the semantic structure of music but also helps machine learning tasks such as classification and automatic composition. We will also try to present examples of applying TDA to corpus data. This talk is designed to be accessible to audiences without prior knowledge of persistent homology, with explanations provided through examples to facilitate practical application in their own research.
Jae-Hun Jung (POSTECH)
13:40 – 14:05
"ToMCLIP: Topological Alignment for Multilingual CLIP" *remote presentation
CLIP (Contrastive Language-Image Pretraining) has shown strong performance in vision-language tasks by aligning image and text representations through contrastive learning. However, its cross-modal capabilities are biased toward English due to the lack of high-quality multilingual multimodal data. Although recent multilingual extensions of CLIP (MCLIP) have attempted to bridge this gap through knowledge distillation and continual learning, they primarily focus on instance-level alignment and fail to preserve the global structure of the embedding space. In this paper, we identify topological inconsistency as a fundamental challenge in multilingual representation learning. We propose ToMCLIP, a topology-aware training framework that aligns the latent spaces of CLIP and MCLIP text encoders using tools from topological data analysis. To ensure scalability, we construct sparse graphs from point clouds to efficiently approximate topological features. Experimental results demonstrate improved structural coherence of multilingual representations, especially for languages that differ significantly from English.
Junwon You (POSTECH)
14:05 – 14:30
"Using iCatcher+ for Automated Gaze Analysis of Korean Infants"
abstract
Jiho Lee (POSTECH)
14:30 – 14:55
“Linguistic Analysis of Biomedical Texts with Natural Language Processing Methods and Large Language Models”
Large Language Models (LLMs) are becoming more performant in different tasks, such as summarizing, translation, or question answering. However, their performance in the medical domain is still not reliable enough to be used as plug-and-play tools in healthcare. The complexity of the medical knowledge and the technical language of research papers make using LLMs in healthcare a challenge, especially in a multilingual context. Multilingual LLMs are not equally performant in all languages, their performance being lower in languages other than English. We present multilingual LLMs’ performance in QA tasks in two languages, English and French. We used Language Processing Methods (NLP) metrics to evaluate the models' capabilities in answering medical research questions of different levels of complexity.
Ioana Buhnila (University of Lorraine – CNRS)
14:55 – 15:15 Coffee Break
Cross-Linguistic Developmental Patterns (Moderator: Jun Ho Chai)
15:15 – 15:40
“English-speaking Children's Acquisition of Comparatives: Based on the CHILDES database”
There are two types of English comparative constructions: synthetic comparatives, using -er suffix as in ‘bigger’ and periphrastic comparatives, using 'more' as in ‘more interesting’ (Quirk et al., 1985). This study investigated the developmental acquisition and processing of comparative constructions among English-speaking children at ages from 1 to 7 from the CHILDES database. The research addressed three primary objectives: (1) to examine how 3-year-olds and 5-7-year-olds utilize synthetic versus periphrastic comparatives, (2) to identify linguistic and frequency factors that influence accurate comparative usage, and (3) to determine correlations between speaker type (child vs. caregiver), age, and comparative error patterns. The analysis examined both production patterns and error types in relation to caregiver input frequency and various linguistic factors. Results indicated that children begin producing comparatives between ages 2-3, with caregiver frequency of synthetic comparatives significantly influencing children's synthetic comparative production. However, 4-year-old children demonstrated systematic overgeneralization of periphrastic comparatives during rule acquisition phases. Critical linguistic factors affecting accurate usage included syllable count, syntactic function, presence of premodifiers, and adjective semantic classification. Children preferentially employed periphrastic comparatives with monosyllabic adjectives in predicative positions, while favoring synthetic comparatives when premodifiers were present. Four-year-olds produced significantly more periphrastic comparative errors through overgeneralization, while children generated substantially more double comparatives compared to caregivers. These empirical findings provide robust support for usage-based theory by demonstrating the complex interaction between linguistic features and input frequency in the acquisition of comparative construction.
Jihye Suh (Chosun University)
15:40 – 16:05
“Morphological Development in Korean Infants based on longitudinal data”
Korean has many postpositions that represent case marking, information structure, or other special meanings after a noun phrase. Most frequently used one is un/nun, which is an auxiliary clitic that can be used in any component position in the sentence, and has topic, contrast, and egophoric meanings. When Korean infants acquire a language, they must find and distinguish noun-un/nun phrases from their alternatives, such as bare nouns or noun-i/ka phrases with different meanings in competitive contexts. This presentation will examine the morphological developments that reveal information structure and egophoric meanings based on un/nun in the spontaneous speech of Korean children from TalkBank.
The topic constructions ‘ike mwe-ya?(this is what?)’ or ‘ike X’ with bare noun ‘ike’ were firstly used by infants at age 1:7~2:4, and they updated the noun phrase adding the topic maker nun, such as ‘ike-nun mwe-ya?’ or ‘ike-nun X’ constructions at age 1:10~2:9. And the constraining marker nun is in the construction of contrast, such as ‘emma-nun X, appa-nun Y (mother X, father Y)’ at age 1:7~2:3. However, the nominative focal marker ka was developed when he answered the question ‘nwuka V? (who does V?)’ with the form of N-ka around age 1:8~2:0. And the egophoric marker nun was used in the expression of volition or speaker’s own information in the construction like ‘na/[name]-nun ... V-ul keya/ulkey (I will V)’ at the age 2:0-2:4. Firstly, the topic construction with a bare noun was used, and it linked to the contrast construction with clitic nun after a noun. And then the noun in the topic construction adds clitic nun, and it is competitively developed with the focal construction using the nominative maker ka in different contexts. Lastly, the egophoric marking clitic nun was developed to display the speaker’s authorized information, such as volition.
Haegwon Jeong (Chosun University)
16:05 – 17:00 Discussion Session II & Closing Remarks
The above schedule may be changed depending on the situation.