Chosun Center for Data Science in Humanities Annual Workshop on the

Language Acquisition through the Lens of Data

데이터를 통해 보는 언어습득 연례 워크숍

June 25, 2024 (Tuesday)

후원: 조선대학교 인문학연구원

Program Overview (KST, GMT+9)

Click here to join via Zoom

영한 번역 캡션이 자막으로 제공됩니다. Translation captions will be activated.

9:10 Registration

9:25 Welcoming remarks

Morning session: A thematic session on children's eye-tracking

9:30 - 10:20: Martin Zettersten (Princeton University) Peekbank: Exploring children’s word recognition through an open, large-scale repository for developmental eye-tracking data
10:20 - 11:05: Jun Ho Chai (Chosun University) Accessing and analyzing Peekbank's eye-tracking datasets
11:05 - 11:15: Break
11:15 - 11:45: Eon-Suk Ko (Chosun University) Hearing Shapes, Seeing Sounds: Early Word Learning Through Sound Symbolism in Korean Infants
11:45 - 12:15: Jun Ho Chai (Chosun University) Investigating the Convergence of Child Language Assessment Measures with 14-month-old Korean Infants

Afternoon session: General session

13:30 - 14:20: Hyun-joo Song (Yonsei University) Infants’ sensitivity to others’ psychological states when interpreting linguistic information
14:20 - 14:50: Jongmin Jung (Chosun University) Shape bias in vocabulary development of young Korean children
14:50 - 15:20: Jinyoung Jo (UCLA) Remote collection of language samples from three-year-olds
15:20 - 15:35: Break
15:35 - 16:05: Seoran Kim (Yonsei University) Korean children's use of linguistic context in verb learning
16:05 - 16:35: Xiaoqiao Wang (Yonsei University) Brain Structures associated with foreign language experiences
16:35 - 17:05: Margarethe McDonald (University of Kansas) Distributional Learning in bilingual and monolingual infants
17:05 - 17:10: Closing remarks

Keynote Speakers

Martin Zettersten

(Princeton University)

Peekbank: Exploring children’s word recognition through an open, large-scale repository for developmental eye-tracking data

What can we learn about from analyzing eye-tracking data from thousands of infants? We will present results from a large-scale team science project which aggregates infant eye-tracking data on infant word recognition across many studies to investigate development of word recognition (Zettersten et al., 2021). The talk will focus on three main themes from ongoing work: (1) Aggregating existing data can be powerful for modeling development: by combining data from many experiments, we can overcome the limitations of small, isolated studies to model gradual item-independent changes in online word processing ability. (2) Large-scale databases can help researchers make more informed design and modeling decisions: using the Peekbank database, we can systematically explore the impact of a variety of design and modeling decisions, including the effect of selecting longer vs. shorter analysis time windows on establishing reliability; the consequences of choosing more inclusive or stricter criteria for participant exclusions; and the effects of repeating trials on infants’ looking patterns. (3) Big data is sometimes not enough: even in the large Peekbank database, word-specific developmental trajectories remain difficult to capture due to high variability within items and idiosyncrasies across individual datasets. Together, these results will highlight opportunities and limitations of current big data approaches to infant language development.

Hyun-joo Song (Yonsei University)

Infants’ sensitivity to others’ psychological states when interpreting linguistic information

To understand others’ words and sentences, children have to figure out which object is under discussion, perhaps by considering what the speaker sees, wants, and believes. In this talk, I will present my experiments on how infants reason about these psychological processes and integrate these types of information when interpreting linguistic information. First, consider the following situation: An actor utters a word and then reaches for an object; later the actor utters a different word and reaches for a different object. Adults could easily interpret the change in word as a cue to a change in goal. I found that 12-month-olds, but not younger infants, realized that a change in word signals a possible change in an actor’s goal object (Jin & Song, 2107; Song, Baillargeon, & Fisher, 2014). Second, I found that infants expect some linguistic information to update others’ false beliefs. In one experiment, actor1 hid a toy in a box and left; actor2 then moved the toy to a new hiding location. When she returned, actor1 searched for her toy in its old (old-location event) or new hiding location (new-location event). The infants who watched the new-location event looked reliably longer than those who watched the old-location event. When actor2 had informed actor1 about the new location by explicit verbal information (“The ball is in the cup!”), the infants changed their predictions: They now expected actor1 to reach for the new location (Song, Onishi, Baillargeon, & Fisher, 2008). But they did not expect uninformative or ambiguous utterance (e.g., “I like the cup!” or “The ball and the cup!”) to change actor1’s beliefs (Song et al., 2008; Jin et al., 2019). Third, 19-month-old English-acquiring infants, but not 14-month-olds, are sensitive to the distinction between indefinite and definite articles when identifying the referents of others’ words (Choi, Song, & Luo, 2018). Infants expect a speaker to refer to an object visible to the speaker when she says “Give me the ball” not when she says “Give me a ball.” I am currently further extending the previous investigations to Korean-acquiring infants’ use of language-specific units when interpreting others’ speech.

Speakers

Jun Ho Chai

(Chosun University)

Accessing and Analyzing Peekbank's Eye-tracking Datasets

The Peekbank project provides an open repository of eye-tracking datasets on children's word recognition, enabling researchers to explore language development through big data. This demo will showcase two key tools that make Peekbank accessible to a broad audience: (1) The Peekbank Shiny app, an interactive web interface for visualizing looking-while-listening data without needing to write code. It allows users to rapidly explore patterns and generate plots, making it easy for researchers, teachers, and students to engage with the data regardless of their computational background; (2) The peekbankr package in R, which enables more flexible and customizable analyses of the Peekbank database. It provides functions for downloading specific data subsets and computing common eye-tracking measures. Examples will demonstrate how to use peekbankr to test hypotheses about the development of word recognition, such as examining age-related changes in speed and accuracy. These tools expand Peekbank's accessibility to a wide audience and provide complementary methods for exploring and analyzing word recognition development.

Eon-Suk Ko

(Chosun University)

Hearing Shapes, Seeing Sounds: Early Word Learning Through Sound Symbolism in Korean Infants

Sound symbolism is posited to facilitate early word learning, yet its applicability across diverse languages and cultures remains underexplored. This study investigated the universality of the bouba-kiki effect in 64 Korean infants, distributed across two age groups (14 and 28 months). Utilizing the Looking-while-Listening paradigm, we examined whether infants better learned word-object mappings in a condition where sound symbolism was matched (e.g., [buba] with a round object versus [kiki] with a spiky object) as opposed to a mismatched condition. Eye gaze data revealed gender and side biases. We analyzed the proportion of gaze switches over time—defined as the switch of gaze from one image at the start of stimuli onset to the other image on the screen. 14-month-olds demonstrated significantly more distractor-to-target gaze switches in both the filler and matched conditions, yet this pattern was not evident in the mismatched condition. In contrast, 28-month-olds showed increased gaze switches across all conditions. These findings suggest that sound symbolism facilitates word learning in infants as young as 14 months old and that the ability to learn arbitrary mappings between sound and form improves with age.

Jun Ho Chai

(Chosun University)

Investigating the Convergence of Child Language Assessment Measures with 14-month-old Korean Infants

This study examined the convergent validity between parental reports and direct measures of word comprehension in 13-to-14-month-old Korean infants. Eye-tracking assessed infants' recognition of 40 words from the Korean MacArthur-Bates Communicative Development Inventories (MB-CDI). Parents completed the full MB-CDI and a short form with only the target words. Results showed moderate correlations between eye-tracking measures and the short MB-CDI, but not the full MB-CDI or percentile scores. Parents over-reported comprehension compared to infants' performance. For unrecognized words, parent-infant alignment was higher for later-acquired words. The short MB-CDI converged better with eye-tracking than the full MB-CDI. Findings suggest parental reports, while valuable, have limitations for measuring comprehension in infants with developing lexical-semantic representations. Multiple converging measures, including eye-tracking, provide a more reliable picture of early word knowledge. Results highlight the need to consider constraints of parental reports when assessing language in infancy.

Jongmin Jung

(Chosun University)

Shape bias in vocabulary development of young Korean children

We examined the role of shape-biased vocabulary, an underexplored predictor in Korean vocabulary development. As children acquire around 50 object nouns, they may utilize the shape of referents to generalize labels. Using the Korean version of the MacArthur-Bates Communicative Development Inventory, we assessed expressive vocabulary in 1,575 children (803 boys, aged 18-36 months). Employing Samuelson and Smith's (1999) approach, we identified 146 shape-biased items. The study revealed that the proportion of shape-biased words to object nouns significantly predicted children's percentile scores. Furthermore, there was a noteworthy interaction between object noun size and the proportion of shape-biased words. The results suggest that the shape of referents plays a crucial role in the early vocabulary development of Korean children. Importantly, the impact of shape-biased word proportion is moderated by object noun size, highlighting the nuanced relationship between these factors. Overall, this study contributes valuable insights into the significance of shape-biased words as predictors of children's vocabulary outcomes.

Jinyoung Jo

(UCLA)

Remote collection of language samples from three-year-olds

We characterized language samples collected remotely from typically-developing three-year-olds by comparing them against independent language samples collected in person from age-matched peers with and without language delays. Forty-eight typically-developing, English-learning three-year-olds were administered a picture description task via Zoom. The in-person comparison groups were two sets of independent language samples from age-matched typically-developing as well as language-delayed children available on the Child Language Data Exchange System. The findings show that although language samples collected remotely from three-year-olds yield numerically dissimilar lexical and grammatical measures compared to samples collected in person, they still consistently distinguish toddlers with and without language delays.

Seoran Kim (Yonsei University)

Korean children's use of linguistic context in verb learning

Korean 24-month-olds learn verbs better in sparse linguistic contexts with omitted arguments, in contrast to English-acquiring peers who favor rich contexts including noun arguments (Arunachalam & Waxman, 2011; Arunachalam et al., 2013). This suggests that the optimal context for verb learning may vary across languages. However, even in the "ideal" sparse context, Korean toddlers performed at chance levels. The current research examined when and how Korean children can prove verb learning above chance levels by testing 2- and 3-year-olds. The participants were first taught novel verbs by viewing scenes described by sentences that either mentioned both the subject and object (rich context) or omitted them (sparse context). Unlike prior research, a sentence introducing event participants preceded the critical sentence, providing discourse support to aid sentence comprehension. They were then asked to identify which of two scenes depicted the novel verb: one scene showed the familiar action on a different object, and the other showed a different action on the familiar object. Preliminary results revealed that participants successfully associated novel verbs with familiar actions in both sparse and rich contexts. However, post-hoc analyses showed that 2-year-olds performed at chance levels in all contexts, while 3-year-olds performed above chance levels in both contexts. Thus, by the age of 3, Korean children are able to learn verbs in both rich and sparse contexts, at least with discourse support.

Xiaoqiao Wang (Yonsei University)

Brain structures associated with foreign language experiences

Previous research has demonstrated that grey matter volume (GMV) and white matter integrity adaptively restructure in response to bilingual experiences. However, most studies focus on bilingual individuals who acquire more than one language naturally, leaving the brain structure changes in those with limited foreign language exposure poorly understood. The present study investigated whether and how brain structure changes in long-time foreign language learners correlate with foreign language proficiency and usage experiences in a mostly monolingual environment. Thirty-two Korean native speakers who had learned English as their first foreign language mostly in formal educational settings since childhood participated in our study. The correlation analysis between their brain structure and English usage experiences revealed that both GMV and white matter integrity of regions involved in language control and general executive functions are correlated with English proficiency and daily usage experiences. These findings suggest that even limited foreign-language exposure in daily life can influence brain structure.

Margarethe McDonald

(University of Kansas)

Distributional Learning in bilingual and monolingual infants

Distributional learning has been proposed as a mechanism for infants to learn the native phonemes of the language(s) to which they are exposed. When hearing two speech streams, bilingual infants may find other strategies more useful and rely on distributional learning less than monolingual infants. A series of studies examined how bilingual language experience affects the application of distributional learning to novel phoneme distributions. Monolingual and bilingual infants between 6 and 8 months old performed a distributional learning task using velar consonant stimuli grouped into one of three distributions based on voice onset time. Performance after exposure to a unimodal distribution was compared to performance after both a bimodal (Experiment 1) and trimodal distribution (Experiment 2) of the same voice onset time cue. Results indicated that monolingual and bilingual infants performed similarly on all tasks, and infants were able to learn both bimodal and trimodal phoneme distributions. The universality of the distributional learning mechanism is suggested by these results, but future research would need to test the two groups and distributions for equivalence of performance.

Page updated

Google Sites

Report abuse