Learning in humans and machines

2nd POSTECH MINDS-Chosun CDS Joint Workshop

2024.12.12. (Thursday) - 12.13. (Friday) | Agenal Hall, 2F Lahan Hotel, Gyeongju

Workshop Rationale

Building on the success of the 2023 conference on music research collaboratively organized by POSTECH MINDS, Chosun CDS, and KIAS, the 2024 joint workshop by POSTECH MINDS and Chosun CDS broadens its focus to foster meaningful dialogue and collaboration between the humanities and sciences. This interdisciplinary platform encourages participants to exchange ideas, share methodologies, and explore collaborative opportunities. Rather than focusing on narrow topics, the workshop promotes open-ended discussions that inspire innovation and cross-disciplinary connections. The goal is to create a dynamic environment for creative thinking and partnerships that advance both research and practical applications.

Organizers

Jae-Hun Jung (POSTECH MINDS; Mathematics; Graduate School of AI)
Eon-Suk Ko (Chosun University CDS; Department of English Language and Literature)

Program

DAY 1 | DECEMBER 12, Thursday

TIME

PROGRAM

SPEAKER

9:30 - 10:00

Registration

10:00 - 10:15

Bridging Developmental Science and AI: Exploring Learning Mechanisms in Infants and Machines

This workshop highlights the fascinating parallels and distinctions between the learning mechanisms of infants and artificial intelligence (AI). Infants effortlessly acquire language—grasping word meanings and grammar with limited exposure—while AI achieves remarkable feats through extensive data and computational power. Yet, AI lacks many of the innate learning abilities that infants naturally develop. Understanding these mechanisms in infants offers the potential to enhance AI's efficiency and adaptability, while AI technologies, in turn, provide tools to advance our understanding of how infants learn. The workshop serves as a collaborative platform for developmental scientists and AI researchers to exchange insights and methodologies. Key questions to be addressed include: How can AI technologies deepen our understanding of infant language development? What lessons from infants’ learning processes can inspire more effective AI systems? And how can these reciprocal insights propel advancements in both fields? By fostering interdisciplinary dialogue, this workshop aims to break down barriers, stimulate innovation, and drive progress in understanding learning across biological and artificial systems.

Eon-Suk Ko
(Chosun University)

10:15 - 10:55

Whisper & ChatGPT API

Enter Abstract here

전성휴 연구원
(POSTECH)

10:55 - 11:05

Coffee Break

11:05 - 11:45

Advancing the Analysis of Infants' Eye-Gaze Data with AI

The integration of artificial intelligence in analyzing infant eye-gaze behavior presents transformative opportunities for advancing our understanding of early language and cognitive development. This talk highlights three critical challenges in infant assessment methodology and new avenues for research involving AI. First, we examine the domain adaptation challenge in applying iCatcher+, an existing system trained with Western children, to data from Korean infants. We consider whether training the algorithm on Korean data might improve its performance and what this could reveal about adapting AI systems to different cultural and linguistic contexts. Second, we explore how techniques like Random Forest (RF) models could be used to refine traditional approaches, such as gaze proportion analysis, and discuss how incorporating additional multimodal features, like facial expressions, might enhance our ability to understand infant behavior more comprehensively. Finally, we discuss the potential of developing automated systems for scoring infant gaze behavior that could leverage domain-specific large language models (LLMs). Together, these systems aim to address fundamental challenges in developmental science, from annotating large datasets to capturing nuanced patterns. This approach not only helps resolve long standing analytical challenges in infant behavior research but also offers novel insights into cross-cultural and developmental processes.

Jun-Ho Chae
(Chosun University)

11:45 - 12:25

Understanding and Applications of Foundation Models: Overcoming Data Scarcity and Expanding Utilization

This presentation introduces the concept and utilization of foundation models, highlighting the importance of self-supervised learning and transfer learning in addressing data scarcity. It examines the development of Large Language Models and explores the training methods and application potential of generative AI like ChatGPT. Additionally, the talk discusses extended applications of language foundation models, showcasing their capabilities across various domains.

김태형 교수
(Seoul National University)

12:25 - 2:00

Lunch Break

2:30 - 3:00

Using AI for Understanding Multi-Modal Cues in Infant Language Acquisition: Cross-Cultural Perspectives

The application of computer vision AI to video data analysis offers unprecedented opportunities to investigate how infants utilize multi-modal cues—such as visual objects and auditory labels—in language acquisition. Building upon research into co-occurrence patterns between objects and labels, we aim to automate the measurement of these associations, transitioning from labor-intensive manual processes to efficient, scalable AI-driven methods. This automation enables the quantification of image-sound co-occurrence patterns across diverse linguistic and cultural contexts. A compelling area of investigation is the early acquisition of words like "uh-oh" and "bye-bye," which lack stable image-sound pairings yet are learned rapidly by infants. This phenomenon suggests that infants may rely on a combination of social, contextual, and prosodic cues to grasp the meanings of such words since infants' early vocabularies include a significant number of non-noun words, learned through rich, multimodal interactions with caregivers. Cross-linguistic differences, such as the frequent omission of nouns in Korean compared to English, may alter the patterns of cross-situational co-occurrences available to infants. These differences could influence not only their word learning strategies but also broader cognitive frameworks. By leveraging AI-driven analysis, we hope to deepen our understanding of how infants integrate multi-modal cues during language acquisition, explore cross-cultural differences, and investigate how these patterns may shape cognitive and linguistic development.

Jongmin Jung
(Chosun University)

3:00 - 3:30

Weighted Penalty Fourier Regularization for Domain Generalization

Deep learning models have achieved remarkable success in computer vision; however, they still suffer from performance degradation under domain shift, where distributional changes between training and testing data lead to reduced robustness. To address this challenge, various approaches within domain generalization have been explored. In this study, we observe that domain generalization improves when datasets are low-pass filtered, suggesting that the low-frequency features of image data play a critical role in enhancing domain-invariant learning—a concept we summarize as ``Seeing Less Can Sometimes Reveal More." Following this intuition, we introduce Weighted Penalty Fourier Regularization (WPFR), an efficient and effective method for obtaining CNN filters that generalize well across domains. WPFR applies a selective penalty in the frequency domain to suppress high-frequency components, guiding convolutional filters to prioritize low-frequency features that are more likely to generalize across different domains. We further support this approach through a mathematical analysis, establishing the theoretical validity of WPFR. Extensive experiments validate our hypothesis and theory, demonstrating that WPFR enhances model performance across a range of domain generalization tasks. Additionally, by incorporating WPFR into state-of-the-art single-source domain generalization models, we show that our technique seamlessly boosts performance, underscoring its practical value. Our findings highlight the importance of frequency-aware regularization for domain generalization, contributing an adaptable and theoretically sound tool for robust model design in domains susceptible to shift.

Jiho Lee
(POSTECH)

3:30 - 3:40

Coffee Break

3:40 - 3:50

Exploring the Potential of AI in Analyzing Long-Form Recordings for Infant Language Development Research

This talk explores how AI technologies could address key challenges in studying infant language development using naturalistic, long-form recordings. First, we examine the potential of AI to automatically diarize speech in recordings such as LENA, differentiating between infant-directed speech (CDS) and adult-directed speech (ADS). Research shows that CDS plays a critical role in fostering infants’ attention and facilitating word segmentation, while ADS may provide exposure to complex linguistic structures. Understanding the distribution and role of these speech types requires robust labeling tools for large-scale recordings. Next, we discuss the possibility of labeling music in everyday recordings to examine its impact on language development. Preliminary evidence suggests that musical exposure may enhance word learning, but studying this in ecological settings requires automated tools to identify music type, duration, and its interaction with linguistic input. Similarly, coding shared book reading in everyday life, an activity widely recognized as beneficial for language acquisition, could provide insights into the frequency and quality of such interactions. We also explore whether Whisper, an AI tool for long-form transcription, could be used to generate detailed speech corpora from these recordings. For example, these transcripts could help investigate whether exposure to positive words in CDS correlates with improved vocabulary learning. Additionally, we consider how network theory could be applied to predict the next words likely to emerge in a child’s lexicon. By modeling word acquisition patterns using CDS corpora or existing CDI data, we may uncover strategies to support language interventions. Finally, we touch on a distinct but related challenge: using AI to analyze tune-text alignment in Korean songs. This involves extracting melodic and linguistic information to understand how melody and text interact, an area with implications for music and language learning research. By leveraging these possibilities, AI could help address longstanding questions in language development while enabling new, ecologically valid approaches to understanding the linguistic and cognitive environments of young children.

Eon-Suk Ko
(Chosun University)

3:50 - 4:20

Graph network

Enter Abstract

Junwon You
(POSTECH)

4:20 - 5:30

Open discussion & Collaboration

Enter Abstract

Jae-Hun Jung

(POSTECH)

5:30 - 7:30

Dinner