3rd Workshop

on Language Learning


19th August 2019, Engineer's House Conference Centre, Oslo, Norway

Important Dates and Place

Poster abstract submission deadline: June 30th 2019

Poster abstract submission acceptance: 1-2 days after submission

Workshop: 19th August

The workshop is part of ICDL-EPIROB 2019 held in Oslo, Norway.


Children acquire language in interactions with caregivers and peers in a socio-cultural environment. When children start to talk, their visual perception, body movement, navigation, object manipulation is already reaching some level of competence. Together with developing auditory control, initial schemas for social interactions (games), communication arises gradually and embedded into the social-interactionist environment. Even though there are various efforts in developmental robotics to model communication, the emergence of symbolic communication is still an unsolved problem. We are lacking convincing theories and implementations that show how cooperation and interaction knowledge could emerge in long-term experiments with populations of robotic agents.

The workshop will address the following issues

  • role of context in language acquisition (cultural, social, interactional)
  • role of every day interactions in language acquisition
  • empirical paradigms for studying language acquisition and emergence
  • longitudinal studies of language acquisition
  • language grounding / embodiment / enaction
  • modeling acquisition of syntax and/or semantics
  • models of sentence and interaction processing
  • modeling with artificial neural networks
  • human robot interaction (especially linked with grounding, ...)
  • models of language emergence using Reinforcement Learning
  • robot language acquisition

Target Audience

Addressing the emergence of communication requires combining and integrating knowledge from diverse disciplines: developmental psychology, robotics, artificial language evolution, complex systems science, computational linguistics and machine learning. The goal of the workshop is to bring together researchers from these areas in order to discuss current findings from experimental studies and to transfer hypothetical insights into potential mechanism for modelling approaches.


09:00 - 09:45 Invited talk Pragmatic Frames: Collaborative and multimodal endeavor of language learning by Prof Katharina Rohlfing

09:45 - 10:30 Invited talk Language as a part of action by Prof Joanna Rączaszek-Leonardi

10:30 - 11:00 Coffee break

11:00 - 11:45 Invited talk Language Grounding in Robot Behavior by Deep Learning by Prof Tetsuya Ogata

11:45 - 12:10 Invited talk How to ground sensorimotor sequence of symbols? From robot learning languages to songbirds. by Xavier Hinaut

12:10 - 12:30 Contributed talk Individual differences in the multimodal effects of parent speech in parent-infant interactions by Sara Schroer

12:30 - 13:30 Lunch

Invited Speakers

Pragmatic Frames: Collaborative and multimodal endeavor of language learning

Prof Katharina Rohlfing (UPaderborn, DE)

So far in the research, the problem of learning a word was presented mostly in an intrapersonal way: a child has to map a word onto a concept. In this presentation, I will present an alternative to this approach: Word learning is not only a matter of the learner. Instead, it is a joint and collaborative endeavor, with words contributing to specific action goals—especially in early development. This view affords not only a change of theoretically conceptualizing word learning but also a change of methods. Departing from the theory summarized in Rohlfing et al., (2016) under the conception of Pragmatic Frames, I will examplify the methodological challenge on turn-taking (Rohlfing et al., 2019), which – so far – was investigated mostly as unimodal but should be considered as a multimodal phenomenon. Analyzing a corpus of mother-child dyads applying Cross Recurrence Quantification Analysis and frequent pattern mining, solutions to the assessment of human sequential behavior will be presented with respect to the questions of (i) how multimodal turn-taking spreads across different modalities and (ii) how it is co-constructed with a partner.

Language Grounding in Robot Behavior by Deep Learning

Prof Tetsuya Ogata (Wasede University, JP)

It is effective for robots to understand natural language to work with robustness responding to human requirements or to work together with humans. However, it is not enough for robots to communicate with us only in a verbal manner. Robots should use language in a form that is grounded in the real world. To arbitrarily design mapping between language, which is a discrete system, and the referents in the real world, which is a continuous and dynamical system, for intelligent machines is notoriously difficult, stated as the symbol grounding problem. In my talk, I will introduce a framework using multiple recurrent neural networks that attains the following two capabilities related to the language grounding: (1) generation of robot behavioral sequences in response to linguistic instructions and (2) generation of linguistic descriptions given robot behavioral sequences. The first capability is clearly a requirement of service robots, and the second capability is also required in order to interpret service robot behavior.

Language as a part of action

Prof Joanna Rączaszek-Leonardi (U Warsaw, PL)

The embodied and situated perspective on cognition allows to see more clearly the process of language learning as a process of becoming a skillful participant in interactions. Being physically present in interactions, language is a part of co-action from the earliest moments. Such perspective enables to appreciate the continuity from early language use as a direct, physical interaciton control to a progressive ungrounding into more sophisticated constraints, while retaining the controlling powers.

I will survey the multiple ways in which language allows for mutual interactions control and point to the structure of the environment, enacted around the infants (the ‘social physics’), which helps them enter such participatory sense-making. We will trace how the early coordination through coupling participants in common rhythms, marking the event structures and providing mutual interactive affordances is later enriched and complexified by additional constraints, coming from culturally selected utterance structures. Hopefully, the integrative role of language will come to the fore, i.e., the fact, that every use of language collapses into the “here and now” the evolutionary process of functional control selection with developmental experience and current coordinative demands.

How to ground sensorimotor sequence of symbols? From robot learning languages to songbirds.

Dr Xavier Hinaut (INRIA, FR)

Could robots learn different languages with the same core mechanisms? How do children associate the structure of a sentence to its meaning? How to enable robots to ground the meaning of a sentence? How do songbirds learn to imitate their tutor and learn to sing? How does the brain processes, learns, encodes and grounds sequences of symbols with explicit or implicit rules (e. g. human language, songbird, hierarchical sequence of actions)?

One of my goals is to find generic artificial neural substrates that can learn the syntax of several languages, and to model the learning of sensorimotor sequences (e.g. songbirds vocal learning). For such purpose, we use models based on Recurrent Neural Networks (RNN), and in particular the Reservoir Computing paradigm, along with human-robot interaction and electrophysiological recordings, in order to model syntax learning in humans and songbirds.

Moreover, we study how such RNNs handle working memory mechanisms which are crucial to learn rules with long-time dependencies (e.g. arithmetic operations, sentences, songs).


Chen Yu (Indiana University, US)

David Crandall (Indiana University, US)

Kaya de Barbaro (UTexas, US)

Malte Schilling (CITEC Bielefeld, DE)

Matthias Scheutz (Tufts University, US)

Michael Spranger (Sony Computer Science Laboratories Inc, JP)

Tadahiro Taniguchi (Ritsumeikan University, Kyoto, JP)

Xavier Hinaut (INRIA, Bordeaux, France)

Corresponding organizer: Michael Spranger (michael dot spranger at gmail dot com) and Chen Yu (chenyu at indiana dot edu)