Benchmarks for agent-based models of language emergence

Benchmarks for agent-based models of language emergence

There is now already a substantial set of agent-based models exhibiting some form of language emergence for specific aspects of language, for example, color terms, spatial expressions, agreement, proper names, tense-aspect systems, constituent structure, vowel inventories, a.o. But research remains fragmented. The importance of many results is not yet fully understood and open problems are not clearly defined. The data on which the experiments are based is generally not released and there is no independent way to validate the results. Current research in Machine Learning and AI shows that the use of well-defined benchmarks is a powerful way to focus and push forward research. It makes it possible to compare different approaches and gauge better whether significant advances are being made. This symposium applies this methodology to agent-based models of language emergence. It discusses general criteria for benchmarks in this domain and some specific examples.

What are valuable benchmarks for language emergence?

Luc Steels, ICREA, UPF Barcelona, Spain

An agent-based model of language emergence proposes an environment, a population of agents with a specific cognitive architecture, and a script for a language game that achieves a particular communicative goal. The model is tested by letting the agents play language games in order to see whether the population indeed collectively self-organizes a communication system adapted to the environment and to the communicative goal. To be of interest and relevant to the study of human language emergence, the communication system should exhibit some key characteristic of human language, such as the expression of rich conceptualizations or hierarchical syntax.

Benchmarks for the emergence of grounded spatial language

Michael Spranger, Sony Computer Science Laboratory, Tokyo, Japan

This talk introduces concrete benchmarks for studying the emergence of spatial language, i.e. spatial terms (e.g. "left", "next-to", "on top of"), spatial expressions (e.g. "left of the box"), and spatial perspective (e.g. "behind the box from your perspective", "to your right"). The benchmarks include data drawn from physical robot interaction in a blocks world environment. They allow us to study the emergence of perceptual categories, grounded spatial relations, item-based syntax, argument structure, and compositional procedural semantics. I will also introduce agent-based models that successfully handle the benchmarks and introduce an additional benchmark involving events to study temporal language constructs.

Benchmarks for exploring the relationship between theory of mind and emergent communication

Jakob Foerster, Oxford University, UK

Humans routinely make inferences about the state of mind, motivations, and beliefs of others when observing their actions. This ability, typically referred to as "theory of mind", is believed to play an important role in the discovery of communication protocols and the learning of language. In this talk we present a benchmark which is well suited for exploring this connection and for measuring our progress as a field. While benchmarks have driven progress in machine learning and artificial intelligence over the last few decades, the field of emergent communication and language is urgent need of more standardized testbeds for measuring progress. I believe that the benchmark presented in this talk will be part of the solution.

Benchmarks for studying the emergence of syntactic structure

Freek Van de Velde, Linguistics Department, University of Louvain-KUL, Belgium

Can we use insights and data from historical linguistics to create benchmarks for agentbased models of language emergence? Here I will focus on the emergence of noun-phrase constituency in West-Germanic and Romance using historical demographic databases we have collected for a burgeoning project on demography-driven morphosyntactic change. I will show how these data allow us to investigate the dynamic nature of hierarchical structure in language and the trade-off between morphological complexity and syntactic/lexical constructions. I will also introduce reliable metrics for morpho-syntactic complexity to track in how far agent-based models achieve phenomena observed in the emergence and evolution of grammar.

Benchmarks for studying the effect of perceptual constraints and socio-cultural transmission on the emergence of color terms

Tao Gong, ETS, New Jersey, USA

The origins of color terms and color expressions has been an important target for agent-based models, because it is relatively easy to collect sensory data and extensive surveys of color word usage are available starting with the seminal work of Berlin and Kay. Here I focus on agent-based models that examine the impact of two factors influencing color language: perceptual constraints and the socio-cultural environment in which agents can communicate using color language. In this way, we can evaluate the relative importance between individual perceptual constraints and socio-cultural transmissions on shaping the color categorization patterns evident in the World Color Survey.

Luc Steels is ICREA Research Professor in Artificial Intelligence at the Institute for Evolutionary Biology at the Universitat Pompeu Fabra in Barcelona, Spain. His main research interests include agent-based modeling of language emergence and evolution. With his team he has done a large series of experiments on the emergence of lexicons of perceptually grounded categories, in particular color, emergence of spatial terms and perspective, action terms, case grammar, agreement systems, and constituent structure. This work has resulted in a large number of publications (H-index = 69) and more than 15 edited volumes including the Talking Heads Experiment, Design Patterns and Computational Issues in Fluid Construction Grammar (the world’s most advanced computational platform for exploring constructional language processing), and Experiments in Cultural Language Evolution. He is the founding director of the Sony Computer Science Laboratory in Paris, co-founder of the European AI Association as well as co-founder and former president of the Belgian AI Association and the Evolutionary Linguistics Association.

Michael Spranger is a researcher at Sony Computer Science Laboratories Inc. based in Tokyo, Japan. His main focus is to build creative, intelligent systems that act autonomously in the real world, and learn and evolve communication systems similar in complexity to human language. His PhD thesis (ECCAI honorable mention award) has been published as an open access book. You can find out more about him on Google Scholar, LinkedIn, ResearchGate, Frontiers in Robotics and AI, ORCID and Sony Computer Science Laboratories Inc.

Sponsoring Institutions