Leveraging Human Feedback to Evolve and Discover Novel Emergent Behaviors in Robot Swarms







Connor Mattson and Daniel S. Brown

Kalhert School of Computing, University of Utah

Abstract

Robot swarms often exhibit emergent behaviors that are fascinating to observe; however, it is often difficult to predict what swarm behaviors can emerge under a given set of agent capabilities. We seek to efficiently leverage human input to automatically discover a taxonomy of collective behaviors that can emerge from a particular multi-agent system, without requiring the human to know beforehand what behaviors are interesting or even possible. Our proposed approach adapts to user preferences by learning a similarity space over swarm collective behaviors using self-supervised learning and human-in-the-loop queries. We combine our learned similarity metric with novelty search and clustering to explore and categorize the space of possible swarm behaviors. We also propose several general-purpose heuristics that improve the efficiency of our novelty search by prioritizing robot controllers that are likely to lead to interesting emergent behaviors. We test our approach in simulation on two robot capability models and show that our methods consistently discover a richer set of emergent behaviors than prior work.

Emergent Behavior Examples: Single-Sensor Model

Cyclic Pursuit

Aggregation

Milling

Wall Following

Dispersal

Random

Emergent Behavior Examples: Two-Sensor Model

Nested Cycle

Concave Path

Novelty Search

We evolve over our behavior space for 100 generations at 100 population each. For each experiment, we evaluate the average number of distinct emergent behaviors returned by 𝑘-medoids for the final 10 generations of evolution. This averaging over medoids helps capture the impact of the behavior space shifting.  Below: Evolutionary Search over the Homogeneous Controller Space (Left) with the Novelty Search objective of diversifying the behavior space (right).

Results

We present the results of our methods when evolving a set of emergent behaviors and find that we are able to re-discover all known collective behaviors that exist on Brown et al.’s original single-sensor model. We also present two new behaviors, Nested Cycles and Concave Paths, that are possible given a two-sensor robot model. To the best of our knowledge, both behaviors were previously unknown to be possible on robots of this simple design.