Emergent Graphical Conventions in a Visual Communication Game

Shuwen Qiu*1, Sirui Xie*1, Lifeng Fan2,

Tao Gao3,4, Jungseock Joo3, Song-Chun Zhu1,2,4,5, Yixin Zhu5

1UCLA Department of Computer Science

2Beijing Institute for General Artificial Intelligence (BIGAI)

3UCLA Department of Communication, 4UCLA Department of Statistics

5Institute for Artificial Intelligence, Peking University

* Equal contribution

[paper] [code]


Humans communicate with graphical sketches apart from symbolic languages. While recent studies of emergent communication primarily focus on symbolic languages, their settings overlook the graphical sketches existing in human communication; they do not account for the evolution process through which symbolic sign systems emerge in the trade-off between iconicity and symbolicity. In this work, we take the very first step to model and simulate such an evolution process via two neural agents playing a visual communication game; the sender communicates with the receiver by sketching on a canvas. We devise a novel reinforcement learning method such that agents are evolved jointly towards successful communication and abstract graphical conventions. To inspect the emerged conventions, we carefully define three key properties -- iconicity, symbolicity, and semanticity -- and design evaluation methods accordingly. Our experimental results under different controls are consistent with the observation in studies of human graphical conventions. Of note, we find that evolved sketches can preserve the continuum of semantics under proper environmental pressures. More interestingly, co-evolved agents can switch between conventionalized and iconic communication based on their familiarity with referents. We hope the present research can pave the path for studying emergent communication with the unexplored modality of sketches.

In an iterative sketch communication game, players first need to ground sketches to referents. The drawer (Alice) gradually simplifies the drawing but keeps the most salient parts of the target concept (rooster crown). This evolution process enables the viewer (Bob) to promptly distinguish the target (rooster) from distractors (bird, cup, rabbit, and sheep). We aim to model and simulate this evolution process of graphical conventions.

Sketch evolution of rabbit and giraffe through game iterations. For each example, sketches from the left to the right show the change of the final-step canvas from iteration 0 to 30,000. For rabbit, at the beginning, the strokes may depict instances from different perspectives; through iterations, they converge to highlight the rabbit's long ear. As for giraffe, the agents gradually learn to emphasize the long neck.