SocialAI 0.1: Towards a Benchmark for Socio-Cognitive Abilites in Deep Reinforcement Learning Agents

Anonymous Authors

Abstract

Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI.  This problem motivated many research directions on embodied language use. Current approaches focus on language as a communication tool in very simplified and non diverse social situations: the "naturalness" of language is reduced to the concept of high vocabulary size and variability. In this paper, we argue that aiming towards human-level AI requires a broader set of key social skills: 1) language use in complex and variable social contexts; 2) beyond language, complex embodied communication in multimodal settings within constantly evolving social worlds. In this work we explain how concepts from cognitive sciences could help AI to draw a roadmap towards human-like intelligence, with a focus on its social dimensions. We then study the limits of a recent SOTA Deep RL approach when tested on a first grid-world environment from the upcoming SocialAI, a benchmark to assess the social skills of Deep RL agents.

TalkItOut environment

The agent has to find out which door is the correct one by asking the true guide. To find out which guide is the true one the agent has to ask the wizard. Upon finding out the correct door the agent has to stand in front of it and utter "Open sesame".

The environment seems to be a complex challenge DRL learners, especially for the social skills it requires, i.e. handling multi-NPCs multimodal interactions  and inferring ill intentions of the false guide. 

Full environment

(The Wizard, Truth speaking guide, Liar Guide)

MH-BabyAI-EB

The agent talks with all the NPCs because of the verbal episodic exploration bonus, but is unable to infer the relevant meaning from their utterances.

No Liar NPC environment

(The Wizard, Truth speaking guide)

MH-BabyAI-EB

The agent solves the task. It talks with both NPCs because of the verbal episodic exploration bonus.

MH-BabyAI

The agent lacks the bias to talk to NPCs and is therefore unable to solve the task. The agent is stuck in the local optimum of going to the nearest door.


MH-BabyAI

The agent lacks the bias to talk to NPCs and is therefore unable to solve the task. The agent is stuck in the local optimum of going to the nearest door.