Multimodal Imitation Learning in Multi-Agent Environments (MIMIc)

New Investigator Award Funded by the Engineering and Physical Sciences Research Council of the UK under the Artificial Intelligence Theme (EP/T000783/1)

We are focused on applications that require autonomous agents (e.g. Robot or Driverless car) to interact with multiple intelligent agents in the environment to accomplish a task (known as Multi-Agent Environments: MAEs). Such applications require an agent to anticipate the behaviour of other agents and to select the most appropriate course of actions. Equipping agents with such autonomous decision-making capability is known as policy learning. Compared to policy learning in single agent domains (teaching a robot to walk or a computer to play a video game), the recent progress of policy learning in MAEs has been quite modest. This is due to multiple reasons: 1)Due to agent actions the environment is dynamic 2)multi-agent policy learning suffers from a theoretical limitation known as curse of dimensionality (CoD) 3)Utility functions that capture agent objectives are difficult to define 4)there is a significant lack of adequate multi-agent datasets that allow meaningful research. This project proposes to undertake research in to policy learning in MAEs, by addressing the above limitations.


Our unique approach to policy learning in MAEs is motivated by how humans thrive in similar settings. Firstly, we perceive the world through multiple senses, (i.e. vision, audition, touch) enabling a rich perception of the world. Secondly, when acting in a MAE, humans do not pay attention to all the stimuli but only to key stimuli e.g. when a football player is attacking the ball, the player pays attention only to the teammates capable of effecting a goal and the key defenders. Finally, the learning paradigm we employ known as imitation learning is an emerging methodology to learn by observing experts, which is a productive approach that we use to learn new skills. Accordingly, we propose to learn realistic policies in MAEs through imitation learning by leveraging multimodal data fusion and selective-attention modelling. Multimodal data fusion allows to capture high dimensional context of the real world and selective attention model allows for allaying the issue of CoD. We have been provided a unique multimodal multi-agent dataset and access to state-of-the-art facilities to capture data, by an elite football club facilitating this ambitious research project.

Real World Multi-Agent Systems: Examples

Sports Analytics

Driverless Vehicle Control

Urban Planning: Flooding/Traffic/Pollution

Multiplayer Video Gaming

The project outputs will be subjectively validated as a tool to answer "what-if" questions related to game play in football assisting coaching staff to visualize speculative game strategies, and as a computational benchmark to quantify cognitive skills of football players. The planned impact activities will ensure the project will leave a legacy in AI development benefiting UK PLC through significant contribution in multiple high growth areas, such as driverless vehicles, video gaming, and assistive robots.