Auto-Encoding Adversarial Imitation Learning

Kaifeng Zhang, Rui Zhao, Ziming Zhang & Yang Gao

AAMAS'2024


Overview

To the best of our knowledge, this is the first auto-encoder based reward function formulation for adversarial imitation learning. This form of reward function provides more informative learning signals to the agent and it is also capable of denoising the expert demonstrations. Empircally, our method achieves 16.4% and 47.2% relative improvement overall compared to the best baseline FAIRL and PWIL on clean and noisy expert data, respectively.


Video Results


Experiments


Discussions


Encoding-decoding process is the major contributing factor of our AEAIL *(proof see in empirical analysis)

 (i) auto-encoding offers a denser reward signal

(ii) auto-encoding helps to denoise the expert data