Auto-Encoding Adversarial Imitation Learning
Kaifeng Zhang, Rui Zhao, Ziming Zhang & Yang Gao
AAMAS'2024
Overview
To the best of our knowledge, this is the first auto-encoder based reward function formulation for adversarial imitation learning. This form of reward function provides more informative learning signals to the agent and it is also capable of denoising the expert demonstrations. Empircally, our method achieves 16.4% and 47.2% relative improvement overall compared to the best baseline FAIRL and PWIL on clean and noisy expert data, respectively.
Video Results
Experiments
Discussions
Encoding-decoding process is the major contributing factor of our AEAIL *(proof see in empirical analysis)
(i) auto-encoding offers a denser reward signal
(ii) auto-encoding helps to denoise the expert data