Meta-Reward-Net: Implicitly Differentiable Reward Learning for Preference-based Reinforcement Learning

Runze Liu, Fengshuo Bai, Yali Du, Yaodong Yang

NeurIPS 2022

[Paper] [Code]