Residual Q-Learning: Offline and Online Policy Customization without Value

Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS) 2023