Efficient Preference-Based RL Using Learned Dynamics Models

Yi Liu, Gaurav Datta, Ellen Novoseller, Daniel S. Brown

Code