Batch Policy Learning under Constraints

Hoang M. Le, Cameron Voloshin, Yisong Yue

California Institute of Technology