Monte Carlo Actor Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations

Albert Wilcox, Ashwin Balakrishna, Jules Dedieu, Wyame Benslimane, Daniel Brown, Ken Goldberg