Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap

Mohammad Mehrab, Stefan Wager