- Efficient Exploration through Bayesian Deep Q-Networks. Kamyar Azizzadenesheli, Emma Brunskill and Anima Anandkumar. [pdf] [code]
- When Simple Exploration is Sample Efficient: Identifying Sufficient Conditions for Random Exploration to Yield PAC RL Algorithms. Yao Liu and Emma Brunskill. [pdf]
- Count-Based Exploration with the Successor Representation. Marlos C. Machado, Marc G. Bellemare and Michael Bowling. [pdf]
- Randomized Value Functions via Multiplicative Normalizing Flows. Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau and Pascal Vincent. [pdf]
- InfoBot: Structured Exploration in Reinforcement Learning Using Information Bottleneck. Anirudh Goyal, Riashat Islam, Zafarali Ahmed, Doina Precup, Matthew Botvinick, Hugo Larochelle, Sergey Levine and Yoshua Bengio.
- On Oracle-Efficient PAC RL with Rich Observations. Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford and Robert Schapire. [pdf]
- Learning Linear Models with Delayed Bandit Feedback. Claire Vernade, Alexandra Carpentier, Giovanni Zappella, Beyza Ermis and Michael Brueckner. [pdf]
- Efficient Exploration in Two-player Games with a Powerful Opponent. Jialian Li, Tongzheng Ren, Hang Su, Jun Zhu and Dong Yan. [pdf]
- Meta-Reinforcement Learning of Structured Exploration Strategies. Abhishek Gupta, Russell Mendonca, Yuxuan Liu, Pieter Abbeel and Sergey Levine. [pdf]
- Span-constrained planning for (more) efficient exploration-exploitation. Jian Qian, Matteo Pirotta, Ronan Fruit, Alessandro Lazaric and Ronald Ortner. [pdf]
- A Contextual Bandit Bake-off. Alberto Bietti, Alekh Agarwal and John Langford. [pdf]
- Exploration and Policy Generalization in Capacity-Limited Reinforcement Learning. Rachel Lerch and Chris Sims. [pdf]
- On-line Reinforcement Learning with Misspecified States. Ronan Fruit, Matteo Pirotta and Alessandro Lazaric. [pdf]
- A Note on K-learning. Brendan O'Donoghue.
- Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents. Edoardo Conti, Vashisht Madhavan, Felipe Such, Joel Lehman, Kenneth Stanley and Jeff Clune. [pdf]
- Diversity-Inducing Policy Gradient: Using MMD to find a set of policies that are diverse in terms of state-visitation. Muhammad Masood and Finale Doshi-Velez. [pdf] [code]
- Strategic Exploration in Object-Oriented Reinforcement Learning. Ramtin Keramati, Jay Whang, Patrick Cho and Emma Brunskill. [pdf]
- Approximate Exploration through State Abstraction. Adrien Ali Taiga, Aaron Courville and Marc Bellemare. [pdf]
- The Potential of the Return Distribution for Exploration in RL. Thomas Moerland, Joost Broekens and Catholijn Jonker. [pdf]
- Bayesian Inference with Anchored Ensembles of Neural Networks, and Application to Reinforcement Learning. Tim Pearce and Nicolas Anastassacos. [pdf] [code]
- Adaptive Learning with Unknown Information Flows. Yonatan Gur and Ahmadreza Momeni. [pdf]
- Hierarchy-Driven Exploration in Reinforcement Learning. Evan Liu, Ramtin Keramati, Kelvin Guu, Sudarshan Seshadri, Panupong Pasupat, Percy Liang and Emma Brunskill. [pdf]
- Deeper & Sparser Sampling. Divya Grover and Christos Dimitrakakis. [pdf] [code]
- Large-Scale Study of Curiosity-Driven Learning. Yuri Burda*, Harri Edwards*, Deepak Pathak*, Amos Storkey, Trevor Darrell, Alexei A. Efros [pdf]
- Counting to Explore and Generalize in Text-based Games. Xingdi Yuan, Marc-Alexandre Côté, Alessandro Sordoni and Adam Trischler. [pdf]
- Goal-oriented Trajectories for Efficient Exploration. Fabio Pardo, Vitaly Levdik and Petar Kormushev. [pdf]
- Is Q-learning Provably Efficient? Chi Jin, Zeyuan Allen-Zhu, Sebastian Bubeck and Michael Jordan. [pdf]
- Directed Exploration in PAC Model-free Reinforcement Learning. Min-Hwan Oh and Garud Iyengar. [pdf]
- Depth and nonlinearity induce implicit exploration for RL. Justas Dauparas, Ryota Tomioka and Katja Hofmann. [pdf]
- Bounding Regret in Simulated Games. Steven Jecmen, Erik Brinkman and Arunesh Sinha. [pdf]