Accepted papers
Note: contributed talks denoted by a star *
*High-Level Strategy Selection under Partial Observability in StarCraft: Brood War; Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier and Gabriel Synnaeve
*Joint Belief Tracking and Reward Optimization through Approximate Inference; Pavel Shvechikov, Alexander Grishin, Arseny Kuznetsov, Alexander Fritzler and Dmitry Vetrov
*Learning Dexterous In-Hand Manipulation; Marcin Andrychowicz, Bowen Baker, Maciej Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng and Wojciech Zaremba
*Differentiable Algorithm Networks: Learning Wrong Models for Wrong Algorithms; Peter Karkus, David Hsu, Leslie Pack Kaelbling and Tomas Lozano-Perez
Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information; Yichi Zhou
ExIt-OOS: Towards Learning from Planning in Imperfect Information Games; Andy Kitchen and Michela Benedetti
Transfer in Model Based Reinforcement Learning as a Partial Observability Problem; Akshay Narayan and Tze-Yun Leong
Learning What to Remember with Online Policy Gradient Over a Reservoir; Kenny Young and Richard Sutton
Sim-to-Real Optimization of Very Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play; Yongxi Tan, Jin Yang, Xin Chen, Qitao Song, Yunjun Chen, Zhangxiang Ye and Zhenqiang Su
Learning Coordination in Adversarial Multi-Agent DQN with dec-POMDPs; Elhadji Amadou Oury Diallo and Toshiharu Sugawara
PF-LSTM: Belief State Particle Filter for LSTM; Xiao Ma, Peter Karkus, David Hsu and Wee Sun Lee
Influence of Recent History to System Controllability; Vladimír Petrík and Ville Kyrki
M^3RL: Mind-aware Multi-agent Management Reinforcement Learning; Tianmin Shu and Yuandong Tian
Imitation Learning via Bootstrapped Demonstrations in an Open World Video Game; Igor Borovikov and Ahmad Beirami
Training Agents to Play Modern Games: Challenges and Opportunities; Yunqi Zhao, Ahmad Beirami, Mohsen Sardari, Navid Aghdaie and Kazi Zaman
Explicit Sequence Proximity Models for Hidden State Identification; Torbjorn Dahl, Anil Kota, Sharath Chandra and Parag Khanna
Learning Minimal Sufficient Representations of Partially Observable Decision Processes; Tommaso Furlanello, Amy Zhang, Kamyar Azizzadenesheli, Anima Anandkumar, Zachary C. Lipton, Laurent Itti and Joelle Pineau
A Baseline of Discovery for General Value Function Networks under Partial Observability; Matthew Schlegel, Adam White and Martha White
Learning Internal State Models in Partially Observable Environments; Andrea Baisero and Christopher Amato
Neural belief states for partially observed domains; Pol Moreno, Jan Humplik, George Papamakarios, Bernardo Ávila Pires, Lars Buesing, Nicolas Heess and Théophane Weber
Bandits with sequentially observed rewards: a Bayesian generative Thompson sampling approach; Iñigo Urteaga and Chris Wiggins
Prediction-Constrained POMDPs; Joseph Futoma, Michael Hughes and Finale Doshi-Velez
MCTS on model-based Bayesian Reinforcement Learning for efficient learning in Partially Observable environments; Sammie Katt, Frans Oliehoek and Christopher Amato
Learning a System-ID Embedding Space for Domain Specialization with Deep Reinforcement Learning; James Preiss, Karol Hausman and Gaurav Sukhatme
Options and partial observability: regret bounds by analogy with semi-supervised learning; Nicholas Denis and Maia Fraser
Deep Counterfactual Regret Minimization; Noam Brown, Adam Lerer, Sam Gross and Tuomas Sandholm