Accepted papers

Note: contributed talks denoted by a star *

*High-Level Strategy Selection under Partial Observability in StarCraft: Brood War; Jonas Gehring, Da Ju, Vegard Mella, Daniel Gant, Nicolas Usunier and Gabriel Synnaeve

*Joint Belief Tracking and Reward Optimization through Approximate Inference; Pavel Shvechikov, Alexander Grishin, Arseny Kuznetsov, Alexander Fritzler and Dmitry Vetrov

*Learning Dexterous In-Hand Manipulation; Marcin Andrychowicz, Bowen Baker, Maciej Chociej, Rafal Jozefowicz, Bob McGrew, Jakub Pachocki, Arthur Petron, Matthias Plappert, Glenn Powell, Alex Ray, Jonas Schneider, Szymon Sidor, Josh Tobin, Peter Welinder, Lilian Weng and Wojciech Zaremba

*Differentiable Algorithm Networks: Learning Wrong Models for Wrong Algorithms; Peter Karkus, David Hsu, Leslie Pack Kaelbling and Tomas Lozano-Perez

Lazy-CFR: a fast regret minimization algorithm for extensive games with imperfect information; Yichi Zhou

ExIt-OOS: Towards Learning from Planning in Imperfect Information Games; Andy Kitchen and Michela Benedetti

Transfer in Model Based Reinforcement Learning as a Partial Observability Problem; Akshay Narayan and Tze-Yun Leong

Learning What to Remember with Online Policy Gradient Over a Reservoir; Kenny Young and Richard Sutton

Sim-to-Real Optimization of Very Complex Real World Mobile Network with Imperfect Information via Deep Reinforcement Learning from Self-play; Yongxi Tan, Jin Yang, Xin Chen, Qitao Song, Yunjun Chen, Zhangxiang Ye and Zhenqiang Su

Learning Coordination in Adversarial Multi-Agent DQN with dec-POMDPs; Elhadji Amadou Oury Diallo and Toshiharu Sugawara

PF-LSTM: Belief State Particle Filter for LSTM; Xiao Ma, Peter Karkus, David Hsu and Wee Sun Lee

Influence of Recent History to System Controllability; Vladimír Petrík and Ville Kyrki

M^3RL: Mind-aware Multi-agent Management Reinforcement Learning; Tianmin Shu and Yuandong Tian

Imitation Learning via Bootstrapped Demonstrations in an Open World Video Game; Igor Borovikov and Ahmad Beirami

Training Agents to Play Modern Games: Challenges and Opportunities; Yunqi Zhao, Ahmad Beirami, Mohsen Sardari, Navid Aghdaie and Kazi Zaman

Explicit Sequence Proximity Models for Hidden State Identification; Torbjorn Dahl, Anil Kota, Sharath Chandra and Parag Khanna

Learning Minimal Sufficient Representations of Partially Observable Decision Processes; Tommaso Furlanello, Amy Zhang, Kamyar Azizzadenesheli, Anima Anandkumar, Zachary C. Lipton, Laurent Itti and Joelle Pineau

A Baseline of Discovery for General Value Function Networks under Partial Observability; Matthew Schlegel, Adam White and Martha White

Learning Internal State Models in Partially Observable Environments; Andrea Baisero and Christopher Amato

Neural belief states for partially observed domains; Pol Moreno, Jan Humplik, George Papamakarios, Bernardo Ávila Pires, Lars Buesing, Nicolas Heess and Théophane Weber

Bandits with sequentially observed rewards: a Bayesian generative Thompson sampling approach; Iñigo Urteaga and Chris Wiggins

Prediction-Constrained POMDPs; Joseph Futoma, Michael Hughes and Finale Doshi-Velez

MCTS on model-based Bayesian Reinforcement Learning for efficient learning in Partially Observable environments; Sammie Katt, Frans Oliehoek and Christopher Amato

Learning a System-ID Embedding Space for Domain Specialization with Deep Reinforcement Learning; James Preiss, Karol Hausman and Gaurav Sukhatme

Options and partial observability: regret bounds by analogy with semi-supervised learning; Nicholas Denis and Maia Fraser

Deep Counterfactual Regret Minimization; Noam Brown, Adam Lerer, Sam Gross and Tuomas Sandholm