Accepted Papers
Accepted Papers
Toward Complex and Structured Goals in Reinforcement Learning. Guy Davidson, Todd M. Gureckis.
The Big World Hypothesis and its Ramifications for Artificial Intelligence. Khurram Javed, Richard S. Sutton.
The Need for a Big World Simulator: A Scientific Challenge for Continual Learning. Saurabh Kumar, Hong Jun Jeon, Alex Lewandowski, Benjamin Van Roy.
Satisficing Exploration for Deep Reinforcement Learning. Dilip Arumugam, Saurabh Kumar, Ramki Gummadi, Benjamin Van Roy.
Reward is (almost) enough. Arjun Jagota, Sonali Parbhoo.
Generalized Hyperbolic Discounting for Delay-Sensitive Reinforcement Learning. Raja Farrukh Ali, John Woods, Robert A Southern, Travis Smith, Vahid Behzadan, William Hsu.
Time is of the Essence: Why Decision-Time Planning Costs Matter. Kevin A. Wang, Jerry Xia, Stephen Chung, Jennifer Wang, Francisco Piedrahita Velez, Hung-Jen Wang, Amy Greenwald.
Investigating Biologically-Inspired Approaches for Continual Reinforcement Learning. Olya Mastikhina, Golnaz Mesbahi, Martha White.
RL in context - towards a framing that enables cybernetics-style questions. Vincent Létourneau, Maia Fraser.
Hyperbolic Discounting in Multi-Agent Reinforcement Learning. Raja Farrukh Ali, John Woods, Esmaeil Seraj, Kevin Duong, Vahid Behzadan, William Hsu.
Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models. Matthew Riemer, Gopeshh Subbaraj, Glen Berseth, Irina Rish.
Exploration Unbound. Dilip Arumugam, Wanqiao Xu, Benjamin Van Roy.
Milnor-Myerson Games and The Principles of Artificial Principal-Agent Problems. Manfred Diaz, Joel Z Leibo, Liam Paull.
Pick up the PACE: A Parameter-Free Optimizer for Lifelong Reinforcement Learning. Aneesh Muppidi, Zhiyu Zhang, Heng Yang.
MOSEAC: Streamlined Variable Time Step Reinforcement Learning. Dong Wang, Giovanni Beltrame.
Can we hop in general? A discussion of benchmark selection and design using the Hopper environment. Claas Voelcker, Marcel Hussing, Eric Eaton.
How to Specify Reinforcement Learning Objectives. W. Bradley Knox, James MacGlashan.
A Method for Evaluating Hyperparameter Sensitivity in Reinforcement Learning. Jacob Adkins, Michael Bowling, Adam White.
Learning telic-controllable state representations. Nadav Amir, Stas Tiomkin, Angela Langdon.
AI Alignment with Changing and Influenceable Reward Functions. Micah Carroll, Davis Foote, Anand Siththaranjan, Stuart Russell, Anca Dragan.
Replacing Implicit Regression with Classification in Policy Gradient Reinforcement Learning. Josiah P. Hanna, Brahma S Pavse, Abhinav Narayan Harish.
Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy. Cameron Allen, Aaron Kirtland, Ruo Yu Tao, Sam Lobel, Daniel Scott, Nicholas Petrocelli, Omer Gottesman, Ronald Parr, Michael L. Littman, George Konidaris.
ODGR: Online Dynamic Goal Recognition. Matan Shamir, Osher Elhadad, Matthew E. Taylor, Reuth Mirsky.
Characteristics of Effective Exploration for Transfer in Reinforcement Learning. Jonathan C Balloch, Rishav Bhagat, Geigh Zollicoffer, Ruoran Jia, Julia M. Kim, Mark Riedl.