Despite many exciting developments in the field of reinforcement learning, it has yet to be made actionable in large scale real-life settings. What are the pitfalls and challenges that are preventing it in practice?
Call for Papers: 29 March 2024
Submission Deadline: 25 May 2024 (Anywhere on Earth)
Reviewing period: 25 May - 31 May 2024
Notification date (papers): 31 May 2024
Workshop Date: 9 August 2024
Reinforcement Learning (RL) faces several challenges in practical applications, which can limit its effectiveness in real life settings. At this year’s ”I Can’t Believe It’s Not Better” workshop, formally accepted to be a part of the program of this year's RL Conference, we welcome papers to critically examine the optimism surrounding RL and to showcase the numerous challenging and intriguing open questions within the field. The workshop specifically aims to highlight the challenges facing RL that have thus far prevented RL from being actionable in large scale real-life settings. We encourage contributions that shed light on aspects including (but not limited to):
Offline Settings or Limited Exploration: RL typically requires exploration to learn optimal policies, which may not be feasible in offline settings where historical data is fixed and exploration is not possible. Without exploration, RL struggles to discover optimal actions and may fail to generalize to unseen states.
Small Data Settings: RL algorithms often require large amounts of data to learn effectively, making them ill-suited for scenarios with limited data. In such settings, RL models may overfit to the available data or fail to learn meaningful patterns, leading to poor performance.
RL as Inference: RL can be viewed as a form of inference where the agent learns to infer the best actions based on observed states and rewards. However, this inference process can be challenging, particularly in complex environments with high-dimensional state spaces or stochastic dynamics.
Sparse Rewards in Long Horizons: RL struggles with sparse reward signals, especially in tasks with long time horizons where rewards are infrequent or delayed. In such cases, RL agents may struggle to learn effective policies due to the lack of feedback, leading to slow learning or suboptimal solutions.
Non-Unique Solutions in Inverse Problems: Inverse reinforcement learning, where the goal is to infer the underlying reward function from observed behaviour, can suffer from non-uniqueness of solutions. This ambiguity can make it difficult to accurately recover the true reward function, leading to uncertainty in learned policies.
Sample Complexity Issues with Model-Based RL: Model-based RL approaches often require significant amounts of data to learn accurate models of the environment dynamics. High sample complexity can hinder the practical applicability of these methods, especially in real-world settings where data collection may be expensive or time-consuming.
Distributional Shift: RL algorithms are sensitive to changes in the distribution of states or rewards between training and deployment environments. Distributional shift can arise due to changes in the environment dynamics or the introduction of new tasks, leading to poor generalisation and degraded performance of RL agents.
Addressing these challenges requires the development of novel algorithms and techniques that can effectively handle limited data, sparse rewards, uncertainty, and distributional shifts, thus making RL more robust and applicable to a wider range of real-world problems.
University of Texas
Université Laval
DeepMind
Imperial College London
Emory University
University of Witwatersrand