Robust and Uncertainty-Aware Reinforcement Learning
Abstract: Reinforcement learning algorithms enable autonomous acquisition of complex skills, but typically do so at the cost of considerable sample complexity and a risky exploration phase. When learning in the real world, these challenges can preclude application to safety-critical domains, such as autonomous driving and robotic flight. In this talk, I will discuss how uncertainty-aware neural network models can enable deep reinforcement learning algorithms that recognize unfamiliar situations and revert automatically to a safer course of action and discuss how we can design maximum entropy reinforcement learning algorithms that automatically optimize for robustness. However, beyond building RL algorithms that are robust and aware of uncertainty, the need for safety also requires us to build algorithms that are fast to adapt: it is too much to hope that we can build machines that never make mistakes, but perhaps we can build machines that make mistakes rarely and learn from them quickly, with fast and efficient adaptation. To that end, I will also discuss our recent work on meta-learning, and how it can enable machines that quickly repair their mistakes when they inevitably happen.
TBD
Learning Humans’ Preferences: a human-centered approach for safe interactions
Abstract: Recent developments in artificial intelligence (AI) have enabled us to build AI agents and robots capable of performing complex tasks, including many that interact with humans. In these tasks, it is desirable for robots to build predictive and robust models of humans’ behaviors and preferences: a robot manipulator collaborating with a human needs to predict her future trajectories, or humans sitting in self-driving cars might have preferences for how cautiously the car should drive. In this talk, we will first discuss efficient active learning techniques to learn humans’ preferences by eliciting comparisons from a mixed set of humans, and further analyze the generalizability and robustness of such models for safe and seamless interaction with AI agents.
Reinforcement Learning for Robotics: Provable Safety and Performance Guarantees by Combining Models and Data
Abstract: In contrast to computers and smartphones, the promise of robotics is to design devices that can physically interact with the world. The interaction of robots with the physical world may be as simple as moving on roads or in the air, and can be as complex as physically collaborating with humans. Envisioning robots to work in human-centered and interactive environments challenges current robot algorithm design, which has largely been based on a-priori knowledge about the system and its environment. In this talk, I will show how we combine models and data to achieve provably safe and high-performance robot behavior in the presence of uncertainties and unknown effects. Our work focuses on learning robot control and decision making strategies, and enables robots to adapt their behavior as they move in the real world. I will highlight how we use prior knowledge to (i) appropriately place the learning in the overall closed-loop system architecture (including choosing the inputs and outputs of the learning module, and building hierarchical solutions with fast adaptation on the low level and transferrable policy learning on the high level), (ii) enable safe online learning (i.e., enable the robot to move in the real world and start gathering data), (iii) design data-efficient algorithms, and (iv) provide provable performance and safety guarantees for the learning. We demonstrate our algorithms on self-flying and -driving vehicles, as well as on mobile manipulators. More information and videos at: www.dynsyslab.org.
A New Type of Safe Reinforcement Learning
Abstract: After reviewing some of the different forms of "safety" provided by various reinforcement learning algorithms, I will discuss the safety properties that are necessary to enable real high-risk RL applications. I will then provide a brief introduction to a new class of "safe" reinforcement learning algorithms that have these properties, and will discuss how these new safe reinforcement learning algorithms can be created. I will conclude with discussion of the many remaining challenges that must be overcome before RL can be applied responsibly to an even wider range of high-risk real-world problems.