Safe RL @ IJCAI 2023

Programme

The full programme of the AISafety-SafeRL Joint Workshop is to be announced soon.

The Safe RL portion of the workshop will include two invited speakers and four contributed speakers.

Contributed Talks

Proceedings track

Ruoqi Zhang and Jens Sjölund. Risk-sensitive Actor-free Policy via Convex Optimization

Traditional track

Martin Kurecka and Petr Novotny. Monte Carlo Tree Search with Function Approximation for Risk-Constrained Planning and Reinforcement Learning.
Martin Tappler, Filip Cano Cordoba, Bernhard K. Aichernig, Bettina Koenighofer. Search-based Testing of Reinforcement Learning.
Weiye Zhao, Rui Chen, Yifan Sun, Tianhao Wei, and Changliu Liu. State-Wise Constrained Policy Optimization.

Invited Talks

Yanan Sui

Embodied safe optimization for the restoration of human motor functions

Abstract: Safe and efficient optimization over high-dimensional space is important for restoring motor functions to paralyzed individuals using neuromodulation and other technologies. Existing approaches to designing implantable neural interfaces and finding stimulation parameters rely heavily on clinical experience, which suffers from the increasing complexity of parameter spaces. Here we introduce the design of embodied safe learning methods to optimize neural interface design and stimulation therapy for human motor control. We build a computational model of the human spinal cord and apply model-based optimization to design implanted electrode arrays for patients with complete spinal cord injury. With the new electrode arrays, we develop safe and efficient algorithms to sequentially optimize stimulation parameters in the high-dimensional therapy space. Our methods successfully restore motor functions in patients with motor complete spinal cord injury. These technologies would also benefit patients with other movement disorders in the future.

Bio: Yanan Sui, Associate Professor at Tsinghua University, works on reinforcement learning, neural interaction, and robotics. He received his bachelor's degree from Tsinghua University, his Ph.D. from Caltech, and did postdoctoral work at Caltech and Stanford University. His work on safe optimization has been included in textbooks at Stanford and other universities. He received the Best Conference Paper Award and the Best Paper Award on Human-Robot Interaction at the 2020 International Conference on Robotics and Automation for preference-based optimization. His work has been applied to the clinical diagnosis and treatment of neurological diseases and injuries. He has served as area chair of artificial intelligence conferences, including AAAI, AISTATS, ICLR, NeurIPS. For his contribution to the interdisciplinary field of artificial intelligence and neural engineering, he was selected as one of MIT Technology Review's Innovators Under 35 in China.

Thiago Simao

Ensuring the offline reliability and online safety of reinforcement learning agents

Abstract: Reinforcement Learning (RL) agents can solve general problems based on little to no knowledge of the underlying environment. These agents often learn through experience, using a trial-and-error strategy that can lead to practical innovations, but this randomized process might cause undesirable events. Safe RL studies how to make such agents more reliable and how to ensure they behave appropriately. We investigate these issues in online settings, where the agent interacts directly with the environment, and offline settings, where the agent only has access to historical data. We develop new RL methods that exploit prior knowledge about the structure of the problem. In particular, we consider factored problems, where the dynamics of each state variable depend only on a small subset of variables. Exploiting this structure, we propose reliable offline algorithms that can improve the policy using fewer data and online algorithms that comply with safety constraints while learning. Besides safety and reliability, we also touch on other issues preventing the deployment of RL to real-world tasks, such as partial observability, generalization and high dimensional data.

Bio: Thiago is a PostDoc researcher with the Department of Software Science (SWS) at Radboud University Nijmegen advised by Dr. Nils Jansen. Previously, he was a Ph.D. candidate within the Algorithmics Group at Delft University of Technology, advised by Dr. Matthijs Spaan. His research interests lie primarily in the automation of sequential decision making, focusing on reinforcement learning.

He obtained his M.Sc. degree in artificial intelligence from the Instituto de Matemática e Estatística at Universidade de São Paulo under the supervision of Prof. Leliane N. de Barros and a bachelor degree in computer science at the Departamento de Ciência da Computação at Universidade Federal de Lavras.

IJCAI_SafeRL_2023_YananSui.pdf

IJCAI_SafeRL_2023_ThiagoSimao.pdf

Full programme

The full programme combining AISafety and SafeRL is shown below.