Closing the Loop on Runtime Monitors with Fallback-Safe MPC
Abstract
When we rely on deep-learned models for robotic perception, we must recognize that these models may behave unreliably on inputs dissimilar from the training data, compromising the closed-loop system’s safety. This raises fundamental questions on how we can assess confidence in perception systems and to what extent we can take safety-preserving actions when external environmental changes degrade our perception model’s performance.
Therefore, we present a framework to certify the safety of a perception-enabled system deployed in novel contexts. To do so, we leverage robust model predictive control (MPC) to control the system using the perception estimates while maintaining the feasibility of a safety-preserving fallback plan that does not rely on the perception system. In addition, we calibrate a runtime monitor using recently proposed conformal prediction techniques to certifiably detect when the perception system degrades beyond the tolerance of the MPC controller, resulting in an end-to-end safety assurance.
We show that this control framework and calibration technique allows us to certify the system’s safety with orders of magnitudes fewer samples than required to retrain the perception network when we deploy in a novel context on a photo-realistic aircraft taxiing simulator. Furthermore, we illustrate the safety-preserving behavior of the MPC on simulated examples of a quadrotor.
Framework Overview
Specify a bound on nominal, in-distribution, ML-based perception errors
Robustly control the system with respect to the specified perception error bound with robust output-feedback MPC
Calibrate an OOD detector to trigger a safety-preserving fallback strategy when the chosen perception error bound is violated
We prove an end-to-end safety guarantee to avoid OOD failures!
Falling back into a Safe Recovery Set
Problem: When ML perception fails, it becomes impossible to estimate the full state!
Insight: we can often identify safe recovery regions that we can make invariant without full knowledge of the state.
Example: delivery drones don't need obstacle detectors to avoid mid-air collisions when landed in a field.
Case Study: Autonomous Aircraft Taxiing
DNN Perception, trained on clear-sky, morning weather data
Nominal, in-distribution, behavior
1. ML perception is reliable in-distribution
2. Errors are tolerable, yet nonzero
Anomalous, OOD, behavior
1. ML perception is arbitrarily poor OOD
2. Total loss of state information
3. Yet, possible to detect!
DNN perception causes crash in snowy weather
Preserve Safety OOD by Triggering Fallback
End-to-end safety using FS-MPC
1. By calibrating an OOD detector to flag the violation of the error bound assumptions underpinning the nominal control design
2. and ensuring the persistent feasibility of the fallback strategy
3. we guarantee safety end-to-end
Fallback-Safe MPC Framework
1. Insight: We can often identify a safe recovery region, a subset of the state space that we can make invariant without knowledge of the full state, but relying only on the limited remaining reliable information once perception suffers from OOD failures. The safe recovery region formalizes the design of a fallback strategy.
2. Leverage robust output-feedback MPC to jointly a) synthesize a safety preserving fallback policy b) modify nominal decision making to ensure feasibility of the fallback
3. Calibrate runtime monitor based on OOD detector with conformal prediction to trigger fallback when perception error bounds are violated
Case Study: Vision-guided Drone Landing
Goal: Safely land at origin
Perception: Simulate imperfect vision system to estimate x-y coordinate
Fallback: Flying away at a safe altitude avoids crashes, even if x-y coordinate is no longer known
Conclusion: Reasoning about perception faults is necessary to avoid failures!
Fallback-Safe MPC maintains fallback to safely abort landing on perception failure.
A naive tube MPC that assumes perception estimates are always correct crashes badly when a perception fault occurs!
Caption: The icons show the orientation of the quadrotor. The safe recovery region is highlighted in green. The blue-dashed line indicates the state constraint. Left: In red, we plot the predicted reachable sets of the fallback strategy and in light blue, we plot the predicted nominal trajectories. Right: In light-blue, we plot the predicted reachable tube of a naive tube MPC.
Case Study: Vision-guided Drone Navigates Across a Road
Goal: Reach in-air goal location on other side of the road, while satisfying x-y position constraint
Perception: Simulate imperfect vision system to estimate x-y coordinate
Fallback: Crash land on either side of the road, but crucially, not on the road (near origin)
Conclusion (baselines in paper):
Modifying nominal operations to maintain fallback-safety is necessary for safe recovery!
We can readily identify safe recovery regions in applications!
The Fallback-Safe MPC modifies nominal inputs to maintain fallback feasibility with respect to the safe recovery regions. To safely cross the road, the drone slows down and descends until it realizes it can avoid failures while crossing.
Open-Sourced Simulator Based on X-Plane 11
Along with the paper, we open-source the simulator developed for this work. Based on the popular X-Plane 11 aircraft simulator, we present a convenient Python-based simulation platform to test and benchmark the performance of perception and control algorithms when they experience Out-of-Distribution scenarios in closed-loop. Our simulator offers photo-realistic graphics and accurate physics simulation. Currently, the simulator offers a single control task: Vision-based autonomous taxiing.
Users can flexibly define different OOD scenarios based on weather that may cause vision-degradation. Currently, we support:
cloud levels time-of-day additive image noise snow snowfall rain motion-blur
combinations of the above several severity levels per corruption type linearly increasing and decreasing severity throughout an episode
Features offered:
Conveniently interact with the simulator through the XPlaneBridge Python API, similar to the CARLA client.
Modular and standardized abstractions for perception and estimation to facilitate development of control/perception algorithms or use of existing systems
Specify and run thousands of simulations by modifying example yaml param files to sample environment variations
A single lightweight example script for sampling environments and running and recording experiments
Some utilities to analyze data and create videos of all episodes in an experiment
Track episode statistics for series of experiments anywhere through the web using weights and biases
To get started:
git clone https://github.com/StanfordASL/XPlane-ASL.git
Out-of-Distribution scenarios sampled from the X-Plane 11 simulator
Citation
@article{SinhaSchmerlingPavone2023,
title={Closing the Loop on Runtime Monitors with Fallback Safe MPC},
author={Sinha, Rohan and Schmerling, Ed and Pavone, Marco},
journal={arXiv preprint arXiv:2309.08603},
year={2023}
}
Acknowledgements:
The NASA University Leadership initiative (grant #80NSSC20M0163) provided funds to assist the authors with their research, but this article solely reflects the opinions and conclusions of its authors and not any NASA entity.