Closing the Loop on Runtime Monitors with Fallback-Safe MPC

Rohan Sinha, Edward Schmerling, Marco Pavone

video

arxiv

open-source simulator

Abstract

When we rely on deep-learned models for robotic perception, we must recognize that these models may behave unreliably on inputs dissimilar from the training data, compromising the closed-loop system’s safety. This raises fundamental questions on how we can assess confidence in perception systems and to what extent we can take safety-preserving actions when external environmental changes degrade our perception model’s performance.

Therefore, we present a framework to certify the safety of a perception-enabled system deployed in novel contexts. To do so, we leverage robust model predictive control (MPC) to control the system using the perception estimates while maintaining the feasibility of a safety-preserving fallback plan that does not rely on the perception system. In addition, we calibrate a runtime monitor using recently proposed conformal prediction techniques to certifiably detect when the perception system degrades beyond the tolerance of the MPC controller, resulting in an end-to-end safety assurance.

We show that this control framework and calibration technique allows us to certify the system’s safety with orders of magnitudes fewer samples than required to retrain the perception network when we deploy in a novel context on a photo-realistic aircraft taxiing simulator. Furthermore, we illustrate the safety-preserving behavior of the MPC on simulated examples of a quadrotor.

Framework Overview

Specify a bound on nominal, in-distribution, ML-based perception errors
Robustly control the system with respect to the specified perception error bound with robust output-feedback MPC
Calibrate an OOD detector to trigger a safety-preserving fallback strategy when the chosen perception error bound is violated

We prove an end-to-end safety guarantee to avoid OOD failures!

Falling back into a Safe Recovery Set

Problem: When ML perception fails, it becomes impossible to estimate the full state!
Insight: we can often identify safe recovery regions that we can make invariant without full knowledge of the state.
Example: delivery drones don't need obstacle detectors to avoid mid-air collisions when landed in a field.

Case Study: Autonomous Aircraft Taxiing

DNN Perception, trained on clear-sky, morning weather data

Nominal, in-distribution, behavior

1. ML perception is reliable in-distribution

2. Errors are tolerable, yet nonzero

Anomalous, OOD, behavior

1. ML perception is arbitrarily poor OOD

2. Total loss of state information

3. Yet, possible to detect!

DNN perception causes crash in snowy weather

Preserve Safety OOD by Triggering Fallback

End-to-end safety using FS-MPC

1. By calibrating an OOD detector to flag the violation of the error bound assumptions underpinning the nominal control design

2. and ensuring the persistent feasibility of the fallback strategy

3. we guarantee safety end-to-end

Fallback-Safe MPC Framework

1. Insight: We can often identify a safe recovery region, a subset of the state space that we can make invariant without knowledge of the full state, but relying only on the limited remaining reliable information once perception suffers from OOD failures. The safe recovery region formalizes the design of a fallback strategy.

2. Leverage robust output-feedback MPC to jointly a) synthesize a safety preserving fallback policy b) modify nominal decision making to ensure feasibility of the fallback

3. Calibrate runtime monitor based on OOD detector with conformal prediction to trigger fallback when perception error bounds are violated

Case Study: Vision-guided Drone Landing

Goal: Safely land at origin

Perception: Simulate imperfect vision system to estimate x-y coordinate

Fallback: Flying away at a safe altitude avoids crashes, even if x-y coordinate is no longer known

Conclusion: Reasoning about perception faults is necessary to avoid failures!

Fallback-Safe MPC maintains fallback to safely abort landing on perception failure.

A naive tube MPC that assumes perception estimates are always correct crashes badly when a perception fault occurs!

Caption: The icons show the orientation of the quadrotor. The safe recovery region is highlighted in green. The blue-dashed line indicates the state constraint. Left: In red, we plot the predicted reachable sets of the fallback strategy and in light blue, we plot the predicted nominal trajectories. Right: In light-blue, we plot the predicted reachable tube of a naive tube MPC.

Case Study: Vision-guided Drone Navigates Across a Road

Goal: Reach in-air goal location on other side of the road, while satisfying x-y position constraint

Perception: Simulate imperfect vision system to estimate x-y coordinate

Fallback: Crash land on either side of the road, but crucially, not on the road (near origin)

Conclusion (baselines in paper):

Modifying nominal operations to maintain fallback-safety is necessary for safe recovery!
We can readily identify safe recovery regions in applications!

The Fallback-Safe MPC modifies nominal inputs to maintain fallback feasibility with respect to the safe recovery regions. To safely cross the road, the drone slows down and descends until it realizes it can avoid failures while crossing.

Open-Sourced Simulator Based on X-Plane 11

Along with the paper, we open-source the simulator developed for this work. Based on the popular X-Plane 11 aircraft simulator, we present a convenient Python-based simulation platform to test and benchmark the performance of perception and control algorithms when they experience Out-of-Distribution scenarios in closed-loop. Our simulator offers photo-realistic graphics and accurate physics simulation. Currently, the simulator offers a single control task: Vision-based autonomous taxiing.

Users can flexibly define different OOD scenarios based on weather that may cause vision-degradation. Currently, we support:

cloud levels time-of-day additive image noise snow snowfall rain motion-blur

combinations of the above several severity levels per corruption type linearly increasing and decreasing severity throughout an episode

Features offered:

Conveniently interact with the simulator through the XPlaneBridge Python API, similar to the CARLA client.
Modular and standardized abstractions for perception and estimation to facilitate development of control/perception algorithms or use of existing systems
Specify and run thousands of simulations by modifying example yaml param files to sample environment variations
A single lightweight example script for sampling environments and running and recording experiments
Some utilities to analyze data and create videos of all episodes in an experiment
Track episode statistics for series of experiments anywhere through the web using weights and biases

To get started:

git clone https://github.com/StanfordASL/XPlane-ASL.git

Out-of-Distribution scenarios sampled from the X-Plane 11 simulator

Citation

@article{SinhaSchmerlingPavone2023,

title={Closing the Loop on Runtime Monitors with Fallback Safe MPC},

author={Sinha, Rohan and Schmerling, Ed and Pavone, Marco},

journal={arXiv preprint arXiv:2309.08603},

year={2023}

}

Acknowledgements:

The NASA University Leadership initiative (grant #80NSSC20M0163) provided funds to assist the authors with their research, but this article solely reflects the opinions and conclusions of its authors and not any NASA entity.

Page updated

Report abuse