Fast, Smooth and Safe: Implicit Control Barrier Functions using Reach-Avoid Differential Dynamic Programming

Athindran Ramesh Kumar, Kai-Chieh Hsu, Peter J. Ramadge, Jaime F. Fisac

Abstract

Safety is a central requirement for autonomous system operation across domains. Hamilton-Jacobi (HJ) reachability analysis can be used to construct "least-restrictive'' safety filters that result in infrequent, but often extreme, control overrides. In contrast, control barrier function (CBF) methods apply smooth control corrections to guard the system against an often conservative safety boundary. This paper provides an online scheme to construct an implicit CBF through HJ reach-avoid differential dynamic programming in a receding-horizon framework, enabling smooth safety filtering with infinite-time safety guarantees. Simulations with the Dubins car and 5D bicycle dynamics demonstrate the scheme's ability to preserve safety smoothly without the conservativeness of handcrafted CBFs.

What does Reach-avoid CBF-DDP do?

The goal of this approach is to provide the dual benefits of run-time safety guarantees and smoother safety filtering. We provide a novel construction of a control barrier function (CBF) up to second-order in real-time using reach-avoid Differential Dynamic Programming. The reach-avoid value function is shown to be a valid CBF by construction. Existence of a CBF provides an Active Set Invariance Filter (ASIF) that displays smoother safety filtering. A quadratic program is solved online to obtain the filtered safety control. Without interfering constraints, the filtered controls display better smoothness properties. With multiple interfering constraints from yaw and road boundaries, controls are jerkier but behavior of trajectories is safer and smoother. We use JAX to obtain acceleration while running on a CPU. 

At each control cycle, we re-solve for the local-optimal reach-avoid value function and optimal safety controls using iLQR. The iLQR optimization solves for the optimal trajectory and controls.  We use an ASIF based on a sequence of quadratically constrained quadratic programs to satisfy a CBF constraint. By virtue of this filtering, we are guaranteed safe operation with continuous controls after filtering.

For our reach-avoid setup, the failure set is the set of states inside obstacles and outside road boundaries. The target set is the set of all states from which applying maximum braking with zero changes in steering brings the car to a safe halt.

Results

We demonstrate our method on two different benchmarks (see below)

Dubins car

The dubins car has 3D states with 1D controls. The only input to the system is steering. The velocity is fixed at 0.7. In the right, we highlight the differences in trajectory and control smoothness between CBF-DDP and LR-DDP as it crosses the edge of an obstacle.

Kinematic Bicycle Dynamics

Naive Task Policy

We initially run the kinematic bicycle dynamics ( 5D states and 2D controls) with a naive task policy that uses linear feedback with some clever tricks. The acceleration will track a reference velocity of 0.9. The steering velocity will adjust using linear feedback to reach the center-line of the track at a look-ahead distance of 4.0. The naive task policy requires a lot of manual tuning to enable the robot to finish the track. Further, with LR-DDP, it is deemed necessary to include an additional yaw constraint to prevent the robot from making U-turns. However, CBF-DDP completes the track always and gives smooth trajectories. The smoothness of controls is best noticed without yaw constraints.

No yaw constraint

Introduce yaw constraint - 0.5*pi

Tighten yaw constraint - 0.4*pi

Lagrange ILQR  Task Policy

With an improved task policy that maintains soft constraints (eg. yaw direction, minimum velocity) such that the robot does not stop, the safety plan will intervene and prevent crashes with obstacles. However, once the robot has reached a state from which task policy is deemed to be safe, the robot transfers to the task plan and enable task completion. With this improved task policy, we don't need an explicit yaw constraint in our safety plan. 

No yaw constraint

In the above figure, we note that in CBF-DDP, the value function decays gradually towards zero and goes back up after the obstacle has been crossed. This reflects in the smoothness of the controls as well. The only switches in the value function are when the obstacle that the robot is trying to avoid is switching.

On the other hand, LR-DDP will provide safe controls at the zero-level set and oscillations along the boundary can cause bigger jerks in the control.

Robustness tests

CBF-DDP goes through a difficult configuration of obstacles without/with yaw constraints

CBF-DDP tested with a different wheelbase.

Authors

Athindran Ramesh Kumar ( arkumar[at]princeton.edu)           Kai-Chieh Hsu             Peter J. Ramadge               Jaime F. Fisac

All authors are with the Department of ECE at Princeton University.

Citation

A. Ramesh Kumar, K. -C. Hsu, P. J. Ramadge and J. F. Fisac, "Fast, Smooth, and Safe: Implicit Control Barrier Functions Through Reach-Avoid Differential Dynamic Programming," in IEEE Control Systems Letters, vol. 7, pp. 2994-2999, 2023, doi: 10.1109/LCSYS.2023.3292132.