DEQ-MPC :
Deep Equilibrium Model Predictive Control

Swaminathan Gurumurthy, Khai Nguyen, Arun Bishop, Zachary Manchester, Zico Kolter

CoRL 2025, Korea

Abstract

Incorporating task-specific priors within a policy or network architecture is crucial for enhancing safety and improving representation and generalization in robotic control problems. Differentiable Model Predictive Control (MPC) layers have proven effective for embedding these priors, such as constraints and cost functions, directly within the architecture, enabling end-to-end training. However, current methods often treat the solver and the neural network as separate, independent entities, leading to suboptimal integration. In this work, we propose a novel approach that co-develops the solver and architecture unifying the optimization solver and network inference problems. Specifically, we formulate this as a joint fixed-point problem over the coupled network outputs and necessary conditions of the optimization problem. We solve this problem in an iterative manner where we alternate between network forward passes and optimization iterations. Through extensive ablations in various robotic control tasks, we demonstrate that our approach yields richer representations and more stable training, while naturally accommodating warm starts, a key requirement for MPC.

Real-world Crazyflie experiment

A real quadrotor is tasked with navigating through a (virtually) cluttered environment filled with numerous obstacles.

cf_real_2v1.mp4

Real robot demonstration

cf_mjc_deq-mpc-deq_1.mp4

Playback with obstacles

We present navigation experiment playbacks (from their respective hardware runs) using four different baselines. Obstacles turned red indicate collisions.

cf_mjc_deq-mpc-deq_2.mp4

DEQ-MPC-DEQ succeeds

cf_mjc_deq-1-mpc.mp4

Diff-MPC-DEQ crashes early

cf_mjc_deq-mpc-nn.mp4

DEQ-MPC-NN collides

cf_mjc_diff-mpc-nn.mp4

Diff-MPC-NN collides

Baseline comparisons

We evaluate DEQ-MPC variants and Diff-MPC baselines on a range of challenging tasks in both domains:
(1) simulation, as shown in Table 1 (pendulum, cartpole, quadrotor, quadrotor-pole, and quadrotor-pole with static/dynamic obstacles),
(2) the real world, as shown in Table 2 (Crazyflie quadrotor navigating through static obstacles).

We demonstrate DEQ-MPC’s enhanced representation capabilities. First, DEQ-MPC variants scale more effectively with dataset size and model capacity. Second, they show less performance degradation as constraint complexity increases.

Generalization

Network capacity

Constraint hardness

Gradient niceness

Parameter sensitivity

Warm-starting

Discussion

Our experimental results highlight several key advantages of DEQ-MPC over differentiable MPC layers. The performance gap between DEQ-MPC variants and Diff-MPC becomes increasingly apparent as task complexity increases, whether through harder constraints, longer planning horizons, or increased problem sensitivity. A particularly promising aspect of DEQ-MPC is its favorable scaling behavior. Unlike Diff-MPC variants which show signs of performance saturation, DEQ-MPC models continue to improve with increasing dataset size and network capacity. This suggests potential for exploiting scaling laws in robotics applications. Furthermore, DEQ-MPC's effectiveness in warm-starting scenarios, requiring fewer augmented Lagrangian iterations while maintaining performance, offers significant practical advantages for real-world deployment. This advantage was also evident in our hardware experiments, where DEQ-MPC methods demonstrated superior reliability. Interestingly, there exist trade-offs even between the DEQ-MPC variants. While DEQ-MPC-NN performs slightly better on average in simulation, DEQ-MPC-DEQ remains stable across a wider range of conditions compared to DEQ-MPC-NN, suggesting a trade-off between performance and stability.

Page updated

Google Sites

Report abuse

DEQ-MPC :Deep Equilibrium Model Predictive Control

Abstract

Real-world Crazyflie experiment

Baseline comparisons

Discussion

DEQ-MPC :
Deep Equilibrium Model Predictive Control