Inverse optimal control: Theory and application

Overview


We study inverse optimal control (IOC) problems, embedded in networked settings. In this framework, we depart from a given stabilizing control law, with an associated control Lyapunov function and reverse engineer the cost functional to guarantee the optimality of the controller. In this way, inverse optimal control generates a whole family of optimal controllers corresponding to different cost functions. This provides analytically explicit and numerically feasible solutions in closed-form. This approach circumvents the complexity of solving partial differential equations descending from dynamic programming and Bellman’s principle of optimality. This is the case also in the presence of disturbances in the dynamics and the cost. In networks, the controller obtained from inverse optimal control has a topological structure (e.g., it is distributed) and thus feasible for implementation. The tuning is analogous to that of linear quadratic regulators.


State and input constrained inverse optimal control

Flipping the order of optimal control synthesis by starting from a stabilizing controller, whose optimality is a byproduct of reverse-engineering the cost functional, motivates the study of constrained optimization problems with input and state constraints. These can be inspired by control strategies that handle well the constraints, such as Model Predictive Control.  

Tuning of discrete-time inverse optimal control: angular droop control

We study discrete-time angular-droop control in power inverter networks and leverage second-order information for the tuning. We thereby increase the rate of convergence of the closed-loop angle trajectories towards an induced steady state angle. In particular, we select the input penalty matrix, representing a tuning knob to be a sufficiently small diagonal matrix so that the closed-loop system approximately yields Newton’s method. In this way, we ensure quadratic convergence, a feat that is otherwise not achievable with a constant, sufficiently large diagonal input penalty matrix. 

This draws a link between second-order methods known for their quadratic rate of convergence and the tuning of inverse optimal stabilizing controllers for discrete-time integrator dynamics.