CONLab Research

Robust Learning-Based Control via Bootstrapped Multiplicative Noise

Keywords: Optimal, robust, adaptive, control, reinforcement learning, system identification, stochastic parameters

Summary

We propose a robust adaptive control algorithm that explicitly accounts for inherent non-asymptotic uncertainties arising from models estimated with finite, noisy data. The algorithm has three components: (1) a least-squares nominal model estimator; (2) a bootstrap resampling method that quantifies non-asymptotic variance of the nominal model estimate; and (3) a non-conventional robust control design method using an optimal linear quadratic regulator (LQR) with multiplicative noise. A key advantage of the proposed approach is that the system identification and robust control design procedures both use stochastic uncertainty representations, so that the actual inherent statistical estimation uncertainty directly aligns with the uncertainty the robust controller is being designed against. Numerical experiments show significant improvements over the certainty equivalent controller on both expected regret and measures of regret risk.

See the L4DC poster and slides.

Read on arXiv.

Control Design for Risk-Based Signal Temporal Logic Specifications

Keywords: Signal temporal logic, stochastic systems, constraint control, optimization

Summary

We present a framework for risk semantics on Signal Temporal Logic (STL) specifications for discrete-time linear dynamical systems with additive stochastic noise. Under our recursive risk semantics, risk constraints on STL formulas can be expressed in terms of risk constraints on atomic predicates which can be tightened into deterministic STL constraints on a related deterministic system. For affine predicates and the Distributionally Robust Value at Risk measure (DR-VaR), we show how the STL risk constraint is reformulated into a deterministic STL constraint. We demonstrate the framework using a Model Predictive Control (MPC) design.

Continue Reading

Get paper on IEEE Xplore or arXiv.

Policy Iteration for Linear Quadratic Games with Stochastic Parameters

Benjamin Gravell, Karthik Ganapathy, Tyler H Summers, IEEE L-CSS/CDC 2020 (accepted)

Keywords: Optimal control, reinforcement learning, dynamic programming, robust control

Summary

Robustness is a key challenge in the integration of learning and control. In machine learning and robotics, two common approaches to promote robustness are adversarial training and domain randomization. Both of these approaches have analogs in control theory: adversarial training relates to H-infinity control and dynamic game theory, while domain randomization relates to theory for systems with stochastic model parameters. We propose a stochastic dynamic game framework that integrates both of these approaches to modeling uncertainty and promoting robustness. We describe policy iteration algorithms in both model-based and model-free settings to compute equilibrium strategies and value functions. We present numerical experiments that illustrate their effectiveness and the value of combining uncertainty representations in our integrated framework. We also provide an open-source implementation of the algorithms to facilitate their wider use.

Read the paper on IEEE Xplore.

Run the reproducible code on Code Ocean.

Robust Linear Quadratic Regulator: Exact Tractable Reformulation

Keywords: Optimal, robust, control, data, dynamic, game, Riccati, equation

Summary

We give novel characterizations of the uncertainty sets that arise in the robust linear quadratic regulator problem, develop Riccati equation-based solutions to optimal robust LQR problems over these sets, and give theoretical and empirical evidence that the resultant robust control law is a natural and computationally attractive alternative to the certainty-equivalent control law when the pair (A, B) is identified under l2-regularized linear least-squares.

Read the short version for CDC or the extended and updated version.

Continue Reading

Learning robust control for LQR systems with multiplicative noise via policy gradient

Keywords: Optimal, robust, control, reinforcement learning, policy, gradient, optimization, nonconvex, gradient domination, Polyak-Lojasiewicz, inequality, concentration, bound

Summary

We show that the linear quadratic regulator with multiplicative noise (LQRm) objective is gradient dominated, and thus applying policy gradient results in global convergence to the globally optimum control policy with polynomial dependence on problem parameters. The learned policy accounts for inherent parametric uncertainty in system dynamics and thus improves stability robustness. Results are provided both in the model-known and model-unknown settings where samples of system trajectories are used to estimate policy gradients.

Read the paper on arXiv.

Continue Reading