Introduction to
Online Nonstochastic Control
Graduate text in learning to control
Abstract:
This text presents an introduction to an emerging paradigm in control of dynamical systems and differentiable reinforcement learning called online nonstochastic control. The new approach applies techniques from online convex optimization and convex relaxations to obtain new methods with provable guarantees for classical settings in optimal and robust control.
The primary distinction between online nonstochastic control and other frameworks is the objective. In optimal control, robust control, and other control methodologies that assume stochastic noise, the goal is to perform comparably to an offline optimal strategy. In online nonstochastic control, both the cost functions as well as the perturbations from the assumed dynamical model are chosen by an adversary. Thus the optimal policy is not defined a priori.
Rather, the target is to attain low regret against the best policy in hindsight from a benchmark class of policies.
This objective suggests the use of the decision making framework of online convex optimization as an algorithmic methodology. The resulting methods are based on iterative mathematical optimization algorithms, and are accompanied by finite-time regret and computational complexity guarantees.
Table of Contents
Background in Control and RL
Introduction
What is This Book About?
The Origins of Control
Formalization and Examples of a Control Problem
Simple Control Algorithms
Classical Theory: Optimal and Robust Control
The Need for a New Theory
Dynamical systems
Examples of Dynamical Systems
Solution Concepts for Dynamical Systems
Intractability of Equilibrium, Stabilizability and Controllability
Markov Decision Processes
Reinforcement Learning
Markov Decision Processes
The Bellman Equation
Value Iteration
Linear Dynamical Systems
General Dynamics as LTV Systems
Stabilizability of Linear Systems
Controllability of LDS
Quantitative Definitions
Optimal Control of Linear Dynamical Systems
The Linear-Quadratic Regulator
Optimal Solution of the LQR
Infinite Horizon LQR
H∞ Control
Basics of Nonstochastic Control
Policy Classes for Dynamical Systems
Relating the Power of Policy Classes
A Quantitative Comparison of Policy Classes for LTI Systems
Policy Classes for Partially Observed LDS
Online Nonstochatic Contro
From Optimal and Robust to Online Control
The Online Nonstochastic Control Problem
The Gradient Perturbation Controller
Online Nonstochastic Control with Partial Observation
Disturbance Response Controllers
The Gradient Response Controller
Online Nonstochastic System Identification
Nonstochastic System Identification
Learning and Filtering
Appendix
A Concepts from Online Convex Optimization
Online Gradient Descent