References

Initial research papers on non-stochastic control

1. Online control with adversarial disturbances

N Agarwal, B Bullins, E Hazan, S Kakade, K Singh

International Conference on Machine Learning, 111-119

2. Logarithmic regret for online control

N Agarwal, E Hazan, K Singh

Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

3. The nonstochastic control problem

E Hazan, S Kakade, K Singh

Algorithmic Learning Theory, 408-421

4. Adaptive regret for control of time-varying dynamics

P Gradu, E Hazan, E Minasyan

5. Black-box control for linear dynamical systems

X Chen, E Hazan

6. Improper learning for non-stochastic control

M Simchowitz, K Singh, E Hazan

Conference on Learning Theory, 3320-3436


7.
Logarithmic regret for adversarial online control

D Foster, M Simchowitz

International Conference on Machine Learning, 3211-3221


8.
Robust Online Control with Model Misspecification
X Chen, U Ghai, E Hazan, A Megretski


Extensions and applications

1. Non-stochastic control with bandit feedback

P Gradu, J Hallman, E Hazan

Advances in Neural Information Processing Systems 33

2. Generating Adversarial Disturbances for Controller Verification

U Ghai, D Snyder, A Majumdar, E Hazan

L4DC 2021

3. A Regret Minimization Approach to Iterative Learning Control

N Agarwal, E Hazan, A Majumdar, K Singh

Learning in non-stochastic dynamical systems

  1. Learning linear dynamical systems via spectral filtering
    E Hazan, K Singh, C Zhang
    Advances in Neural Information Processing Systems 30 (NIPS 2017)

  2. Spectral filtering for general linear dynamical systems

E Hazan, H Lee, K Singh, C Zhang, Y Zhang

Neural Information Processing Systems (NIPS), 2018

  1. No-regret prediction in marginally stable systems

U Ghai, H Lee, K Singh, C Zhang, Y Zhang

Conference on Learning Theory, 1714-1757


Research papers in online stochastic control

1. Regret Bounds for the Adaptive Control of Linear Quadratic Systems.

Y Abbasi-Yadkori, C Szepesvári

COLT 2011, 1-26

2. Model-Free Linear Quadratic Control via Reduction to Expert Prediction

Y Abbasi-Yadkori, N Lazic, C Szepesvari

The 22nd International Conference on Artificial Intelligence and Statistics

3. Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with root(T) Regret

A Cassel, T Koren

4. Logarithmic regret for learning linear quadratic regulators efficiently

A Cassel, A Cohen, T Koren

International Conference on Machine Learning, 1328-1337

5. Online linear quadratic control

A Cohen, A Hasidim, T Koren, N Lazic, Y Mansour, K Talwar

International Conference on Machine Learning, 1029-1038

6. Learning without mixing: Towards a sharp analysis of linear system identification

M Simchowitz, H Mania, S Tu, MI Jordan, B Recht

Conference On Learning Theory, 439-473

7. Learning linear dynamical systems with semi-parametric least squares

M Simchowitz, R Boczar, B Recht

Conference on Learning Theory, 2714-2802


8. Logarithmic regret bound in partially observable linear dynamical systems

S Lale, K Azizzadenesheli, B Hassibi, A Anandkumar


9. Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting

S Lale, K Azizzadenesheli, B Hassibi, A Anandkumar


10. Learning nonlinear dynamical systems from a single trajectory

D Foster, T Sarkar, A Rakhlin

Learning for Dynamics and Control, 851-861


11.
Global convergence of policy gradient methods for the linear quadratic regulator

M Fazel, R Ge, S Kakade, M Mesbahi

International Conference on Machine Learning, 1467-1476


12. Online convex programming and regularization in adaptive control

M Raginsky, A Rakhlin, S Yüksel

49th IEEE Conference on Decision and Control (CDC), 1957-1962

13. Geometric Exploration for Online Control

O Plevrakis, E Hazan

Advances in Neural Information Processing Systems 33 (NeurIPS 2020)


Competitive ratio for online control

  1. Competitive Control with Delayed Imperfect Information

C Yu, G Shi, SJ Chung, Y Yue, A Wierman

  1. Online Optimization with Memory and Competitive Control

G Shi, Y Lin, SJ Chung, Y Yue, A Wierman

Proceedings of NeurIPS

Regret minimization for Markov Decision Processes

  1. Online Markov decision processes

E Even-Dar, SM Kakade, Y Mansour

Mathematics of Operations Research 34 (3), 726-736

  1. Experts in a Markov decision process

E Even-Dar, SM Kakade, Y Mansour

Advances in neural information processing systems, 401-408

  1. Markov decision processes with arbitrary reward processes

JY Yu, S Mannor, N Shimkin

Mathematics of Operations Research 34 (3), 737-757

  1. Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions

Y Abbasi-Yadkori, P Bartlett, V Kanade, Y Seldin, C Szepesvari

Neural Information Processing Systems

  1. Better rates for any adversarial deterministic MDP

O Dekel, E Hazan

International Conference on Machine Learning, 675-683


Textbooks and other relevant readings (references limited to online availability)

1. Introduction to Online Convex Optimization, by Elad Hazan

2. Underactuated Robotics by Russ Tedrake

3. Reinforcement learning: an introduction, by Richard S. Sutton, Andrew G. Barto

4. Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal, Nan Jiang, Sham M. Kakade, Wen Sun

5. Bandit Algorithms, by Tor Lattimore and Csaba Szepesvari

6. Control Systems & Reinforcement Learning by Sean Meyn