References
Initial research papers on non-stochastic control
1. Online control with adversarial disturbances
N Agarwal, B Bullins, E Hazan, S Kakade, K Singh
International Conference on Machine Learning, 111-119
2. Logarithmic regret for online control
N Agarwal, E Hazan, K Singh
Advances in Neural Information Processing Systems 32 (NeurIPS 2019)
3. The nonstochastic control problem
E Hazan, S Kakade, K Singh
Algorithmic Learning Theory, 408-421
4. Adaptive regret for control of time-varying dynamics
P Gradu, E Hazan, E Minasyan
5. Black-box control for linear dynamical systems
X Chen, E Hazan
6. Improper learning for non-stochastic control
M Simchowitz, K Singh, E Hazan
Conference on Learning Theory, 3320-3436
7. Logarithmic regret for adversarial online control
D Foster, M Simchowitz
International Conference on Machine Learning, 3211-3221
8. Robust Online Control with Model Misspecification
X Chen, U Ghai, E Hazan, A Megretski
Extensions and applications
1. Non-stochastic control with bandit feedback
P Gradu, J Hallman, E Hazan
Advances in Neural Information Processing Systems 33
2. Generating Adversarial Disturbances for Controller Verification
U Ghai, D Snyder, A Majumdar, E Hazan
L4DC 2021
3. A Regret Minimization Approach to Iterative Learning Control
N Agarwal, E Hazan, A Majumdar, K Singh
Learning in non-stochastic dynamical systems
Learning linear dynamical systems via spectral filtering
E Hazan, K Singh, C Zhang
Advances in Neural Information Processing Systems 30 (NIPS 2017)Spectral filtering for general linear dynamical systems
E Hazan, H Lee, K Singh, C Zhang, Y Zhang
Neural Information Processing Systems (NIPS), 2018
U Ghai, H Lee, K Singh, C Zhang, Y Zhang
Conference on Learning Theory, 1714-1757
Research papers in online stochastic control
1. Regret Bounds for the Adaptive Control of Linear Quadratic Systems.
Y Abbasi-Yadkori, C Szepesvári
COLT 2011, 1-26
2. Model-Free Linear Quadratic Control via Reduction to Expert Prediction
Y Abbasi-Yadkori, N Lazic, C Szepesvari
The 22nd International Conference on Artificial Intelligence and Statistics
3. Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with root(T) Regret
A Cassel, T Koren
4. Logarithmic regret for learning linear quadratic regulators efficiently
A Cassel, A Cohen, T Koren
International Conference on Machine Learning, 1328-1337
5. Online linear quadratic control
A Cohen, A Hasidim, T Koren, N Lazic, Y Mansour, K Talwar
International Conference on Machine Learning, 1029-1038
6. Learning without mixing: Towards a sharp analysis of linear system identification
M Simchowitz, H Mania, S Tu, MI Jordan, B Recht
Conference On Learning Theory, 439-473
7. Learning linear dynamical systems with semi-parametric least squares
M Simchowitz, R Boczar, B Recht
Conference on Learning Theory, 2714-2802
8. Logarithmic regret bound in partially observable linear dynamical systems
S Lale, K Azizzadenesheli, B Hassibi, A Anandkumar
9. Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting
S Lale, K Azizzadenesheli, B Hassibi, A Anandkumar
10. Learning nonlinear dynamical systems from a single trajectory
D Foster, T Sarkar, A Rakhlin
Learning for Dynamics and Control, 851-861
11. Global convergence of policy gradient methods for the linear quadratic regulator
M Fazel, R Ge, S Kakade, M Mesbahi
International Conference on Machine Learning, 1467-1476
12. Online convex programming and regularization in adaptive control
M Raginsky, A Rakhlin, S Yüksel
49th IEEE Conference on Decision and Control (CDC), 1957-1962
13. Geometric Exploration for Online Control
O Plevrakis, E Hazan
Advances in Neural Information Processing Systems 33 (NeurIPS 2020)
Competitive ratio for online control
Competitive Control with Delayed Imperfect Information
C Yu, G Shi, SJ Chung, Y Yue, A Wierman
Online Optimization with Memory and Competitive Control
G Shi, Y Lin, SJ Chung, Y Yue, A Wierman
Proceedings of NeurIPS
Regret minimization for Markov Decision Processes
Online Markov decision processes
E Even-Dar, SM Kakade, Y Mansour
Mathematics of Operations Research 34 (3), 726-736
Experts in a Markov decision process
E Even-Dar, SM Kakade, Y Mansour
Advances in neural information processing systems, 401-408
Markov decision processes with arbitrary reward processes
JY Yu, S Mannor, N Shimkin
Mathematics of Operations Research 34 (3), 737-757
Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions
Y Abbasi-Yadkori, P Bartlett, V Kanade, Y Seldin, C Szepesvari
Neural Information Processing Systems
Better rates for any adversarial deterministic MDP
O Dekel, E Hazan
International Conference on Machine Learning, 675-683
Textbooks and other relevant readings (references limited to online availability)
1. Introduction to Online Convex Optimization, by Elad Hazan
2. Underactuated Robotics by Russ Tedrake
3. Reinforcement learning: an introduction, by Richard S. Sutton, Andrew G. Barto
4. Reinforcement Learning: Theory and Algorithms, by Alekh Agarwal, Nan Jiang, Sham M. Kakade, Wen Sun
5. Bandit Algorithms, by Tor Lattimore and Csaba Szepesvari
6. Control Systems & Reinforcement Learning by Sean Meyn