Neural Network Controller that Satisfies Desired Disk Margin
Best guaranteed stability, but episodic length and returns stagnates and slow to improve. Future work can look to obtain better coherence between projection and gradient update.
Somewhat compromised or reduced stability guarantees, but leads to improved episodic length and returns, as a tradeoff between stability constraints and PPO gradient update.
(80M Steps)
(50M steps)
Poster