Tv LQR and Trajectory Optimization Swing-Up of a Variable Link Acrobot
In Spring quarter 2021 I took a course in linear multivariate control systems. For a final project we chose to apply non-linear control techniques to a highly non-linear, underactuated and chaotic system, that being a variable length acrobot. We implemented a combination of trajectory optimization and time varying LQR to perform an optimal swing-up and stabilization maneuver with this system.
Variable Length Acrobot:
The variable length acrobot is an extension to the classical non-linear underactuated control systems problem of the acrobot. The canonical acrobot is a 2-link pendulum that is only actuated at the elbow joint. To perform a swing-up maneuver the system needs to "pump" energy by utilizing the interplay of potential and kinetic energy to gradually build up until it can stand upright. The variable length acrobot is the same system now with telescoping links that change the inertial distribution of the system.
Trajectory optimization allows for the solving of a dynamically feasible solution to the problem that is local and open loop in nature. Using just trajectory optimization a variety of swing up maneuvers can be solved for depending on the objective and imposed constraints. At the immediate time step after reaching the final conditions however, the pendulum will fall back down.
Optimal control in the form of time varying LQR can be used to "close the loop" around the control trajectory, driving the systems dynamics to follow the desired trajectory and hold in the upright position.
Normal LQR is a form of optimal control where the systems dynamics are linearized about a fixed point, upright in the case of our pendulum. Applying regular LQR to our system does not work as it is not at a fixed point while performing the initial swing up maneuver. Simply commanding the system to follow the open loop optimal swing up trajectory and then just applying LQR once it is upright is not a fix as if the system deviates from the optimal trajectory at any point it will become unrecoverable and behave chaotically.
Time varying LQR is an extension to LQR where a moving coordinate system is chosen such that we are able to linearize our dynamics along the trajectory and apply the LQR gains that will guide our systems dynamics to follow this local optimal swing up trajectory.