seminar series
Upcoming Talk
Optimal Control or Optimal Learning?… How About Both!
Real-time learning strategies where, at infinite time and under PE conditions, gains become optimal are common — they are a particular variety of RL and called also adaptive dynamic programming or "adaptive-optimal" control. What is, in contrast, truly of interest to the practitioner are OPTIMAL-ADAPTIVE controllers (note the commuted order of the two adjectives). They are optimal during the entire infinite horizon. Conceived in 1997 by the speaker, this idea failed to spread widely because in general its controllers - while given explicitly - have complicated expressions. The exception, where the optimal-adaptive controllers become elegant, are the so-called “driftless systems,” of the form dx/dt = g(x)u (with f=0), where dim(x)>dim(u). Nonholonomic mobile robots are archetypal driftless systems. For such vehicles, we first present globally uniformly Lagrange asymptotically stabilizing (GULAS) feedback laws, which don’t contradict Brockett’s condition. With their strict CLFs, adaptive LgV-type feedback laws are then designed for vehicles with completely unknown wheel traction coefficients. These controllers optimize cost functionals that not only penalize the longitudinal and angular velocity inputs, as well as the unicycle’s three configuration states, but also the parameter estimation errors - over the entire infinite time horizon (not only in the asymptotic limit). As a bonus, control designs that complete parking in user-desired time will be shown: (1) a time-varying feedback, with gains that are singular at terminal time but keep the controls bounded, and (2) a static homogeneous feedback, which is nonsmooth at the target values of position and heading.