For the study of autonomous robots, MPC has been increasingly utilized for generating control signals. It is is typically considered on a level lower than the decision-making layer, i.e., the decision-making layer outputs a goal for MPC to execute toward. However, as more research efforts are focused on learning-based control methods, MPC has been associated with decision-making in more ways:
MPC-guided control policy search [12]: To accelerate the decision-making process, MPC serves as a teacher to train deep neural networks using supervised learning. With a trained deep neural network, the robot is able to directly obtain the end-to-end control policy after receiving the sensor data. It skips the state estimation stage. This method works very efficiently, yet it is hard to generalize, requires extensive offline training, and always lacks safety guarantee.
Figure 1: Diagram of the method of the training phase alternating between running MPC to attempt the task and collect data under full state observation and using this data to train a neural network policy that chooses actions based only on the vehicle's onboard sensors [12].
High-level decision learning for MPC [13]: This approach, in turn, learns a high-level decision for different tasks and the corresponding constraints specifically for MPC. By directly outputing the contraints, this approach combines decision-making and MPC in more interactively. Moreover, this indicates that MPC has become a unique tool with customized treatment from the decision-making level.
Figure 2: Diagram of the approach of high-level decision learning for MPC [13].