supervised by prof. Luca Iocchi and prof. Marc Hanheide
Master of Science in "Artificial Intelligence and Robotics"
Department of Computer, Control and Management Engineering "Antonio Ruberti"
Thesis developed also at the L-CAS, University of Lincoln, UK.
In this work I present a human-in-the-loop learning framework for mobile robots that leverages human demonstrations for generating local policies, with the goal of extending the robot current capabilities.
The robot systems in use today are typically equipped with a stack of modules that enable several capabilities for each variety of tasks one can imagine; yet those modules are usually pre-designed/pre-trained and do not adapt to the variability of the environment that a robot employed in a real world scenario encounter. With this work I try to overcome this current limitation by introducing a learning system that allows non-expert humans to teach the robot how to deal with situations in which it typically fails (failures) and how to deal with situations it is not able to handle yet (opportunities).
The learning approach. From left to right: 1) the robot encounter a situation in which it doesn’t know how to proceed; 2) a human shows the wanted behavior; 3) the robot, in a future time, encounter again the same situation; 4) the robot executes the behavior learned from the demonstrations.
The proposed framework allows humans to interactively show new behaviors while learning how to replicate them and when. It is composed by two learning layers: one that models the situation in which the recovery/opportunity should happen and one that generates the behavior learned from the human demonstrations. The emphasis of this work is on the general architecture design and on the implementation of the learning by demonstration (LbD) modules for generating the behavior. Therefore, the situation detection part is only implemented for one experiment, while it is left as a mention in other contexts.
In particular I evaluated extensively the framework in a real case scenario in which the robot learns to recover local navigation failures with the use of Gaussian Process models for generating the recovery trajectories and detecting the failure situation. Other testing scenarios have been evaluated qualitatively for learning symbolic navigation recovery through Inverse Reinforcement Learning and for testing the capability of addressing multiple and diverse situations on the same robotic implementation.
Navigation failures recovery learning
Opportunistic behaviors learning