The ultimate goal of my research is to develop an intelligent companion robot that coexists and interacts with human beings in our daily lives to increase the quality of life. Compared with the recent advances in computer vision fields and computing technologies, intelligent robot technology has been providing insignificant personal benefits to physical and mental assistance. I believe this is partially because most of the existing robotics research focuses on control and motion planning and rule-based procedure designs. To make intelligent companion robots to effectively coexist and provide useful and meaningful services to human beings, they should not only be functional enough to carry on assigned tasks, but also have to guarantee the critical safety constraints and infer possibly noisy and vague, human signals. Furthermore, they should be able to evolve and adapt themselves to novel environments without excessive engineer interventions.
My research includes: (1) robust and reliable machine learning methods to handle noisy, corrupted, or biased training data as well as guaranteeing the feasibility of the prediction results, (2) sample-efficient reinforcement learning for the intelligent companion robot to evolve itself to the users as well as environments adaptively, and (3) effective human-robot interaction via casual inference.
In this era of machine learning, a tremendous amount of training data is often required to properly train a model where those data are often assumed to be clean and not corrupted. However, this may not be the case all the time as we often rely on crowd-sourcing to collect large datasets. To alleviate this clean data assumption, I have focused on robust learning from my Ph.D. studies. In this regard, I first presented a Bayesian approach to handle corruption and inconsistencies in data [1], which is recently extended to using Bayesian deep learning to achieve scalability [2]. I will continue this to different applications, including (1) object detection problems and (2) human behavior/intention inference problems.
I am also interested in reliability in machine learning. In particular, I focus on uncertainty assessment in prediction or inference of the learned model and guaranteeing the feasibility of the prediction. In this regard, I presented an uncertainty-aware imitation learning method for autonomous driving [3] and deep latent space model for guaranteeing the safety of the generated humanoid robot motion [4]. I believe both topics are significantly important for machine learning models to be widely used in our daily lives. There are multiple interesting future directions including (1) how can we use the quantified uncertainty information to provide interpretability to the end-users, especially who are not familiar with technologies, and (2) safety-guarantee of real-time robotic motion generation with its application to smart factories and human robot interaction.
We have witnessed the great successes of reinforcement learning, including AlphaGo from DeepMind. However, it is also well known that a formidable amount of experience is required to successfully train a policy function making most of the methods not applicable without leveraging simulation environments. However, when it comes to involving human beings or contact-rich robotic domains in the learning loop, leveraging simulations may not be feasible. In such cases, one remedy is to impose more structures on the learning process (i.e., structured reinforcement learning). Recently, I presented a sample efficient policy gradient method [5] for learning locomotion skills of a quadruped robot (without simulations) by restricting the policy function to the space of smooth joint trajectories. I have a plan to achieve even better sample efficiency by meta-learning the prior distributions of joint trajectories. The learned distributions over the robot joint trajectories could effectively be combined with reliable learning in 1 (Robust and Reliable Machine Learning) to enable safety guarantee while the exploration phase, such as not destroying itself nor surrounding environments including human beings.
Human-robot interaction (HRI) is undoubtedly one of the utmost important research topics in this era of the aging population as it can potentially give both physical and mental assistance to human users. However, as we have seen by the recent failures of social robot companies, including Anki, Jibo, and Kuri, there are multiple hurdles to overcome before HRI can actually be applied to our everyday lives. In Disney, I had been working on developing an attention engine that infers how much guests are interested in the interactive robot agent and learned that understanding human intention should consider the context with longer time horizons.
In this regard, I focus on two aspects: (1) understanding human intention via causal inference, and (2) adaptive learning via inferred human intentions. Human signals, including voices and gestures, could be the same even with different intents, emphasizing the importance of understanding or inferring the underlying human intentions. One possible approach is to leverage causal inference (or counterfactual inference) to interpret the intention better. This inferred intention can further be used for the intelligent agent or robot to evolve itself adaptively and reliably with the methods in Section 2 and 3, respectably. In fact, considering that most customer complaints come from the inability of social robots to adaptive themselves, this adaptability may play a significant role in the successful deployment of intelligent companion robots.
[1] Sungjoon Choi, Kyungjae Lee, and Songhwai Oh, “Robust Learning from Demonstration with Leveraged Gaussian Processes,” IEEE Transaction on Robotics (T-RO), 2019
[2] Sungjoon Choi, Sanghoon Hong, Kyungjae Lee, and Sungbin Lim, “Task Agnostic Robust Learning on Corrupt Outputs by Correlation-Guided Mixture Density Network,” in Proc. of Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
[3] Sungjoon Choi, Kyungjae Lee, Sungbin Lim, and Songhwai Oh, “Uncertainty-Aware Learning from Demonstration Using Mixture Density Networks with Sampling-Free Variance Modeling,” in Proc. of the IEEE International Conference on Robotics and Applications (ICRA), May 2018.
[4] Sungjoon Choi, Matthew Pan, and Joohyung Kim, “Nonparametric Motion Retargeting for Humanoid Robots on Shared Latent Space,” in Robotics: Science and Systems (RSS), July 2020.
[5] Sungjoon Choi and Joohyung Kim, “Trajectory-based Probabilistic Policy Gradient for Learning Locomotion Behaviors,” in Proc. of the IEEE International Conference on Robotics and Applications (ICRA), May 2019