2019 ICML Workshop on AI for Autonomous Driving

Talk Abstract and Speaker Bio:

9.15 - 9.40 Sven Kreiss (EPFL) Compositionality, Confidence and Crowd Modeling for Self-Driving Cars

Abstract: I will present our recent works related to the three AI pillars of a self-driving car: perception, prediction, and planning. For the perception pillar, I will present new human pose estimation and monocular distance estimation methods that use a loss that learns its own confidence, the Laplace loss. For prediction, I will show our investigations on interpretable models where we apply deep learning techniques within structured and hand-crafted classical models for path prediction in social contexts. For the third pillar, planning, I will show our crowd-robot interaction module that uses attention-based representation learning suitable for planning in an RL environment with multiple people.

Bio: Sven is a postdoc at the Visual Intelligence for Transportation (VITA) lab at EPFL in Switzerland. Before returning to academia, he was the Senior Data Scientist at Sidewalk Labs (an Alphabet company) and worked on machine learning problems for urban environments. Before that, he led the machine learning efforts at the New York-based startup Wildcard. Prior to his industry experience, Sven developed statistical tools and methods used in particle physics research.

Sven grew up in Germany and studied mathematical physics at the University of Edinburgh before earning his PhD in physics at New York University. For his research in particle physics he spent a year at CERN in Switzerland and was on the core team that discovered the Higgs boson.

9.40 - 10.05 Mayank Bansal (Waymo) ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst

Abstract: Our goal is to train a policy for autonomous driving via imitation learning that is robust enough to drive a real vehicle. We find that standard behavior cloning is insufficient for handling complex driving scenarios, even when we leverage a perception system for preprocessing the input and a controller for executing the output on the car: 30 million examples are still not enough. We propose exposing the learner to synthesized data in the form of perturbations to the expert's driving, which creates interesting situations such as collisions and/or going off the road. Rather than purely imitating all data, we augment the imitation loss with additional losses that penalize undesirable events and encourage progress -- the perturbations then provide an important signal for these losses and lead to robustness of the learned model. We show that the ChauffeurNet model can handle complex situations in simulation, and present ablation experiments that emphasize the importance of each of our proposed changes and show that the model is responding to the appropriate causal factors. Finally, we demonstrate the model driving a real car at our test facility.

Bio: Mayank Bansal is a Staff Research Scientist at Waymo where he leads research efforts on deep-learning for planning and prediction. Before joining Waymo in 2015, he was a Principal Research Scientist at the Center for Vision Technologies, SRI International (Sarnoff) where he spent 11 years leading computer vision and robotics R&D programs for a range of government (DARPA, NGA, FHWA, NIH etc.) and commercial (Google Inc., Autoliv Inc.) clients. Bansal has more than 15 years of experience in 3D/2D object detection and recognition from LiDAR and EO/IR mono/stereo camera data as well as deep-learning modeling and research experience for a variety of computer vision and robotics applications. He has significant expertise in geometric and stereo vision, perception for mobile robotics, medical image analysis and geo-localization techniques from a variety of input sources. Bansal received his PhD in Computer & Information Sciences from the University of Pennsylvania. His Masters and Bachelors degrees in Computer Science & Engineering are from the Indian Institute of Technology (IIT) Delhi, New Delhi, India.

10:05 - 10.30 Chelsea Finn (UC Berkeley) A Practical View on Generalization and Autonomy in the Real World

10.50 - 11.15 Sergey Levine (UC Berkeley) Imitation, Prediction, and Model-Based Reinforcement Learning for Autonomous Driving

Abstract: While machine learning has transformed passive perception -- computer vision, speech recognition, NLP -- its impact on autonomous control in real-world robotic systems has been limited due to reservations about safety and reliability. In this talk, I will discuss how end-to-end learning for control can be framed in a way that is data-driven, reliable and, crucially, easy to merge with existing model-based control pipelines based on planning and state estimation. The basic building blocks of this approach to control are generative models that estimate which states are safe and familiar, and model-based reinforcement learning, which can utilize these generative models within a planning and control framework to make decisions. By framing the end-to-end control problem as one of prediction and generation, we can make it possible to use large datasets collected by previous behavioral policies, as well as human operators, estimate confidence or familiarity of new observations to detect "unknown unknowns," and analyze the performance of our end-to-end models on offline data prior to live deployment. I will discuss how model-based RL can enable navigation and obstacle avoidance, how generative models can detect uncertain and unsafe situations, and then discuss how these pieces can be put together into the framework of deep imitative models: generative models trained via imitation of human drivers that can be incorporated into model-based control for autonomous driving, and can reason about future behavior and intentions of other drivers on the road. Finally, I will conclude with a discussion of current research that is likely to make an impact on autonomous driving and safety-critical AI systems in the near future, including meta-learning, off-policy reinforcement learning, and pixel-level video prediction models.

Bio: Sergey Levine received a BS and MS in Computer Science from Stanford University in 2009, and a Ph.D. in Computer Science from Stanford University in 2014. He joined the faculty of the Department of Electrical Engineering and Computer Sciences at UC Berkeley in fall 2016. His work focuses on machine learning for decision making and control, with an emphasis on deep learning and reinforcement learning algorithms. Applications of his work include autonomous robots and vehicles, as well as computer vision and graphics. His research includes developing algorithms for end-to-end training of deep neural network policies that combine perception and control, scalable algorithms for inverse reinforcement learning, deep reinforcement learning algorithms, and more. His work has been featured in many popular press outlets, including the New York Times, the BBC, MIT Technology Review, and Bloomberg Business.

11.15 - 11.40 Wolfram Burgard (Toyota Research Institute)

11.40 - 12.05 Dorsa Sadigh (Stanford) Influencing Interactive Mixed-Autonomy Systems

14.30 - 14.55 Alexander Amini (MIT) Learning to Drive with Purpose

Abstract: Deep learning has revolutionized the ability to learn "end-to-end" autonomous vehicle control directly from raw sensory data. In recent years, there have been advances to handle more complex forms of navigational instruction. However, these networks are still trained on biased human driving data (yielding biased models), and are unable to capture the full distribution of possible actions that could be taken. By learning a set of unsupervised latent variables that characterize the training data, we present an online debiasing algorithm for autonomous driving. Additionally, we extend end-to-end driving networks with the ability to drive with purpose and perform point-to-point navigation. We formulate how our model can be used to also localize the robot according to correspondences between the map and the observed visual road topology, inspired by the rough localization that human drivers can perform, even in cases where GPS is noisy or removed all together. Our results highlight the importance of bridging the benefits from end-to-end learning with classical probabilistic reasoning and Bayesian inference to push the boundaries of autonomous driving.

Bio: Alexander Amini is a PhD student at the Massachusetts Institute of Technology (MIT) in the Computer Science and Artificial Intelligence Laboratory (CSAIL), working with Prof. Daniela Rus. He is an NSF Fellow and completed his Bachelor of Science and Master of Science in Electrical Engineering and Computer Science at MIT, with a minor in Mathematics. Amini's research focuses on building machine learning algorithms for end-to-end control (i.e., perception to actuation) of autonomous systems and formulating guarantees for these algorithms. He has worked on control of autonomous vehicles, formulating confidence and improving algorithmic bias of deep neural networks, as well as mathematical modeling of human mobility.

14.55 - 15.20 Fisher Yu (UC Berkeley) Motion and Prediction for Autonomous Driving

Bio: Fisher Yu is a postdoctoral researcher at UC Berkeley. He pursued his Ph.D. degree at Princeton University. He obtained his bachelor degree from the University of Michigan, Ann Arbor. His research interest lies in image representation learning and interactive data processing systems. His works focus on seeking connections between computer vision problems and building unified image representation frameworks. Through the lens of image representation, he is also studying high-level understanding of dynamic 3D scenes. More information about his works can be found at his homepage: https://www.yf.io.

15.20 - 15.45 Alfredo Canziani (NYU) Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic

Abstract: Learning a policy using only observational data is challenging because the distribution of states it induces at execution time may differ from the distribution observed during training. In this work, we propose to train a policy while explicitly penalizing the mismatch between these two distributions over a fixed time horizon. We do this by using a learned model of the environment dynamics which is unrolled for multiple time steps, and training a policy network to minimize a differentiable cost over this rolled-out trajectory. This cost contains two terms: a policy cost which represents the objective the policy seeks to optimize, and an uncertainty cost which represents its divergence from the states it is trained on. We propose to measure this second cost by using the uncertainty of the dynamics model about its own predictions, using recent ideas from uncertainty estimation for deep networks. We evaluate our approach using a large-scale observational dataset of driving behavior recorded from traffic cameras, and show that we are able to learn effective driving policies from purely observational data, with no environment interaction.

Bio: Alfredo Canziani is a Postdoctoral Deep Learning Research Scientist and Lecturer at NYU Courant Institute of Mathematical Sciences, under the supervision of professors KyungHyun Cho and Yann LeCun. His research mainly focusses on Machine Learning for Autonomous Driving. He has been exploring deep policy networks actions uncertainty estimation and failure detection, and long term planning based on latent forward models, which nicely deal with the stochasticity and multimodality of the surrounding environment. In his spare time, Alfredo is a professional musician, dancer, and cook, and keeps expanding his free online video course on Deep Learning and PyTorch.

16.05 - 16.30 Jianxiong Xiao (AutoX) Self-driving Car: What we can achieve today?

Bio: Jianxiong Xiao (a.k.a., Professor X) is the Founder and CEO of AutoX Inc., a high-tech company working on self-driving vehicles. AutoX's mission is to democratize autonomy and enable autonomous driving to improve everyone's life. Dr. Xiao has over ten years of research and engineering experience in Computer Vision, Autonomous Driving, and Robotics. In particular, he is a pioneer in the fields of 3D Deep Learning, RGB-D Recognition and Mapping, Big Data, Large-scale Crowdsourcing, and Deep Learning for Robotics. Jianxiong received a BEng. and MPhil. in Computer Science from the Hong Kong University of Science and Technology in 2009. He received his Ph.D. from the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT) in 2013. And he was an Assistant Professor at Princeton University and the founding director of the Princeton Computer Vision and Robotics Labs from 2013 to 2016. His work has received the Best Student Paper Award at the European Conference on Computer Vision (ECCV) in 2012 and the Google Research Best Papers Award for 2012, and has appeared in the popular press. He was awarded the Google U.S./Canada Fellowship in Computer Vision in 2012, the MIT CSW Best Research Award in 2011, NSF/Intel VEC Research Award in 2016, and two Google Faculty Awards in 2014 and in 2015 respectively. He co-lead the MIT+Princeton joint team to participate in the Amazon Picking Challenge in 2016, and won the 3rd and 4th place worldwide. He was named to be one of the 35 Innovators Under 35 by MIT Technology Review in 2017, and one of the 12 of the most promising entrepreneurs to keep an eye on in 2018 by Next Visionaries. More information can be found at: http://www.jianxiongxiao.com.

16.30 - 16.55 German Ros (Intel Labs) Fostering Autonomous Driving Research with CARLA

Abstract: This talk focuses on the relevance of open source solutions to foster autonomous driving research and development. To this end, we present how CARLA has been used within the research community in the last year and what results has it enabled. We will also cover the CARLA Autonomous Driving Challenge and its relevance as an open benchmark for the driving community. We will share with the community new soon-to-be-released features and the future direction of the CARLA simulation platform.

Bio: German Ros is a Sr. Research Scientist at Intel Intelligent Systems Lab, working on topics at the intersection of Machine Learning, Simulation, Virtual worlds, Transfer Learning, and Intelligent Autonomous systems. He also leads the CARLA organization as part of the Open Source Vision Foundation and serves as a product manager of the Open3D organization. Before joining Intel Labs, German served as a Research Scientist at Toyota Research Institute (TRI), where he conducted research in the area of Simulation for Autonomous Driving, Scene Understanding and Domain Adaptation, in the context of Autonomous Driving. Part of his previous work includes the SYNTHIA dataset and the exploitation of synthetic data in the context of training autonomous driving systems. German Ros obtained his Ph.D. in Computer Science at the Autonomous University of Barcelona & the Computer Vision Center (2016).

16.55 - 17.20 Venkatraman Narayanan (Aurora) The Promise and Challenge of ML in Self-Driving

Abstract: To deliver the benefits of autonomous driving safely, quickly, and broadly, learnability has to be a key element of the solution. In this talk, I will describe Aurora's philosophy towards building learnability into the self-driving architecture, avoiding the pitfalls of applying vanilla ML to problems involving feedback, and leveraging expert demonstrations for learning decision-making models. I will conclude with our approach to testing and validation.

Bio: Venkatraman Narayanan is a Planning Engineer at Aurora Innovation, where he develops production-grade algorithms at the intersection of machine learning and planning for safe and comfortable driving. He received his Ph.D. and M.S. in Robotics from Carnegie Mellon University, researching techniques for "deliberative" robot perception that bring together classical AI and machine learning. He has published in top robotics and AI conferences, won a best poster award at the 2014 International Symposium on Combinatorial Search, and is one of the recipients of the 2014 AAAI Robotics Fellowship. He has previously worked on autonomous driving at the Uber Advanced Technologies Group and Google X (now Waymo).