Benchmark and Dataset for Probabilistic Prediction of Interactive Human Behavior
Accurate prediction of probabilistic and interactive human behavior is a prerequisite to enable full autonomy of mobile robots (e.g., autonomous vehicles) in complex scenes. To enable accurate predictions, two fundamental problems should be addressed: 1) datasets of human behavior and motions in interactive tasks and scenarios, and 2) evaluation metrics and benchmarks for extensive prediction models/algorithms. Datasets are the most important asset since they provide sources for both model learning/training and validation. Similarly, evaluation metrics and benchmarks are also of fundamental importance since they provide not only criteria but also guidance for the design of prediction algorithms. Currently, the research community is still on its way to build high-quality datasets containing interactive human behavior, such as human-driven vehicles, pedestrians, cyclists, etc. Also, there is yet no widely accepted evaluation metric which can comprehensively quantify/evaluate the performance of different probabilistic prediction algorithms from perspectives of both data approximation and fatality/utility impacts on the autonomy of the mobile robots.
Photos Taken on November 4
A photo of the presentation of Yeping Hu
Group photo of speakers
Speakers from Academia
Yeping Hu, UC Berkeley, Interactive behavior prediction for autonomous vehicles
Shashank Srikanth, IIIT Hyderabad, Intermediate representations for trajectory prediction
Interactive motion datasets of road participants are vital to the development of autonomous vehicles in both industry and academia. Research areas such as motion prediction, motion planning, representation learning, imitation learning, behavior modeling, behavior generation, and algorithm testing, require support from high-quality motion datasets containing interactive driving scenarios with different driving cultures. In this paper, we present an INTERnational, Adversarial and Cooperative moTION dataset (INTERACTION dataset) in interactive driving scenarios with semantic maps. Five features of the dataset are highlighted. 1) The interactive driving scenarios are diverse, including urban/highway/ramp merging and lane changes, roundabouts with yield/stop signs, signalized intersections, intersections with one/two/all-way stops, etc. 2) Motion data from different countries and different continents are collected so that driving preferences and styles in different cultures are naturally included. 3) The driving behavior is highly interactive and complex with adversarial and cooperative motions of various traffic participants. Highly complex behavior such as negotiations, aggressive/irrational decisions and traffic rule violations are densely contained in the dataset, while regular behavior can also be found from cautious car-following, stop, left/right/U-turn to rational lane-change and cycling and pedestrian crossing, etc. 4) The levels of criticality span wide, from regular safe operations to dangerous, near-collision maneuvers. Real collision, although relatively slight, is also included. 5) Maps with complete semantic information are provided with physical layers, reference lines, lanelet connections and traffic rules. The data is recorded from drones and traffic cameras, and the processing pipelines for both are briefly described. Statistics of the dataset in terms of number of entities and interaction density are also provided, along with some utilization examples in the areas of motion prediction, imitation learning, decision-making and planing, representation learning, interaction extraction and social behavior generation. The dataset can be downloaded via https://interaction-dataset.com
Reasonable and Reliable Interactive Behavior Prediction for Autonomous Vehicles
Accurately predicting future behaviors of surrounding vehicles is an essential capability for autonomous vehicles in order to plan safe and feasible trajectories. A good prediction algorithm is supposed to be both reasonable and reliable. For example, the algorithm should not only capable of explaining the underneath logic of the predicted results, but also being aware of various possible behaviors (i.e. rational and irrational) of surrounding drivers. Moreover, the prediction module is expected to generate reasonable results even in the presence of unseen and corner scenarios. In this talk, we will address the above aspects by considering a combination of learning-based and planning-based prediction models, which is able to predict continuous trajectories that well-reflect possible future situations of other drivers. A case study under a real-world roundabout scenario is provided to demonstrate the performance and capability of the proposed prediction architecture.
Incorporating Relational Reasoning in Multi-agent Trajectory Prediction
Interacting systems are prevalent in nature, from dynamical systems in physics to complex societal dynamics. The interplay of components can give rise to very complex and diversified dynamics at the level of individual constituents and in the system as a whole. In the context of autonomous driving, in order to navigate safely and efficiently in dense traffic scenarios or crowded areas full of pedestrians, it is necessary for autonomous vehicles to forecast future behavior of surrounding interactive agents accurately. However, modeling these types of dynamics is challenging, since generally we only have access to individual trajectories without knowledge of the underlying interactions or dynamical model. In this talk, we will introduce the graph neural network (GNN) and its suitability and superiority in relational reasoning for multi-agent trajectory prediction. Experimental results will be demonstrated and analyzed.
Intermediate representations for trajectory prediction
In urban driving scenarios, forecasting future trajectories of surrounding vehicles is of paramount importance. While several approaches for the problem have been proposed, the best-performing ones tend to require extremely detailed input representations (eg. image sequences). But, such methods do not generalize to datasets they have not been trained on. We propose intermediate representations that are particularly well-suited for future prediction. As opposed to using texture (color) information, we rely on semantics and train an autoregressive model to accurately predict future trajectories of traffic participants (vehicles) (see fig. above). We demonstrate that using semantics provides a significant boost over techniques that operate over raw pixel intensities/disparities. Uncharacteristic of state-of-the-art approaches, our representations and models generalize to completely different datasets, collected across several cities, and also across countries where people drive on opposite sides of the road (left-handed vs right-handed driving). Additionally, we demonstrate an application of our approach in multi-object tracking (data association). To foster further research in transferrable representations and ensure reproducibility, we release all our code and data.
Speakers from Industry
Dr. Dongchun Ren, Meituan-Dianping, Pedestrian Trajectory Prediction Network for Autonomous Driving
Dr. Friederike Schneemann, Autonomous Intelligent Driving GmbH, Challenges of evaluating Intention Detection Algorithms
Dr. Daniel Graves, Huawei Technologies, Perception as prediction using general value functions in autonomous driving applications
Pedestrian Trajectory Prediction Network for Autonomous Driving
Pedestrian trajectory prediction is an important component in autonomous driving. It improves the experience of passengers and helps the car makes wiser path planning. However, it is a challenging task because of the complex interactions among pedestrians and vehicles. Many previous studies consider the interactions through the spatial relationship between the target and its nearby objects, which oversimplify the problem. In our study published in IROS 2019, we propose a novel multi-object trajectory prediction framework. The framework includes a hub network that considers the object trajectories of all pedestrians to produce a holistic coding of the interactions among pedestrians. Meanwhile, there is a host network for each pedestrian that considers the holistic coding by the hub network and predicts the future trajectory of each pedestrian. The framework is advantageous over previous studies in two aspects: 1. It considers the mutual interactions among all pedestrians in the hub network. 2. It is computationally efficient since the number of host networks increases linearly with the number of pedestrians. Experiments on benchmark datasets demonstrate the effectiveness of our proposed framework. In another study of us, a multi-branch LSTM encoder-decoder network is proposed for multi-object trajectory prediction. The multi-branch method shows much superior performance on the Apoloscape trajectory dataset than the SOA methods.
Challenges of evaluating Intention Detection Algorithms
Predicting the behavior of other road users in highly interactive traffic situations, is one of the major challenges for today's autonomous vehicles. Part of this challenge is detecting the intention of the other road users in order to increase the situational awareness of the autonomous vehicle. But intention is an internal state of mind and the execution of the intended behavior highly depends on potential conflicts with other traffic participants and the interaction between them. Therefore, developing and evaluating intention detection algorithms introduce a lot of new challenges for the research community. This talk will first give an introduction to the concept of "intention detection", including a clear demarcation to other tasks related to behavior prediction. Subsequently, the speaker will share insides and challenges she faced within two projects, when creating labeled datasets to train models for intention detection and when evaluating the performance of these models.
Perception as prediction using general value functions in autonomous driving applications
We propose and demonstrate a framework called perception as prediction for autonomous driving that uses general value functions (GVFs) to learn predictions. Perception as prediction learns data-driven predictions relating to the impact of actions on the agent's perception of the world. It also provides a data-driven approach to predict the impact of the anticipated behavior of other agents on the world without explicitly learning their policy or intentions. We demonstrate perception as prediction by learning to predict an agent's front safety and rear safety with GVFs, which encapsulate anticipation of the behavior of the vehicle in front and in the rear, respectively. The safety predictions are learned through random interactions in a simulated environment containing other agents. We show that these predictions can be used to produce similar control behavior to an LQR-based controller in an adaptive cruise control problem as well as provide advanced warning when the vehicle behind is approaching dangerously. The predictions are compact policy-based predictions that support prediction of the long term impact on safety when following a given policy. We analyze two controllers that use the learned predictions in a racing simulator to understand the value of the predictions and demonstrate their use in the real-world on a Clearpath Jackal robot and an autonomous vehicle platform.
Ernest C. Cheung and Farshid Moussavi, ATG, Samsung Strategy and Innovation Center, Unsupervised Trajectory Extraction Pipeline For Drone Footage
Alex Yuan Gao, Uppsala University, CongreG8: A motion capture dataset of human and robot approach behaviors into small group formations
Gregor Koporec and Janez Perš, Gorenje and University of Ljubljana, Human-Centered Unsupervised Segmentation Fusion
The workshop will be held on the Nov 4th, at IROS 2019, Macau. More details about the confernece can be found here. The tentative schedule can be found below. Details on speakers and talks will be posted soon!