Schedule

Program

Date: October 27th, 2019 Duration: Half-day

Chair: Senthil Yogamani (Valeo Vision Systems, Ireland) & Lars Kunze (Oxford Robotics Institute, UK)

Tentative Program :

  • 8:45 Opening Remarks

  • 8:50 - 9:10 Accepted Paper #1 - Applying map-masks to Trajectory Prediction for Interacting Traffic-Agents (PDF)

Speaker: Vyshakh Palli Thazha (ENSTA Paris)

  • 9:10 - 9:30 Accepted Paper #2 - DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance (PDF)

Speaker: Ian Miller (University of Pennsylvania, US)

  • 9:30 - 10:00 Invited Talk #1 - Overview of Deep Learning based Perception and Navigation at Oxford Robotics Institute

Speaker: Daniele de Martini (ORI, Oxford Uni)

  • Coffee Break and Networking 10:00 - 10:40

  • 10:40 - 11:00 Invited Talk #1 - Moving Object Detection on Surround-view Cameras for Autonomous Driving (PDF)

Speaker: Dr. Letizia Mariotti (Valeo Vision Systems, Ireland)

  • 11:00 : 11:30 Invited Talk #2 - Predictive Motion Models: Designed by an Expert or Learned from Data?

Speaker: Prof. Dariu Gavrila (TU Delft, Netherlands)

  • 11:30 : 12:00 Invited Talk #3 - Coupling deep learning for perception and decision making based control

Speaker: Prof. Arnaud de la Fortelle (Mines Paristech, France)

  • 12:00 Closing

Invited Speakers

Center for Robotics - CAOR, MINES ParisTech, France

Prof. Arnaud de La Fortelle has engineer degrees from the French École Polytechnique and École des Ponts et Chaussées (2 top French institutions) and a Ph.D. in Applied Mathematics (Probability Theory). He managed (being coordinator twice) several French and European projects. He moves to MINES ParisTech in 2006 where he becomes director of the Center for Robotics in 2008. He was Visiting Professor at UC Berkeley in 2017_2018. He has been elected in 2009 to the Board of Governors of IEEE Intelligent Transportation System Society. He has been member of several program committees for conferences and was General Chair of IEEE Intelligent Vehicles Symposium 2019 in Paris. He was member, then president of the French ANR scientific evaluation committee for sustainable mobility and cities in 2008-2017. He also served as expert for the European H2020 program. His main topic of interest is cooperative systems (communication, data distribution, control, mathematical certification) and their applications (e.g. collective taxis, cooperative automated vehicles). He chairs the international research chair Drive for All with sponsors Valeo, Safran and Peugeot and partners UC Berkeley, EPFL and Shanghai Jiao Tong University.

Coupling deep learning for perception and decision making based control

Deep Learning, and especially Neural Networks, contributed significantly to many advances in automated driving, be it partially or fully. However, we claim there is still a significant gap between perception and control. There are several possible outputs to a Neural Network (or any more traditional perception algorithm), and it is difficult to exploit them directly for decision making or control: most architecture rely on maps as convenient representation of the world. From these maps, reasoning leads to decision making and ultimately to control. Today’s architectures usually rely on Model Predictive Control as a way to both take decision (i.e. decide maneuvers) and to control the vehicle (i.e. accelerate, steer, brake). In spite of their success, these algorithms rely on assumptions on their model, and on rather simple models. This means the maps used for Model Predictive Control are pretty simple, hence the output of the perception is sometimes simplified, even discarded. Additionally, the real world is ways wider than any dataset, and datasets can produce a false confidence level (dataset bias). It is an ongoing research to find suitable representation of the world, maps, that can both be produced by perception and that can be used by decision, planning and control algorithms. This talk aims at giving some more insight on how we could better couple the perception side with the decision making and control side.

Professor, Intelligent Vehicles section, TU Delft.

Dariu M. Gavrila received the MSc degree in computer science from the Vrije University, in Amsterdam, NL. He received the PhD degree in computer science from the University of Maryland at College Park, USA, in 1996. He was a Visiting Researcher at the MIT Media Laboratory in 1996. From 1997 till 2016 he has been with Daimler R&D in Ulm, Germany, where he eventually became a Distinguished Scientist. In 2010, he was appointed professor at the University of Amsterdam, chairing the area of Intelligent Perception Systems (part-time). Since 2016 he heads the Intelligent Vehicles section at the TU Delft as a Full Professor.

Over the past 20 years, Prof. Gavrila has focused on visual systems for detecting humans and their activity, with application to intelligent vehicles, smart surveillance and social robotics. He led the multi-year pedestrian detection research effort at Daimler, which was incorporated in the Mercedes-Benz S-, E-, and C-Class models (2013-2014). Currently, he performs research on self-driving cars in complex urban environment and is particularly interested in the anticipation of pedestrian and cyclist behavior.

Prof. D. M. Gavrila graduated eight Ph.D. students and over 20 MS students. He published 100+ papers in first-tier conferences and journals, and is frequently cited in the area of computer vision and intelligent vehicles (Google Scholar: 13.000+ times). He served as Area Chair and Associate Editor on many occasions, he was Program Co-Chair at the IEEE Intelligent Vehicles 2016 conference. He received the I/O 2007 Award from the Netherlands Organisation for Scientific Research (NWO) and the IEEE Intelligent Transportation Systems Application Award 2014. He has had regular appearances in the international broadcast and print media. His personal Web site is www.gavrila.net (till 2016). His group's Web site is www.intelligent-vehicles.org (since 2016).

Predictive Motion Models: Designed by an Expert or Learned from Data?

Sensors are meanwhile very good at measuring 3D in the context of environment perception for self-driving vehicles. Scene labeling and object detection have also made big strides, mainly due to advances in deep learning. Time has now come to focus on the next frontier: modeling and anticipating the motion of road users. The potential benefits are large, such as earlier and more effective system reactions in dangerous traffic situations. To reap these benefits, however, it is necessary to use sophisticated predictive motion models based on intent-relevant (context) cues.

In this talk, I give an overview of predictive motion models and intent-relevant cues with respect to the vulnerable road users (i.e. pedestrians, cyclists, and other riders). In particular, I discuss the pros and cons of having these models designed by an expert compared to learning them from data. I present results from a recent case study on cyclist path prediction involving a Dynamic Bayesian Network and a Recurrent Neural Network.

Professor, Intelligent Vehicles section, TU Delft.

Dariu M. Gavrila received the MSc degree in computer science from the Vrije University, in Amsterdam, NL. He received the PhD degree in computer science from the University of Maryland at College Park, USA, in 1996. He was a Visiting Researcher at the MIT Media Laboratory in 1996. From 1997 till 2016 he has been with Daimler R&D in Ulm, Germany, where he eventually became a Distinguished Scientist. In 2010, he was appointed professor at the University of Amsterdam, chairing the area of Intelligent Perception Systems (part-time). Since 2016 he heads the Intelligent Vehicles section at the TU Delft as a Full Professor.

Over the past 20 years, Prof. Gavrila has focused on visual systems for detecting humans and their activity, with application to intelligent vehicles, smart surveillance and social robotics. He led the multi-year pedestrian detection research effort at Daimler, which was incorporated in the Mercedes-Benz S-, E-, and C-Class models (2013-2014). Currently, he performs research on self-driving cars in complex urban environment and is particularly interested in the anticipation of pedestrian and cyclist behavior.

Prof. D. M. Gavrila graduated eight Ph.D. students and over 20 MS students. He published 100+ papers in first-tier conferences and journals, and is frequently cited in the area of computer vision and intelligent vehicles (Google Scholar: 13.000+ times). He served as Area Chair and Associate Editor on many occasions, he was Program Co-Chair at the IEEE Intelligent Vehicles 2016 conference. He received the I/O 2007 Award from the Netherlands Organisation for Scientific Research (NWO) and the IEEE Intelligent Transportation Systems Application Award 2014. He has had regular appearances in the international broadcast and print media. His personal Web site is www.gavrila.net (till 2016). His group's Web site is www.intelligent-vehicles.org (since 2016).

3D Semantic Scene Analysis in Urban Traffic

This talk presents recent work at TU Delft on 3D semantic scene analysis in urban traffic using video and/or LiDAR. First, I discuss fast and compact stereo image segmentation using Instance Stixels [1]. These augment single-frame stixels with instance information, which can be extracted by a CNN from the RGB image input. As a result, the novel Instance Stixels method efficiently computes stixels that account for boundaries of individual objects, and represents instances as grouped stixels that express connectivity. Second, I discuss the outcome of an experimental study on video- and LiDAR-based 3D person detection (i.e. pedestrians and cyclists) [2]. I report how the detection performance depends on distance, number of LiDAR points, amount of occlusion, and the optional use of LiDAR intensity cues. I include results on the new EuroCity Persons 2.5D (ECP2.5D) dataset, which is about one order of magnitude larger than KITTI regarding persons. Finally, I cover domain transfer experiments between the KITTI and ECP2.5D datasets, and discuss future challenges.

[1] T. Hehn, J.F.P. Kooij and D.M. Gavrila. “Fast and Compact Image Segmentation using Instance Stixels”. Under review at IEEE Trans. on Intelligent Vehicles, 2020.

[2] J. van der Sluis, E.A.I. Pool and D.M. Gavrila. “An Experimental Study on 3D Person Localization in Traffic Scenes”. Under review at IEEE Trans. on Intelligent Vehicles, 2020.


Computer Vision Researcher, Valeo Ireland

Speaker-Bio : Letizia Mariotti received her BSc and MSc in Physics at the University of Trieste, Italy. She graduated with a PhD in image processing at the School of Physics of the National University of Ireland, Galway, in 2018. She is currently working as a computer vision research engineer at Valeo Vision Systems, Ireland, where her main research focus is the detection of moving objects.


Moving Object Detection on Surround-view Cameras for Autonomous Driving

The detection and localisation of moving obstacles is critically important for assisted and autonomous driving vehicles. The problem is even more complex when using fisheye cameras, because of the extreme non-linearity in the image. In this talk, I will present the challenges inherent in the moving object detection problem and compare the classical computer vision approach with the recent advancements in Deep Learning.


Dr. Daniele De Martini

Postdoctoral Research Assistant at the Oxford Robotics Institute

Speaker-Bio : Dr Daniele De Martini is a Postdoctoral Research Assistant at the Oxford Robotics Institute, part of the Department of Engineering Science, and a Junior Research Fellow of Kellogg College. He owns a degree in Mechanical Engineering from the Università degli Studi di Pavia, a MSc in Mechatronics Engineering from the Politecnico di Torino and a PhD from the Università degli Studi di Pavia, Italy. Daniele joined the Oxford Robotics Institute in 2018 and since then his work focused on autonomous vehicles, with particular interest in innovative solutions that can lead to the wide adoption of radar technology as main sensor modality.


Deep Learning Methods for Perception and Navigation in complex urban environments

Autonomous vehicles are closer and closer to be deployed in challenging, real-world scenarios. Among the challenges are varying weather and lighting conditions which mostly affect sensors such as cameras and LiDARs. Environmental conditions can degrade the overall performance in critical tasks such as localisation and semantic segmentation. At the same time the environment is only partially observable due to occlusions from static and dynamic objects. In this talk, we present Deep Learning as a viable methodology to tackle these challenges. In particular, we report on recent advances in the field made by the Mobile Robotics Group (MRG), with a focus on the works that are presented during ITSC 2019.

Accepted papers

  • DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance (link) Yilun Zhang, Ty Nguyen, Ian D. Miller, Shreyas S. Shivakumar, Steven Chen, Camillo J. Taylor, Vijay Kumar

  • Applying map-masks to Trajectory Prediction for Interacting Traffic-Agents (link) Vyshakh Palli-Thazha, David Filliat and Javier Ibanez-Guzman

  • Multi Modal Semantic Segmentation using Synthetic Data (link) Kartik Srivastava, Akash Kumar Singh, Guruprasad M. Hegde