Schedule

Program

Date: June 9th, 2019 Duration: 09:00-14:30

Venue: MINES ParisTech, Paris

Registration: Registration can be done at IV 2019 website (link). Workshop only registration is 200 € and full conference registration is 840 €. Complete conference program is available here (link).

Program :

  • 09:00 - 09:10 Introduction
  • Invited Talks
    • 09:10 - 09:45 : Virtual worlds for 3D Object Detection, Antonio M. López (Computer Vision Center - CVC-UAB, Spain) [slides]
    • 09:45- 10:15 : HD Maps from LiDAR Mobile Mapping Systems and Paris-Lille-3D dataset, Jean-Emmanuel Deschaud (CAOR, MINES ParisTech, France) [slides]
    • 10:15 - 10:45 : Seeing Through Fog and Snow: Sensor Fusion & Dense Depth in Adverse Weather, Felix Heide (CTO Algolux | Incoming Professor, Princeton University)
  • 10:45 - 11:15 Coffee Break
  • Paper presentations
    • 11:15 - 11:35 : Vehicle Detection based on Deep Learning Heatmap Estimation, Ruddy Theodose, Dieumet Denis, Christophe Blanc, Thierry Chateau, Paul Checchin
    • 11:35 - 11:55 : End-to-End 3D-PointCloud Semantic Segmentation for Autonomous Driving, Mohammed Abdou, Mahmoud Elkhateeb, Ibrahim Sobh , Ahmad Al Sallab
  • 12:00 - 13:00 Lunch Break / Poster presentations
    • Point clouds density impact on performance in 3D object detection, Antoine Gauthier AKKA Technologies, Victor Talpaert, ENSTA ParisTech [Poster]
    • Representations for 3D reconstruction and modeling on lidar point cloids (tentative title), Luis Roldao et al, INRIA Paris [Poster]
    • Normal estimation from spherical images as feature for point cloud segmentation, Leonardo Gigli, CMM, Mines ParisTech [Poster]
  • Invited Talks
    • 13:00 - 13:30 : From big scale point cloud registration to globally available HD Maps , Blazej Kubiak (Expert Software Engineer at TomTom Poland)
    • 13:30 - 14:00 : Predicting and Using Monocular Depth for Deep Driving, Adrien Gaidon (Machine Learning Lead at Toyota Research Institute, California) [Slides]
    • 14:00 - 14:30 : Lidar for reconstruction, recognition and understanding: the open source tools, Bastien Jacquet (Computer Vision Group Leader at Kitware SAS, Lyon, France)
  • 14:30 : Closing

Invited Speakers

Computer Vision Center - CVC-UAB, Spain

Speaker-Bio : Antonio M Lopez is Associate Professor (Tenure) at the Computer Science Department of the Universitat Autònoma de Barcelona (UAB). He is also a founding member of the Computer Vision Center (CVC) at the UAB, where he created and is the Principal Investigator of the group on ADAS and Autonomous Driving since 2002. Antonio is founding member and co-organizer of consolidated international workshops such as the “Computer Vision in Vehicle Technology (CVVT)” and the “Transferring and Adapting Source Knowledge in Computer Vision (TASK-CV)”. Antonio has also collaborated in numerous research projects with international companies, specially from the automotive sector. His work in the last 10 years has focused on the use of Computer Graphics, Computer Vision, Machine (Deep) Learning, and Domain Adaptation to train and test onboard AI drivers. He is the Principal Investigator at CVC/UAB of known systems such as SYNTHIA dataset and CARLA simulator, the latter in deep collaboration with the Intelligent Systems Lab at INTEL and the Toyota Research Institute (TRI). Complementary, Antonio is currently also focusing more on topics such as Active Learning, Lifelong learning, and Augmented Reality, in the context of Autonomous Driving.

Virtual worlds for 3D Object Detection

Training accurate 3D perception algorithms is a challenging task due to the vast amount of annotated data needed. Virtual worlds are gaining popularity for these purposes as they allow to extensively generate realistic 3D ground-truth as well as simulate vehicle-environment interactions. In this talk we will present out research related to the use of synthesized LiDAR data (geometry and semantics) for 3D object detection, monocular depth estimation, and end-to-end driving.


Center for Robotics - CAOR, MINES ParisTech, France

Speaker-Bio : Engineer from Telecom ParisTech and graduated from the Master MVA (Mathematics, Vision and Learning) of ENS Paris-Saclay, J.-E. Deschaud did his thesis (2007-2010) at MINES ParisTech on processings of 3D point clouds from Mobile Mapping Systems. After a year of Post-Doc at Carnegie Mellon University on the subject of LiDAR data simulation, he returned to MINES ParisTech on a position of Assistant Professor on the topic of 3D Points Cloud and Modeling. J.-E. Deschaud is now Associate Professor since 2016 at the Robotics Center of MINES ParisTech and is interested in the topics of LiDAR SLAM, 3D data analysis and creation of 3D maps for the autonomous vehicle.

HD Maps from LiDAR Mobile Mapping Systems and Paris-Lille-3D dataset

The precise map (or called HD Map) is a data recognized as essential in the problem of the autonomous vehicle. We will see how this map can be produced on large scale from mobile mapping systems and see what are the current limitations of this process. First, we will present what are the essential elements of the HD map. We will see then how the production of 3D point clouds from LiDAR based mobile systems responds to the localization problem for autonomous vehicles and how the same data could be used for the perception part. We will also present our work to create a dataset, called Paris-Lille-3D to improve the essentiel step of classification of 3D point clouds.

CTO at Algolux | Incoming Professor at Princeton University

Felix Heide has co-authored over 50 publications and filed 6 patents. He received his Ph.D. from the University of British Columbia under the advisement of Professor Wolfgang Heidrich. He obtained his MSc from the University of Siegen, and was a postdoc at Stanford University. His doctoral dissertation won the Alain Fournier Ph.D. Dissertation Award and the SIGGRAPH outstanding doctoral dissertation award.

Seeing Through Fog and Snow: Sensor Fusion and Dense Depth in Adverse Weather

Color and lidar data play a critical role in object detection for autonomous vehicles, which base their decision making on these inputs. While existing perception stacks exploit redundant and complementary information under good imaging conditions, they fail to do this in adverse weather and in imaging conditions where the sensory streams can be distorted. These rare “edge case” scenarios are not represented in available datasets, and existing sensor stacks and fusion methods are not designed to handle them. This talk describes a deep fusion approach for robust detection and depth reconstruction in fog and snow without having labeled training data available for these scenarios. For robust object detection in adverse weather, we departing from proposal-level fusion, we propose a real-time single-shot model that adaptively fuses features, driven by a generic measurement uncertainty signal which is unseen during training. To provide accurate depth information in these scenarios, we do not rely on scanning lidar, and instead, demonstrate that it is possible to turn a low-cost CMOS gated imager into a dense depth camera with at least 80m range – by learning depth from three gated images. We validate the proposed method in simulation and on unseen real-world data acquired over 4.000 km driving in northern Europe.

Expert Software Engineer at TomTomLodz, Lodz District, Poland

Blazej Kubiak is enthusiast of autonomous driving and all technologies that bring this enthusiasm from dream into reality. He works at TomTom on global scale HD Maps and technologies that let this creation process efficient like: Deep Learning, SLAM technics, point cloud and image processing. Blazej is author of patent of Road DNA concept – centimeter level accuracy localization machinery.

From big scale point cloud registration to globally available HD Maps

HD Maps have become a part of autonomous driving ecosystem and they play important role in this world. We see some companies creating high quality HD maps and other companies that use those maps to perform successful drives in fully autonomous mode. However, when we think about turning from test drives into day to day usage of production ready self-driving cars, we must figure out how to create HD maps at global scale and how to maintain them and keep up to date. In the talk we will discuss challenges that arise when we turn from preparing small HD map sample into global scale map creation process and about technics like deep learning and SLAM that help to reach these challenging goals.

Machine Learning Lead at Toyota Research Institute, California, EEUU

Speaker-Bio : Adrien Gaidon is the Manager of the Machine Learning team and a Senior Research Scientist at the Toyota Research Institute (TRI) in Los Altos, CA, USA, working on open problems in world-scale learning for autonomous driving. He received his PhD from Microsoft Research - Inria Paris in 2012 and has over a decade of experience in academic and industrial Computer Vision, with over 30 publications, top entries in international Computer Vision competitions, multiple best reviewer awards, international press coverage for his work on Deep Learning with simulation, and was a guest editor for the International Journal of Computer Vision. You can find him on linkedin (https://www.linkedin.com/in/adrien-gaidon-63ab2358/) and Twitter (@adnothing).

Predicting and Using Monocular Depth for Deep Driving

Recent advances in deep learning for monocular depth prediction opens a breadth of new uses for cameras in automated driving. First, we will discuss a new model, called SuperDepth, that uses super-resolution and self-supervised learning to get state-of-the-art monocular depth. Second, we will show that these predictions are indeed useful as a prior for a new method, called ROI-10D, for 3D object detection and metric shape retrieval from a single image, vastly improving performance on the standard KITTI benchmark.


Computer Vision Group Leader at Kitware SAS

Bastien Jacquet leads Kitware's European Computer Vision team, this team provides professional service for writing open source software tools in computer vision.

The team is specialized in Image, Video and Point Cloud processing, Deep Learning, and SLAM for RGB-Camera, Depth-Camera, and Lidar. The team published an end-to-end pipeline for 3D reconstruction and classification from satellite-imagery in 2018, under the IARPA program "Creation of Operationally Realistic 3D Environment" CORE3D: https://blog.kitware.com/3d-reconstruction-from-satellite-images/

His team recently published our open source Lidar-based SLAM, shipped within VeloView codebase : https://github.com/Kitware/VeloView

Bastien Jacquet is passionate about 3D reconstruction and scene understanding, as well as generally enabling computers to understand the world through pictures, videos or Lidars.

Bastien Jacquet received his Ph.D. in Computer Vision from ETH Zurich (Switzerland), under Marc Pollefeys supervision. His Ph.D. research investigated the different aspects of 3D reconstruction and analysis of multi-body scenes from images. He published several publication in top CV conferences (CVPR, ICCV, ECCV, 3DV, ...).

He obtained an Engineering degree both from École Polytechnique Paris and École des Ponts ParisTech (France), as well as a MSc from the Master MVA (Mathematics, Vision and Learning) of ENS Paris-Saclay (France).

You can find him and several demo videos on LinkedIn.

Lidar for reconstruction, recognition and understanding: the open source tools

Lidar for reconstruction, recognition and understanding: the open source tools

Lidars - from cheap ones to expensive ones - are a key component of developing Autonomous Vehicle. It's either used once for getting ground truth data or integrated live in each vehicle.

We will discuss how such active sensors can efficiently mitigate GNSS/INS-denied situations, from obscure underground parkings to urban corridors or cases of active signal scrambling. Simultaneous Localisation and Mapping (SLAM) technics need some adaptation for those sparse data, and also enable mapping without expensive GNSS/INS.

Lidar often comes in numbers, and are efficiently coupled with other sensors, and our experience shows that calibration such multi-sensor setup is hard for end-users. We will discuss our self calibration algorithms that computes both the spatial and temporal shift between intrinsically different sensors (IMU, GPS, radar, Lidars, cameras).

We will briefly discuss how having such calibration, fusion and processing tools in an open source software (LidarView) - with an integrated Python sandbox - can facilitate further processing. Fusion of Deep Learning technics from highly investigated 2D, 2.5D, 3D and 3D+t with more recent point-cloud-based networks for enhanced detection and classification pipelines.