Joint Workshop on Long-Term Visual Localization, Visual Odometry and Geometric and Learning-based SLAM

Workshop Information

  • When: TBD
  • Where: TBD
  • Time: TBA
  • Schedule
    • TBA

Workshop Description

Visual Localization is the problem of estimating the position and orientation, i.e., the camera pose, from which an image was taken. Long-Term Visual Localization is the problem of robustly handling changes in the scene. Simultaneous Localization and Mapping (SLAM) is the problem of tracking the motion of a camera (or sensor system) while simultaneously building a (3D) map of the scene. Similarly, Visual Odometry (VO) algorithms track the motion of a sensor system, without necessarily creating a map of the scene. Localization, SLAM, and VO are highly related problems, e.g., SLAM algorithms can be used to construct maps that are later used by Localization techniques, Localization approaches can be used to detect loop closures in SLAM and SLAM / VO can be used to integrate frame-to-frame tracking into real-time Localization approaches.

Visual Localization, SLAM, and VO are all fundamental capabilities required in many Computer Vision and Robotics applications, such as Augmented / Mixed / Virtual Reality and other emerging applications based on location context, such as scene understanding, city navigation and tourist recommendation, and autonomous vehicles such as self-driving cars and other robots. Consequently, visual localization, SLAM, and VO are important research areas, both in academia and industry. This workshop focuses on the following important topics in the context of Localization, SLAM, and VO:

  • Common to existing approaches to the Visual Localization problem, whether they rely on local features or CNNs, is that they generate a representation of the scene from a set of training images. These approaches (implicitly) assume that the set of training images covers all relevant viewing conditions. In practice, this assumption is typically violated as it is nearly impossible to cover complex scenes over the full range of viewing conditions. Moreover, many scenes are dynamic: the geometry and appearance of scenes changes significantly over time, e.g., due to seasonal changes in outdoor scenes or changes in furniture in indoor scenes. This workshop aims to serve as a benchmark for the current state of visual localization under changing conditions and to encourage new work on this challenging problem.
  • We have see impressive progress on Visual SLAM (V-SLAM) with both geometric-based methods and learning-based methods. However, none of those methods is robust enough for high-reliability robotics, where challenging situations such as changing or a lack of illumination, dynamic objects, and texture-less scenes, exist and no other sources of odometry are available. Unfortunately, popular benchmarks such as KITTI or TUM RGB-D SLAM are too clean and simple, have rather restricted motion patterns, usually only cover one type of scene (e.g. urban street, indoor), and are often free of degrading effects such as lighting changes and motion blur. This workshop puts forth a challenge to gather evidence on the robustness of geometric and learning-based SLAM in challenging situations and to push the limit of geometric and learning-based SLAM towards real world applications. To this end, the workshop provides a new benchmark with large high-quality and diverse data and good labels.
  • The development of smart-phones and cameras is also making the visual odometry more accessible to common users in daily life. With the increasing efforts devoted to accurately computing the position information, emerging applications based on location context, such as scene understanding, city navigation and tourist recommendation, have gained significant growth. The location information can bring a rich context to facilitate a large number of challenging problems, such as landmark and traffic sign recognition under various weather and light conditions, and computer vision applications on entertainment based on location information, such as Pokemon. This workshop solicits scalable algorithms and systems for addressing the ever increasing demands of accurate and real-time visual odometry, as well as the methods and applications based on the location clues.

Besides offering concrete challenges, invited talks by experts from both academia and industry provide a detailed understanding about the current state of Visual Localization, SLAM, and VO algorithms, as well as open problems and current challenges. In addition, the workshop solicits original paper submissions.


This workshop provides challenges on SLAM and Long-Term Visual Localization. More details can be found on the pages describing the challenges.

Topics Covered

This workshop covers a wide range of topics, including, but not limited to.

  • Long-Term Operation of Localization and Mapping
  • Geometric Methods for SLAM in Dynamic Environments
  • Hybrid (Learning + Geometry) SLAM Systems
  • Semantic-context applied to SLAM, VO, and Visual Localization
  • Applications of SLAM, VO, and Visual Localization in challenging domains
  • SLAM / VO / Visual Localization Datasets, Benchmarks, and Metrics
  • (6DOF) Visual Localization
  • Place Recognition
  • Image Retrieval
  • (Deep Learned) Local Features and Matching
  • Deep Learning for Scene Coordinate Regression and Camera Pose Regression
  • 3D Reconstruction for Mapping
  • Augmented / Mixed / Virtual Reality applications based on Visual Localization, SLAM, or VO
  • Applications based on Visual Localization, SLAM, or VO in the area of Robotics and Autonomous Driving
  • Semantic Scene Understanding for Localization and Mapping
  • Simultaneous Localization and Mapping
  • (3D) (Semantic) Scene Understanding and Scene Representations
  • Image-based localization and navigation
  • Monocular and Stereo Visual Odometry
  • Multi-Modal Visual Sensor Data Fusion
  • Real-Time Object Tracking
  • Deep Learning for Visual Odometry and SLAM
  • Large-Scale SLAM
  • Rendering and Visualization of Large-Scale Models
  • Feature Representation, Indexing, Storage, and Analysis
  • Object Detection and Recognition based on Location Context
  • Landmark Mining and Tourism Recommendation
  • Video Surveillance
  • Large-Scale Multi-Modal Datasets Collection
  • Visual Odometry for Night Vision
  • Odometry based on Event Cameras
  • Scale Estimation for Monocular Odometry with Prior Information
  • End-to-End Visual Odometry, SLAM and Localization

For additional details, please also see the Call for Papers.

Important Dates

Localization Challenges:

  • Challenges Open: Feb. 8th
  • Submission deadline: May 1st (papers describing the methods need to be available (on arXiv) by May 8th)
  • Notification: May 15th

SLAM Challenges:

  • Submission deadline: TBD
  • Notification: TBD

Paper submission:

  • Paper submission deadline: TBD
  • Author notification: TBD