We host challenges to understand the current status of computer vision algorithms in solving the environmental perception problems for autonomous driving. We have prepared a number of large scale datasets with fine annotation, collected and annotated by Berkeley DeepDriving, nuTonomy, and Didi Chuxing. Based on the datasets, we have defined a set of four realistic problems and encourage new algorithms and pipelines to be invented for autonomous driving.

Challenge 1: nuScenes 3D Detection

The goal of the nuScenes detection challenge is to use the 360 degree sensor data from camera, lidar and radar to estimate 3D location, size, velocity, and attributes for all objects in the scene. There will be three tracks: LIDAR only, VISION only, and an unrestricted OPEN track.

The evaluation server can be found here. For more details and the leaderboard refer to

Challenge 2: BDD100K & D²-City Detection Domain Adaptation

We propose a transfer learning challenge on the object detection task. Given the annotated data from the BDD100K dataset collected in US, participants are asked to provide object detection results on data from the D²-City dataset collected in China. Data may cover a variety of conditions or even severe or rare conditions (e.g., dim light, rain or fog, and traffic congestion). Participants are expected to provide accurate object detection results in all challenging situations. More information about the challenge can be found at BDD Data D²-City websites.

Challenge 3: D²-City & BDD100K Tracking Domain Adaptation

We propose a transfer learning challenge on the object detection tracking task. Different from Challenge 2, participants are required to examine the annotated data from D²-City collected in China and provide results on BDD100K collected in US. Participants are expected to provide accurate tracking results. More information about the challenge can be found at BDD Data D²-City websites.

Teaser Challenge: D²-City Large-scale Detection Interpolation

This teaser challenge is designed to reflect the practical applications of object detection, interpolation, detection tracking, and domain adaptation. The participants will be provided with videos that present object detection annotations on some keyframes and are required to provide detection results on other video frames. The participants are encouraged to utilize the annotated video sequences from BDD100K and/or any other publicly available datasets and make manual corrections to improve the results. In order to encourage participants to leverage algorithms and manpower efficiently, the test videos will be released in stages. More information about the challenge can be found at D²-City websites.


D²-City Dataset from Didi Chuxing

D²-City is a large-scale driving video dataset, providing 10K videos recorded in HD 720P or FHD 1080P from dash cameras. Among all data, 1K videos come with tracking annotation of all the road objects, including bounding box and tracking id of car, van, bus, truck, pedestrian, motorcycle, bicycle, open- and closed-tricycle, forklift, and large- and small-block. The rest videos come with road objects annotation on keyframes.

Compared with existing datasets, D²-City enjoys a huge amount of diversity as it is collected from several cities in China with various weather, road and traffic conditions. D²-City pays special attention on challenging cases such as extreme weather conditions and complicated traffic scenes etc. By bringing more challenging cases to the community, we hope D²-City could encourage and inspire new progress in perception models of autonomous vehicles.

BDD100K Dataset from Berkeley DeepDrive

BDD100K dataset is a large collection of 100K driving videos with diverse scene types and weather conditions. Along with the video data, we also released annotation of different levels on 100K keyframes, including image tagging, object detection, instance segmentation, driving area and lane marking. In 2018, the challenges hosted at CVPR 2018 and AI Challenger 2018 based on BDD data attracted hundreds of teams to compete for best object recognition and segmentation algorithms for autonomous driving.

nuScenes 3D Detection Dataset from APTIV

nuScenes is a public large-scale dataset for autonomous driving. While previous datasets and challenges [CamVid, Cityscapes, Mapillary Vistas, Apolloscapes, Berkeley DeepDrive] have focused primarily on image-based detection and segmentation of objects, nuScenes is truly multi-modal with 6 cameras, 5 radars and 1 lidar. It includes 1000 driving scenes from Singapore and Boston and covers 1.4M camera images, 400k lidar sweeps and 1.3M radar sweeps. Compared to the KITTI dataset, nuScenes will include 15-20x more keyframes and object annotations. Other datasets also provide vast amounts of sensor data, but without 3D object annotations [Oxford Robotcar, TorontoCity, Berkeley DeepDrive]. Compared to other public datasets, our data is collected from a production-grade self-driving car platform that is approved for autonomous driving on public roads. Furthermore, we provide accurate localization data of the car as well as human annotated maps that can serve as strong priors for various vision tasks.