We will conduct experiments on TuSimple’s velocity dataset (10) and KITTI 3D object detection & tracking dataset (20) for evaluating the performance of our proposed method.
KITTI: Depth is predicted on the 2017 KITTI 3D Object Detection dataset was divided into 3707 training and 3769 validation images, each displaying atleast 1 car. The dataset contains object annotations in form of 3D bounding box. The training dataset has over 14000 cars with an average length of 3.88 m, height of 1.52 m and width of 1.62 m. Velocity is predicted on the KITTI tracking dataset has 21 training sequences and 29 validation sequences benchmarked for 8 classes. We consider only fully visible (non-truncated) cars. Following image shows how KITTI dataset is processed.
Figure 3
TuSimple Velocity Challenge Dataset: This dataset provides a set of 1074 2-second-long video clips. Video sequences are captured at 20fps and thus, each is 40 frames long. Designated vehicles, at a relative distance ranging from 5 meters to up to 90 meters, are annotated with bounding boxes on the last frame along with their ground truth velocity and position generated by range sensors.
Since for a given vehicle the bounding box is defined only in the last frame, tracking over the temporal extent of the input is required for all further processing. We deploy multiple trackers and finally settle upon fast and light weight Median Flow tracker. It precisely localizes the object outline and operates at the pixel level. The tracker provides a tight bounding box over the entire trajectory in the video, shown in Fig. 4, which is crucial for estimating the relative velocity of the objects. The tracker is initialized and run from the last frame to the first frame. This is separately done for each vehicle if multiple are present in the same video.
Figure 4: The first image is the ground truth, i.e., bounding box on the vehicle from the last frame. The next 2 images are random frames between frame numbers 1-39. The 2 frames are generated using the Median Flow trackers based on frame number 40