Datasets tailored to time-ordered online training for object detection, where both the training and test sequence are videos that could contain or not contain a target object, are few and far between. We provide the two datasets here (PersonFinder and CarChaser) that were used in our paper.

Both these datasets were taken at Camp Roberts, CA during the Joint Interagency Field Experimentation (JIFX) events. Parts of the images containing people uninvolved in the experiment were blurred for privacy reasons.

Both sets contain a train and test set, that were created from a video stream. The test set was pared down so that there are the same number of positive and negative examples - this is done so binary accuracy is readily used as a metric.


Each ZIP file contains two directories, and two text files. Each text file serves as an index for the images in the corresponding directory.

The text file contains one line per image, and each line is formatted as such:

<filepath> <label> <x coordinate> <y coordinate>


  • label is 1 when there is no target in the scene, and 0 otherwise
  • x and y coordinate refer to the (human-tagged) centroid of the target, and (0,0) is the top lefthand corner.


You can download the datasets on Google Drive here.