KAIST Multispectral Pedestrian Detection Benchmark


    We developed imaging hardware consisting of a color camera, a thermal camera and a beam splitter to capture the aligned multispectral (RGB color + Thermal) images. With this hardware, we captured various regular traffic scenes at day and night time to consider changes in light conditions.

    The KAIST Multispectral Pedestrian Dataset consists of 95k color-thermal pairs (640x480, 20Hz) taken from a vehicle. All the pairs are manually annotated (person, people, cyclist) for the total of 103,128 dense annotations and 1,182 unique pedestrians. The annotation includes temporal correspondence between bounding boxes like Caltech Pedestrian Dataset. More infomation can be found in our CVPR 2015 paper.

  • Data Format. (Compatible with Caltech Pedestrian Dataset Format)
  • Extended Video Annotation Tool for Multispectral Images.


    Please contact Soonmin Hwang [smhwang at rcv.kaist.ac.kr] with questions or comments.


    Soonmin Hwang, Jaesik Park, Namil Kim, Yukyung Choi and In So Kweon,
    Multispectral Pedestrian Detection: Benchmark Dataset and Baseline,
    CVPR, 2015. [pdf|Ext. Abstract]

Change Log

    2015.11.09. Bug-fixed code is released.
    2015.06.03. The "Multispectral Pedestrian Detection Benchmark" webpages opened.