Learning SURF Cascade for Fast and Accurate Object Detection

Release Date:Apr 26, 2012 2:52:45 AM

This paper presents a novel learning framework for training boosting cascade based object detector from large scale dataset. The framework is derived from the well-known Viola-Jones (VJ) framework but distinguished by three key differences. First, the proposed framework adopts multi-dimensional SURF features instead of single dimensional Haar features to describe local patches. In this way, the number of used local patches can be reduced from hundreds of thousands to several hundreds. Second, it adopts logistic regression as weak classifier for each local patch instead of decision trees in the VJ framework. Third, we adopt AUC as a single criterion for the convergence test during cascade training rather than the two trade-off criteria (false-positive-rate and hit-rate) in the VJ framework. The benefit is that the false-positive-rate can be adaptive among different cascade stages, and thus yields much faster convergence speed of SURF cascade.

Combining these points together, the proposed approach has three good properties. First, the boosting cascade can be trained very efficiently. Experiments show that the proposed approach can train object detectors from billions of negative samples within one hour even on personal computers. Second, the built detector is comparable to the state-of-the-art algorithm not only on the accuracy but also on the processing speed. Third, the built detector is small in model-size due to short cascade stages.


NOTE

1) An FAQ on how to implement is available below, keep on updating it if more questions coming.

2) Training set for frontal face detection can be downloaded from HERE. Training set for multi-view face detection will be available soon.

If you use our collections for any research purpose, please cite this paper.

3) We have a new super fast face detection system derived on our CVPR paper recently, which can run super real-time for HD videos in pure detection mode with single computing thread. 200+ fps for VGA videos, 60 fps for HD (720p) videos, 25 fps for Full-HD (1080p) videos. The speed is about 2x faster than original SURF Cascade. I have ported it to phone and very low-power devices. It can run real-time on most current main-stream smart phones for HD (720p) input. A demo on Windows platform and a simple Win32 SDK is also available below. Feel free to try it for your research work.