Most of previous papers about the detection of nude or pornographic images start by the application of a skin detector followed by some kind of shape or geometric modeling. In this work, these two steps are avoided by a bag-of-features (BoF) approach, in which images are represented by histograms of visual descriptors. BoF approaches have been applied successfully to object recognition tasks, but most descriptors used are based on gray level information. Our approach is based on an extension to the well-known SIFT descriptor - called Hue-SIFT - aimed at adding color information to the original SIFT. Experimental results show that this approach is indeed able to distinguish between nude and non-nude images from the web better than a similar traditional BoF.

THIS DATABASE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. The images provided were produced by third-parties, who may have retained copyrights. They are provided strictly for non-profit research purposes, and limited, controlled distributed, intended to fall under the fair-use limitation. We take no guarantees or responsibilities, whatsoever, arising out of any copyright issue. Use at your own risk.

The database contains 180 images collected from the Web. To download the data, please click here. Please feel free to contact me if you have any questions or comments.

Examples of nude (first row) and non-nude images (second row) from our database.

Results @ EUSIPCO 2009

SIFT and Hue-SIFT descriptors were extracted from all images, and then the vocabulary size and SVM model were extensively searched for, using a 5-fold cross validation scheme. For the vocabulary size (k), the experiments spawned values between 50 and 700. For each k value, the penalty of the SVM error term (C) was varied in a logarithmic scale. The k and C values which achieved the best recognition rate were submitted to a finer search for C. This procedure was repeated for SIFT and Hue-SIFT separately.

The best recognition rates for all tested values of k are plotted both for Hue-SIFT and SIFT alone. The best recognition rate is the higher value achieved while varying penalty error term of the SVM model (C parameter). It can be seen that SIFT recognition rates are consistently smaller for all tested vocabulary sizes, indicating the importance of color information.

If you make use of our database, please cite the following reference: 
  • LOPES, Ana. P. B.; AVILA, Sandra E. F.; PEIXOTO, Anderson; OLIVEIRA, Rodrigo S.; ARAÚJO, Arnaldo de A. A Bag-of-Features Approach based on Hue-SIFT Descriptor for Nude Detection. In: 17th European Signal Processing Conference (EUSIPCO), Glasgow, 2009. PDF | Bibtex ]

See also
Nude detection in videos
Pornography detection

This work is supported by Brazilian research funding agencies: CAPES, CNPq and FAPEMIG.