A Probabilistic Approach to People-Centric Photo Selection and Sequencing

We present a large crowdsourcing (CS) study to examine how specific image attributes probabilistically affect the selection and sequencing of images from personal photo collections. 13 image attributes are explored, including 7 human-centric properties. We first propose a novel dataset shaping technique based on Mixed Integer Linear Programming (MILP), to identify a subset of photos in which the attributes of interest are (a) uniformly distributed and (b) minimally correlated. This allows the synthesis of compact, balanced representative datasets, allowing the efficient modeling of crowd preferences by learning through CS (i) the selection likelihood of an image and (ii) its relative position in a sequence, given its attributes. Then, we present an ILP-based slideshow creation framework to select and arrange (a subset of) appealing/interesting images from a personal photo library. A user-study confirms that our method considerably outperforms random photo selection and sequencing, while generating slideshows similar in quality to those created by humans.

The supplementary material includes:

  1. The image selection probabilities for 13 image/face attributes, learnt through a large-scale crowdsourcing study.

  2. A MATLAB implementation of the Mixed-Integer Linear Programming (MILP) technique for dataset shaping (or balancing), including minimization of the cross-dimensional correlations. You can use the provided function to create different subsets of a dataset, enforcing particular distributions. Enforcing a Uniform distribution will have a balancing effect in the resulting subset.

  3. A MATLAB implementation of the Integer Linear Programming (ILP) technique for automatic appealing slideshow creation, based on the learnt selection probabilities from the crowd.

  4. A MATLAB script that estimates image appeal in images, ranks them according to it and selects the top N most appealing ones.

The provided code for (3) and (4) requires the images from the Gallagher dataset to demonstrate the results. For obvious copyright reasons we cannot release these images.You may download them from this link: http://chenlab.ece.cornell.edu/people/Andy/GallagherDataset.html

Supplementary material