Shaping Datasets

This MATLAB code replicates the results of the paper:V. Vonikakis, R. Subramanian, S. Winkler. (2016). Shaping Datasets: Optimal Data Selection for Specific Target Distributions. Proc. ICIP2016, Phoenix, USA, Sept. 25-28.

It includes:

1. SHAPE_DATASET.m is the main function that shapes (or balances) a dataset. It is generic and can be used with many different datsaets.

2. test_shaping_Galllagher.m and test_shaping_Helen.m are scripts that apply the technique to 2 different datasets.

3. GHALLAGHER.mat and HELEN.mat, contain the filenames and attributes we used for these 2 datasets. No actual images are included, due to copyright issues.

4. test_shaping_random.m is a script that demonstrates the technique in randomly generated data points, from different initial distributions.

The project is also available in Github: https://github.com/bbonik/shaping_datasets

UPDATE: This work is outdated! A newer better version, which minimizes also inter-dimensional correlations, is described in our IEEE TMM paper: "A Probabilistic Approach to People-Centric Photo Selection and Sequencing".

The new code is available here.

website stats