DSSLab University of Piraeus - DEA Massive Data Sets

Massive Scale Data Suite

The suite includes 48 synthetic data files created by Jose Dulá. Detailed information about the data generation process can be found in Dulá (2008).

The data sets vary in four cardinality levels (n = 25K, 50K, 75K, 100K), four dimensions (m = 5, 10, 15, 20) and three density levels (d = 1%, 10%, 25%).

Dulá, J.H. (2008). A computational study of DEA with massive data sets. Computers & Operations Research, 35:1191 – 1203.

The datasets has been already widely employed in the literature, e.g., in the following studies, among others.

Korhonen, P. J., & Siitari, P. A. (2007). Using lexicographic parametric programming for identifying efficient units in DEA. Computers & operations research, 34(7), 2177-2190.
Korhonen, P. J., & Siitari, P. A. (2009). A dimensional decomposition approach to identifying efficient units in large-scale DEA models. Computers & operations research, 36(1), 234-244.
Dulá, J. H., & López, F. J. (2009). Preprocessing dea. Computers & Operations Research, 36(4), 1204-1220.
Dulá, J. H. (2011). An algorithm for data envelopment analysis. INFORMS Journal on Computing, 23(2), 284-296.
Jie, T. (2020). Parallel processing of the Build Hull algorithm to address the large-scale DEA problem. Annals of Operations Research, 295(1), 453-481.

12 data sets of cardinality 25,000

12 data sets of cardinality 50,000

12 data sets of cardinality 75,000

12 data sets of cardinality 100,000

Google Sites

Report abuse