Massive Scale Data Suite
Massive Scale Data Suite
The suite includes 48 synthetic data files created by Jose Dulá. Detailed information about the data generation process can be found in Dulá (2008).
The data sets vary in four cardinality levels (n = 25K, 50K, 75K, 100K), four dimensions (m = 5, 10, 15, 20) and three density levels (d = 1%, 10%, 25%).
Dulá, J.H. (2008). A computational study of DEA with massive data sets. Computers & Operations Research, 35:1191 – 1203.
The datasets has been already widely employed in the literature, e.g., in the following studies, among others.
Korhonen, P. J., & Siitari, P. A. (2007). Using lexicographic parametric programming for identifying efficient units in DEA. Computers & operations research, 34(7), 2177-2190.
Korhonen, P. J., & Siitari, P. A. (2009). A dimensional decomposition approach to identifying efficient units in large-scale DEA models. Computers & operations research, 36(1), 234-244.
Dulá, J. H., & López, F. J. (2009). Preprocessing dea. Computers & Operations Research, 36(4), 1204-1220.
Dulá, J. H. (2011). An algorithm for data envelopment analysis. INFORMS Journal on Computing, 23(2), 284-296.
Jie, T. (2020). Parallel processing of the Build Hull algorithm to address the large-scale DEA problem. Annals of Operations Research, 295(1), 453-481.
12 data sets of cardinality 25,000
12 data sets of cardinality 50,000
12 data sets of cardinality 75,000
12 data sets of cardinality 100,000