Title of the academic study:
Meta-Learning Analysis of Deep Neural Network Architectures on Diverse Numeric Datasets via Geometric Complexity Descriptors
Autohors: Faruk BULUT, İlknur DÖNMEZ
The study has been submitted to a journal and is currently under review. All the C++, MATLAB, and Python codes, pre-processed datasets, experimental outputs in text and Excel formats are shared here publicly for further achievements and process.
You can also ask any question regarding this study to these e-mail addresses below:
faruk (dot) bulut (at) essex (dot) ac (dot) uk
ilknur (dot) donmez (at) istun (dot) edu (dot) tr
The DOI, WoS, and the citation will be available after the acceptance of the paper.
The Codes, Datasets, and Related Files
PCA for CNN and Correlation Matrix is here.
PCA for Transformer and Correlation Matrix is here.
All the experimental results including Transformer and CNN results, Features of each of the datasets can be downloadable here.
CNN and Transformer datasets (noises have been removed) can be downloadable here.
Extended Tables for p values of CN and Transformer are here as a Word file.
The whole dataset are in three parts here. Download the 1st one, the 2nd one, and the 3rd one.
The Python codes for Transformer and CNN are here.
DT_regression.arff : This file is a Weka format. In this file there are meta features of datasets. The last attribute in this ARFF file is the DT Accracy rate. Linear Regression model is computed with this file. Download it from here.
DT_regression_Normalized : In this file, all attributes in the file DT_regression.arff are normalized. Download it from here.
Dataset spesifications.xlsx : There are external features of all datasets in this file. Download it from here.
Normalized Datasets.xlsx : Some tabels in the article. Download it from here.
MATLAB codes UCI datset analyzer : Source Codes. Download it from here.
DCoL Software Download URL is here: http://dcol.sourceforge.net/
You need the DCoL-v1.1.tar.gz file. If you cannot download it, please send me an e-mail: bulutfaruk [at] gmail [dot] com. But don't forget to use this software in a Linux platform. I suggest you to use Ubuntu.
115_UCI_Datasets.rar : The dataset collection. It is about 11MByte. All of them are derived from the UCI Repository. They are in a zipped folder. Unzip them first. Download it from here.
MATLAB_Code_Decision_Tree.m : MATLAB code file that calculates the Decision Tree Accuracy for each of the dataset. Download it from here. You need a MATLAB (c) platform. The version should be higher than R2015.
The names of the datasets: All of the names in MATLAB format, it is as:
alldatasets={'hillValley','bank','liver-disorders','bupa','cmc.2c2','liv','cmc.2c2','bpa','hab','cmc.2c0','breast-cancer','haberman','credit-g','yea.2c0','cylinder-bands','sonar','pim','glass.2c1','diabetes','lung-cancer','cmc.2c1','cmc.2c1','transfusion','vehicle.2c1','h-s','abalone.2c6','vehicle.2c0','veh.2c0','heart-statlog','abalone.2c7','gls.2c0','glass.2c0','colic','hepatitis','primary-tumor.2c0','abalone.2c5','column3C.2c0','column3C.2c2','lymph','abalone.2c8','waveform.2c0','wav40.2c0','wav21.2c0','mag','autos.2c1','balance-scale.2c0','autos.2c2','bankruptcy','waveform.2c2','bal.2c0','waveform.2c1','credit-a','abalone.2c4','glass.2c2','labor','ionosphere','col10.2c4','ecoli.2c1','audiology.2c3','ringnorm','col10.2c5','spambase','tic-tac-toe','ecoli.2c3','audiology.2c4','balance-scale.2c1','spa','monk','thy.2c0','ecoli.2c2','vehicle.2c3','wineCultivars.2c1','wdbc','ecoli.2c0','wne.2c0','wineCultivars.2c2','wineCultivars.2c0','vote','win.2c0','iris.2c1','splice.2c2','authors.2c0','vehicle.2c2','iris.2c2','tao','ozone','zoo.2c2','column3C.2c1','audiology.2c0','pageblocks.2c0','pbc.2c0','ecoli.2c4','solar-flare_1','d159','sick','pageblocks.2c4','pageblocks.2c1','anneal.2c1','kr-vs-kp','opt.2c0','statlog-sgm.2c0','seg.2c0','col10.2c6','zoo.2c0','soybean.2c3','pageblocks.2c3','pen.2c0','pageblocks.2c2','hypothyroid.2c0','mushroom','badges','badges2','col10.2c0','iris.2c0','zoo.2c3'};
The whole experimental results in Excel format is here
The Whole normalized Experimental Results (Long form of Table 1) including 114 lines can be accessed here.
The Correlation Matrix
% Correlation Matrix
% Load the whole experimental results into A
r = corr(A)
isupper = logical(triu(ones(size(r)),1));
r(isupper) = NaN
% Plot results
h = heatmap(r, 'MissingDataColor','w');
labels = ["F1", "F1v", "F2", "F3", "F4", "L1", "L2", "L3", "N1", "N2", "N3", "N4", "T1", "T2", "DTAcc"];
h.XDisplayLabels = labels;
h.YDisplayLabels = labels;