Mushroom Data Set (UCI)
Number of features = 22
Total number of instances = 8124
Number of classes: 2
Probability of class 0 in the whole set: 0.482029
Probability of class 1 in the whole set: 0.517971
feature 0, with number of values = 6
feature 1, with number of values = 4
feature 2, with number of values = 10
feature 3, with number of values = 2
feature 4, with number of values = 9
feature 5, with number of values = 2
feature 6, with number of values = 2
feature 7, with number of values = 2
feature 8, with number of values = 12
feature 9, with number of values = 2
feature 10, with number of values = 4
feature 11, with number of values = 4
feature 12, with number of values = 4
feature 13, with number of values = 9
feature 14, with number of values = 9
feature 15, with number of values = 1
feature 16, with number of values = 4
feature 17, with number of values = 3
feature 18, with number of values = 5
feature 19, with number of values = 9
feature 20, with number of values = 6
feature 21, with number of values = 7
Conditional probability table for naive Bayes classifier trained on the whole data set:
0/1 loss if no feature was used (error of random):0.482029
feature: 0, with gini loss: 0.469583, with 0/1 loss: 0.43582 (when used alone for prediction)
class: 0
value: 0, prob: 0.435747
value: 1, prob: 0.0124936
value: 2, prob: 0.000254972
value: 3, prob: 0.396991
value: 4, prob: 0.153238
value: 5, prob: 0.00127486
class: 1
value: 0, prob: 0.462506
value: 1, prob: 0.0961082
value: 2, prob: 0.00783104
value: 3, prob: 0.378975
value: 4, prob: 0.0543427
value: 5, prob: 0.000237304
feature: 1, with gini loss: 0.480107, with 0/1 loss: 0.419585 (when used alone for prediction)
class: 0
value: 0, prob: 0.360459
value: 1, prob: 0.444133
value: 2, prob: 0.194133
value: 3, prob: 0.00127551
class: 1
value: 0, prob: 0.271842
value: 1, prob: 0.357312
value: 2, prob: 0.370608
value: 3, prob: 0.000237417
feature: 2, with gini loss: 0.475891, with 0/1 loss: 0.404948 (when used alone for prediction)
class: 0
value: 0, prob: 0.260061
value: 1, prob: 0.171421
value: 2, prob: 0.0817626
value: 3, prob: 0.206062
value: 4, prob: 0.223383
value: 5, prob: 0.0226694
value: 6, prob: 0.0308202
value: 7, prob: 0.000254712
value: 8, prob: 0.00331126
value: 9, prob: 0.000254712
class: 1
value: 0, prob: 0.299905
value: 1, prob: 0.0950688
value: 2, prob: 0.170934
value: 3, prob: 0.244903
value: 4, prob: 0.148174
value: 5, prob: 0.0135135
value: 6, prob: 0.0116169
value: 7, prob: 0.00403035
value: 8, prob: 0.00782361
value: 9, prob: 0.00403035
feature: 3, with gini loss: 0.37388, with 0/1 loss: 0.256153 (when used alone for prediction)
class: 0
value: 0, prob: 0.15952
value: 1, prob: 0.84048
class: 1
value: 0, prob: 0.653919
value: 1, prob: 0.346081
feature: 4, with gini loss: 0.0306557, with 0/1 loss: 0.0158426 (when used alone for prediction)
class: 0
value: 0, prob: 0.0654777
value: 1, prob: 0.000254777
value: 2, prob: 0.000254777
value: 3, prob: 0.030828
value: 4, prob: 0.550573
value: 5, prob: 0.049172
value: 6, prob: 0.147006
value: 7, prob: 0.147006
value: 8, prob: 0.00942675
class: 1
value: 0, prob: 0.000237135
value: 1, prob: 0.0950913
value: 2, prob: 0.0950913
value: 3, prob: 0.808395
value: 4, prob: 0.000237135
value: 5, prob: 0.000237135
value: 6, prob: 0.000237135
value: 7, prob: 0.000237135
value: 8, prob: 0.000237135
feature: 5, with gini loss: 0.491106, with 0/1 loss: 0.482029 (when used alone for prediction)
class: 0
value: 0, prob: 0.995151
value: 1, prob: 0.00484941
class: 1
value: 0, prob: 0.954157
value: 1, prob: 0.0458432
feature: 6, with gini loss: 0.438862, with 0/1 loss: 0.38411 (when used alone for prediction)
class: 0
value: 0, prob: 0.971159
value: 1, prob: 0.0288412
class: 1
value: 0, prob: 0.714727
value: 1, prob: 0.285273
feature: 7, with gini loss: 0.353892, with 0/1 loss: 0.243845 (when used alone for prediction)
class: 0
value: 0, prob: 0.567892
value: 1, prob: 0.432108
class: 1
value: 0, prob: 0.0686461
value: 1, prob: 0.931354
feature: 8, with gini loss: 0.269366, with 0/1 loss: 0.195867 (when used alone for prediction)
class: 0
value: 0, prob: 0.0165479
value: 1, prob: 0.0287678
value: 2, prob: 0.128564
value: 3, prob: 0.163187
value: 4, prob: 0.0628819
value: 5, prob: 0.134674
value: 6, prob: 0.0124745
value: 7, prob: 0.000254582
value: 8, prob: 0.440173
value: 9, prob: 0.00636456
value: 10, prob: 0.0058554
value: 11, prob: 0.000254582
class: 1
value: 0, prob: 0.0817536
value: 1, prob: 0.222038
value: 2, prob: 0.0590047
value: 3, prob: 0.202133
value: 4, prob: 0.226777
value: 5, prob: 0.0485782
value: 6, prob: 0.10545
value: 7, prob: 0.0229858
value: 8, prob: 0.000236967
value: 9, prob: 0.000236967
value: 10, prob: 0.0154028
value: 11, prob: 0.0154028
feature: 9, with gini loss: 0.494162, with 0/1 loss: 0.447095 (when used alone for prediction)
class: 0
value: 0, prob: 0.485197
value: 1, prob: 0.514803
class: 1
value: 0, prob: 0.384086
value: 1, prob: 0.615914
feature: 10, with gini loss: 0.43515, with 0/1 loss: 0.352562 (when used alone for prediction)
class: 0
value: 0, prob: 0.118981
value: 1, prob: 0.0208333
value: 2, prob: 0.859722
value: 3, prob: 0.000462963
class: 1
value: 0, prob: 0.247709
value: 1, prob: 0.146907
value: 2, prob: 0.550115
value: 3, prob: 0.0552692
feature: 11, with gini loss: 0.327061, with 0/1 loss: 0.225768 (when used alone for prediction)
class: 0
value: 0, prob: 0.392092
value: 1, prob: 0.0369898
value: 2, prob: 0.568622
value: 3, prob: 0.00229592
class: 1
value: 0, prob: 0.864435
value: 1, prob: 0.0971035
value: 2, prob: 0.0344255
value: 3, prob: 0.00403609
feature: 12, with gini loss: 0.334679, with 0/1 loss: 0.234129 (when used alone for prediction)
class: 0
value: 0, prob: 0.392092
value: 1, prob: 0.0369898
value: 2, prob: 0.0196429
value: 3, prob: 0.551276
class: 1
value: 0, prob: 0.807455
value: 1, prob: 0.1085
value: 2, prob: 0.0496201
value: 3, prob: 0.0344255
feature: 13, with gini loss: 0.362933, with 0/1 loss: 0.284071 (when used alone for prediction)
class: 0
value: 0, prob: 0.436433
value: 1, prob: 0.000254777
value: 2, prob: 0.330446
value: 3, prob: 0.110318
value: 4, prob: 0.110318
value: 5, prob: 0.000254777
value: 6, prob: 0.000254777
value: 7, prob: 0.00942675
value: 8, prob: 0.00229299
class: 1
value: 0, prob: 0.652834
value: 1, prob: 0.136827
value: 2, prob: 0.136827
value: 3, prob: 0.0040313
value: 4, prob: 0.000237135
value: 5, prob: 0.0230021
value: 6, prob: 0.0457671
value: 7, prob: 0.000237135
value: 8, prob: 0.000237135
feature: 14, with gini loss: 0.368156, with 0/1 loss: 0.286037 (when used alone for prediction)
class: 0
value: 0, prob: 0.42828
value: 1, prob: 0.330446
value: 2, prob: 0.000254777
value: 3, prob: 0.110318
value: 4, prob: 0.114395
value: 5, prob: 0.000254777
value: 6, prob: 0.00636943
value: 7, prob: 0.000254777
value: 8, prob: 0.00942675
class: 1
value: 0, prob: 0.641451
value: 1, prob: 0.136827
value: 2, prob: 0.136827
value: 3, prob: 0.000237135
value: 4, prob: 0.0154138
value: 5, prob: 0.0230021
value: 6, prob: 0.000237135
value: 7, prob: 0.0457671
value: 8, prob: 0.000237135
feature: 15, with gini loss: 0.499354, with 0/1 loss: 0.482029 (when used alone for prediction)
class: 0
value: 0, prob: 1
class: 1
value: 0, prob: 1
feature: 16, with gini loss: 0.487951, with 0/1 loss: 0.481045 (when used alone for prediction)
class: 0
value: 0, prob: 0.997194
value: 1, prob: 0.000255102
value: 2, prob: 0.000255102
value: 3, prob: 0.00229592
class: 1
value: 0, prob: 0.953704
value: 1, prob: 0.0230294
value: 2, prob: 0.0230294
value: 3, prob: 0.000237417
feature: 17, with gini loss: 0.476525, with 0/1 loss: 0.461881 (when used alone for prediction)
class: 0
value: 0, prob: 0.971932
value: 1, prob: 0.0186272
value: 2, prob: 0.00944118
class: 1
value: 0, prob: 0.874139
value: 1, prob: 0.125623
value: 2, prob: 0.000237473
feature: 18, with gini loss: 0.318251, with 0/1 loss: 0.224859 (when used alone for prediction)
class: 0
value: 0, prob: 0.208365
value: 1, prob: 0.45116
value: 2, prob: 0.330783
value: 3, prob: 0.000255037
value: 4, prob: 0.00943637
class: 1
value: 0, prob: 0.748398
value: 1, prob: 0.239497
value: 2, prob: 0.000237361
value: 3, prob: 0.0116307
value: 4, prob: 0.000237361
feature: 19, with gini loss: 0.217985, with 0/1 loss: 0.13277 (when used alone for prediction)
class: 0
value: 0, prob: 0.0573248
value: 1, prob: 0.0573248
value: 2, prob: 0.000254777
value: 3, prob: 0.403822
value: 4, prob: 0.461911
value: 5, prob: 0.0185987
value: 6, prob: 0.000254777
value: 7, prob: 0.000254777
value: 8, prob: 0.000254777
class: 1
value: 0, prob: 0.391036
value: 1, prob: 0.413801
value: 2, prob: 0.0116196
value: 3, prob: 0.0116196
value: 4, prob: 0.136827
value: 5, prob: 0.000237135
value: 6, prob: 0.0116196
value: 7, prob: 0.0116196
value: 8, prob: 0.0116196
feature: 20, with gini loss: 0.381268, with 0/1 loss: 0.278515 (when used alone for prediction)
class: 0
value: 0, prob: 0.0940847
value: 1, prob: 0.000254972
value: 2, prob: 0.000254972
value: 3, prob: 0.726415
value: 4, prob: 0.165477
value: 5, prob: 0.0135135
class: 1
value: 0, prob: 0.209065
value: 1, prob: 0.095159
value: 2, prob: 0.0913621
value: 3, prob: 0.283104
value: 4, prob: 0.252729
value: 5, prob: 0.0685809
feature: 21, with gini loss: 0.403113, with 0/1 loss: 0.310014 (when used alone for prediction)
class: 0
value: 0, prob: 0.0695896
value: 1, prob: 0.188886
value: 2, prob: 0.00943156
value: 3, prob: 0.323477
value: 4, prob: 0.257201
value: 5, prob: 0.000254907
value: 6, prob: 0.15116
class: 1
value: 0, prob: 0.023013
value: 1, prob: 0.334282
value: 2, prob: 0.0609727
value: 3, prob: 0.446263
value: 4, prob: 0.032503
value: 5, prob: 0.0457888
value: 6, prob: 0.0571767
Average number of times each feature was queried using the single-feature look-ahead algorithm (depth 50) under a budget of 300. Feature 4 was queried the most, and it is the most discriminative as well.
Feature query statistics:
0: 5.53333
1: 1.23333
2: 24.3667
3: 0.166667
4: 110.367
5: 0.133333
6: 0.3
7: 0.166667
8: 56.6333
9: 0.3
10: 1.36667
11: 1.06667
12: 1.53333
13: 19.7667
14: 20.1333
15: 0
16: 1.5
17: 0.933333
18: 2.83333
19: 39.4
20: 4.93333
21: 7.33333
Average number of times each feature was queried using the biased-robin algorithm under a budget of 300. Feature 4 was queried the most, and it is the most discriminative as well.
Feature query statistics:
0: 8.94
1: 8.84
2: 10.48
3: 12.5
4: 72.3
5: 7.54
6: 9.42
7: 12.06
8: 17.2
9: 7.56
10: 4.02
11: 15.74
12: 12.46
13: 11
14: 12.02
15: 2.64
16: 7.74
17: 7.9
18: 13.04
19: 21.72
20: 11.5
21: 11.5