Mushroom Data Set (UCI)

​Number of features = 22​

Total number of instances = 8124

Number of classes: 2


Probability of class 0 in the whole set: 0.482029

Probability of class 1 in the whole set: 0.517971

feature 0, with number of values = 6

feature 1, with number of values = 4

feature 2, with number of values = 10

feature 3, with number of values = 2

feature 4, with number of values = 9

feature 5, with number of values = 2

feature 6, with number of values = 2

feature 7, with number of values = 2

feature 8, with number of values = 12

feature 9, with number of values = 2

feature 10, with number of values = 4

feature 11, with number of values = 4

feature 12, with number of values = 4

feature 13, with number of values = 9

feature 14, with number of values = 9

feature 15, with number of values = 1

feature 16, with number of values = 4

feature 17, with number of values = 3

feature 18, with number of values = 5

feature 19, with number of values = 9

feature 20, with number of values = 6

feature 21, with number of values = 7


Conditional probability table for naive Bayes classifier trained on the whole data set:

​0/1 loss if no feature was used (error of random):0.482029​

feature: 0, with gini loss: 0.469583, with 0/1 loss: 0.43582 (when used alone for prediction)


class: 0

value: 0, prob: 0.435747

value: 1, prob: 0.0124936

value: 2, prob: 0.000254972

value: 3, prob: 0.396991

value: 4, prob: 0.153238

value: 5, prob: 0.00127486

class: 1

value: 0, prob: 0.462506

value: 1, prob: 0.0961082

value: 2, prob: 0.00783104

value: 3, prob: 0.378975

value: 4, prob: 0.0543427

value: 5, prob: 0.000237304


feature: 1, with gini loss: 0.480107, with 0/1 loss: 0.419585 (when used alone for prediction)

class: 0

value: 0, prob: 0.360459

value: 1, prob: 0.444133

value: 2, prob: 0.194133

value: 3, prob: 0.00127551

class: 1

value: 0, prob: 0.271842

value: 1, prob: 0.357312

value: 2, prob: 0.370608

value: 3, prob: 0.000237417


feature: 2, with gini loss: 0.475891, with 0/1 loss: 0.404948 (when used alone for prediction)

class: 0

value: 0, prob: 0.260061

value: 1, prob: 0.171421

value: 2, prob: 0.0817626

value: 3, prob: 0.206062

value: 4, prob: 0.223383

value: 5, prob: 0.0226694

value: 6, prob: 0.0308202

value: 7, prob: 0.000254712

value: 8, prob: 0.00331126

value: 9, prob: 0.000254712

class: 1

value: 0, prob: 0.299905

value: 1, prob: 0.0950688

value: 2, prob: 0.170934

value: 3, prob: 0.244903

value: 4, prob: 0.148174

value: 5, prob: 0.0135135

value: 6, prob: 0.0116169

value: 7, prob: 0.00403035

value: 8, prob: 0.00782361

value: 9, prob: 0.00403035


feature: 3, with gini loss: 0.37388, with 0/1 loss: 0.256153 (when used alone for prediction)

class: 0

value: 0, prob: 0.15952

value: 1, prob: 0.84048

class: 1

value: 0, prob: 0.653919

value: 1, prob: 0.346081


feature: 4, with gini loss: 0.0306557, with 0/1 loss: 0.0158426 (when used alone for prediction)

class: 0

value: 0, prob: 0.0654777

value: 1, prob: 0.000254777

value: 2, prob: 0.000254777

value: 3, prob: 0.030828

value: 4, prob: 0.550573

value: 5, prob: 0.049172

value: 6, prob: 0.147006

value: 7, prob: 0.147006

value: 8, prob: 0.00942675

class: 1

value: 0, prob: 0.000237135

value: 1, prob: 0.0950913

value: 2, prob: 0.0950913

value: 3, prob: 0.808395

value: 4, prob: 0.000237135

value: 5, prob: 0.000237135

value: 6, prob: 0.000237135

value: 7, prob: 0.000237135

value: 8, prob: 0.000237135


feature: 5, with gini loss: 0.491106, with 0/1 loss: 0.482029 (when used alone for prediction)

class: 0

value: 0, prob: 0.995151

value: 1, prob: 0.00484941

class: 1

value: 0, prob: 0.954157

value: 1, prob: 0.0458432


feature: 6, with gini loss: 0.438862, with 0/1 loss: 0.38411 (when used alone for prediction)

class: 0

value: 0, prob: 0.971159

value: 1, prob: 0.0288412

class: 1

value: 0, prob: 0.714727

value: 1, prob: 0.285273


feature: 7, with gini loss: 0.353892, with 0/1 loss: 0.243845 (when used alone for prediction)

class: 0

value: 0, prob: 0.567892

value: 1, prob: 0.432108

class: 1

value: 0, prob: 0.0686461

value: 1, prob: 0.931354


feature: 8, with gini loss: 0.269366, with 0/1 loss: 0.195867 (when used alone for prediction)

class: 0

value: 0, prob: 0.0165479

value: 1, prob: 0.0287678

value: 2, prob: 0.128564

value: 3, prob: 0.163187

value: 4, prob: 0.0628819

value: 5, prob: 0.134674

value: 6, prob: 0.0124745

value: 7, prob: 0.000254582

value: 8, prob: 0.440173

value: 9, prob: 0.00636456

value: 10, prob: 0.0058554

value: 11, prob: 0.000254582

class: 1

value: 0, prob: 0.0817536

value: 1, prob: 0.222038

value: 2, prob: 0.0590047

value: 3, prob: 0.202133

value: 4, prob: 0.226777

value: 5, prob: 0.0485782

value: 6, prob: 0.10545

value: 7, prob: 0.0229858

value: 8, prob: 0.000236967

value: 9, prob: 0.000236967

value: 10, prob: 0.0154028

value: 11, prob: 0.0154028


feature: 9, with gini loss: 0.494162, with 0/1 loss: 0.447095 (when used alone for prediction)

class: 0

value: 0, prob: 0.485197

value: 1, prob: 0.514803

class: 1

value: 0, prob: 0.384086

value: 1, prob: 0.615914


feature: 10, with gini loss: 0.43515, with 0/1 loss: 0.352562 (when used alone for prediction)

class: 0

value: 0, prob: 0.118981

value: 1, prob: 0.0208333

value: 2, prob: 0.859722

value: 3, prob: 0.000462963

class: 1

value: 0, prob: 0.247709

value: 1, prob: 0.146907

value: 2, prob: 0.550115

value: 3, prob: 0.0552692


feature: 11, with gini loss: 0.327061, with 0/1 loss: 0.225768 (when used alone for prediction)

class: 0

value: 0, prob: 0.392092

value: 1, prob: 0.0369898

value: 2, prob: 0.568622

value: 3, prob: 0.00229592

class: 1

value: 0, prob: 0.864435

value: 1, prob: 0.0971035

value: 2, prob: 0.0344255

value: 3, prob: 0.00403609


feature: 12, with gini loss: 0.334679, with 0/1 loss: 0.234129 (when used alone for prediction)

class: 0

value: 0, prob: 0.392092

value: 1, prob: 0.0369898

value: 2, prob: 0.0196429

value: 3, prob: 0.551276

class: 1

value: 0, prob: 0.807455

value: 1, prob: 0.1085

value: 2, prob: 0.0496201

value: 3, prob: 0.0344255


feature: 13, with gini loss: 0.362933, with 0/1 loss: 0.284071 (when used alone for prediction)

class: 0

value: 0, prob: 0.436433

value: 1, prob: 0.000254777

value: 2, prob: 0.330446

value: 3, prob: 0.110318

value: 4, prob: 0.110318

value: 5, prob: 0.000254777

value: 6, prob: 0.000254777

value: 7, prob: 0.00942675

value: 8, prob: 0.00229299

class: 1

value: 0, prob: 0.652834

value: 1, prob: 0.136827

value: 2, prob: 0.136827

value: 3, prob: 0.0040313

value: 4, prob: 0.000237135

value: 5, prob: 0.0230021

value: 6, prob: 0.0457671

value: 7, prob: 0.000237135

value: 8, prob: 0.000237135


feature: 14, with gini loss: 0.368156, with 0/1 loss: 0.286037 (when used alone for prediction)

class: 0

value: 0, prob: 0.42828

value: 1, prob: 0.330446

value: 2, prob: 0.000254777

value: 3, prob: 0.110318

value: 4, prob: 0.114395

value: 5, prob: 0.000254777

value: 6, prob: 0.00636943

value: 7, prob: 0.000254777

value: 8, prob: 0.00942675

class: 1

value: 0, prob: 0.641451

value: 1, prob: 0.136827

value: 2, prob: 0.136827

value: 3, prob: 0.000237135

value: 4, prob: 0.0154138

value: 5, prob: 0.0230021

value: 6, prob: 0.000237135

value: 7, prob: 0.0457671

value: 8, prob: 0.000237135


feature: 15, with gini loss: 0.499354, with 0/1 loss: 0.482029 (when used alone for prediction)

class: 0

value: 0, prob: 1

class: 1

value: 0, prob: 1


feature: 16, with gini loss: 0.487951, with 0/1 loss: 0.481045 (when used alone for prediction)

class: 0

value: 0, prob: 0.997194

value: 1, prob: 0.000255102

value: 2, prob: 0.000255102

value: 3, prob: 0.00229592

class: 1

value: 0, prob: 0.953704

value: 1, prob: 0.0230294

value: 2, prob: 0.0230294

value: 3, prob: 0.000237417


feature: 17, with gini loss: 0.476525, with 0/1 loss: 0.461881 (when used alone for prediction)

class: 0

value: 0, prob: 0.971932

value: 1, prob: 0.0186272

value: 2, prob: 0.00944118

class: 1

value: 0, prob: 0.874139

value: 1, prob: 0.125623

value: 2, prob: 0.000237473


feature: 18, with gini loss: 0.318251, with 0/1 loss: 0.224859 (when used alone for prediction)

class: 0

value: 0, prob: 0.208365

value: 1, prob: 0.45116

value: 2, prob: 0.330783

value: 3, prob: 0.000255037

value: 4, prob: 0.00943637

class: 1

value: 0, prob: 0.748398

value: 1, prob: 0.239497

value: 2, prob: 0.000237361

value: 3, prob: 0.0116307

value: 4, prob: 0.000237361


feature: 19, with gini loss: 0.217985, with 0/1 loss: 0.13277 (when used alone for prediction)

class: 0

value: 0, prob: 0.0573248

value: 1, prob: 0.0573248

value: 2, prob: 0.000254777

value: 3, prob: 0.403822

value: 4, prob: 0.461911

value: 5, prob: 0.0185987

value: 6, prob: 0.000254777

value: 7, prob: 0.000254777

value: 8, prob: 0.000254777

class: 1

value: 0, prob: 0.391036

value: 1, prob: 0.413801

value: 2, prob: 0.0116196

value: 3, prob: 0.0116196

value: 4, prob: 0.136827

value: 5, prob: 0.000237135

value: 6, prob: 0.0116196

value: 7, prob: 0.0116196

value: 8, prob: 0.0116196


feature: 20, with gini loss: 0.381268, with 0/1 loss: 0.278515 (when used alone for prediction)

class: 0

value: 0, prob: 0.0940847

value: 1, prob: 0.000254972

value: 2, prob: 0.000254972

value: 3, prob: 0.726415

value: 4, prob: 0.165477

value: 5, prob: 0.0135135

class: 1

value: 0, prob: 0.209065

value: 1, prob: 0.095159

value: 2, prob: 0.0913621

value: 3, prob: 0.283104

value: 4, prob: 0.252729

value: 5, prob: 0.0685809


feature: 21, with gini loss: 0.403113, with 0/1 loss: 0.310014 (when used alone for prediction)

class: 0

value: 0, prob: 0.0695896

value: 1, prob: 0.188886

value: 2, prob: 0.00943156

value: 3, prob: 0.323477

value: 4, prob: 0.257201

value: 5, prob: 0.000254907

value: 6, prob: 0.15116

class: 1

value: 0, prob: 0.023013

value: 1, prob: 0.334282

value: 2, prob: 0.0609727

value: 3, prob: 0.446263

value: 4, prob: 0.032503

value: 5, prob: 0.0457888

value: 6, prob: 0.0571767


Average number of times each feature was queried using the single-feature look-ahead algorithm (depth 50) under a budget of 300. Feature 4 was queried the most, and it is the most discriminative as well.

Feature query statistics:

0: 5.53333

1: 1.23333

2: 24.3667

3: 0.166667

4: 110.367

5: 0.133333

6: 0.3

7: 0.166667

8: 56.6333

9: 0.3

10: 1.36667

11: 1.06667

12: 1.53333

13: 19.7667

14: 20.1333

15: 0

16: 1.5

17: 0.933333

18: 2.83333

19: 39.4

20: 4.93333

21: 7.33333

Average number of times each feature was queried using the biased-robin algorithm under a budget of 300. Feature 4 was queried the most, and it is the most discriminative as well.


Feature query statistics:

0: 8.94

1: 8.84

2: 10.48

3: 12.5

4: 72.3

5: 7.54

6: 9.42

7: 12.06

8: 17.2

9: 7.56

10: 4.02

11: 15.74

12: 12.46

13: 11

14: 12.02

15: 2.64

16: 7.74

17: 7.9

18: 13.04

19: 21.72

20: 11.5

21: 11.5