Initial Guessing Bias

If we pass a binary dataset through an untrained network,

 does it assign half of the examples to each classor does it privilege one class

The answer depends on the architecture.

Things that we already know: 

 We prove that untrained neural networks can unevenly distribute their guesses among different classes. This is due to a node-permutation symmetry breaking, caused by architectural elements such as activations, depth and max-pooling. 

Some of the questions we want to address:

 Once we understand how architecture design causes predictive bias on an untrained neural network, a question that naturally arises is how the bias affects the learning dynamics. In this direction, it may be interesting to exploit Initial Guessing Bias (IGB) to counterbalance other effects (e.g., class imbalance) to improve performance.

Our work shows how the design of a neural network can bring out IGB. At the same time, however, our analysis shows how input distribution also plays a key role in the phenomenon. A more in-depth study of the interplay between these two elements, in addition to further complementing the understanding of IGB, may be an important first step in including modeling of real data in the analysis. Understanding the role of dataset distribution could also inform about pre-processing procedures (e.g., how to standardize the dataset).

Our analysis succeeds in quantitatively describing IGB on multilayer perceptron (MLP) and using a Gaussian blob as input. Since the phenomenon is empirically observed on broader settings, the natural next step is to extend our analysis on them.

The condition driving the emergence of IGB suggests how some forms of network regularization might be effective in eliminating IGB. Understanding the effect of regularizations on the phenomenon can then inform about whether or not they should be included in the network design, depending on whether or not we want to eliminate IGB.