To this end, this tutorial will first review recent work to provide a mathematical justification for geometric and statistical properties of different normalization methods applied among different ensembles of input-output channels. The theoretical analysis of normalization methods presented exploits mathematical tools that will guide researchers to develop novel normalization methods, and help them to improve our understanding of theoretical foundations of normalization methods. In addition, we will consider practical methods for implementation of various particular normalization methods such as batch normalization, block orthogonal weight and gradient normalization using CNNs and RNNs with small batch size in the context of important vision applications.