Normalized Initialization - rugbyprof/5443-Data-Mining GitHub Wiki
Normalized Initialization
Normalization refers to normalizing the data dimensions that they are of approximately same scale.Deep multi-layered neural networks often require experiments that interrogate different initialization routines, activations and variation of gradients across layers during training. A new initialization scheme that brings substantially faster convergence was found that in networks with linear activation at each layer, the variance of the back-propagated gradients decreases as we go backwards in the network.The normalization factor may therefore be important when initializing deep networks because of the multiplicative effect through layers. The suggestion then is of an initialization procedure that maintains stable variances of activation and back-propagated gradients as one moves up or down the network. This is known as the normalized initialization. (http://www.jefkine.com/deep/2016/08/01/initialization-of-deep-feedfoward-networks/)