neural network training — papers

AIICML 2015 (32nd International Conference on Machine Learning) · Jul 2015 Open access

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe and Christian Szegedy

This paper introduced batch normalization, a technique that normalizes layer inputs using mini-batch statistics to reduce internal covariate shift during training. It allows higher learning rates and less careful initialization, accelerates convergence, and acts as a regularizer. Applied to image classification networks, it dramatically reduced training steps and improved accuracy.

batch normalization deep learning neural network training regularization