Batch Normalization (BN) is a popular technique for training Deep Neural Networks (DNNs). BN uses scaling and shifting to normalize activations of mini-batches to accelerate convergence and improve generalization. Recent studies have shown that whitening (decorrelating) the activations can further reduce the training time and improve the generalization. However, when dealing with high-dimensional data, the requirement of Eigen-Decomposition, Singular Value Decomposition (SVD), or Newton’s iteration for computing whitening matrices has been the bottleneck of these methods. In this talk, I will present our recently proposed Stochastic Whitening Batch Normalization (SWBN) method, which can learn these whitening matrices in an online fashion, without expensive matrix decomposition (or inversion). We show that SWBN improves the convergence rate and generalization of DNNs in both many-shot and few-shot classiﬁcation tasks. Since SWBN is designed to be an efficient drop-in replacement for BN, it can be easily employed in most DNN architectures with a large number of layers.