Dounia Lakhmiri – Polytechnique Montréal, Canada
Deep neural networks underwent a wide array of developments over the past decade. Their complexity grows as precision requirements increase, as do the energy consumption and ecological footprint of these solvers. Some applications have strict constraints on the overall size, memory, and acceptable latency, especially when deploying neural networks on edge and IoT devices. Network pruning encompasses a vast collection of techniques that reduce the number of parameters of a network while maintaining as much of its accuracy as possible.
We sparsify neural networks using non-smooth regularization. Our solver, called SR2, is based on stochastic proximal gradient principles but does not require prior knowledge of the gradient's Lipschitz constant. We illustrate two instances trained with
\(\ell_0\) regularization and compare the strength of our method against ProxSGD and ProxGEN in terms of pruning ratio and accuracy. Ongoing work seeks to establish non-asymptotic convergence and complexity properties of SR2.