Sparse Training with Lipschitz Continuous Loss Functions and a Weighted Group L0-norm Constraint
This talk focuses on training neural networks with structured sparsity using a constrained optimization approach. We consider a weighted group L0-norm constraint, and present the projection and normal cone of this set. Using randomized smoothing, zeroth and first-order algorithms are developed for minimizing a general stochastic Lipschitz continuous loss function constrained by any closed set which can be projected onto. Non-asymptotic convergence guarantees are proven in expectation and high probability for the proposed algorithms for two related convergence criteria. A final method is presented based on the proposed algorithms with asymptotic convergence guarantees to a stationary point almost surely.