Variance reduction under restricted strong convexity
Computational challenges of statistical estimators and machine learnng algorithms are an active area of study thanks to countless applications involved big data. In particular, several variants of stochastic first order methods based on variance reduction, namely SVRG, SDCA and SAGA, have attracted significant attention because of light computation load for each iteration, and fast convergence speed - for strongly convex problems they all converge with a linear rate. Yet, many powerful statistical and machine learning models (e.g., Lasso, group Lasso, nuclear norm regularized matrix completition), are not strongly convex, and sometimes even non-convex (a prominent example being the SCAD regularizer). In this talk, I will argue that a nice statisitcal property of those problems, namely restricted strong convexity, ensures that linear convergence of variance reduced stochastic first order methods for those statistical models. From a high level, our results re-confirmed a well-observed phenomenon that statistically easy problems tend to be more computationally friendly. We believe this intriguing fact indicates that there are fundamental relationships between statistics and optimization, and understanding such relationships may shed deep insights.