Stochastic first-order methods for empirical risk minimization employ gradient approximations based on sampled data in lieu of exact gradients.
The recently proposed variance-reduction technique, alpha-SVRG, allows for fine-grained control of the level of residual noise in the learning dynamics, and has been reported to outperform SGD and SVRG in modern deep learning scenarios.
In strongly convex environments, alpha-SVRG has a faster convergence rate compared to SGD and SVRG under suitable choice of alpha.
Simulation results on linear regression validate the theory.