menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

From Gradi...
source image

Arxiv

2d

read

325

img
dot

Image Credit: Arxiv

From Gradient Clipping to Normalization for Heavy Tailed SGD

  • Recent empirical evidence shows that heavy-tailed gradient noise in machine learning challenges standard assumptions of bounded variance in stochastic optimization.
  • Gradient clipping is commonly used to address heavy-tailed noise, but current theoretical understanding has limitations, such as relying on large clipping thresholds and sub-optimal sampling complexity.
  • A new approach, Normalized SGD (NSGD), is introduced to overcome these issues by establishing parameter-free sample complexity and improving convergence rates even when problem parameters are known.
  • The study on NSGD offers improved sample complexities, matching lower bounds for first-order methods, and ensures high-probability convergence with a mild dependence on failure probability.

Read Full Article

like

19 Likes

For uninterrupted reading, download the app