menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Purifying ...
source image

Arxiv

2d

read

304

img
dot

Image Credit: Arxiv

Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner

  • The success of Shampoo in the AlgoPerf contest has led to a resurgence of interest in Kronecker-factorization-based optimization algorithms for training neural networks.
  • Shampoo depends on heuristics like learning rate grafting and stale preconditioning for performance at-scale, which increase complexity and require hyperparameter tuning without solid theoretical backing.
  • This study explores these heuristics by focusing on Frobenius norm approximation to full-matrix Adam and separating the preconditioner's eigenvalues and eigenbasis updates.
  • The research demonstrates how grafting from Adam can address staleness and mis-scaling of eigenvalues, eliminating the need for learning rate grafting, along with proposing adaptive criteria for eigenbasis computation frequency to better manage errors and improve convergence.

Read Full Article

like

18 Likes

For uninterrupted reading, download the app