<ul><li>A new framework called GALA (Gradient Alignment-based Learning rate Adaptation) has been proposed for dynamically adjusting the learning rate in large-scale deep learning models.</li><li>GALA tracks the alignment between consecutive gradients and uses a local curvature estimate to adapt the learning rate effectively.</li><li>The method formulates the learning rate selection problem as a one-dimensional online learning problem and pairs it with an algorithm like Follow-the-Regularized-Leader.</li><li>Empirical results show that optimizers like SGD and Adam, combined with GALA, perform well across various initial learning rates without requiring extensive tuning.</li></ul>

Online Learning-guided Learning Rate Adaptation via Gradient Alignment

Discover more