menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

AutoScale:...
source image

Arxiv

1w

read

274

img
dot

Image Credit: Arxiv

AutoScale: Scale-Aware Data Mixing for Pre-Training LLMs

  • Domain reweighting is an emerging research area aimed at adjusting the relative weights of different data sources to improve the effectiveness and efficiency of LLM pre-training.
  • The existing practice of determining competitive data mixtures in small-scale experiments and directly applying them at larger scales may not retain their advantage.
  • AutoScale is a two-stage, scale-aware data composition framework that fits a parametric model to predict the loss under different data compositions, finding an approximate best allocation at smaller budgets and extrapolating that composition to larger budgets.
  • Empirical results show that AutoScale accelerates convergence, improves downstream performance, and achieves a 28% faster perplexity reduction than baselines when pre-training GPT-2 Large.

Read Full Article

like

16 Likes

For uninterrupted reading, download the app