menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

AdaSTaR: A...
source image

Arxiv

1d

read

27

img
dot

Image Credit: Arxiv

AdaSTaR: Adaptive Data Sampling for Training Self-Taught Reasoners

  • Self-Taught Reasoners (STaR), also known as Rejection sampling Fine-Tuning (RFT), is crucial for training self-improving reasoning Language Models (LMs).
  • Random observation sampling often leads to trained observation imbalance, causing over-training on solved examples and under-training on challenging ones.
  • AdaSTaR is a new algorithm that addresses this issue by implementing Adaptive Sampling for Diversity and Adaptive Sampling for Curriculum to ensure balanced training and adjust data difficulty based on the model's strength.
  • Across six benchmarks, AdaSTaR outperforms other methods with best test accuracy in all instances (6/6) and reduces training FLOPs by an average of 58.6%, showing promise for more efficient and effective self-improving LMs.

Read Full Article

like

1 Like

For uninterrupted reading, download the app