menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Actor-Crit...
source image

Arxiv

2d

read

366

img
dot

Image Credit: Arxiv

Actor-Critic based Online Data Mixing For Language Model Pre-Training

  • A new method for language model pre-training, called Actor-Critic based Online Data Mixing (AC-ODM), has been developed.
  • AC-ODM captures varying domain weights using auxiliary actor-critic networks and considers intra-domain interactions with a reward function.
  • It applies the actor trained with a small proxy Language Model as the environment for data sampling strategy.
  • Numerical results show that AC-ODM-410M performs significantly better in convergence and accuracy compared to existing methods.

Read Full Article

like

22 Likes

For uninterrupted reading, download the app