menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

HALoS: Hie...
source image

Arxiv

2d

read

305

img
dot

Image Credit: Arxiv

HALoS: Hierarchical Asynchronous Local SGD over Slow Networks for Geo-Distributed Large Language Model Training

  • HALoS is a hierarchical asynchronous optimization framework designed for training large language models (LLMs) in geo-distributed environments.
  • It introduces local parameter servers (LPSs) within each region and a global parameter server (GPS) to minimize inter-region communication costs and improve training efficiency.
  • HALoS achieves faster convergence compared to synchronous baselines and existing asynchronous methods in geo-distributed LLM training, with up to 7.5x faster convergence and improvements of up to 2.1x.
  • The framework maintains model quality while reducing total training time, making it a powerful tool for scalable and efficient training of large language models in heterogeneous, geo-distributed settings.

Read Full Article

like

18 Likes

For uninterrupted reading, download the app