menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

Accelerati...
source image

Arxiv

4d

read

145

img
dot

Image Credit: Arxiv

Accelerating AllReduce with a Persistent Straggler

  • Distributed machine learning workloads rely on the AllReduce collective to synchronize gradients or activations during training and inference.
  • A new algorithm called StragglAR has been proposed to accelerate distributed training and inference in the presence of persistent stragglers.
  • StragglAR implements a ReduceScatter among the remaining GPUs during delays caused by stragglers, achieving a 2x theoretical speedup over popular AllReduce algorithms for large GPU clusters.
  • On an 8-GPU server, StragglAR implementation has shown a 22% speedup compared to state-of-the-art AllReduce algorithms.

Read Full Article

like

8 Likes

For uninterrupted reading, download the app