<ul><li>We revisit two fundamental decentralized optimization methods, Decentralized Gradient Tracking (DGT) and Decentralized Gradient Descent (DGD), with multiple local updates.</li><li>Incorporating local update steps can reduce communication complexity for strongly convex and smooth loss functions.</li><li>Increasing the number of additional local updates can effectively reduce communication costs when data heterogeneity is low and the network is well-connected.</li><li>Employing local updates in DGD achieves exact linear convergence under the Polyak-Łojasiewicz (PL) condition.</li></ul>

The Effectiveness of Local Updates for Decentralized Learning under Data Heterogeneity

Discover more