We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays

Abstract: Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients") under the coordination of a central server. Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift"). In this work, we propose and analyze Asynchronous Exact Averaging (AREA), a new stochastic (sub)gradient algorithm that utilizes asynchronous communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies. Moreover, AREA is, to the best of our knowledge, the first method that is guaranteed to converge under arbitrarily long delays, without the use of delay-adaptive stepsizes, and (i) for strongly convex, smooth functions, asymptotically converges to an error neighborhood whose size depends only on the variance of the stochastic gradients used with respect to the number of iterations, and (ii) for convex, non-smooth functions, matches the convergence rate of the centralized stochastic subgradient method up to a constant factor, which depends on the average of the individual client update frequencies instead of their minimum (or maximum). Our numerical results validate our theoretical analysis and indicate AREA outperforms state-of-the-art methods when local data are highly non-iid, especially as the number of clients grows.
Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC)
Cite as: arXiv:2405.10123 [cs.LG]
  (or arXiv:2405.10123v2 [cs.LG] for this version)

Submission history

From: Charikleia Iakovidou [view email]
[v1] Thu, 16 May 2024 14:22:49 GMT (3290kb,D)
[v2] Tue, 28 May 2024 18:27:41 GMT (3291kb,D)

Link back to: arXiv, form interface, contact.