Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression
We consider the problem of finding second-order stationary points in the optimization of heterogeneous federated learning (FL). Previous works in FL mostly focus on first-order convergence guarantees, which do not rule out the scenario of unstable saddle points. Meanwhile, it is a key bottleneck of...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9493 https://ink.library.smu.edu.sg/context/sis_research/article/10493/viewcontent/1_s2.0_S2214140524001233_pvoa_cc_by_nc.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-10493 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-104932024-11-11T06:05:02Z Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression CHEN, Sijin LI, Zhize CHI, Yuejie We consider the problem of finding second-order stationary points in the optimization of heterogeneous federated learning (FL). Previous works in FL mostly focus on first-order convergence guarantees, which do not rule out the scenario of unstable saddle points. Meanwhile, it is a key bottleneck of FL to achieve communication efficiency without compensating the learning accuracy, especially when local data are highly heterogeneous across different clients. Given this, we propose a novel algorithm PowerEF-SGD that only communicates compressed information via a novel error-feedback scheme. To our knowledge, PowerEF-SGD is the first distributed and compressed SGD algorithm that provably escapes saddle points in heterogeneous FL without any data homogeneity assumptions. In particular, PowerEF-SGD improves to second-order stationary points after visiting first-order (possibly saddle) points, using additional gradient queries and communication rounds only of almost the same order required by first-order convergence, and the convergence rate shows a linear-speedup pattern in terms of the number of workers. Our theory improves/recovers previous results, while extending to much more tolerant settings on the local data. Numerical experiments are provided to complement the theory. 2024-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9493 info:doi/https://proceedings.mlr.press/v238/chen24d.html https://ink.library.smu.edu.sg/context/sis_research/article/10493/viewcontent/1_s2.0_S2214140524001233_pvoa_cc_by_nc.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Federated learning Machine learning Communication compression Artificial Intelligence and Robotics |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Federated learning Machine learning Communication compression Artificial Intelligence and Robotics |
spellingShingle |
Federated learning Machine learning Communication compression Artificial Intelligence and Robotics CHEN, Sijin LI, Zhize CHI, Yuejie Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression |
description |
We consider the problem of finding second-order stationary points in the optimization of heterogeneous federated learning (FL). Previous works in FL mostly focus on first-order convergence guarantees, which do not rule out the scenario of unstable saddle points. Meanwhile, it is a key bottleneck of FL to achieve communication efficiency without compensating the learning accuracy, especially when local data are highly heterogeneous across different clients. Given this, we propose a novel algorithm PowerEF-SGD that only communicates compressed information via a novel error-feedback scheme. To our knowledge, PowerEF-SGD is the first distributed and compressed SGD algorithm that provably escapes saddle points in heterogeneous FL without any data homogeneity assumptions. In particular, PowerEF-SGD improves to second-order stationary points after visiting first-order (possibly saddle) points, using additional gradient queries and communication rounds only of almost the same order required by first-order convergence, and the convergence rate shows a linear-speedup pattern in terms of the number of workers. Our theory improves/recovers previous results, while extending to much more tolerant settings on the local data. Numerical experiments are provided to complement the theory. |
format |
text |
author |
CHEN, Sijin LI, Zhize CHI, Yuejie |
author_facet |
CHEN, Sijin LI, Zhize CHI, Yuejie |
author_sort |
CHEN, Sijin |
title |
Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression |
title_short |
Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression |
title_full |
Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression |
title_fullStr |
Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression |
title_full_unstemmed |
Escaping saddle points in heterogeneous federated learning via distributed SGD with communication compression |
title_sort |
escaping saddle points in heterogeneous federated learning via distributed sgd with communication compression |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2024 |
url |
https://ink.library.smu.edu.sg/sis_research/9493 https://ink.library.smu.edu.sg/context/sis_research/article/10493/viewcontent/1_s2.0_S2214140524001233_pvoa_cc_by_nc.pdf |
_version_ |
1816859094438379520 |