Data source selection in federated learning: A submodular optimization approach
Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The nece...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7219 https://ink.library.smu.edu.sg/context/sis_research/article/8222/viewcontent/dasfaa22_zhang.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-8222 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-82222023-08-21T08:28:35Z Data source selection in federated learning: A submodular optimization approach ZHANG, Ruisheng WANG, Yansheng ZHOU, Zimu REN, Ziyao TONG, Yongxin XU, Ke Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The necessity of data source selection is further amplified in presence of data heterogeneity among clients. Prior solutions are either low in efficiency with exponential time cost or lack theoretical guarantees. Inspired by the diminishing marginal accuracy phenomenon in federated learning, we study the problem from the perspective of submodular optimization. In this paper, we aim at efficient data source selection with theoretical guarantees. We prove that data source selection in federated learning is a monotone submodular maximization problem and propose FDSS, an efficient algorithm with a constant approximate ratio. Furthermore, we extend FDSS to FDSS-d for dynamic data source selection. Extensive experiments on CIFAR10 and CIFAR100 validate the efficiency and effectiveness of our algorithms. 2022-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7219 info:doi/10.1007/978-3-031-00126-0_43 https://ink.library.smu.edu.sg/context/sis_research/article/8222/viewcontent/dasfaa22_zhang.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Federated learning Data source selection Submodularity Databases and Information Systems Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Federated learning Data source selection Submodularity Databases and Information Systems Software Engineering |
spellingShingle |
Federated learning Data source selection Submodularity Databases and Information Systems Software Engineering ZHANG, Ruisheng WANG, Yansheng ZHOU, Zimu REN, Ziyao TONG, Yongxin XU, Ke Data source selection in federated learning: A submodular optimization approach |
description |
Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The necessity of data source selection is further amplified in presence of data heterogeneity among clients. Prior solutions are either low in efficiency with exponential time cost or lack theoretical guarantees. Inspired by the diminishing marginal accuracy phenomenon in federated learning, we study the problem from the perspective of submodular optimization. In this paper, we aim at efficient data source selection with theoretical guarantees. We prove that data source selection in federated learning is a monotone submodular maximization problem and propose FDSS, an efficient algorithm with a constant approximate ratio. Furthermore, we extend FDSS to FDSS-d for dynamic data source selection. Extensive experiments on CIFAR10 and CIFAR100 validate the efficiency and effectiveness of our algorithms. |
format |
text |
author |
ZHANG, Ruisheng WANG, Yansheng ZHOU, Zimu REN, Ziyao TONG, Yongxin XU, Ke |
author_facet |
ZHANG, Ruisheng WANG, Yansheng ZHOU, Zimu REN, Ziyao TONG, Yongxin XU, Ke |
author_sort |
ZHANG, Ruisheng |
title |
Data source selection in federated learning: A submodular optimization approach |
title_short |
Data source selection in federated learning: A submodular optimization approach |
title_full |
Data source selection in federated learning: A submodular optimization approach |
title_fullStr |
Data source selection in federated learning: A submodular optimization approach |
title_full_unstemmed |
Data source selection in federated learning: A submodular optimization approach |
title_sort |
data source selection in federated learning: a submodular optimization approach |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2022 |
url |
https://ink.library.smu.edu.sg/sis_research/7219 https://ink.library.smu.edu.sg/context/sis_research/article/8222/viewcontent/dasfaa22_zhang.pdf |
_version_ |
1779156950985998336 |