Data source selection in federated learning: A submodular optimization approach

Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The nece...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHANG, Ruisheng, WANG, Yansheng, ZHOU, Zimu, REN, Ziyao, TONG, Yongxin, XU, Ke
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7219
https://ink.library.smu.edu.sg/context/sis_research/article/8222/viewcontent/dasfaa22_zhang.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8222
record_format dspace
spelling sg-smu-ink.sis_research-82222023-08-21T08:28:35Z Data source selection in federated learning: A submodular optimization approach ZHANG, Ruisheng WANG, Yansheng ZHOU, Zimu REN, Ziyao TONG, Yongxin XU, Ke Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The necessity of data source selection is further amplified in presence of data heterogeneity among clients. Prior solutions are either low in efficiency with exponential time cost or lack theoretical guarantees. Inspired by the diminishing marginal accuracy phenomenon in federated learning, we study the problem from the perspective of submodular optimization. In this paper, we aim at efficient data source selection with theoretical guarantees. We prove that data source selection in federated learning is a monotone submodular maximization problem and propose FDSS, an efficient algorithm with a constant approximate ratio. Furthermore, we extend FDSS to FDSS-d for dynamic data source selection. Extensive experiments on CIFAR10 and CIFAR100 validate the efficiency and effectiveness of our algorithms. 2022-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7219 info:doi/10.1007/978-3-031-00126-0_43 https://ink.library.smu.edu.sg/context/sis_research/article/8222/viewcontent/dasfaa22_zhang.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Federated learning Data source selection Submodularity Databases and Information Systems Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Federated learning
Data source selection
Submodularity
Databases and Information Systems
Software Engineering
spellingShingle Federated learning
Data source selection
Submodularity
Databases and Information Systems
Software Engineering
ZHANG, Ruisheng
WANG, Yansheng
ZHOU, Zimu
REN, Ziyao
TONG, Yongxin
XU, Ke
Data source selection in federated learning: A submodular optimization approach
description Federated learning is a new learning paradigm that jointly trains a model from multiple data sources without sharing raw data. For the practical deployment of federated learning, data source selection is compulsory due to the limited communication cost and budget in real-world applications. The necessity of data source selection is further amplified in presence of data heterogeneity among clients. Prior solutions are either low in efficiency with exponential time cost or lack theoretical guarantees. Inspired by the diminishing marginal accuracy phenomenon in federated learning, we study the problem from the perspective of submodular optimization. In this paper, we aim at efficient data source selection with theoretical guarantees. We prove that data source selection in federated learning is a monotone submodular maximization problem and propose FDSS, an efficient algorithm with a constant approximate ratio. Furthermore, we extend FDSS to FDSS-d for dynamic data source selection. Extensive experiments on CIFAR10 and CIFAR100 validate the efficiency and effectiveness of our algorithms.
format text
author ZHANG, Ruisheng
WANG, Yansheng
ZHOU, Zimu
REN, Ziyao
TONG, Yongxin
XU, Ke
author_facet ZHANG, Ruisheng
WANG, Yansheng
ZHOU, Zimu
REN, Ziyao
TONG, Yongxin
XU, Ke
author_sort ZHANG, Ruisheng
title Data source selection in federated learning: A submodular optimization approach
title_short Data source selection in federated learning: A submodular optimization approach
title_full Data source selection in federated learning: A submodular optimization approach
title_fullStr Data source selection in federated learning: A submodular optimization approach
title_full_unstemmed Data source selection in federated learning: A submodular optimization approach
title_sort data source selection in federated learning: a submodular optimization approach
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7219
https://ink.library.smu.edu.sg/context/sis_research/article/8222/viewcontent/dasfaa22_zhang.pdf
_version_ 1779156950985998336