Joint client-and-sample selection for federated learning via bi-level optimization

Federated Learning (FL) enables massive local data owners to collaboratively train a deep learning model without disclosing their private data. The importance of local data samples from various data owners to FL models varies widely. This is exacerbated by the presence of noisy data that exhibit lar...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Anran, Wang, Guangjing, Hu, Ming, Sun, Jianfei, Zhang, Lan, Tuan, Luu Anh, Yu, Han
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181061
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181061
record_format dspace
spelling sg-ntu-dr.10356-1810612024-11-13T00:06:44Z Joint client-and-sample selection for federated learning via bi-level optimization Li, Anran Wang, Guangjing Hu, Ming Sun, Jianfei Zhang, Lan Tuan, Luu Anh Yu, Han School of Computer Science and Engineering Computer and Information Science Bi-level optimization Federated learning Federated Learning (FL) enables massive local data owners to collaboratively train a deep learning model without disclosing their private data. The importance of local data samples from various data owners to FL models varies widely. This is exacerbated by the presence of noisy data that exhibit large losses similar to important (hard) samples. Currently, there lacks an FL approach that can effectively distinguish hard samples (which are beneficial) from noisy samples (which are harmful). To bridge this gap, we propose the joint Federated Meta-Weighting based Client and Sample Selection (FedMW-CSS) approach to simultaneously mitigate label noise and hard sample selection. It is a bilevel optimization approach for FL client-and-sample selection and global model construction to achieve hard sample-aware noise-robust learning in a privacy preserving manner. It performs meta-learning based online approximation to iteratively update global FL models, select the most positively influential samples and deal with training data noise. To utilize both the instance-level information and class-level information for better performance improvements, FedMW-CSS efficiently learns a class-level weight by manipulating gradients at the class level, e.g., it performs a gradient descent step on class-level weights, which only relies on intermediate gradients. Theoretically, we analyze the privacy guarantees and convergence of FedMW-CSS. Extensive experiments comparison against eight state-of-the-art baselines on six real-world datasets in the presence of data noise and heterogeneity shows that FedMW-CSS achieves up to 28.5% higher test accuracy, while saving communication and computation costs by at least 49.3% and 1.2%, respectively. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University National Research Foundation (NRF) This research was supported in part by Nanyang Technological University (NTU), under Grant 020724-00001, in part by RIE2025 Industry Alignment Fund. Industry Collaboration Projects (IAF-ICP) under Grant I2301E0026, administered by A*STAR, as well as supported by in part by Alibaba Group and NTU Singapore, National Research Foundation, Singapore and DSO National Laboratories under the AI Singapore Programme AISG under Grant AISG2-RP-2020-019, in part by the National Key R&D Program of China under Grant 2021YFB2900103, in part by China National Natural Science Foundation under Grant 61932016, and in part by “The Fundamental Research Funds for the Central Universities” under Grant WK2150110024. 2024-11-13T00:06:44Z 2024-11-13T00:06:44Z 2024 Journal Article Li, A., Wang, G., Hu, M., Sun, J., Zhang, L., Tuan, L. A. & Yu, H. (2024). Joint client-and-sample selection for federated learning via bi-level optimization. IEEE Transactions On Mobile Computing, 23(12), 15196-15209. https://dx.doi.org/10.1109/TMC.2024.3455331 1536-1233 https://hdl.handle.net/10356/181061 10.1109/TMC.2024.3455331 2-s2.0-85203416646 12 23 15196 15209 en 020724-00001 I2301E0026 AISG2-RP-2020-019 IEEE Transactions on Mobile Computing © 2024 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Bi-level optimization
Federated learning
spellingShingle Computer and Information Science
Bi-level optimization
Federated learning
Li, Anran
Wang, Guangjing
Hu, Ming
Sun, Jianfei
Zhang, Lan
Tuan, Luu Anh
Yu, Han
Joint client-and-sample selection for federated learning via bi-level optimization
description Federated Learning (FL) enables massive local data owners to collaboratively train a deep learning model without disclosing their private data. The importance of local data samples from various data owners to FL models varies widely. This is exacerbated by the presence of noisy data that exhibit large losses similar to important (hard) samples. Currently, there lacks an FL approach that can effectively distinguish hard samples (which are beneficial) from noisy samples (which are harmful). To bridge this gap, we propose the joint Federated Meta-Weighting based Client and Sample Selection (FedMW-CSS) approach to simultaneously mitigate label noise and hard sample selection. It is a bilevel optimization approach for FL client-and-sample selection and global model construction to achieve hard sample-aware noise-robust learning in a privacy preserving manner. It performs meta-learning based online approximation to iteratively update global FL models, select the most positively influential samples and deal with training data noise. To utilize both the instance-level information and class-level information for better performance improvements, FedMW-CSS efficiently learns a class-level weight by manipulating gradients at the class level, e.g., it performs a gradient descent step on class-level weights, which only relies on intermediate gradients. Theoretically, we analyze the privacy guarantees and convergence of FedMW-CSS. Extensive experiments comparison against eight state-of-the-art baselines on six real-world datasets in the presence of data noise and heterogeneity shows that FedMW-CSS achieves up to 28.5% higher test accuracy, while saving communication and computation costs by at least 49.3% and 1.2%, respectively.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Li, Anran
Wang, Guangjing
Hu, Ming
Sun, Jianfei
Zhang, Lan
Tuan, Luu Anh
Yu, Han
format Article
author Li, Anran
Wang, Guangjing
Hu, Ming
Sun, Jianfei
Zhang, Lan
Tuan, Luu Anh
Yu, Han
author_sort Li, Anran
title Joint client-and-sample selection for federated learning via bi-level optimization
title_short Joint client-and-sample selection for federated learning via bi-level optimization
title_full Joint client-and-sample selection for federated learning via bi-level optimization
title_fullStr Joint client-and-sample selection for federated learning via bi-level optimization
title_full_unstemmed Joint client-and-sample selection for federated learning via bi-level optimization
title_sort joint client-and-sample selection for federated learning via bi-level optimization
publishDate 2024
url https://hdl.handle.net/10356/181061
_version_ 1816859051588321280